Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,12 @@ COPY --chown=runner:runner metrics-server.sh /usr/local/bin/metrics-server.sh
COPY --chown=runner:runner metrics-collector.sh /usr/local/bin/metrics-collector.sh
RUN chmod +x /usr/local/bin/metrics-server.sh /usr/local/bin/metrics-collector.sh

# Copy job lifecycle hook scripts (Phase 3: DORA Metrics)
# TASK-028: Runner calls these via ACTIONS_RUNNER_HOOK_JOB_STARTED/COMPLETED
COPY --chown=runner:runner job-started.sh /usr/local/bin/job-started.sh
COPY --chown=runner:runner job-completed.sh /usr/local/bin/job-completed.sh
Comment on lines +174 to +175

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve Docker image layer efficiency and align with best practices, you can combine these two COPY instructions into a single one since they target the same destination directory.

COPY --chown=runner:runner job-started.sh job-completed.sh /usr/local/bin/

RUN chmod +x /usr/local/bin/job-started.sh /usr/local/bin/job-completed.sh


# Final image runs as unprivileged runner user.
USER runner
Expand Down
6 changes: 6 additions & 0 deletions docker/Dockerfile.chrome
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,12 @@ COPY --chown=runner:runner metrics-server.sh /usr/local/bin/metrics-server.sh
COPY --chown=runner:runner metrics-collector.sh /usr/local/bin/metrics-collector.sh
RUN chmod +x /usr/local/bin/metrics-server.sh /usr/local/bin/metrics-collector.sh

# Copy job lifecycle hook scripts (Phase 3: DORA Metrics)
# TASK-028: Runner calls these via ACTIONS_RUNNER_HOOK_JOB_STARTED/COMPLETED
COPY --chown=runner:runner job-started.sh /usr/local/bin/job-started.sh
COPY --chown=runner:runner job-completed.sh /usr/local/bin/job-completed.sh
Comment on lines +251 to +252

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve Docker image layer efficiency and align with best practices, you can combine these two COPY instructions into a single one since they target the same destination directory.

COPY --chown=runner:runner job-started.sh job-completed.sh /usr/local/bin/

RUN chmod +x /usr/local/bin/job-started.sh /usr/local/bin/job-completed.sh

# TASK-014: Expose Prometheus metrics port
EXPOSE 9091

Expand Down
6 changes: 6 additions & 0 deletions docker/Dockerfile.chrome-go
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,12 @@ COPY --chown=runner:runner metrics-server.sh /usr/local/bin/metrics-server.sh
COPY --chown=runner:runner metrics-collector.sh /usr/local/bin/metrics-collector.sh
RUN chmod +x /usr/local/bin/metrics-server.sh /usr/local/bin/metrics-collector.sh

# Copy job lifecycle hook scripts (Phase 3: DORA Metrics)
# TASK-028: Runner calls these via ACTIONS_RUNNER_HOOK_JOB_STARTED/COMPLETED
COPY --chown=runner:runner job-started.sh /usr/local/bin/job-started.sh
COPY --chown=runner:runner job-completed.sh /usr/local/bin/job-completed.sh
Comment on lines +283 to +284

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve Docker image layer efficiency and align with best practices, you can combine these two COPY instructions into a single one since they target the same destination directory.

COPY --chown=runner:runner job-started.sh job-completed.sh /usr/local/bin/

RUN chmod +x /usr/local/bin/job-started.sh /usr/local/bin/job-completed.sh

# TASK-015: Expose Prometheus metrics port
EXPOSE 9091

Expand Down
12 changes: 12 additions & 0 deletions docker/entrypoint-chrome.sh
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,18 @@ if [ -z "$RUNNER_TOKEN" ] || [ "$RUNNER_TOKEN" == "null" ]; then
exit 1
fi

# --- JOB LIFECYCLE HOOKS (Phase 3: DORA Metrics) ---
# TASK-028: Set runner hook env vars for job tracking
# The runner (v2.300.0+) will call these scripts before/after each job
export ACTIONS_RUNNER_HOOK_JOB_STARTED=/usr/local/bin/job-started.sh
export ACTIONS_RUNNER_HOOK_JOB_COMPLETED=/usr/local/bin/job-completed.sh
echo "Job lifecycle hooks configured:"
echo " - Job started hook: ${ACTIONS_RUNNER_HOOK_JOB_STARTED}"
echo " - Job completed hook: ${ACTIONS_RUNNER_HOOK_JOB_COMPLETED}"

# Create job state directory for duration tracking
mkdir -p /tmp/job_state

# Configure the runner
echo "Configuring runner..."
./config.sh \
Expand Down
12 changes: 12 additions & 0 deletions docker/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,18 @@ if [ -z "$RUNNER_TOKEN" ] || [ "$RUNNER_TOKEN" == "null" ]; then
exit 1
fi

# --- JOB LIFECYCLE HOOKS (Phase 3: DORA Metrics) ---
# TASK-028: Set runner hook env vars for job tracking
# The runner (v2.300.0+) will call these scripts before/after each job
export ACTIONS_RUNNER_HOOK_JOB_STARTED=/usr/local/bin/job-started.sh
export ACTIONS_RUNNER_HOOK_JOB_COMPLETED=/usr/local/bin/job-completed.sh
echo "Job lifecycle hooks configured:"
echo " - Job started hook: ${ACTIONS_RUNNER_HOOK_JOB_STARTED}"
echo " - Job completed hook: ${ACTIONS_RUNNER_HOOK_JOB_COMPLETED}"

# Create job state directory for duration tracking
mkdir -p /tmp/job_state

# Configure the runner
echo "Configuring runner..."
./config.sh \
Expand Down
142 changes: 142 additions & 0 deletions docker/job-completed.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#!/bin/bash
# job-completed.sh - Runner hook script invoked after each job completes
# Called via ACTIONS_RUNNER_HOOK_JOB_COMPLETED environment variable
#
# Implementation: Phase 3, TASK-027, TASK-028
# Records job completion event to /tmp/jobs.log with duration and status
#
# The GitHub Actions runner (v2.300.0+) sets these env vars before calling this hook:
# GITHUB_JOB - Job name
# GITHUB_RUN_ID - Workflow run ID
# GITHUB_RUN_NUMBER - Workflow run number
# GITHUB_WORKFLOW - Workflow name
# GITHUB_REPOSITORY - Repository (owner/repo)
#
# Additionally, at job completion the runner provides result context.
# We detect success/failure from the runner's internal result code.

set -euo pipefail

# Configuration
JOBS_LOG="${JOBS_LOG:-/tmp/jobs.log}"
JOB_STATE_DIR="${JOB_STATE_DIR:-/tmp/job_state}"
HOOK_LOG="${HOOK_LOG:-/tmp/job-hooks.log}"

# Logging function
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] [job-completed] $*" | tee -a "$HOOK_LOG"
}

# Derive a unique job identifier (must match job-started.sh logic)
get_job_id() {
local run_id="${GITHUB_RUN_ID:-0}"
local job_name="${GITHUB_JOB:-unknown}"
echo "${run_id}_${job_name}"
}

# Convert ISO 8601 timestamp to epoch seconds (portable)
iso_to_epoch() {
local ts="$1"
# Use date -d for GNU date, fall back to python3 for macOS/BSD
if date -d "$ts" +%s 2>/dev/null; then
return
fi
python3 -c "from datetime import datetime; print(int(datetime.fromisoformat('${ts}'.replace('Z','+00:00')).timestamp()))" 2>/dev/null || echo "0"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

The iso_to_epoch function is vulnerable to Python injection. The variable ${ts} is directly interpolated into a Python script string passed to python3 -c. If an attacker can control the input to this function (e.g., via the GITHUB_RUN_CREATED_AT environment variable), they can execute arbitrary Python code by breaking out of the single quotes.

To remediate this, pass the timestamp as a command-line argument to the Python script and access it via sys.argv.

Suggested change
python3 -c "from datetime import datetime; print(int(datetime.fromisoformat('${ts}'.replace('Z','+00:00')).timestamp()))" 2>/dev/null || echo "0"
python3 -c "import sys; from datetime import datetime; print(int(datetime.fromisoformat(sys.argv[1].replace('Z','+00:00')).timestamp()))" "$ts" 2>/dev/null || echo "0"

}

# Determine job status from available signals
# The runner hook doesn't directly pass a "status" env var in all versions.
# We check multiple sources:
# 1. GITHUB_JOB_STATUS (set by some runner versions)
# 2. Runner's result file if available
# 3. Default to "success" (runner only calls completed hook on non-crash)
determine_status() {
# Check for explicit status env var (runner v2.304.0+)
if [[ -n "${GITHUB_JOB_STATUS:-}" ]]; then
echo "${GITHUB_JOB_STATUS,,}" # lowercase
return
fi

# Check runner's internal result context file
local job_id="$1"
local result_file="${JOB_STATE_DIR}/${job_id}.result"
if [[ -f "$result_file" ]]; then
cat "$result_file"
return
fi

# Default: if the completed hook is called, the job finished
# (cancelled/crashed jobs may not trigger the hook at all)
echo "success"
}

# Main logic
main() {
local job_id
local timestamp
local start_timestamp
local start_epoch
local end_epoch
local duration_seconds
local queue_time_seconds
local status

job_id=$(get_job_id)
timestamp=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
end_epoch=$(date +%s)

log "Job completed: id=${job_id} job=${GITHUB_JOB:-unknown} run_id=${GITHUB_RUN_ID:-0}"

# Calculate duration from start timestamp
duration_seconds=0
if [[ -f "${JOB_STATE_DIR}/${job_id}.start" ]]; then
start_timestamp=$(cat "${JOB_STATE_DIR}/${job_id}.start")
start_epoch=$(iso_to_epoch "$start_timestamp")
if [[ "$start_epoch" -gt 0 ]]; then
duration_seconds=$((end_epoch - start_epoch))
# Guard against negative values (clock skew)
if [[ "$duration_seconds" -lt 0 ]]; then
duration_seconds=0
fi
fi
else
log "WARNING: No start timestamp found for job ${job_id}"
fi

# Calculate queue time if GITHUB_RUN_CREATED_AT is available
# Queue time = time from workflow creation to job start
queue_time_seconds=0
if [[ -n "${GITHUB_RUN_CREATED_AT:-}" ]] && [[ -f "${JOB_STATE_DIR}/${job_id}.start" ]]; then
local created_epoch
created_epoch=$(iso_to_epoch "$GITHUB_RUN_CREATED_AT")
if [[ "$created_epoch" -gt 0 ]] && [[ "$start_epoch" -gt 0 ]]; then
queue_time_seconds=$((start_epoch - created_epoch))
if [[ "$queue_time_seconds" -lt 0 ]]; then
queue_time_seconds=0
fi
fi
fi

# Determine job status
status=$(determine_status "$job_id")

# Remove the preliminary "running" entry and append final entry
# Use a temp file for atomic update to avoid race conditions
local temp_log="${JOBS_LOG}.tmp.$$"
if [[ -f "$JOBS_LOG" ]]; then
# Remove matching running entry for this job_id
grep -v ",${job_id},running," "$JOBS_LOG" >"$temp_log" 2>/dev/null || true
mv "$temp_log" "$JOBS_LOG"
fi
Comment on lines +125 to +130

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

This section has a critical security vulnerability due to insecure temporary file handling. The script creates a temporary file in the world-writable /tmp directory using a predictable name based on the process ID ($$). This can be exploited via symlink attacks to overwrite sensitive files, leading to denial of service or privilege escalation. Additionally, there's a race condition where grep -v and mv operations on jobs.log are not atomic, potentially causing data loss if multiple jobs complete concurrently. Use mktemp for secure temporary file creation and flock for atomic log file updates.

Suggested change
local temp_log="${JOBS_LOG}.tmp.$$"
if [[ -f "$JOBS_LOG" ]]; then
# Remove matching running entry for this job_id
grep -v ",${job_id},running," "$JOBS_LOG" >"$temp_log" 2>/dev/null || true
mv "$temp_log" "$JOBS_LOG"
fi
local temp_log temp_log=$(mktemp "${JOBS_LOG}.XXXXXX") if [[ -f "$JOBS_LOG" ]]; then # Remove matching running entry for this job_id grep -v ",${job_id},running," "$JOBS_LOG" >"$temp_log" 2>/dev/null || true mv "$temp_log" "$JOBS_LOG" fi


# Append final completed entry
# Format: timestamp,job_id,status,duration_seconds,queue_time_seconds
echo "${timestamp},${job_id},${status},${duration_seconds},${queue_time_seconds}" >>"$JOBS_LOG"

log "Job recorded: status=${status} duration=${duration_seconds}s queue_time=${queue_time_seconds}s"

# Clean up state files for this job
rm -f "${JOB_STATE_DIR}/${job_id}.start" "${JOB_STATE_DIR}/${job_id}.result"
}

main "$@"
59 changes: 59 additions & 0 deletions docker/job-started.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/bin/bash
# job-started.sh - Runner hook script invoked before each job starts
# Called via ACTIONS_RUNNER_HOOK_JOB_STARTED environment variable
#
# Implementation: Phase 3, TASK-027, TASK-028
# Records job start event to /tmp/jobs.log for metrics collection
#
# The GitHub Actions runner (v2.300.0+) sets these env vars before calling this hook:
# GITHUB_JOB - Job name
# GITHUB_RUN_ID - Workflow run ID
# GITHUB_RUN_NUMBER - Workflow run number
# GITHUB_WORKFLOW - Workflow name
# GITHUB_REPOSITORY - Repository (owner/repo)

set -euo pipefail

# Configuration
JOBS_LOG="${JOBS_LOG:-/tmp/jobs.log}"
JOB_STATE_DIR="${JOB_STATE_DIR:-/tmp/job_state}"
HOOK_LOG="${HOOK_LOG:-/tmp/job-hooks.log}"

# Logging function
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] [job-started] $*" | tee -a "$HOOK_LOG"
}

# Derive a unique job identifier from available environment variables
get_job_id() {
local run_id="${GITHUB_RUN_ID:-0}"
local job_name="${GITHUB_JOB:-unknown}"
# Combine run_id and job_name for uniqueness within a workflow
echo "${run_id}_${job_name}"
}

# Main logic
main() {
local job_id
local timestamp

job_id=$(get_job_id)
timestamp=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

log "Job starting: id=${job_id} job=${GITHUB_JOB:-unknown} run_id=${GITHUB_RUN_ID:-0} workflow=${GITHUB_WORKFLOW:-unknown}"

# Create state directory for per-job tracking
mkdir -p "$JOB_STATE_DIR"

# Record start timestamp for duration calculation in job-completed.sh
echo "$timestamp" >"${JOB_STATE_DIR}/${job_id}.start"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The script writes to a file in /tmp/job_state/ using a predictable name. Since /tmp is world-writable, an attacker can pre-create a symlink at this location to overwrite arbitrary files that the runner user has permission to write to.

To remediate this, use a directory that is not world-writable (e.g., /home/runner/job_state) or ensure the directory is created with restricted permissions (e.g., 700) and verify it is not a symlink before use.

Suggested change
echo "$timestamp" >"${JOB_STATE_DIR}/${job_id}.start"
# Record start timestamp for duration calculation in job-completed.sh
# Ensure JOB_STATE_DIR is secure (e.g., /home/runner/job_state)
echo "$timestamp" >"${JOB_STATE_DIR}/${job_id}.start"


# Write a preliminary entry to jobs.log (status=running, duration/queue_time TBD)
# Final entry with duration and status is written by job-completed.sh
# Format: timestamp,job_id,status,duration_seconds,queue_time_seconds
echo "${timestamp},${job_id},running,0,0" >>"$JOBS_LOG"
Comment on lines +51 to +54

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This script has a race condition when writing to jobs.log. While echo >> is often atomic for single writes, job-completed.sh performs a non-atomic read-modify-write operation on the same file. To prevent data corruption, all accesses to jobs.log should be synchronized using a lock.

I recommend using flock here to ensure that this write operation does not conflict with operations in job-completed.sh or metrics-collector.sh.

Suggested change
# Write a preliminary entry to jobs.log (status=running, duration/queue_time TBD)
# Final entry with duration and status is written by job-completed.sh
# Format: timestamp,job_id,status,duration_seconds,queue_time_seconds
echo "${timestamp},${job_id},running,0,0" >>"$JOBS_LOG"
# Use a lock file to ensure atomic updates to jobs.log
( flock 200
# Write a preliminary entry to jobs.log (status=running, duration/queue_time TBD)
# Final entry with duration and status is written by job-completed.sh
# Format: timestamp,job_id,status,duration_seconds,queue_time_seconds
echo "${timestamp},${job_id},running,0,0" >>"$JOBS_LOG"
) 200>"${JOBS_LOG}.lock"


log "Job start recorded: ${JOB_STATE_DIR}/${job_id}.start"
}

main "$@"
Loading