Skip to content

engine.pid: verify process start-time identity to defend against PID reuse #12

Description

@pbean

Problem

engine.pid (written by runs.write_pid) stores only the bare numeric PID. Operating systems recycle PIDs, so after an engine crashes/is kill -9'd without the run being marked finished (the stale engine.pid is never deleted), the OS can reassign that PID to an unrelated process. Then:

  • runs.engine_alive() (pid-only — no tmux fallback for a present pid) can false-positive, blocking resume/delete/archive with "still running".
  • runs.stop_run()platform_util.terminate_pid(pid) can send SIGTERM to the recycled, unrelated process.

Blast radius is bounded (SIGTERM not SIGKILL; os.kill only succeeds on a same-UID process), and the conjunction (hard crash + PID wraparound + action on that same stale run) is low-probability, but the wrong-process termination is a real correctness hazard.

Surfaced by CodeRabbit on #11 (comment 3472621208). Pre-existing pattern — not introduced by that PR. The pid <= 0 guard from that review was addressed in 6492cdf; this issue tracks the remaining identity-verification work.

Proposed fix

Persist a per-process identity token alongside the PID and verify it before trusting liveness or terminating:

  • platform_util.pid_identity(pid) -> str | None — process start-time. Linux: /proc/<pid>/stat field 22 (bare /proc read, # portability:-acked + added to the portability-guard allowlist). macOS/Windows: psutil.Process(pid).create_time() (the existing optional non-linux extra). Mirrors the existing unity_teardown proc/psutil split.
  • write_pid"<pid> <start_time>"; read_pid returns (pid, token) (ripples to engine_alive, stop_run, and tui/data.py which reads engine.pid directly).
  • Verify pid_identity(pid) == stored_token in engine_alive / stop_run / data.liveness; a mismatch means the PID was recycled → treat as not-ours.
  • Legacy tolerance: existing engine.pid files are bare ints — absent token must degrade to today's behavior, not crash.

Affected files

src/automator/platform_util.py, src/automator/runs.py, src/automator/tui/data.py, tests/test_runs.py, tests/test_tui_data.py, tests/test_portability_guard.py (+ new pid_identity tests).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions