Skip to content

fix: svcinit tolerates service crash in ibazel reload mode#73

Open
jaqx0r wants to merge 6 commits into
hermeticbuild:masterfrom
jaqx0r:fix/svcinit-crash-reload
Open

fix: svcinit tolerates service crash in ibazel reload mode#73
jaqx0r wants to merge 6 commits into
hermeticbuild:masterfrom
jaqx0r:fix/svcinit-crash-reload

Conversation

@jaqx0r

@jaqx0r jaqx0r commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Summary

When running under ibazel run --@rules_itest//:enable_per_service_reload, if a service crashes, svcinit panics on the next reload notification. Additionally, the existing drain loop silently discards crash errors — this PR adds logging to make them visible.

Changes

  • Log discarded crash errors (cmd/svcinit/main.go): The existing Drain: loop drains servicesErrCh non-blocking before reload. This PR adds log.Printf for each discarded error so crashes are visible in output.
  • StopWithSignal() tolerates dead processes (runner/service_instance.go): After calling killGroup, if the process raced to exit (isDone()), return nil instead of propagating the error. The initial killGroup call already attempted group cleanup.
  • Hot-reload fallback (runner/runner.go): If stdin.Write fails for a hot-reloadable service (broken pipe from crash), explicitly close the old stdin pipe, log the stop error, and fall back to a full stop+restart instead of propagating the error.

Fixes #72

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3187d60290

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread runner/service_instance.go
jaqx0r added 2 commits June 20, 2026 15:35
- Replace blocking `for range servicesErrCh` with non-blocking drain
  so reload proceeds even when no crash errors are pending
- Stop() returns nil for already-dead processes instead of propagating
  ESRCH from killGroup
- Hot-reload stdin.Write failure falls back to full restart instead
  of panicking

Fixes hermeticbuild#72
…ited

A crashed service may leave child processes alive in the same process
group (holding a port or stdout). Previously we returned nil immediately
when isDone() was true, skipping the killGroup call. Now we always
attempt the group kill and discard the error (ESRCH means the group is
already gone, which is fine).

Addresses review comment on hermeticbuild#73.
@jaqx0r jaqx0r force-pushed the fix/svcinit-crash-reload branch from 3187d60 to a8eaabf Compare June 20, 2026 06:22
jaqx0r added 3 commits June 20, 2026 16:36
- Log each discarded service error during pre-reload drain instead of
  silently dropping them
- Explicitly close old stdin pipe before stopping crashed service in
  hot-reload fallback
- Log StopWithSignal error in crash fallback instead of silently ignoring
Comment thread runner/runner.go Outdated
Comment thread runner/service_instance.go
- Move old instance capture above stdin.Write in hot-reload fallback
- Remove redundant killGroup call in isDone() fast-path; first killGroup
  call already handled group cleanup

@dzbarsky dzbarsky left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

svcinit panics on service crash during ibazel reload mode

2 participants