fix(engine): emit heartbeat on interval regardless of output growth#60
Open
mvanhorn wants to merge 3 commits intodanshapiro:mainfrom
Open
fix(engine): emit heartbeat on interval regardless of output growth#60mvanhorn wants to merge 3 commits intodanshapiro:mainfrom
mvanhorn wants to merge 3 commits intodanshapiro:mainfrom
Conversation
Decouples heartbeat emission from output-growth gating so active stages produce predictable liveness signals even during quiet periods. Adds appendProgressLivenessOnly to write events without resetting the stall watchdog timer, preserving correct stall detection while improving observability. Heartbeat events now include since_last_output_s for distinguishing "alive but quiet" from "alive and producing output". Fixes danshapiro#51 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…i/ paths - Add gpt-5.3-codex-spark to cliOnlyModelIDs map (fixes TestIsCLIOnlyModel) - Replace root .ai/verify_errors.log and .ai/test-evidence/latest/ with run-scoped .ai/runs/$KILROY_RUN_ID/ paths in demo spec (fixes TestReferenceSurfaces_NoLegacyRootAIScratchPaths) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #51
Summary
The
stage_heartbeatemission was gated on stdout/stderr growth, so quiet-but-active phases (likenpm installdownloading) produced sparse or missing heartbeat events. This made healthy stages appear stalled in monitoring.Changes
progress.go: AddedappendProgressLivenessOnly()- writes events to progress.ndjson for observability without resetting the stall watchdog timer. This is the key to decoupling heartbeat observability from stall detection.codergen_router.go(CLI path, ~line 1162): Heartbeat goroutine now always emits on every ticker interval. UsesappendProgresswhen output has grown (resets stall timer) andappendProgressLivenessOnlywhen output is static (preserves stall detection). Addedsince_last_output_sfield.codergen_router.go(API path, ~line 330): Same pattern - always emits heartbeat, uses liveness-only when event count hasn't changed. Addedsince_last_output_sfield.codergen_heartbeat_test.go: AddedTestRunWithConfig_HeartbeatEmitsDuringQuietPeriods- verifies heartbeats emit during a 3-second quiet period and include thesince_last_output_sfield. Updated stall watchdog test comments.Design
The fix separates two concerns:
Test plan
TestRunWithConfig_HeartbeatEmitsDuringQuietPeriods- heartbeats during quiet periods withsince_last_output_sTestRunWithConfig_HeartbeatEmitsDuringCodergen- existing test still passesTestRunWithConfig_APIBackend_StallWatchdogFiresDespiteHeartbeatGoroutine- watchdog still firesTestRunWithConfig_CLIBackend_StallWatchdogFiresDespiteHeartbeatGoroutine- watchdog still firesTestRunWithConfig_HeartbeatStopsAfterProcessExit- heartbeats stop after process exitThis contribution was developed with AI assistance (Claude Code).