fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959
Conversation
apm compile wrote literal '<!-- Build ID: __BUILD_ID__ -->' to disk on every target except the legacy --single-agents path. The substitution that constants.py promises lived only in compile/cli.py:506-524 and was gated by `if config.strategy == "distributed" and not single_agents`. PR microsoft#468 (which closed microsoft#467) added cross-platform sort determinism but never extended substitution to the distributed/claude/gemini paths; its verification ran on --single-agents only. Affected paths producing literal __BUILD_ID__ on main: - apm compile (default distributed): AGENTS.md + every sub-dir - apm compile -t claude: CLAUDE.md - apm compile -t gemini: GEMINI.md - apm compile --watch (single-file): AGENTS.md - compile_agents_md() public API: AGENTS.md Rather than repeat PR microsoft#468's sibling-by-sibling pattern (4 files, all needing the same call site), this introduces a single chokepoint: compilation/output_writer.py::CompiledOutputWriter All five compiled-output write sites route through writer.write(...). The writer extracts stabilize_build_id() (preserved verbatim from compile/cli.py), asserts no placeholder survives stabilization (defense-in-depth: future bypass raises RuntimeError instead of silently emitting __BUILD_ID__), creates parent dirs, and writes atomically. Direct Path.write_text / open(...).write on compiled output is a contract violation by design. The atomic write primitive was duplicated in commands/_helpers. Consolidated to utils/atomic_io.atomic_write_text; commands/_helpers keeps a one-line alias for backward compatibility with existing tests. Closes microsoft#957
APM Review Panel VerdictDisposition: APPROVE (clean refactor; no required pre-merge actions) Per-persona findingsPython Architect: This PR extracts compiled-output write logic from four scattered call sites in 1. OO / class diagram (mermaid)classDiagram
direction LR
class CompiledOutputWriter {
<<Facade>>
+write(path: Path, content: str) None
}
class stabilize_build_id {
<<Pure>>
+stabilize_build_id(content: str) str
}
class atomic_write_text {
<<IOBoundary>>
+atomic_write_text(path: Path, data: str) None
}
class AgentsCompiler {
+_compile_claude_md(config, primitives) None
+_compile_gemini_md(config, primitives) None
+_write_output_file(output_path, content) None
+_write_distributed_file(agents_path, content, config) None
}
class _helpers {
<<Module>>
+_atomic_write alias
}
CompiledOutputWriter ..> stabilize_build_id : calls
CompiledOutputWriter ..> atomic_write_text : calls
AgentsCompiler ..> CompiledOutputWriter : uses (lazy import per method)
_helpers ..> atomic_write_text : re-exports as alias
note for CompiledOutputWriter "Chokepoint: stabilize build_id\n+ mkdir + atomic rename"
class CompiledOutputWriter:::touched
class stabilize_build_id:::touched
class atomic_write_text:::touched
class AgentsCompiler:::touched
class _helpers:::touched
classDef touched fill:#fff3b0,stroke:#d47600
2. Execution flow diagram (mermaid)flowchart TD
A["compile CLI\ncommands/compile/cli.py"] --> B
C["AgentsCompiler._compile_claude_md\ncompilation/agents_compiler.py"] --> B
D["AgentsCompiler._compile_gemini_md\ncompilation/agents_compiler.py"] --> B
E["AgentsCompiler._write_output_file\ncompilation/agents_compiler.py"] --> B
F["AgentsCompiler._write_distributed_file\ncompilation/agents_compiler.py"] --> B
B["CompiledOutputWriter.write\ncompilation/output_writer.py"]
B --> G["stabilize_build_id(content)\ncompilation/build_id.py"]
G --> H{BUILD_ID_PLACEHOLDER\nin final?}
H -->|Yes| I["RuntimeError: stabilization\nbypassed -- loud fail"]
H -->|No| J["[FS] path.parent.mkdir\nparents=True, exist_ok=True"]
J --> K["[I/O] atomic_write_text\nutils/atomic_io.py"]
K --> L["[I/O] tempfile.mkstemp\nprefix='apm-atomic-', dir=path.parent"]
L --> M["[I/O] os.fdopen.write(data)"]
M --> N["[FS] os.replace(tmp_name, path)"]
N --> O["Success: target updated atomically"]
M -->|"OSError"| P["[FS] os.unlink(tmp_name)\n+ re-raise"]
Design patterns
Minor observation: the lazy CLI Logging Expert: No user-facing output paths are touched. The removed inline build_id computation in DevX UX Expert: Purely internal infrastructure refactor. The Supply Chain Security Expert: The PR improves the security posture in two ways:
No new network, credential, or path-traversal surface is introduced. The Auth Expert: Not activated -- the PR touches only compilation output writing infrastructure ( OSS Growth Hacker: No conversion surface is affected. This is a maintenance PR; no CHANGELOG entry, README, or docs update is warranted. Side-channel note to CEO: if atomic writes have been the root cause of intermittent "corrupted AGENTS.md" bug reports, that is worth a one-liner in the release notes ("All compiled outputs are now written atomically -- no partial files on crash") once confirmed by issue history. No CEO arbitrationAll five mandatory specialists agree: this is a well-scoped internal refactor with no user-facing impact and a measurable security improvement. No trade-offs require arbitration. The chokepoint pattern is the right call at this scale -- five call sites sharing identical cross-cutting concerns (build_id, mkdir, atomicity) is exactly the threshold where extraction pays for itself. The PR ships with comprehensive unit tests for both new modules, covering idempotency, determinism, defensive assertions, and atomic-failure recovery. Disposition is APPROVE with no required pre-merge actions. Required actions before mergeNone. Optional follow-ups
Note 🔒 Integrity filter blocked 2 itemsThe following items were blocked because they don't meet the GitHub integrity level.
To allow these resources, lower tools:
github:
min-integrity: approved # merged | approved | unapproved | none
|
…placeholder # Conflicts: # src/apm_cli/compilation/agents_compiler.py
There was a problem hiding this comment.
Pull request overview
Fixes apm compile Build ID stabilization so compiled outputs no longer persist the literal __BUILD_ID__ placeholder outside the legacy --single-agents path, by routing all write sites through a single writer chokepoint.
Changes:
- Introduces
stabilize_build_id()as a shared helper and applies it via a centralizedCompiledOutputWriter. - Adds a canonical
atomic_write_text()implementation and reuses it via_atomic_writealias. - Updates compilation write sites and expands unit test coverage for Build ID stabilization + atomic write behavior.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/compilation/build_id.py |
New helper to deterministically replace the Build ID placeholder. |
src/apm_cli/compilation/output_writer.py |
New writer chokepoint that stabilizes Build ID, asserts invariants, and writes atomically. |
src/apm_cli/utils/atomic_io.py |
New canonical atomic text write primitive used by writer and legacy helper alias. |
src/apm_cli/commands/_helpers.py |
Replaces old _atomic_write implementation with an alias to atomic_write_text. |
src/apm_cli/commands/compile/cli.py |
Removes inline Build ID hashing and writes via CompiledOutputWriter. |
src/apm_cli/compilation/agents_compiler.py |
Routes distributed/claude/gemini/single-file write sites through the writer. |
tests/unit/compilation/test_build_id.py |
Adds unit tests for stabilize_build_id() semantics and determinism. |
tests/unit/compilation/test_output_writer.py |
Adds unit tests for writer chokepoint behavior and atomic write semantics. |
tests/unit/compilation/test_agents_compiler_coverage.py |
Adjusts OSError-path coverage to reflect atomic rename failure mode. |
tests/unit/test_command_helpers.py |
Makes temp-file cleanup assertion prefix-agnostic. |
- _helpers.py: move atomic_write_text alias import to top of file (PEP 8 ordering) and drop now-unused tempfile import after the inline _atomic_write body was replaced with the alias. - test_output_writer.py: replace literal CJK title with \u escape sequences so the source file stays within the printable-ASCII policy enforced by .github/instructions/encoding.instructions.md. Runtime characters and the UTF-8 round-trip assertion are unchanged.
Description
Closes #957.
apm compilewrites literal<!-- Build ID: __BUILD_ID__ -->to disk on every target except--single-agents. The placeholder substitution thatconstants.pypromises ("a deterministic Build ID (content hash) is substituted post-generation") lives only incompile/cli.py:506-524and is gated byif config.strategy == "distributed" and not single_agents-- only the legacy single-file path reaches it.PR #468 (which closed #467) added cross-platform sort determinism but never extended substitution to distributed / claude / gemini formatters; its diff only touches
sorted(...)calls. The verification claim "Both now produce<!-- Build ID: e316dee8ef7f -->" matches the--single-agentspath only -- the defaultapm compileand every multi-target setup still write__BUILD_ID__literally onmain.Affected paths producing literal
__BUILD_ID__onmain(666925f):apm compile(default distributed)agents_compiler._write_distributed_fileAGENTS.md(root + every distributed sub-dir)apm compile -t claudeagents_compiler._compile_claude_md(inline)CLAUDE.mdapm compile -t geminiagents_compiler._compile_gemini_md(inline)GEMINI.mdapm compile --watch(single-file)agents_compiler._write_output_fileAGENTS.mdcompile_agents_md()(public API)agents_compiler._write_output_fileAGENTS.mdWhy a chokepoint, not a per-formatter patch
PR #468's structural pattern was "fix the same concern in 4 sibling files". The straightforward way to fix this issue would be to add a
stabilize_build_id()call at each of the five write sites -- and inherit the same fragility. Any future target re-exposes the bug.This PR introduces a single chokepoint:
All five compiled-output write sites route through
writer.write(path, content). The writer:stabilize_build_id(content)(extracted fromcompile/cli.py, algorithm preserved verbatim).BUILD_ID_PLACEHOLDERsurvives stabilization. A future code path that bypasses or breaks stabilization raisesRuntimeErrorrather than silently emitting__BUILD_ID__.Direct
Path.write_text/open(...).writeon compiled output is a contract violation by design -- adding a new target without using the writer fails loudly.Implementation
src/apm_cli/compilation/build_id.py(new, 41 lines)stabilize_build_id(content)-- pure, idempotent helper; algorithm preserved fromcompile/cli.py:506-524.src/apm_cli/compilation/output_writer.py(new, 49 lines)CompiledOutputWriterchokepoint with stabilize -> assert -> atomic-write contract. Error contract documented in module docstring (OSErrorpropagates to callers;RuntimeErroris intentionally NOT caught by callers'except OSErrorblocks).src/apm_cli/utils/atomic_io.py(new, 39 lines)atomic_write_text(path, data)-- single canonical implementation. The pre-existing_atomic_writeincommands/_helpers.pywas a near-duplicate; consolidating here avoids reintroducing the same sibling-duplication anti-pattern this PR is fixing.src/apm_cli/commands/_helpers.py_atomic_writereduced to a one-line alias ofatomic_write_text. Backward-compatible with existing tests and any external caller.src/apm_cli/commands/compile/cli.py(-22 lines)splitlines/hashlib.sha256/ line replacement); single-file path now callsCompiledOutputWriter().write(...).src/apm_cli/compilation/agents_compiler.py(+11 / -14)_write_distributed_file, claude inline, gemini inline,_write_output_file) route through the writer.Type of change
Testing
mainare unrelated to this change -- same set fails before and after.)New test coverage
tests/unit/compilation/test_build_id.py(9 cases): replace placeholder / idempotent / deterministic / no-placeholder pass-through / different content yields different hash / hash excludes placeholder line itself / preserves trailing newline / empty content / only-placeholder line.tests/unit/compilation/test_output_writer.py(6 cases): stabilizes before write / creates parent dirs / UTF-8 encoding / passes through clean content / raises on unresolvable placeholder (defense-in-depth) / atomic semantics -- pre-existing target untouched onos.replacefailure.Updated tests
tests/unit/compilation/test_agents_compiler_coverage.py::test_write_output_file_oserror_adds_error-- writer now auto-creates parent dirs, so the missing-parent failure mode is replaced with amonkeypatch os.replaceraise to exercise the sameOSErrorcodepath.tests/unit/test_command_helpers.py::test_cleans_up_temp_file_on_write_error-- leftover-file glob made prefix-agnostic so a future temp-file prefix change does not silent-pass.End-to-end verification
On a fixture with
apm.yml,src/main.py, and two instruction files (one withapplyTo: "**/*.py", one withapplyTo: "tests/**/*.py"):apm compileAGENTS.md(root) +src/AGENTS.md(distributed sub-dir)d659e6183d03/a91eb45e9bbfapm compile --single-agentsAGENTS.md3df8bca8558aapm compile -t claudeCLAUDE.md46a595669d48apm compile -t geminiGEMINI.mdb25ecb1630a5compile_agents_md()(public API)AGENTS.md1cb5828b4dd3Verified:
__BUILD_ID__literals remain in any output..instructions.mdchanges only the affected target's hash (drift detection works as documented in [BUG] apm compile produces different Build IDs on macOS vs Linux #467's user story).git stashof this PR's changes reproduces the original bug -- all five paths write__BUILD_ID__literally -- confirming the fix is necessary and sufficient.