Skip to content

fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959

Merged
danielmeppiel merged 4 commits intomicrosoft:mainfrom
edenfunf:fix/compile-build-id-placeholder
Apr 30, 2026
Merged

fix(compile): stabilize __BUILD_ID__ in all compiled outputs#959
danielmeppiel merged 4 commits intomicrosoft:mainfrom
edenfunf:fix/compile-build-id-placeholder

Conversation

@edenfunf
Copy link
Copy Markdown
Contributor

@edenfunf edenfunf commented Apr 26, 2026

Description

Closes #957.

apm compile writes literal <!-- Build ID: __BUILD_ID__ --> to disk on every target except --single-agents. The placeholder substitution that constants.py promises ("a deterministic Build ID (content hash) is substituted post-generation") lives only in compile/cli.py:506-524 and is gated by if config.strategy == "distributed" and not single_agents -- only the legacy single-file path reaches it.

PR #468 (which closed #467) added cross-platform sort determinism but never extended substitution to distributed / claude / gemini formatters; its diff only touches sorted(...) calls. The verification claim "Both now produce <!-- Build ID: e316dee8ef7f -->" matches the --single-agents path only -- the default apm compile and every multi-target setup still write __BUILD_ID__ literally on main.

Affected paths producing literal __BUILD_ID__ on main (666925f):

Path Caller Output
apm compile (default distributed) agents_compiler._write_distributed_file AGENTS.md (root + every distributed sub-dir)
apm compile -t claude agents_compiler._compile_claude_md (inline) CLAUDE.md
apm compile -t gemini agents_compiler._compile_gemini_md (inline) GEMINI.md
apm compile --watch (single-file) agents_compiler._write_output_file AGENTS.md
compile_agents_md() (public API) agents_compiler._write_output_file AGENTS.md

Why a chokepoint, not a per-formatter patch

PR #468's structural pattern was "fix the same concern in 4 sibling files". The straightforward way to fix this issue would be to add a stabilize_build_id() call at each of the five write sites -- and inherit the same fragility. Any future target re-exposes the bug.

This PR introduces a single chokepoint:

src/apm_cli/compilation/output_writer.py::CompiledOutputWriter

All five compiled-output write sites route through writer.write(path, content). The writer:

  1. Calls stabilize_build_id(content) (extracted from compile/cli.py, algorithm preserved verbatim).
  2. Defensively asserts no BUILD_ID_PLACEHOLDER survives stabilization. A future code path that bypasses or breaks stabilization raises RuntimeError rather than silently emitting __BUILD_ID__.
  3. Creates parent dirs and writes atomically (replace-on-rename, so a crash mid-write cannot corrupt a pre-existing target).

Direct Path.write_text / open(...).write on compiled output is a contract violation by design -- adding a new target without using the writer fails loudly.

Implementation

File Change
src/apm_cli/compilation/build_id.py (new, 41 lines) stabilize_build_id(content) -- pure, idempotent helper; algorithm preserved from compile/cli.py:506-524.
src/apm_cli/compilation/output_writer.py (new, 49 lines) CompiledOutputWriter chokepoint with stabilize -> assert -> atomic-write contract. Error contract documented in module docstring (OSError propagates to callers; RuntimeError is intentionally NOT caught by callers' except OSError blocks).
src/apm_cli/utils/atomic_io.py (new, 39 lines) atomic_write_text(path, data) -- single canonical implementation. The pre-existing _atomic_write in commands/_helpers.py was a near-duplicate; consolidating here avoids reintroducing the same sibling-duplication anti-pattern this PR is fixing.
src/apm_cli/commands/_helpers.py _atomic_write reduced to a one-line alias of atomic_write_text. Backward-compatible with existing tests and any external caller.
src/apm_cli/commands/compile/cli.py (-22 lines) Removed inline hash logic (splitlines / hashlib.sha256 / line replacement); single-file path now calls CompiledOutputWriter().write(...).
src/apm_cli/compilation/agents_compiler.py (+11 / -14) All four compiled-output write sites (_write_distributed_file, claude inline, gemini inline, _write_output_file) route through the writer.

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally
  • All existing tests pass -- 5511 unit tests, 1 skipped, 1 warning. Compile-related integration tests: 24 passed, 1 skipped. (8 pre-existing integration failures on main are unrelated to this change -- same set fails before and after.)
  • Added tests for new functionality

New test coverage

  • tests/unit/compilation/test_build_id.py (9 cases): replace placeholder / idempotent / deterministic / no-placeholder pass-through / different content yields different hash / hash excludes placeholder line itself / preserves trailing newline / empty content / only-placeholder line.
  • tests/unit/compilation/test_output_writer.py (6 cases): stabilizes before write / creates parent dirs / UTF-8 encoding / passes through clean content / raises on unresolvable placeholder (defense-in-depth) / atomic semantics -- pre-existing target untouched on os.replace failure.

Updated tests

  • tests/unit/compilation/test_agents_compiler_coverage.py::test_write_output_file_oserror_adds_error -- writer now auto-creates parent dirs, so the missing-parent failure mode is replaced with a monkeypatch os.replace raise to exercise the same OSError codepath.
  • tests/unit/test_command_helpers.py::test_cleans_up_temp_file_on_write_error -- leftover-file glob made prefix-agnostic so a future temp-file prefix change does not silent-pass.

End-to-end verification

On a fixture with apm.yml, src/main.py, and two instruction files (one with applyTo: "**/*.py", one with applyTo: "tests/**/*.py"):

Target Output Build ID
apm compile AGENTS.md (root) + src/AGENTS.md (distributed sub-dir) d659e6183d03 / a91eb45e9bbf
apm compile --single-agents AGENTS.md 3df8bca8558a
apm compile -t claude CLAUDE.md 46a595669d48
apm compile -t gemini GEMINI.md b25ecb1630a5
compile_agents_md() (public API) AGENTS.md 1cb5828b4dd3

Verified:

  • Zero __BUILD_ID__ literals remain in any output.
  • Re-running the same compile reproduces the same hash (deterministic).
  • Modifying an .instructions.md changes only the affected target's hash (drift detection works as documented in [BUG] apm compile produces different Build IDs on macOS vs Linux #467's user story).
  • Reverting the modification reproduces the original hash byte-for-byte (byte-stable rebuild).
  • git stash of this PR's changes reproduces the original bug -- all five paths write __BUILD_ID__ literally -- confirming the fix is necessary and sufficient.

apm compile wrote literal '<!-- Build ID: __BUILD_ID__ -->' to disk on
every target except the legacy --single-agents path. The substitution
that constants.py promises lived only in compile/cli.py:506-524 and was
gated by `if config.strategy == "distributed" and not single_agents`.
PR microsoft#468 (which closed microsoft#467) added cross-platform sort determinism but
never extended substitution to the distributed/claude/gemini paths;
its verification ran on --single-agents only.

Affected paths producing literal __BUILD_ID__ on main:
- apm compile (default distributed): AGENTS.md + every sub-dir
- apm compile -t claude: CLAUDE.md
- apm compile -t gemini: GEMINI.md
- apm compile --watch (single-file): AGENTS.md
- compile_agents_md() public API: AGENTS.md

Rather than repeat PR microsoft#468's sibling-by-sibling pattern (4 files, all
needing the same call site), this introduces a single chokepoint:

  compilation/output_writer.py::CompiledOutputWriter

All five compiled-output write sites route through writer.write(...).
The writer extracts stabilize_build_id() (preserved verbatim from
compile/cli.py), asserts no placeholder survives stabilization
(defense-in-depth: future bypass raises RuntimeError instead of
silently emitting __BUILD_ID__), creates parent dirs, and writes
atomically. Direct Path.write_text / open(...).write on compiled
output is a contract violation by design.

The atomic write primitive was duplicated in commands/_helpers.
Consolidated to utils/atomic_io.atomic_write_text; commands/_helpers
keeps a one-line alias for backward compatibility with existing tests.

Closes microsoft#957
@danielmeppiel danielmeppiel added the panel-review Trigger the apm-review-panel gh-aw workflow label Apr 29, 2026
@github-actions
Copy link
Copy Markdown

APM Review Panel Verdict

Disposition: APPROVE (clean refactor; no required pre-merge actions)


Per-persona findings

Python Architect:

This PR extracts compiled-output write logic from four scattered call sites in AgentsCompiler and one inline block in compile/cli.py into two new modules (compilation/build_id.py, compilation/output_writer.py) plus one new utility (utils/atomic_io.py). The refactor satisfies the "abstract when 3+ call sites share the same logic" rule from the codebase's design philosophy.

1. OO / class diagram (mermaid)

classDiagram
    direction LR
    class CompiledOutputWriter {
        <<Facade>>
        +write(path: Path, content: str) None
    }
    class stabilize_build_id {
        <<Pure>>
        +stabilize_build_id(content: str) str
    }
    class atomic_write_text {
        <<IOBoundary>>
        +atomic_write_text(path: Path, data: str) None
    }
    class AgentsCompiler {
        +_compile_claude_md(config, primitives) None
        +_compile_gemini_md(config, primitives) None
        +_write_output_file(output_path, content) None
        +_write_distributed_file(agents_path, content, config) None
    }
    class _helpers {
        <<Module>>
        +_atomic_write alias
    }
    CompiledOutputWriter ..> stabilize_build_id : calls
    CompiledOutputWriter ..> atomic_write_text : calls
    AgentsCompiler ..> CompiledOutputWriter : uses (lazy import per method)
    _helpers ..> atomic_write_text : re-exports as alias
    note for CompiledOutputWriter "Chokepoint: stabilize build_id\n+ mkdir + atomic rename"
    class CompiledOutputWriter:::touched
    class stabilize_build_id:::touched
    class atomic_write_text:::touched
    class AgentsCompiler:::touched
    class _helpers:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading

2. Execution flow diagram (mermaid)

flowchart TD
    A["compile CLI\ncommands/compile/cli.py"] --> B
    C["AgentsCompiler._compile_claude_md\ncompilation/agents_compiler.py"] --> B
    D["AgentsCompiler._compile_gemini_md\ncompilation/agents_compiler.py"] --> B
    E["AgentsCompiler._write_output_file\ncompilation/agents_compiler.py"] --> B
    F["AgentsCompiler._write_distributed_file\ncompilation/agents_compiler.py"] --> B
    B["CompiledOutputWriter.write\ncompilation/output_writer.py"]
    B --> G["stabilize_build_id(content)\ncompilation/build_id.py"]
    G --> H{BUILD_ID_PLACEHOLDER\nin final?}
    H -->|Yes| I["RuntimeError: stabilization\nbypassed -- loud fail"]
    H -->|No| J["[FS] path.parent.mkdir\nparents=True, exist_ok=True"]
    J --> K["[I/O] atomic_write_text\nutils/atomic_io.py"]
    K --> L["[I/O] tempfile.mkstemp\nprefix='apm-atomic-', dir=path.parent"]
    L --> M["[I/O] os.fdopen.write(data)"]
    M --> N["[FS] os.replace(tmp_name, path)"]
    N --> O["Success: target updated atomically"]
    M -->|"OSError"| P["[FS] os.unlink(tmp_name)\n+ re-raise"]
Loading

Design patterns

  • Used in this PR: Facade / Chokepoint -- CompiledOutputWriter centralises build_id stabilisation, parent-mkdir, and atomic rename so every write site gets these cross-cutting concerns for free; Pure Function -- stabilize_build_id is referentially transparent (same input, same output, no side effects), making it trivially unit-testable as demonstrated by the 8-case test suite added in this PR.
  • Pragmatic suggestion: Consider storing one CompiledOutputWriter instance per compilation method (e.g. writer = CompiledOutputWriter() once at the top of _compile_claude_md) rather than constructing it inside nested loops (the PR already does this for _compile_claude_md and _compile_gemini_md; _write_output_file and _write_distributed_file each construct a fresh instance for every call). Since the class is stateless this is not a bug -- it is minor noise. A module-level singleton or a class-level constant _WRITER = CompiledOutputWriter() would silence it, but that optimization is out of scope for this PR.

Minor observation: the lazy from .output_writer import CompiledOutputWriter inside each method body in agents_compiler.py is consistent with the codebase's pattern for breaking circular imports. No action required -- just noting it is intentional, not an oversight.


CLI Logging Expert: No user-facing output paths are touched. The removed inline build_id computation in compile/cli.py was silent internal logic; moving it into CompiledOutputWriter does not change any message the user sees. No CommandLogger, _rich_*, or DiagnosticCollector call sites are modified. Clean.


DevX UX Expert: Purely internal infrastructure refactor. The apm compile command surface, flags, help text, and exit codes are unchanged. No CLI reference doc (docs/src/content/docs/reference/cli-commands.md) update is needed. The change is invisible to users and to CI pipelines consuming apm compile. No UX concerns.


Supply Chain Security Expert: The PR improves the security posture in two ways:

  1. Atomic writes are now universal. Before this PR, AgentsCompiler._compile_claude_md, _compile_gemini_md, _write_output_file, and _write_distributed_file used path.write_text() / open().write(), which can leave a half-written file if the process is interrupted. All four now route through atomic_write_text (same-filesystem rename via os.replace). A pre-existing target file is never corrupted by a failed write.

  2. Build ID is now injected consistently. Previously only the single-file AGENTS.md path in compile/cli.py stamped a Build ID; the other compilation targets wrote raw content with BUILD_ID_PLACEHOLDER surviving to disk. The chokepoint's defensive RuntimeError assertion ensures no future code path can silently bypass stabilisation.

No new network, credential, or path-traversal surface is introduced. The path.parent.mkdir() call uses exist_ok=True, which is safe. No security concerns -- this is a net positive.


Auth Expert: Not activated -- the PR touches only compilation output writing infrastructure (compilation/output_writer.py, compilation/build_id.py, utils/atomic_io.py, commands/_helpers.py, commands/compile/cli.py) with no changes to authentication behavior, token management, credential resolution, or host classification.


OSS Growth Hacker: No conversion surface is affected. This is a maintenance PR; no CHANGELOG entry, README, or docs update is warranted. Side-channel note to CEO: if atomic writes have been the root cause of intermittent "corrupted AGENTS.md" bug reports, that is worth a one-liner in the release notes ("All compiled outputs are now written atomically -- no partial files on crash") once confirmed by issue history. No WIP/growth-strategy.md update triggered.


CEO arbitration

All five mandatory specialists agree: this is a well-scoped internal refactor with no user-facing impact and a measurable security improvement. No trade-offs require arbitration. The chokepoint pattern is the right call at this scale -- five call sites sharing identical cross-cutting concerns (build_id, mkdir, atomicity) is exactly the threshold where extraction pays for itself. The PR ships with comprehensive unit tests for both new modules, covering idempotency, determinism, defensive assertions, and atomic-failure recovery. Disposition is APPROVE with no required pre-merge actions.


Required actions before merge

None.


Optional follow-ups

  • AgentsCompiler._write_output_file and _write_distributed_file each instantiate CompiledOutputWriter() on every call. Since the class is stateless, consider promoting it to a module-level or instance-level attribute to reduce per-call allocation noise -- low priority, purely cosmetic.
  • If any open issues cite "partial AGENTS.md after crash" or "Build ID missing from distributed output", close them as fixed by this PR and add a one-liner to the next release's CHANGELOG entry.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review Panel for issue #959 · ● 760.7K ·

…placeholder

# Conflicts:
#	src/apm_cli/compilation/agents_compiler.py
Copilot AI review requested due to automatic review settings April 30, 2026 03:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes apm compile Build ID stabilization so compiled outputs no longer persist the literal __BUILD_ID__ placeholder outside the legacy --single-agents path, by routing all write sites through a single writer chokepoint.

Changes:

  • Introduces stabilize_build_id() as a shared helper and applies it via a centralized CompiledOutputWriter.
  • Adds a canonical atomic_write_text() implementation and reuses it via _atomic_write alias.
  • Updates compilation write sites and expands unit test coverage for Build ID stabilization + atomic write behavior.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/apm_cli/compilation/build_id.py New helper to deterministically replace the Build ID placeholder.
src/apm_cli/compilation/output_writer.py New writer chokepoint that stabilizes Build ID, asserts invariants, and writes atomically.
src/apm_cli/utils/atomic_io.py New canonical atomic text write primitive used by writer and legacy helper alias.
src/apm_cli/commands/_helpers.py Replaces old _atomic_write implementation with an alias to atomic_write_text.
src/apm_cli/commands/compile/cli.py Removes inline Build ID hashing and writes via CompiledOutputWriter.
src/apm_cli/compilation/agents_compiler.py Routes distributed/claude/gemini/single-file write sites through the writer.
tests/unit/compilation/test_build_id.py Adds unit tests for stabilize_build_id() semantics and determinism.
tests/unit/compilation/test_output_writer.py Adds unit tests for writer chokepoint behavior and atomic write semantics.
tests/unit/compilation/test_agents_compiler_coverage.py Adjusts OSError-path coverage to reflect atomic rename failure mode.
tests/unit/test_command_helpers.py Makes temp-file cleanup assertion prefix-agnostic.

Comment thread tests/unit/compilation/test_output_writer.py Outdated
Comment thread src/apm_cli/utils/atomic_io.py
Comment thread src/apm_cli/commands/_helpers.py Outdated
Comment thread src/apm_cli/commands/compile/cli.py
- _helpers.py: move atomic_write_text alias import to top of file
  (PEP 8 ordering) and drop now-unused tempfile import after the
  inline _atomic_write body was replaced with the alias.
- test_output_writer.py: replace literal CJK title with \u escape
  sequences so the source file stays within the printable-ASCII
  policy enforced by .github/instructions/encoding.instructions.md.
  Runtime characters and the UTF-8 round-trip assertion are
  unchanged.
@danielmeppiel danielmeppiel enabled auto-merge April 30, 2026 05:09
@danielmeppiel danielmeppiel added this pull request to the merge queue Apr 30, 2026
Merged via the queue into microsoft:main with commit 5e00cf4 Apr 30, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

panel-review Trigger the apm-review-panel gh-aw workflow

Projects

None yet

3 participants