Skip to content

debug: v0.31.1 compiler emits different YAML than committed *.lock.yml files from #861, breaking integrity check #866

@jamesadevine

Description

@jamesadevine

debug: v0.31.1 compiler emits different YAML than committed *.lock.yml files generated by PR #861, breaking ado-aw check

Summary

The scheduled Daily safe-output smoke: noop pipeline (ADO org msazuresphere, project AgentPlayground) fails in the "Verify pipeline integrity" step. The committed tests/safe-outputs/noop.lock.yml on main contains an Emit aw_info.json bash step that the publicly-released ado-aw v0.31.1 binary does not reproduce when re-compiling the same source. Both the on-disk lock and the regenerated lock claim version=0.31.1 in their headers, so the integrity check cannot reconcile them.

The same drift very likely affects all 26 lock files touched by #861 "regen ado-aw"noop.lock.yml is just the first one to be exercised by a scheduled smoke run.

Failing build

Failing log excerpt (verbatim)

--- tests/safe-outputs/noop.lock.yml (on disk)
+++ tests/safe-outputs/noop.lock.yml (expected from source)
@@ -261,16 +261,6 @@
         displayName: "ado-aw"

       - bash: |
-          set -eo pipefail
-
-          mkdir -p "$(Agent.TempDirectory)/staging"
-          cat >"$(Agent.TempDirectory)/staging/aw_info.json" <<'AW_INFO_EOF'
-          {"agent_name":"Daily safe-output smoke: noop","build_definition_id":"$(System.DefinitionId)","build_id":"$(Build.BuildId)","compiler_version":"0.31.1","engine":"copilot","model":"gpt-5-mini","org":"","repo":"","schema":"ado-aw/aw_info/1","source":"tests/safe-outputs/noop.md","source_branch":"$(Build.SourceBranch)","source_version":"$(Build.SourceVersion)","target":"standalone"}
-          AW_INFO_EOF
-        displayName: "Emit aw_info.json"
-        condition: always()
-
-      - bash: |
           cat >> "/tmp/awf-tools/agent-prompt.md" << 'SAFEOUTPUTS_EOF'
           ---


Summary: 0 line(s) added, 10 line(s) removed

Error: Integrity check failed: generated pipeline for 'Daily safe-output smoke: noop' does not match tests/safe-outputs/noop.lock.yml. Re-run `ado-aw compile` to update the pipeline file.
##[error]Bash exited with code '1'.

What I verified

I reproduced this locally on a clean checkout to rule out environmental noise:

  1. Source on main is unchanged from what the build saw. tests/safe-outputs/noop.md does not configure or reference anything that would conditionally enable / disable an aw_info.json emit.
  2. The publicly-released v0.31.1 compiler does not emit the aw_info.json step.
    > .\ado-aw.exe --version
    ado-aw 0.31.1
    > .\ado-aw.exe compile tests/safe-outputs/noop.md
    Generated standalone pipeline: tests/safe-outputs/noop.lock.yml
    > Select-String -Path tests\safe-outputs\noop.lock.yml -Pattern 'aw_info|Emit'
    (no matches)
    
  3. The committed noop.lock.yml on main does contain the step (lines around 261–270 per the diff). Both files carry the same header:
    # @ado-aw source="tests/safe-outputs/noop.md" version=0.31.1
    
  4. regen ado-aw #861 introduced the step. git log on tests/safe-outputs/noop.lock.yml:
    ea16d851  2026-06-05T13:21:52Z  regen ado-aw (#861)
    9597f35c  2026-05-18T13:43:54Z  chore: regenerate all ado-aw workflows with v0.30.1 (#619)
    
    regen ado-aw #861's patch on noop.lock.yml bumps the version pin 0.30.1 → 0.31.1 and adds the entire Emit aw_info.json block as a new step.
  5. The failing build started 10 minutes after regen ado-aw #861 merged, so the trigger is unambiguous: the regen committed YAML the integrity check (running the public release) refuses to accept.
  6. --debug-pipeline is not a flag on the released v0.31.1 binary. I tested it; the binary suggests --debug (logging only) and rejects --debug-pipeline. So the drift cannot be explained by "regen ran with --debug-pipeline, integrity check ran without it" against the released artifact.

Likely root cause

The compiler binary that produced PR #861's lock files is not the same binary the public v0.31.1 GitHub release tag ships. It is most likely a build from a commit after the v0.31.1 tag (which added the aw_info.json emit logic) but whose hard-coded version string is still 0.31.1.

This is effectively a versioning hole:

  • The lock-file header is the only thing ado-aw check uses to assert "this lock matches this compiler version."
  • If two different codegen behaviors both stamp version=0.31.1, the integrity check has no way to distinguish them, and CI regressions caused by an unreleased-but-tag-named build of the compiler land silently in main.

Impact

  • All 26 lock files touched by regen ado-aw #861 are likely in the same drift state. The noop smoke just happens to be on a daily schedule, so it was the first to surface the regression.
  • Every scheduled / manual run of those pipelines will fail at "Verify pipeline integrity" until either:
    • the lock files are regenerated using whatever release artifact actually corresponds to the published v0.31.1 binary, or
    • a new release (e.g. v0.31.2) is cut from the compiler revision that regen ado-aw #861 used, and the lock files' headers are bumped to match.

Suggested fixes (out of scope for this report; for maintainer triage)

  1. Bump the version pin whenever codegen changes. Even patch-level codegen changes should bump the version in the lock-file header so ado-aw check reliably catches drift.
  2. Embed the build SHA in the lock header (e.g. version=0.31.1+sha.abcdef0) so unreleased-but-tag-stamped builds are distinguishable.
  3. Gate the regen workflow on --version matching the latest published release, so PRs like regen ado-aw #861 can't be authored using a local build that disagrees with what the integrity check uses.
  4. Re-release (cut a v0.31.2 from the codegen revision that produced regen ado-aw #861) and regenerate all lock files against that public binary.

Environment

  • ADO org / project: msazuresphere / AgentPlayground
  • ADO pool: AZS-1ES-L-Playground-ubuntu-22.04
  • Local reproduction: Windows 11, ado-aw.exe 0.31.1 downloaded by our scripts/rebuild-pipelines.ps1 from https://github.com/githubnext/ado-aw/releases/download/v0.31.1/ado-aw-windows-x64.exe
  • Related upstream context: #861 (the regen PR that introduced the drift)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions