feat(deploy): record generating model(s) per cell in run manifest#1
Open
JoshuaBearup wants to merge 1 commit into
Open
feat(deploy): record generating model(s) per cell in run manifest#1JoshuaBearup wants to merge 1 commit into
JoshuaBearup wants to merge 1 commit into
Conversation
Adds a `generator_models` column to runs/<id>/manifest.csv and a `generation` block (token usage by model, from the existing getUsageReport()) to the per-cell local manifest, so a run records which model(s) actually produced each deployed cell. Local manifest only — deliberately NOT baked into the container. RCE/LFI classes can read the baked manifest, and the generating model is metadata the model under test must not see. Mirrors how solvabilityProof is attached post-deploy. Additive column appended at the end of the CSV, so existing manifest.csv consumers are unaffected. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Records which model(s) generated each cell in a run, alongside the existing record of which model attacked it.
generator_modelscolumn toruns/<id>/manifest.csv(e.g.claude-haiku-4-5-20251001:3|claude-opus-4-8:1— model:call-count, pipe-separated).generationblock (the existinggetUsageReport()output — tokens + cost by model) to the per-cell local manifest.Why
A run already records the model under test (the attacker) but not which model produced each target. Since generation mixes models (Haiku by default, Opus for the higher-quality steps via
quality: true), recording the generator mix per cell helps reproducibility and cost/quality analysis — and makes it explicit when a quality step silently ran on a different model than expected.Design notes
solvabilityProofis attached post-deploy (generator/deploy.mjs), so it never reaches the container.manifest.csv, so existing consumers of that file are unaffected.Testing
Verified
node -con both changed files, header/row column-count consistency (14 to 14), and thegenerator_modelsformatting against a samplebyModel. I did not run a full end-to-end deploy on this branch (noANTHROPIC_API_KEYin my environment), but the change only persistsgetUsageReport()output thatdeploy.mjsalready computes and logs vialogUsage()— no new generation behaviour.