feat(output-assay): add witness scaffold, corpus, and report CLI#104
Open
Haserjian wants to merge 29 commits into
Open
feat(output-assay): add witness scaffold, corpus, and report CLI#104Haserjian wants to merge 29 commits into
Haserjian wants to merge 29 commits into
Conversation
AgentMesh Lineage CheckLineage coverage: 0/29 commits (0%) No |
There was a problem hiding this comment.
Pull request overview
Adds an initial “Output Assay” local-only scaffold (draft validation → deterministic stamping → deterministic Guardian gating) plus a calibration fixture corpus and operator-facing report/CLI wiring, and threads the resulting receipts through the reviewer-packet e2e path.
Changes:
- Introduces Output Assay models/guardian/analyzer/report modules and a new
assay output-assayCLI command. - Adds Output Assay calibration fixtures (manifest + 20-fixture corpus) and contract tests to enforce fixture shape/completeness.
- Adds a small invariant-accounting v0 primitive (
evaluate_latency_budget) with tests, and extends reviewer-packet e2e receipts to include Output Assay receipts.
Reviewed changes
Copilot reviewed 77 out of 77 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| docs/specs/INVARIANT_ACCOUNTING_V0.md | Spec draft for invariant accounting v0 |
| docs/specs/OUTPUT_ASSAY_CALIBRATION_V0.md | Spec draft for Output Assay calibration v0 |
| docs/specs/OUTPUT_ASSAY_RECONCILIATION_V0_DRAFT.md | Spec draft for Output Assay noun/projection boundaries |
| src/assay/commands.py | Adds assay output-assay command |
| src/assay/invariants.py | Adds invariant accounting primitives |
| src/assay/output_assay/init.py | Exposes Output Assay public API |
| src/assay/output_assay/analyzer.py | Local draft validation + deterministic stamping pipeline |
| src/assay/output_assay/guardian.py | Deterministic local Guardian gating |
| src/assay/output_assay/models.py | Pydantic models for draft/run/failure |
| src/assay/output_assay/report.py | Operator report rendering + fail-on helpers |
| tests/assay/test_invariant_accounting.py | Tests for invariant accounting |
| tests/assay/test_output_assay_report.py | Tests report rendering + CLI wiring |
| tests/assay/test_output_assay_scaffold.py | Tests for analyzer scaffold behavior |
| tests/contracts/test_output_assay_calibration_fixtures.py | Contract tests for fixture corpus |
| tests/e2e/test_reviewer_packet_evidence_sprint.py | Adds Output Assay receipts into e2e pack |
| tests/fixtures/output_assay/README.md | Fixture corpus README |
| tests/fixtures/output_assay/manifest.json | Fixture corpus manifest |
| tests/fixtures/output_assay/business_artifacts/oa_301_status_update_mixed/artifact.md | Business fixture artifact |
| tests/fixtures/output_assay/business_artifacts/oa_301_status_update_mixed/expected_run.json | Business fixture golden run |
| tests/fixtures/output_assay/business_artifacts/oa_301_status_update_mixed/fixture.json | Business fixture metadata |
| tests/fixtures/output_assay/business_artifacts/oa_302_decision_memo_mixed/artifact.md | Business fixture artifact |
| tests/fixtures/output_assay/business_artifacts/oa_302_decision_memo_mixed/expected_run.json | Business fixture golden run |
| tests/fixtures/output_assay/business_artifacts/oa_302_decision_memo_mixed/fixture.json | Business fixture metadata |
| tests/fixtures/output_assay/business_artifacts/oa_303_business_insight_brief_mixed/artifact.md | Business fixture artifact |
| tests/fixtures/output_assay/business_artifacts/oa_303_business_insight_brief_mixed/expected_run.json | Business fixture golden run |
| tests/fixtures/output_assay/business_artifacts/oa_303_business_insight_brief_mixed/fixture.json | Business fixture metadata |
| tests/fixtures/output_assay/mixed_quality/oa_201_claim_like_example/artifact.md | Mixed-quality fixture artifact |
| tests/fixtures/output_assay/mixed_quality/oa_201_claim_like_example/expected_run.json | Mixed-quality fixture golden run |
| tests/fixtures/output_assay/mixed_quality/oa_201_claim_like_example/fixture.json | Mixed-quality fixture metadata |
| tests/fixtures/output_assay/mixed_quality/oa_202_partial_evidence_brief/artifact.md | Mixed-quality fixture artifact |
| tests/fixtures/output_assay/mixed_quality/oa_202_partial_evidence_brief/expected_run.json | Mixed-quality fixture golden run |
| tests/fixtures/output_assay/mixed_quality/oa_202_partial_evidence_brief/fixture.json | Mixed-quality fixture metadata |
| tests/fixtures/output_assay/mixed_quality/oa_203_evidence_quote_boundary/artifact.md | Mixed-quality fixture artifact |
| tests/fixtures/output_assay/mixed_quality/oa_203_evidence_quote_boundary/expected_run.json | Mixed-quality fixture golden run |
| tests/fixtures/output_assay/mixed_quality/oa_203_evidence_quote_boundary/fixture.json | Mixed-quality fixture metadata |
| tests/fixtures/output_assay/mixed_quality/oa_204_ambiguous_stance_argument/artifact.md | Mixed-quality fixture artifact |
| tests/fixtures/output_assay/mixed_quality/oa_204_ambiguous_stance_argument/expected_run.json | Mixed-quality fixture golden run |
| tests/fixtures/output_assay/mixed_quality/oa_204_ambiguous_stance_argument/fixture.json | Mixed-quality fixture metadata |
| tests/fixtures/output_assay/mixed_quality/oa_205_verify_status_boundary/artifact.md | Mixed-quality fixture artifact |
| tests/fixtures/output_assay/mixed_quality/oa_205_verify_status_boundary/expected_run.json | Mixed-quality fixture golden run |
| tests/fixtures/output_assay/mixed_quality/oa_205_verify_status_boundary/fixture.json | Mixed-quality fixture metadata |
| tests/fixtures/output_assay/negative_controls/oa_101_output_assay_duplicate_spec/artifact.md | Negative-control fixture artifact |
| tests/fixtures/output_assay/negative_controls/oa_101_output_assay_duplicate_spec/expected_run.json | Negative-control fixture golden run |
| tests/fixtures/output_assay/negative_controls/oa_101_output_assay_duplicate_spec/fixture.json | Negative-control fixture metadata |
| tests/fixtures/output_assay/negative_controls/oa_102_unanchorable_extraction/artifact.md | Negative-control fixture artifact |
| tests/fixtures/output_assay/negative_controls/oa_102_unanchorable_extraction/expected_run.json | Negative-control fixture golden run |
| tests/fixtures/output_assay/negative_controls/oa_102_unanchorable_extraction/fixture.json | Negative-control fixture metadata |
| tests/fixtures/output_assay/negative_controls/oa_103_prompt_injection_bait/artifact.md | Negative-control fixture artifact |
| tests/fixtures/output_assay/negative_controls/oa_103_prompt_injection_bait/expected_run.json | Negative-control fixture golden run |
| tests/fixtures/output_assay/negative_controls/oa_103_prompt_injection_bait/fixture.json | Negative-control fixture metadata |
| tests/fixtures/output_assay/negative_controls/oa_104_author_judgment_bait/artifact.md | Negative-control fixture artifact |
| tests/fixtures/output_assay/negative_controls/oa_104_author_judgment_bait/expected_run.json | Negative-control fixture golden run |
| tests/fixtures/output_assay/negative_controls/oa_104_author_judgment_bait/fixture.json | Negative-control fixture metadata |
| tests/fixtures/output_assay/negative_controls/oa_105_failed_extraction_refusal/artifact.md | Negative-control fixture artifact |
| tests/fixtures/output_assay/negative_controls/oa_105_failed_extraction_refusal/expected_run.json | Negative-control fixture golden run |
| tests/fixtures/output_assay/negative_controls/oa_105_failed_extraction_refusal/fixture.json | Negative-control fixture metadata |
| tests/fixtures/output_assay/non_claim_artifacts/oa_401_support_note/artifact.md | Non-claim fixture artifact |
| tests/fixtures/output_assay/non_claim_artifacts/oa_401_support_note/expected_run.json | Non-claim fixture golden run |
| tests/fixtures/output_assay/non_claim_artifacts/oa_401_support_note/fixture.json | Non-claim fixture metadata |
| tests/fixtures/output_assay/non_claim_artifacts/oa_402_brainstorm/artifact.md | Non-claim fixture artifact |
| tests/fixtures/output_assay/non_claim_artifacts/oa_402_brainstorm/expected_run.json | Non-claim fixture golden run |
| tests/fixtures/output_assay/non_claim_artifacts/oa_402_brainstorm/fixture.json | Non-claim fixture metadata |
| tests/fixtures/output_assay/positive_controls/oa_001_clear_claim/artifact.md | Positive-control fixture artifact |
| tests/fixtures/output_assay/positive_controls/oa_001_clear_claim/expected_run.json | Positive-control fixture golden run |
| tests/fixtures/output_assay/positive_controls/oa_001_clear_claim/fixture.json | Positive-control fixture metadata |
| tests/fixtures/output_assay/positive_controls/oa_002_kernel_constitutional_rules/artifact.md | Positive-control fixture artifact |
| tests/fixtures/output_assay/positive_controls/oa_002_kernel_constitutional_rules/expected_run.json | Positive-control fixture golden run |
| tests/fixtures/output_assay/positive_controls/oa_002_kernel_constitutional_rules/fixture.json | Positive-control fixture metadata |
| tests/fixtures/output_assay/positive_controls/oa_003_compiled_packet_claim_bindings/artifact.md | Positive-control fixture artifact |
| tests/fixtures/output_assay/positive_controls/oa_003_compiled_packet_claim_bindings/expected_run.json | Positive-control fixture golden run |
| tests/fixtures/output_assay/positive_controls/oa_003_compiled_packet_claim_bindings/fixture.json | Positive-control fixture metadata |
| tests/fixtures/output_assay/positive_controls/oa_004_observation_not_assertion_boundary/artifact.md | Positive-control fixture artifact |
| tests/fixtures/output_assay/positive_controls/oa_004_observation_not_assertion_boundary/expected_run.json | Positive-control fixture golden run |
| tests/fixtures/output_assay/positive_controls/oa_004_observation_not_assertion_boundary/fixture.json | Positive-control fixture metadata |
| tests/fixtures/output_assay/positive_controls/oa_005_verification_order_plan/artifact.md | Positive-control fixture artifact |
| tests/fixtures/output_assay/positive_controls/oa_005_verification_order_plan/expected_run.json | Positive-control fixture golden run |
| tests/fixtures/output_assay/positive_controls/oa_005_verification_order_plan/fixture.json | Positive-control fixture metadata |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+131
to
+151
| expected = ( | ||
| "# OUTPUT ASSAY REPORT\n" | ||
| "\n" | ||
| "Run:\n" | ||
| " status: extraction_failure\n" | ||
| f" artifact_hash: {result.artifact_hash}\n" | ||
| f" failure_id: {result.failure_id}\n" | ||
| " truth_verification: performed=false tier=internal_support_only\n" | ||
| "\n" | ||
| "Failure:\n" | ||
| " stage: draft_validation\n" | ||
| " failure_modes: schema_validation_failed\n" | ||
| " summary: The system failed to produce a trustworthy Output Assay run artifact during local draft validation.\n" | ||
| " errors:\n" | ||
| " - (root): 'observed_units' is a required property\n" | ||
| "\n" | ||
| "Decision:\n" | ||
| " quarantine (no trustworthy run artifact produced)" | ||
| ) | ||
|
|
||
| assert report == expected |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Validation
Scope note