Fix scan fail-on bypass via repository-controlled suppressions by tachyon-beep · Pull Request #25 · foundryside-dev/wardline

tachyon-beep · 2026-06-05T08:36:29Z

Motivation

The CLI/MCP scan flow applied repository-supplied baseline/waiver/judged suppressions to emitted findings before evaluating the --fail-on gate, allowing an attacker-controlled repo entry to hide defects from CI gates.
The intent is to keep human-visible annotations (baselined/waived/judged) in scan output while ensuring CI gating only considers a trusted unsuppressed population.

Description

Introduce a separate unsuppressed gate population ScanResult.gate_findings that is derived from the raw (pre-suppression) findings so gates evaluate an unsuppressed view while emitted findings remain annotated (file: src/wardline/core/run.py).
Apply --new-since delta scoping to both the emitted findings and the separate gate population so operator scoping still scopes the gate but repository suppressions do not (file: src/wardline/core/run.py).
Change gate_decision to evaluate against the gate_findings population when present, falling back to result.findings otherwise (file: src/wardline/core/run.py).
Update unit/regression tests to assert that baselined/waived/judged annotations remain visible in output but no longer clear the --fail-on gate (files: tests/unit/core/test_run.py, tests/unit/cli/test_cli.py).

Testing

Ran static/style checks with uv run ruff check and Python compilation with python -m py_compile, both succeeded.
Executed focused unit tests including tests/unit/core/test_run.py::test_gate_decision_uses_unsuppressed_gate_population and tests/unit/core/test_suppression.py::test_gate_trips_only_on_active_defect_at_or_above_threshold, which passed.
Attempted broader test runs that exercise CLI/MCP parity (--extra scanner) but some runs were blocked by dependency downloads (PyYAML / click) due to external network/tunnel failures and thus could not complete; this is an environment issue and not a code regression.
Local test adjustments were added to assert the new gate semantics and the updated test subset ran green in the available environment.

Codex Task

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd64010099

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-05T08:41:56Z

    findings = apply_suppressions(raw, baseline, waivers, today=date.today(), judged=judged)
+    # Keep a separate gate population that applies only operator-supplied scan
+    # scoping (for example --new-since), not repository-controlled suppressions.
+    gate_findings = list(raw)


Preserve line-less defect quarantine in gate population

When a rule or local pack emits a normal-file DEFECT with line_start=None, apply_suppressions intentionally replaces it with a WLN-ENGINE-LINELESS-DEFECT FACT so the invalid, collision-prone defect is not treated as an active finding. Building gate_findings directly from raw bypasses that tool-owned invariant, so scan --fail-on can now fail on a defect that is omitted from the emitted findings and was previously quarantined for fingerprint-safety reasons; only repository-controlled baseline/waiver/judged suppressions should be skipped here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-05T08:41:56Z

    if fail_on is None:
        return GateDecision(tripped=False, fail_on=None, exit_class=0)
-    tripped = gate_trips(result.findings, fail_on)
+    gate_findings = result.gate_findings or result.findings


Surface explanations for gate-only suppressed defects

When the only above-threshold defect is baselined, waived, or judged, this switch makes gate.tripped true while the MCP response still emits that finding as suppressed; _scan only attaches inline explanations for SuppressionState.ACTIVE findings (src/wardline/mcp/server.py:160-164). Agents using scan with explain: true and fail_on can therefore receive a failing gate with no explained active defect to fix, so the gate-relevant population needs to be exposed/explained separately or the emitted finding needs a gate-relevant marker.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-05T08:41:56Z

    findings = apply_suppressions(raw, baseline, waivers, today=date.today(), judged=judged)
+    # Keep a separate gate population that applies only operator-supplied scan
+    # scoping (for example --new-since), not repository-controlled suppressions.
+    gate_findings = list(raw)


Honor waivers from explicit operator configs

This treats every waiver as untrusted by gating on the completely raw findings, but run_scan(..., config_path=...) and the CLI --config path can be an operator-supplied config outside the repository. In that CI setup, centrally managed waivers in the explicit config are no longer honored by --fail-on, even though the nearby config handling explicitly preserves operator-provided policy; only waivers loaded from the repository's default wardline.yaml should be excluded from the gate population.

Useful? React with 👍 / 👎.

…ail-on) (#28) Close a HIGH-severity CI-gate bypass. `wardline scan --fail-on` applied repository-controlled suppressions (`.wardline/baseline.yaml`, `wardline.yaml` waivers, `.wardline/judged.yaml`) to findings BEFORE evaluating the gate, so a malicious PR could commit a suppression keyed to its own new defect's fingerprint and clear the gate. All three sources are committed repo content and equally exploitable. Reproduced live (baselining the sole ERROR zeroed the gate). Secure-by-default model (combines #24 + #25): - `gate_decision` now evaluates a separate UNSUPPRESSED population (`ScanResult.gate_findings`). baseline/waiver/judged still ANNOTATE the emitted findings (`suppressed=…` stays visible) but no longer clear the gate. - The gate population is built with apply_suppressions over EMPTY baseline + waivers + judged, NOT `list(raw)`, so the lineless-DEFECT→non-gating-FACT downgrade is preserved (no spurious gate trips). - `--new-since <ref>` (operator-supplied, unforgeable) scopes BOTH the emitted and gate populations — the secure CI ratchet. - `--trust-suppressions` (CLI) / `trust_suppressions` (run_scan, MCP scan tool), default False, restores the local ratchet / judge DX for trusted checkouts (None sentinel → gate falls back to the suppressed findings). `run_judge` passes True so judge/triage/persist are unchanged. - `load_judged` now requires `verdict: FALSE_POSITIVE` (rejects a hand-edited TRUE_POSITIVE / missing verdict smuggled in as a silent suppression). BREAKING (noted in CHANGELOG, acceptable at 0.x): baseline-gated CI goes green→red on upgrade until `--new-since` or `--trust-suppressions` is added. Docs updated (suppression.md): the secure CI ratchet is `--new-since`. Combines and supersedes #24 (judged-only) and #25 (no escape hatch + a lineless-DEFECT gate bug). Full suite green (2394 passed), ruff + mypy clean. Co-authored-by: John Morrissey <john@wardline.dev> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tachyon-beep · 2026-06-05T11:57:10Z

Fixed ourselves in 16a4d005 (PR #28, merged to main) — your mechanism, two bugs fixed, plus an escape hatch.

The vulnerability is real and HIGH-severity, and your scope was right: all three repo-controlled suppression sources (baseline / waiver / judged) must stop clearing the --fail-on gate. #28 adopts your core design — a separate ScanResult.gate_findings unsuppressed population that gate_decision evaluates, with --new-since scoping both populations.

We wrote our own rather than merge this directly because of two issues:

Broke the documented baseline ratchet with no escape hatch. Gating over the full unsuppressed set means scan --fail-on flips green→red for every pre-existing baselined finding on upgrade — and suppression.md documents the baseline as 'the gate fires only on findings after the snapshot'. fix(scan): gate on the unsuppressed population by default (secure --fail-on) — supersedes #24, #25 #28 keeps the secure default but adds --trust-suppressions (default off) to restore the local ratchet for trusted checkouts, threads it through CLI + MCP + run_judge, documents --new-since as the secure CI ratchet, and ships a BREAKING CHANGELOG note (acceptable at 0.x). This was a product decision; the maintainer chose the full secure-by-default + opt-in model.
Lineless-DEFECT gate bug. gate_findings = list(raw) includes raw lineless DEFECTs that apply_suppressions deliberately downgrades to non-gating FACTs (fingerprint-collision safety), so the gate could trip where main wouldn't. fix(scan): gate on the unsuppressed population by default (secure --fail-on) — supersedes #24, #25 #28 builds the gate population as apply_suppressions(raw, Baseline(frozenset()), WaiverSet([]), judged=None) — zero suppression, but the same structural transforms — and adds a regression guard.

Also folded in #24's verdict: FALSE_POSITIVE hardening, and used a None sentinel for gate_findings (not a falsy-empty fallback) so a legitimately-empty gate population can't silently fall back to the suppressed set.

Verified: full suite 2406 passed, ruff + mypy clean; live repro shows a baselined ERROR trips the gate by default and clears under --trust-suppressions; existing gate-clears tests flipped to the secure default with --trust-suppressions variants. Thanks for the headline find — closing in favour of #28.

fix(scan): gate before repo suppressions

bd64010

tachyon-beep added codex aardvark labels Jun 5, 2026 — with ChatGPT Codex Connector

chatgpt-codex-connector Bot reviewed Jun 5, 2026

View reviewed changes

tachyon-beep mentioned this pull request Jun 5, 2026

fix(scan): gate on the unsuppressed population by default (secure --fail-on) — supersedes #24, #25 #28

Merged

tachyon-beep mentioned this pull request Jun 5, 2026

fix(scan): require explicit trust for .wardline/judged.yaml suppressions #24

Closed

tachyon-beep closed this Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scan fail-on bypass via repository-controlled suppressions#25

Fix scan fail-on bypass via repository-controlled suppressions#25
tachyon-beep wants to merge 1 commit into
mainfrom
codex/fix-vulnerability-bypassing-scan-fail-on-gate

tachyon-beep commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Uh oh!

tachyon-beep commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tachyon-beep commented Jun 5, 2026

Motivation

Description

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

tachyon-beep commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant