Skip to content

Refs #406: Harden submission gate API bounty parsing#478

Open
er1c-cartman wants to merge 3 commits into
ramimbo:mainfrom
er1c-cartman:fix-406-submission-gate-api-validation
Open

Refs #406: Harden submission gate API bounty parsing#478
er1c-cartman wants to merge 3 commits into
ramimbo:mainfrom
er1c-cartman:fix-406-submission-gate-api-validation

Conversation

@er1c-cartman
Copy link
Copy Markdown

@er1c-cartman er1c-cartman commented May 27, 2026

Summary

  • harden scripts/submission_quality_gate.py live API bounty parsing against malformed UTF-8, malformed rows, missing authoritative fields, and full API pages
  • require matching API bounty rows to provide valid issue_number, non-empty status, and integer-coercible awards_remaining before the gate treats API payability as verified
  • add focused regression coverage for bad UTF-8 fallback warnings, missing API fields, and the 200-row API safety cap

Refs #406.

Evidence

The submission quality gate is a preflight tool for bounty PRs. Its live path already warns when API bounty payability cannot be verified, but _load_api_bounties() could still let malformed API data through: invalid UTF-8 escaped the RuntimeError fallback boundary, missing status defaulted to open, missing awards_remaining could be surfaced as an incomplete authoritative row, and a full API page was not treated as a truncation risk. This patch keeps the existing fallback behavior while making the API boundary explicit and test-covered.

Live API sanity check: https://api.mrwk.ltclab.site/api/v1/bounties?status=open&limit=200 returned a list with 7 open bounty rows, including expected id, repo, issue_number, status, and awards_remaining fields.

Validation

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 27 passed
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 418 passed
  • ~/.local/bin/uv run --extra dev ruff check . -> passed
  • ~/.local/bin/uv run --extra dev ruff format --check . -> 79 files already formatted
  • ~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> success
  • git diff --check -> clean
  • changed-file secret scan -> no matches

Note: pytest plugin autoload is disabled because this workstation has an unrelated ROS Humble pytest plugin on the global path.

Summary by CodeRabbit

  • Bug Fixes

    • Stricter validation of bounty data from external APIs: malformed entries now fail validation instead of being silently ignored; boolean/missing fields are rejected and numeric fields normalized.
    • Enforced a safety cap on fetched bounties and fail-fast behavior for unexpected payload shapes.
    • Improved handling of invalid/invalidly-encoded API responses and honoring API-provided bounty state when present.
  • Tests

    • Added tests for malformed bounty objects, safety-cap behavior, and invalid UTF-8 / decoding edge cases.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 8b2859d5-912a-4d3e-b48a-efb8e65ce413

📥 Commits

Reviewing files that changed from the base of the PR and between aa16b14 and 2708989.

📒 Files selected for processing (2)
  • scripts/submission_quality_gate.py
  • tests/test_submission_quality_gate.py

📝 Walkthrough

Walkthrough

Hardens the API bounty loader: adds API_BOUNTY_SAFETY_CAP, sends explicit limit, enforces cap and payload-type checks, tightens per-row validation and awards_remaining normalization (raising RuntimeError on malformed rows), adds UnicodeDecodeError handling, and extends tests for UTF-8 and malformed-payload edge cases.

Changes

API Bounty Loader Hardening

Layer / File(s) Summary
Safety cap constant and request limit
scripts/submission_quality_gate.py
API_BOUNTY_SAFETY_CAP constant added; _load_api_bounties() passes limit=API_BOUNTY_SAFETY_CAP to the API and rejects responses whose length reaches or exceeds the cap (raises RuntimeError).
API response validation and normalization
scripts/submission_quality_gate.py
Per-row parsing tightened: rejects invalid/boolean issue_number, requires non-empty string status (used as state), rejects missing/None/boolean awards_remaining, and normalizes awards_remaining to int; malformed rows now raise RuntimeError rather than being skipped.
Attempts fetch error handling
scripts/submission_quality_gate.py
_load_api_attempts() now also catches UnicodeDecodeError when decoding API response bodies.
Live-context bounty state wiring
scripts/submission_quality_gate.py
_load_live_context() prefers api_bounty["state"] when present, falling back to the GitHub issue state.
Test coverage for API loader edge cases
tests/test_submission_quality_gate.py
Adds pytest import and tests: live-mode test for API-derived closed bounty state, tests simulating invalid UTF-8 in bounty list and attempts responses (assert load warnings and warn results), parameterized unit tests asserting _load_api_bounties raises RuntimeError for malformed bounty items, and a safety-cap enforcement test.

Possibly related PRs

  • ramimbo/mergework#385: Overlapping modifications to _load_api_bounties() bounty object validation and payload normalization paths.
  • ramimbo/mergework#325: Introduced the submission gate’s live API bounty logic and initial attempts handling that this PR hardens.
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Refs #406: Harden submission gate API bounty parsing' directly names the main changed surface (submission_quality_gate.py API parsing) and references the issue.
Description check ✅ Passed The description includes all key sections: Summary of changes, Evidence with context and validation steps, and a detailed comment history showing iterative fixes with validation evidence.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Mergework Public Artifact Hygiene ✅ Passed PR modifies only code/tests, not README/docs. Existing docs properly describe MRWK as native ledger coin; no investment, price, or cash-out claims present.
Bounty Pr Focus ✅ Passed PR scope focused on bounty #406: validates API bounties strictly, enforces 200-item safety cap, prefers API state over GitHub state, handles UTF-8 errors, tests cover all hardening scenarios.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@GHX5T-SOL GHX5T-SOL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed PR #478 at current head 3efeac3745abde5b7ec7858ff568e3c18ecfbdff.

I would hold this because it closes the malformed UTF-8 fallback for the bounty-list API, but the same failure mode still exists in the active-attempts API path used by the live submission gate.

Evidence:

  • _load_api_bounties() now catches UnicodeDecodeError and converts bad API bytes into a RuntimeError, which _load_live_context() can turn into a warning.
  • _load_api_attempts() still decodes the attempts response with response.read().decode("utf-8") but only catches HTTPError, OSError, URLError, and json.JSONDecodeError.
  • _load_live_context() calls _load_api_attempts() inside an except RuntimeError block only. If /api/v1/bounties/{id}/attempts returns malformed UTF-8, the UnicodeDecodeError escapes and the gate crashes instead of marking active-attempt lookup unverified.

Minimal local repro on this head:

from scripts import submission_quality_gate as gate

class BadUtf8Response:
    def __enter__(self): return self
    def __exit__(self, *args): return None
    def read(self): return b"\xff\xfe"

gate.urlopen = lambda url, timeout: BadUtf8Response()
gate._load_api_attempts("https://api.example.test", 66)

Observed result:

UnicodeDecodeError
'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I also reproduced the full live-context shape by returning a valid bounty-list row for #406 and malformed UTF-8 for /api/v1/bounties/66/attempts; _load_live_context() raised the same UnicodeDecodeError. Expected behavior should match the rest of this PR: wrap attempts decode failures in RuntimeError so the caller can emit the existing active_attempts unavailable warning.

Validation run locally:

  • uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 27 passed
  • uv run --extra dev python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
  • uv run --extra dev python -m pytest -q -> 418 passed
  • uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
  • uv run --extra dev python -m mypy scripts/submission_quality_gate.py -> success
  • uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok
  • git diff --check origin/main...HEAD -> clean
  • gitleaks detect --no-banner --redact --source . --log-opts origin/main..HEAD --exit-code 1 -> no leaks found
  • Live gate smoke for a Refs #406 draft returned warn only because the current PR itself is a similar open PR; bounty payability, maintainer activity, and active-attempt checks passed.

No wallet material, private keys, private data, signatures, price claims, exchange claims, liquidity claims, or off-ramp promises were used.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ddcd564c-b17f-4611-830d-a38f902f2e83

📥 Commits

Reviewing files that changed from the base of the PR and between d8532d4 and 3efeac3.

📒 Files selected for processing (2)
  • scripts/submission_quality_gate.py
  • tests/test_submission_quality_gate.py

Comment thread scripts/submission_quality_gate.py
@er1c-cartman
Copy link
Copy Markdown
Author

Thanks, agreed. I pushed aa16b14 to keep the active-attempts API on the same fallback boundary as the bounty-list API.

What changed:

  • _load_api_attempts() now wraps malformed UTF-8 responses in RuntimeError, so _load_live_context() records active_attempts_verified = False and emits the existing active attempts unavailable warning instead of crashing.
  • Added test_submission_quality_gate_live_context_warns_on_bad_attempts_utf8() covering the full live-context shape: valid bounty-list row, malformed /api/v1/bounties/{id}/attempts bytes, warning output, and no escaped UnicodeDecodeError.

Fresh validation:

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 419 passed
  • ~/.local/bin/uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • ~/.local/bin/uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
  • ~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> success
  • git diff --check -> clean
  • changed-file secret scan -> no matches

Copy link
Copy Markdown
Contributor

@GHX5T-SOL GHX5T-SOL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head aa16b142fc316bd36e62288f0a0d9c60d91f5083 after the author follow-up.

The blocker from my previous review is fixed. _load_api_attempts() now catches UnicodeDecodeError and wraps it in RuntimeError, so _load_live_context() can mark active attempts as unverified and emit the existing warning instead of crashing. The added regression covers the full live-context path with a valid bounty-list row plus malformed /api/v1/bounties/{id}/attempts bytes.

Validation run locally on the updated head:

  • uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed
  • uv run --extra dev python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
  • uv run --extra dev python -m pytest -q -> 419 passed
  • uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
  • uv run --extra dev python -m mypy scripts/submission_quality_gate.py -> success
  • uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok
  • git diff --check origin/main...HEAD -> clean
  • gitleaks detect --no-banner --redact --source . --log-opts origin/main..HEAD --exit-code 1 -> no leaks found

GitHub quality check is successful. CodeRabbit was still processing the updated head at the time of this review, so I checked the changed surface locally. This looks ready from my review.

@jtc268
Copy link
Copy Markdown

jtc268 commented May 27, 2026

Reviewed current head aa16b142fc316bd36e62288f0a0d9c60d91f5083 as non-author jtc268.

I would hold this because PR #478 validates API status but _load_live_context() still writes the GitHub issue state into the evaluated bounty at scripts/submission_quality_gate.py:535. That means the newly validated API status is not actually used by _bounty_is_payable().

Concrete repro on this head: mock gh issue list --state all to return bounty #319 as CLOSED, while _load_api_bounties() returns the authoritative MergeWork API row {state: "open", awards_remaining: 1} for the same issue. _load_live_context() produces a bounty with state: "CLOSED", awards_remaining: 1, and payability_verified: true; evaluate_submission() then emits bounty_payable: fail with referenced bounty #319 is closed or exhausted.

Observed repro output:

{"active_attempts": [], "active_attempts_verified": true, "awards_remaining": 1, "id": 11, "maintainer_activity_verified": false, "max_maintainer_age_days": 14, "number": 319, "payability_verified": true, "state": "CLOSED", "title": "MRWK bounty: gate"}
[{"message": "referenced bounty #319 is closed or exhausted", "name": "bounty_payable", "status": "fail"}]

Expected fix: when an API bounty row is present, propagate api_bounty.get("state", issue.get("state")) into the live-context bounty object, with a regression that proves API-open / GitHub-closed evaluates from the API row instead of false-failing.

Validation: PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed; focused repro above run locally on the current PR head. No wallet material, private keys, private data, signatures, price claims, exchange claims, liquidity claims, or off-ramp promises were used.

@er1c-cartman
Copy link
Copy Markdown
Author

Follow-up for CodeRabbit's API status propagation warning is pushed in 27089896ed0c6d13f4be227820b0daf8b91ec926.

What changed:

  • _load_live_context() now carries the validated public API bounty state into the live context when API data is available, instead of falling back to the GitHub issue state for the same bounty row.
  • Added a regression where GitHub reports the bounty issue as OPEN, but the API row reports closed with awards_remaining > 0; the submission gate now fails it as not payable.

Validation:

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 420 passed
  • ~/.local/bin/uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • ~/.local/bin/uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • ~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> passed
  • git diff --check -> passed
  • changed-file secret scan -> no matches

Copy link
Copy Markdown
Contributor

@GHX5T-SOL GHX5T-SOL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head 27089896ed0c6d13f4be227820b0daf8b91ec926.

This resolves the prior API-status propagation blocker. _load_live_context() now carries the validated MergeWork API bounty state into the evaluated bounty, so _bounty_is_payable() no longer falls back to the GitHub issue state when API bounty data is available. The new regression covers an API-closed bounty with the GitHub issue still open and confirms the gate fails it as not payable.

Validation run locally:

  • uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed in 1.14s.
  • uv run --extra dev python -m pytest -q -> 420 passed in 87.10s.
  • uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok.
  • uv run --extra dev python -m mypy app scripts/submission_quality_gate.py -> Success: no issues found in 31 source files.
  • uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> All checks passed!.
  • uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted.
  • git diff --check origin/main...HEAD -> clean.
  • git diff origin/main...HEAD | gitleaks stdin --no-banner --redact --exit-code 1 -> no leaks found.

Live readback: PR is open/non-draft, mergeable CLEAN, CI successful, and CodeRabbit reports no actionable comments on the current head.

Copy link
Copy Markdown
Contributor

@GHX5T-SOL GHX5T-SOL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head 27089896ed0c6d13f4be227820b0daf8b91ec926 after the API status propagation follow-up.

I do not see a blocker on this head. The earlier malformed-attempts UTF-8 issue remains fixed, and the new state propagation change makes _load_live_context() carry the validated API bounty state into the evaluated bounty row with api_bounty.get("state", issue.get("state")). That resolves the prior false-fail path where the API row says the bounty is open/payable while the GitHub issue state is closed.

Validation run locally on the reviewed head:

  • /home/kali/money/mergework/.venv/bin/python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed
  • /home/kali/money/mergework/.venv/bin/python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
  • /home/kali/money/mergework/.venv/bin/python -m pytest -q -> 420 passed
  • /home/kali/money/mergework/.venv/bin/python -m ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
  • /home/kali/money/mergework/.venv/bin/python -m ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
  • /home/kali/money/mergework/.venv/bin/python -m mypy scripts/submission_quality_gate.py -> success, no issues in 1 source file
  • /home/kali/money/mergework/.venv/bin/python scripts/docs_smoke.py -> docs smoke ok
  • git diff --check upstream/main...HEAD -> clean
  • gitleaks detect --no-banner --redact --source . --log-opts upstream/main..HEAD --exit-code 1 -> no leaks found
  • Local explicit API-open/GitHub-closed repro now preserves API state and does not emit the closed/exhausted false fail.

GitHub CI and CodeRabbit are successful on the reviewed head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants