Refs #406: Harden submission gate API bounty parsing by er1c-cartman · Pull Request #478 · ramimbo/mergework

er1c-cartman · 2026-05-27T00:01:54Z

Summary

harden scripts/submission_quality_gate.py live API bounty parsing against malformed UTF-8, malformed rows, missing authoritative fields, and full API pages
require matching API bounty rows to provide valid issue_number, non-empty status, and integer-coercible awards_remaining before the gate treats API payability as verified
add focused regression coverage for bad UTF-8 fallback warnings, missing API fields, and the 200-row API safety cap

Refs #406.

Evidence

The submission quality gate is a preflight tool for bounty PRs. Its live path already warns when API bounty payability cannot be verified, but _load_api_bounties() could still let malformed API data through: invalid UTF-8 escaped the RuntimeError fallback boundary, missing status defaulted to open, missing awards_remaining could be surfaced as an incomplete authoritative row, and a full API page was not treated as a truncation risk. This patch keeps the existing fallback behavior while making the API boundary explicit and test-covered.

Live API sanity check: https://api.mrwk.ltclab.site/api/v1/bounties?status=open&limit=200 returned a list with 7 open bounty rows, including expected id, repo, issue_number, status, and awards_remaining fields.

Validation

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 27 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 418 passed
~/.local/bin/uv run --extra dev ruff check . -> passed
~/.local/bin/uv run --extra dev ruff format --check . -> 79 files already formatted
~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> success
git diff --check -> clean
changed-file secret scan -> no matches

Note: pytest plugin autoload is disabled because this workstation has an unrelated ROS Humble pytest plugin on the global path.

Summary by CodeRabbit

Bug Fixes
- Stricter validation of bounty data from external APIs: malformed entries now fail validation instead of being silently ignored; boolean/missing fields are rejected and numeric fields normalized.
- Enforced a safety cap on fetched bounties and fail-fast behavior for unexpected payload shapes.
- Improved handling of invalid/invalidly-encoded API responses and honoring API-provided bounty state when present.
Tests
- Added tests for malformed bounty objects, safety-cap behavior, and invalid UTF-8 / decoding edge cases.

coderabbitai · 2026-05-27T00:03:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 8b2859d5-912a-4d3e-b48a-efb8e65ce413

📥 Commits

Reviewing files that changed from the base of the PR and between aa16b14 and 2708989.

📒 Files selected for processing (2)

scripts/submission_quality_gate.py
tests/test_submission_quality_gate.py

📝 Walkthrough

Walkthrough

Hardens the API bounty loader: adds API_BOUNTY_SAFETY_CAP, sends explicit limit, enforces cap and payload-type checks, tightens per-row validation and awards_remaining normalization (raising RuntimeError on malformed rows), adds UnicodeDecodeError handling, and extends tests for UTF-8 and malformed-payload edge cases.

Changes

API Bounty Loader Hardening

Layer / File(s)	Summary
Safety cap constant and request limit `scripts/submission_quality_gate.py`	`API_BOUNTY_SAFETY_CAP` constant added; `_load_api_bounties()` passes `limit=API_BOUNTY_SAFETY_CAP` to the API and rejects responses whose length reaches or exceeds the cap (raises `RuntimeError`).
API response validation and normalization `scripts/submission_quality_gate.py`	Per-row parsing tightened: rejects invalid/boolean `issue_number`, requires non-empty string `status` (used as `state`), rejects missing/None/boolean `awards_remaining`, and normalizes `awards_remaining` to `int`; malformed rows now raise `RuntimeError` rather than being skipped.
Attempts fetch error handling `scripts/submission_quality_gate.py`	`_load_api_attempts()` now also catches `UnicodeDecodeError` when decoding API response bodies.
Live-context bounty state wiring `scripts/submission_quality_gate.py`	`_load_live_context()` prefers `api_bounty["state"]` when present, falling back to the GitHub issue `state`.
Test coverage for API loader edge cases `tests/test_submission_quality_gate.py`	Adds `pytest` import and tests: live-mode test for API-derived closed bounty state, tests simulating invalid UTF-8 in bounty list and attempts responses (assert load warnings and warn results), parameterized unit tests asserting `_load_api_bounties` raises `RuntimeError` for malformed bounty items, and a safety-cap enforcement test.

Possibly related PRs

ramimbo/mergework#385: Overlapping modifications to _load_api_bounties() bounty object validation and payload normalization paths.
ramimbo/mergework#325: Introduced the submission gate’s live API bounty logic and initial attempts handling that this PR hardens.

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Refs `#406`: Harden submission gate API bounty parsing' directly names the main changed surface (submission_quality_gate.py API parsing) and references the issue.
Description check	✅ Passed	The description includes all key sections: Summary of changes, Evidence with context and validation steps, and a detailed comment history showing iterative fixes with validation evidence.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Mergework Public Artifact Hygiene	✅ Passed	PR modifies only code/tests, not README/docs. Existing docs properly describe MRWK as native ledger coin; no investment, price, or cash-out claims present.
Bounty Pr Focus	✅ Passed	PR scope focused on bounty `#406`: validates API bounties strictly, enforces 200-item safety cap, prefers API state over GitHub state, handles UTF-8 errors, tests cover all hardening scenarios.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

GHX5T-SOL

Reviewed PR #478 at current head 3efeac3745abde5b7ec7858ff568e3c18ecfbdff.

I would hold this because it closes the malformed UTF-8 fallback for the bounty-list API, but the same failure mode still exists in the active-attempts API path used by the live submission gate.

Evidence:

_load_api_bounties() now catches UnicodeDecodeError and converts bad API bytes into a RuntimeError, which _load_live_context() can turn into a warning.
_load_api_attempts() still decodes the attempts response with response.read().decode("utf-8") but only catches HTTPError, OSError, URLError, and json.JSONDecodeError.
_load_live_context() calls _load_api_attempts() inside an except RuntimeError block only. If /api/v1/bounties/{id}/attempts returns malformed UTF-8, the UnicodeDecodeError escapes and the gate crashes instead of marking active-attempt lookup unverified.

Minimal local repro on this head:

from scripts import submission_quality_gate as gate

class BadUtf8Response:
    def __enter__(self): return self
    def __exit__(self, *args): return None
    def read(self): return b"\xff\xfe"

gate.urlopen = lambda url, timeout: BadUtf8Response()
gate._load_api_attempts("https://api.example.test", 66)

Observed result:

UnicodeDecodeError
'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I also reproduced the full live-context shape by returning a valid bounty-list row for #406 and malformed UTF-8 for /api/v1/bounties/66/attempts; _load_live_context() raised the same UnicodeDecodeError. Expected behavior should match the rest of this PR: wrap attempts decode failures in RuntimeError so the caller can emit the existing active_attempts unavailable warning.

Validation run locally:

uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 27 passed
uv run --extra dev python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
uv run --extra dev python -m pytest -q -> 418 passed
uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
uv run --extra dev python -m mypy scripts/submission_quality_gate.py -> success
uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok
git diff --check origin/main...HEAD -> clean
gitleaks detect --no-banner --redact --source . --log-opts origin/main..HEAD --exit-code 1 -> no leaks found
Live gate smoke for a Refs #406 draft returned warn only because the current PR itself is a similar open PR; bounty payability, maintainer activity, and active-attempt checks passed.

No wallet material, private keys, private data, signatures, price claims, exchange claims, liquidity claims, or off-ramp promises were used.

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ddcd564c-b17f-4611-830d-a38f902f2e83

📥 Commits

Reviewing files that changed from the base of the PR and between d8532d4 and 3efeac3.

📒 Files selected for processing (2)

scripts/submission_quality_gate.py
tests/test_submission_quality_gate.py

er1c-cartman · 2026-05-27T00:10:21Z

Thanks, agreed. I pushed aa16b14 to keep the active-attempts API on the same fallback boundary as the bounty-list API.

What changed:

_load_api_attempts() now wraps malformed UTF-8 responses in RuntimeError, so _load_live_context() records active_attempts_verified = False and emits the existing active attempts unavailable warning instead of crashing.
Added test_submission_quality_gate_live_context_warns_on_bad_attempts_utf8() covering the full live-context shape: valid bounty-list row, malformed /api/v1/bounties/{id}/attempts bytes, warning output, and no escaped UnicodeDecodeError.

Fresh validation:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 419 passed
~/.local/bin/uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
~/.local/bin/uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> success
git diff --check -> clean
changed-file secret scan -> no matches

GHX5T-SOL

Re-reviewed current head aa16b142fc316bd36e62288f0a0d9c60d91f5083 after the author follow-up.

The blocker from my previous review is fixed. _load_api_attempts() now catches UnicodeDecodeError and wraps it in RuntimeError, so _load_live_context() can mark active attempts as unverified and emit the existing warning instead of crashing. The added regression covers the full live-context path with a valid bounty-list row plus malformed /api/v1/bounties/{id}/attempts bytes.

Validation run locally on the updated head:

uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed
uv run --extra dev python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
uv run --extra dev python -m pytest -q -> 419 passed
uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
uv run --extra dev python -m mypy scripts/submission_quality_gate.py -> success
uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok
git diff --check origin/main...HEAD -> clean
gitleaks detect --no-banner --redact --source . --log-opts origin/main..HEAD --exit-code 1 -> no leaks found

GitHub quality check is successful. CodeRabbit was still processing the updated head at the time of this review, so I checked the changed surface locally. This looks ready from my review.

jtc268 · 2026-05-27T00:14:25Z

Reviewed current head aa16b142fc316bd36e62288f0a0d9c60d91f5083 as non-author jtc268.

I would hold this because PR #478 validates API status but _load_live_context() still writes the GitHub issue state into the evaluated bounty at scripts/submission_quality_gate.py:535. That means the newly validated API status is not actually used by _bounty_is_payable().

Concrete repro on this head: mock gh issue list --state all to return bounty #319 as CLOSED, while _load_api_bounties() returns the authoritative MergeWork API row {state: "open", awards_remaining: 1} for the same issue. _load_live_context() produces a bounty with state: "CLOSED", awards_remaining: 1, and payability_verified: true; evaluate_submission() then emits bounty_payable: fail with referenced bounty #319 is closed or exhausted.

Observed repro output:

{"active_attempts": [], "active_attempts_verified": true, "awards_remaining": 1, "id": 11, "maintainer_activity_verified": false, "max_maintainer_age_days": 14, "number": 319, "payability_verified": true, "state": "CLOSED", "title": "MRWK bounty: gate"}
[{"message": "referenced bounty #319 is closed or exhausted", "name": "bounty_payable", "status": "fail"}]

Expected fix: when an API bounty row is present, propagate api_bounty.get("state", issue.get("state")) into the live-context bounty object, with a regression that proves API-open / GitHub-closed evaluates from the API row instead of false-failing.

Validation: PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 28 passed; focused repro above run locally on the current PR head. No wallet material, private keys, private data, signatures, price claims, exchange claims, liquidity claims, or off-ramp promises were used.

er1c-cartman · 2026-05-27T00:17:38Z

Follow-up for CodeRabbit's API status propagation warning is pushed in 27089896ed0c6d13f4be227820b0daf8b91ec926.

What changed:

_load_live_context() now carries the validated public API bounty state into the live context when API data is available, instead of falling back to the GitHub issue state for the same bounty row.
Added a regression where GitHub reports the bounty issue as OPEN, but the API row reports closed with awards_remaining > 0; the submission gate now fails it as not payable.

Validation:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ~/.local/bin/uv run --extra dev python -m pytest -q -> 420 passed
~/.local/bin/uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
~/.local/bin/uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
~/.local/bin/uv run --extra dev mypy app scripts/submission_quality_gate.py -> passed
git diff --check -> passed
changed-file secret scan -> no matches

GHX5T-SOL

Re-reviewed current head 27089896ed0c6d13f4be227820b0daf8b91ec926.

This resolves the prior API-status propagation blocker. _load_live_context() now carries the validated MergeWork API bounty state into the evaluated bounty, so _bounty_is_payable() no longer falls back to the GitHub issue state when API bounty data is available. The new regression covers an API-closed bounty with the GitHub issue still open and confirms the gate fails it as not payable.

Validation run locally:

uv run --extra dev python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed in 1.14s.
uv run --extra dev python -m pytest -q -> 420 passed in 87.10s.
uv run --extra dev python scripts/docs_smoke.py -> docs smoke ok.
uv run --extra dev python -m mypy app scripts/submission_quality_gate.py -> Success: no issues found in 31 source files.
uv run --extra dev ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> All checks passed!.
uv run --extra dev ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted.
git diff --check origin/main...HEAD -> clean.
git diff origin/main...HEAD | gitleaks stdin --no-banner --redact --exit-code 1 -> no leaks found.

Live readback: PR is open/non-draft, mergeable CLEAN, CI successful, and CodeRabbit reports no actionable comments on the current head.

GHX5T-SOL

Re-reviewed current head 27089896ed0c6d13f4be227820b0daf8b91ec926 after the API status propagation follow-up.

I do not see a blocker on this head. The earlier malformed-attempts UTF-8 issue remains fixed, and the new state propagation change makes _load_live_context() carry the validated API bounty state into the evaluated bounty row with api_bounty.get("state", issue.get("state")). That resolves the prior false-fail path where the API row says the bounty is open/payable while the GitHub issue state is closed.

Validation run locally on the reviewed head:

/home/kali/money/mergework/.venv/bin/python -m pytest tests/test_submission_quality_gate.py -q -> 29 passed
/home/kali/money/mergework/.venv/bin/python -m pytest tests/test_pr_queue_health.py -q -> 9 passed
/home/kali/money/mergework/.venv/bin/python -m pytest -q -> 420 passed
/home/kali/money/mergework/.venv/bin/python -m ruff check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> passed
/home/kali/money/mergework/.venv/bin/python -m ruff format --check scripts/submission_quality_gate.py tests/test_submission_quality_gate.py -> 2 files already formatted
/home/kali/money/mergework/.venv/bin/python -m mypy scripts/submission_quality_gate.py -> success, no issues in 1 source file
/home/kali/money/mergework/.venv/bin/python scripts/docs_smoke.py -> docs smoke ok
git diff --check upstream/main...HEAD -> clean
gitleaks detect --no-banner --redact --source . --log-opts upstream/main..HEAD --exit-code 1 -> no leaks found
Local explicit API-open/GitHub-closed repro now preserves API state and does not emit the closed/exhausted false fail.

GitHub CI and CodeRabbit are successful on the reviewed head.

Harden submission gate API bounty parsing

3efeac3

er1c-cartman mentioned this pull request May 27, 2026

MRWK bounty: useful bug reports and small fixes, round 5 #406

Open

GHX5T-SOL suggested changes May 27, 2026

View reviewed changes

GHX5T-SOL mentioned this pull request May 27, 2026

MRWK bounty: review open MergeWork PRs with evidence, round 12 #447

Open

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Comment thread scripts/submission_quality_gate.py

Handle malformed attempts API bytes in submission gate

aa16b14

GHX5T-SOL approved these changes May 27, 2026

View reviewed changes

Use API bounty status in submission gate

2708989

GHX5T-SOL approved these changes May 27, 2026

View reviewed changes

adliebe mentioned this pull request May 27, 2026

Refs #406: Warn when live submission gate hits GitHub caps #493

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refs #406: Harden submission gate API bounty parsing#478

Refs #406: Harden submission gate API bounty parsing#478
er1c-cartman wants to merge 3 commits into
ramimbo:mainfrom
er1c-cartman:fix-406-submission-gate-api-validation

er1c-cartman commented May 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Uh oh!

GHX5T-SOL left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

er1c-cartman commented May 27, 2026

Uh oh!

GHX5T-SOL left a comment

Uh oh!

jtc268 commented May 27, 2026

Uh oh!

er1c-cartman commented May 27, 2026

Uh oh!

GHX5T-SOL left a comment

Uh oh!

GHX5T-SOL left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

er1c-cartman commented May 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Evidence

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Uh oh!

GHX5T-SOL left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

er1c-cartman commented May 27, 2026

Uh oh!

GHX5T-SOL left a comment

Choose a reason for hiding this comment

Uh oh!

jtc268 commented May 27, 2026

Uh oh!

er1c-cartman commented May 27, 2026

Uh oh!

GHX5T-SOL left a comment

Choose a reason for hiding this comment

Uh oh!

GHX5T-SOL left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

er1c-cartman commented May 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 27, 2026 •

edited

Loading