Skip to content

test: add large PR review harness#385

Merged
rianjs merged 1 commit into
mainfrom
issue-375-large-pr-context-stuffing-harness
Jun 24, 2026
Merged

test: add large PR review harness#385
rianjs merged 1 commit into
mainfrom
issue-375-large-pr-context-stuffing-harness

Conversation

@rianjs

@rianjs rianjs commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Closes #375.

Summary

  • add scripts/verify-large-pr-review.sh for large-PR no-post review regression checks
  • add a build-tagged JSON field helper for structured cr review --json extraction
  • validate artifact shape, source sentinel absence from model-facing artifacts, prompt/context artifact size limits, stderr LLM task breadcrumbs, and full durable-task reuse across two unchanged passes
  • include --self-test for credential-free positive and negative invariant checks

Verification

  • rtk scripts/verify-large-pr-review.sh --help
  • rtk scripts/verify-large-pr-review.sh --self-test
  • rtk go test ./internal/pipeline ./internal/cmd/reviewcmd ./internal/cmd/benchmarkcmd -count=1
  • rtk make lint
  • rtk make build

Empirical Harness Note

The real harness mode is dry-run/no-post and isolates data/cache state while using the maintainer's normal cr config. I started one local real pass with CR_LARGE_PR_KEEP_TMP=1; it blocked during local runtime construction before PR fetch or LLM breadcrumbs, so I terminated that harness process tree rather than treating it as a product signal. The script now has a per-pass timeout to make that failure mode explicit.

@rianjs rianjs force-pushed the issue-375-large-pr-context-stuffing-harness branch from d2ff84e to 74ba978 Compare June 24, 2026 22:40
@rianjs rianjs merged commit 8e39fd1 into main Jun 24, 2026
10 checks passed
@rianjs rianjs deleted the issue-375-large-pr-context-stuffing-harness branch June 24, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add large-PR context stuffing regression harness

1 participant