Skip to content

feat: add reusable thread lifecycle pipeline#359

Draft
rianjs wants to merge 18 commits into
mainfrom
feat/thread-lifecycle-architecture
Draft

feat: add reusable thread lifecycle pipeline#359
rianjs wants to merge 18 commits into
mainfrom
feat/thread-lifecycle-architecture

Conversation

@rianjs

@rianjs rianjs commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Decomposes inline discussion handling into reusable thread lifecycle building blocks for both full cr review and the new cr respond command.

  • adds shared thread context normalization, resolved-thread summary compaction, and CR-authored thread detection
  • adds shared thread analysis and planned-action projection so replies/resolutions flow through reviewplan, ledger, and outbox
  • adds shared stage model resolution and LLM-run persistence contracts with architecture docs and guardrails
  • wires full review to analyze inline thread lifecycle before reviewer agents and feed compact context into prompts
  • adds cr respond for reusable thread response planning and retryable posting semantics
  • preserves safe JSON unknown-field names in structured-output retry prompts, which the local dry run needed to recover from reviewer schema validation

Validation

  • rtk go test ./internal/threadrespond ./internal/cmd/reviewcmd
  • rtk go test ./internal/pipeline
  • rtk go test ./internal/cmd/...
  • rtk go test ./internal/llm
  • rtk go test ./... (2438 passed in 49 packages)
  • rtk go build ./cmd/cr
  • local build dry run: /tmp/cr-thread-lifecycle review https://github.com/open-cli-collective/codereview-cli/pull/359 --dry-run --max-agents 3 --allow-self-review --json --verbose

Dry-run Result

Run d09716b5-8129-4668-89f0-f8675ce768b4 completed with outcome: dry_run against head 5da53b22aaec.

  • selected reviewers: structure:repo-health, go:implementation-tests, policies:conventions
  • reviewer count respected --max-agents 3
  • planned actions stayed planned_only: 5 inline comments, 1 rollup comment, 1 submit-review action
  • thread lifecycle context: 0 discussion threads considered, 0 summarized, 0 resolved for this PR
  • wall time: 12m16s; estimated cost: ~$1.99

Notes

This PR intentionally keeps provider writes on the existing durable reviewplan/ledger/outbox path.

@rianjs rianjs changed the title [codex] add reusable thread lifecycle pipeline feat: add reusable thread lifecycle pipeline Jun 23, 2026
@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 70ee83c to bbb4e15 Compare June 26, 2026 10:43
@rianjs

rianjs commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Findings

  • Major: the new dry-run rerun coverage proves --rerun gets a fresh RunID, but it never asserts a fresh artifact root. TestDryRunRerunBypassesIncompleteRunAttempt would still pass if the implementation reused the interrupted run’s artifact tree while allocating run-fresh, which is the exact regression the ticket calls out. See internal/pipeline/pipeline_test.go.
  • Minor: there is no CLI-level smoke test for cr review --dry-run --rerun; the only --rerun command assertion is on the live path. That leaves the flag plumbing for the dry-run recovery path unproven at the wrapper layer. See internal/cmd/reviewcmd/reviewcmd_test.go.
  • Minor: cr respond is only exercised through a fake runtime/responder in the command tests, so a wiring regression between the command surface and the new threadrespond pipeline could still slip through even if the lower-level package tests stay green. See internal/cmd/reviewcmd/reviewcmd_test.go.

Comment thread docs/architecture.md
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.

Reply inline to this comment.

if err != nil {
return preparedSelectionContext{}, err
}
if strings.TrimSpace(req.PostingIdentity.ID) != "" || strings.TrimSpace(req.PostingIdentity.Login) != "" {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionOnly now silently drops normalized thread context whenever PostingIdentity is unset, even though SelectionRequest grew that field specifically to drive the new thread-lifecycle prompt input. That means existing callers can keep compiling while benchmarking or other selection-only flows quietly stop exercising the same selector inputs as cr review, so the changed behavior is no longer proven by the same code path. Make this contract explicit instead of degrading silently: either require PostingIdentity for selection-only runs that load inline threads, or add a focused test that covers the no-identity branch and update callers to pass the resolved posting identity so selection-only continues to match full-review behavior.

Reply inline to this comment.

Comment thread docs/architecture.md
New LLM-backed components must not call provider adapters or `internal/llm`
structured helpers directly. They should call `llmlifecycle` through explicit,
fakeable dependencies in unit tests and should return domain results rather
than posting comments or mutating provider state. This guardrail is enforced by

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.

Reply inline to this comment.

cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo’s command layout is organized as one top-level command package per internal/cmd/* directory, but this change introduces a new top-level respond command by folding it into reviewcmd (newRespondCommand, runRespond, ResponseRunner, extra rendering helpers, and tests) instead of giving it its own internal/cmd/respondcmd boundary. That makes the command tree less discoverable from the package map in docs/development.md, and it couples future respond-specific flags/runtime/output work to the already large reviewcmd package, which is the kind of ownership drift that gets harder to unwind after more command-specific behavior lands. Split respond into its own command package and keep only shared review/respond bootstrap helpers in a common seam so the repo’s internal/cmd/* structure continues to reflect the user-facing command surface.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 994639f56055
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 1
policies:conventions 0
documentation:docs 2
go:implementation-tests (1 finding)

Minor - internal/pipeline/pipeline.go:1052

SelectionOnly now silently drops normalized thread context whenever PostingIdentity is unset, even though SelectionRequest grew that field specifically to drive the new thread-lifecycle prompt input. That means existing callers can keep compiling while benchmarking or other selection-only flows quietly stop exercising the same selector inputs as cr review, so the changed behavior is no longer proven by the same code path. Make this contract explicit instead of degrading silently: either require PostingIdentity for selection-only runs that load inline threads, or add a focused test that covers the no-identity branch and update callers to pass the resolved posting identity so selection-only continues to match full-review behavior.

structure:repo-health (1 finding)

Major - internal/cmd/reviewcmd/reviewcmd.go:189

The repo’s command layout is organized as one top-level command package per internal/cmd/* directory, but this change introduces a new top-level respond command by folding it into reviewcmd (newRespondCommand, runRespond, ResponseRunner, extra rendering helpers, and tests) instead of giving it its own internal/cmd/respondcmd boundary. That makes the command tree less discoverable from the package map in docs/development.md, and it couples future respond-specific flags/runtime/output work to the already large reviewcmd package, which is the kind of ownership drift that gets harder to unwind after more command-specific behavior lands. Split respond into its own command package and keep only shared review/respond bootstrap helpers in a common seam so the repo’s internal/cmd/* structure continues to reflect the user-facing command surface.

documentation:docs (2 findings)

Minor - docs/architecture.md:16

The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.

Minor - docs/architecture.md:25

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/approvaloverride/approvaloverride.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/review/review.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver_test.go Review scope was intentionally narrowed to the core thread-lifecycle, pipeline, ledger, model-resolution, and command-wiring changes.; Static review only; I did not execute the Go test suite in this read-only workbench.
structure:repo-health incomplete_skipped docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go internal/gateio/gateio.go, internal/llmlifecycle/lifecycle_test.go, internal/pipeline/pipeline_test.go Review scoped to the assigned files, with emphasis on long-term command/package boundaries and durable lifecycle contracts rather than exhaustive behavioral validation.
policies:conventions complete_broad docs/development.md, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go unavailable Shared source-of-truth convenience copies were not present at ../cli-common/docs or ../.github, so this review was limited to repo-local guidance and the assigned changed files.
documentation:docs complete_broad docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md unavailable Review scope was limited to the assigned documentation files, with repo/code reads only to verify factual claims and referenced paths/commands.; Workspace was read-only, so I could not execute the documented workflows end-to-end; findings are based on source inspection.

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 6m 13s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 6m 13s wall · 12m 17s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 24s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 5m 13s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 20s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 33s
documentation:docs gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 38s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 6s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

@rianjs-bot

rianjs-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 3ec58bdf8fc2
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 1
policies:conventions 1
documentation:docs 2
go:implementation-tests (1 finding)

Major - internal/pipeline/pipeline.go:924

findIncompleteDryRun now resumes any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full review runs from the new threadrespond dry-runs. threadrespond explicitly writes and checks a thread-response-run.json marker before resuming its own runs, so an interrupted cr respond --dry-run can be picked up by the next cr review --dry-run here, reusing the wrong artifact tree and run ID and mixing response-only state into the review pipeline. Filter review resumes to review-owned runs as well, for example by adding a complementary review marker or at least rejecting artifact roots that contain the response marker, and add a regression test that a pending response dry-run is not resumed by pipeline.DryRun.

structure:repo-health (1 finding)

Major - internal/pipeline/pipeline.go:30

SelectionRequest now carries PostingIdentity so selector-only runs can normalize thread lifecycle context, but validateSelectionOnly still treats that field as optional while prepareSelectionContext silently skips normalization when it is empty at lines 1052-1057. Existing callers such as benchmark select still construct SelectionRequest without a posting identity, so they now lose discussion context instead of failing, which makes selector benchmarks diverge from real cr review behavior and hides the contract change from future callers. Make PostingIdentity required for SelectionOnly, plumb it through every caller, and add a regression test that empty identities are rejected rather than dropping thread context.

policies:conventions (1 finding)

Minor - internal/cmd/reviewcmd/reviewcmd.go:189

This adds a new top-level cr respond command from inside reviewcmd, which drifts from the repo’s visible command layout convention: cmd/cr/main.go registers one internal/cmd/*cmd package per top-level command (configcmd, datacmd, reviewcmd, etc.). Keeping respond nested in reviewcmd makes the package name stop matching the command surface and sets an awkward precedent for future top-level commands. The smallest policy-aligned fix is to move the respond Cobra wiring into a new internal/cmd/respondcmd package and register it alongside the other top-level command packages, while leaving shared runtime helpers in a lower-level shared package if needed.

documentation:docs (2 findings)

Minor - docs/architecture.md:16

The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).

Minor - docs/architecture.md:25

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/threadrespond/threadrespond.go internal/approvaloverride/approvaloverride.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/review/review.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go Filesystem is read-only in this environment, so I inspected code and tests but did not run the Go test suite.; Scope intentionally limited to the assigned Go implementation files and nearby tests needed to validate behavior.
structure:repo-health complete_broad docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review scoped to the assigned files and structural risks; I did not audit the rest of the PR.
policies:conventions complete_broad docs/development.md, internal/approvaloverride/approvaloverride.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/reviewplan/summary.go unavailable Reviewed only the assigned files.; Shared CLI source-of-truth docs were not present in ../cli-common/docs, so findings are limited to conventions visible in this repo and review context.
documentation:docs complete_broad docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md unavailable Review limited to the three assigned documentation files and local repository context; no network-based verification of external source-of-truth links.
unassigned incomplete_unassigned unavailable internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go, internal/ledger/ledger_test.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 4m 25s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 4m 25s wall · 11m 52s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 19s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 19s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 58s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 48s
documentation:docs gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 07s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 18s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

@@ -30,13 +30,17 @@ import (
"github.com/open-cli-collective/codereview-cli/internal/gitprovider"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: internal/pipeline/pipeline.go

SelectionRequest now carries PostingIdentity so selector-only runs can normalize thread lifecycle context, but validateSelectionOnly still treats that field as optional while prepareSelectionContext silently skips normalization when it is empty at lines 1052-1057. Existing callers such as benchmark select still construct SelectionRequest without a posting identity, so they now lose discussion context instead of failing, which makes selector benchmarks diverge from real cr review behavior and hides the contract change from future callers. Make PostingIdentity required for SelectionOnly, plumb it through every caller, and add a regression test that empty identities are rejected rather than dropping thread context.

Reply inline to this comment.

Comment thread docs/architecture.md
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).

Reply inline to this comment.

cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a new top-level cr respond command from inside reviewcmd, which drifts from the repo’s visible command layout convention: cmd/cr/main.go registers one internal/cmd/*cmd package per top-level command (configcmd, datacmd, reviewcmd, etc.). Keeping respond nested in reviewcmd makes the package name stop matching the command surface and sets an awkward precedent for future top-level commands. The smallest policy-aligned fix is to move the respond Cobra wiring into a new internal/cmd/respondcmd package and register it alongside the other top-level command packages, while leaving shared runtime helpers in a lower-level shared package if needed.

Reply inline to this comment.

Comment thread docs/architecture.md
New LLM-backed components must not call provider adapters or `internal/llm`
structured helpers directly. They should call `llmlifecycle` through explicit,
fakeable dependencies in unit tests and should return domain results rather
than posting comments or mutating provider state. This guardrail is enforced by

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.

Reply inline to this comment.

postingIdentity := postingKey(req.PostingIdentity)
var best ledger.Run
found := false
for _, run := range runs {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findIncompleteDryRun now resumes any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full review runs from the new threadrespond dry-runs. threadrespond explicitly writes and checks a thread-response-run.json marker before resuming its own runs, so an interrupted cr respond --dry-run can be picked up by the next cr review --dry-run here, reusing the wrong artifact tree and run ID and mixing response-only state into the review pipeline. Filter review resumes to review-owned runs as well, for example by adding a complementary review marker or at least rejecting artifact roots that contain the response marker, and add a regression test that a pending response dry-run is not resumed by pipeline.DryRun.

Reply inline to this comment.

@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 3ec58bd to 7488454 Compare June 26, 2026 15:46
@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 7488454 to 34d7a6d Compare June 26, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant