feat: add reusable thread lifecycle pipeline#359
Conversation
70ee83c to
bbb4e15
Compare
|
Findings
|
| artifacts, session persistence, progress breadcrumbs, and failure | ||
| classification. | ||
|
|
||
| The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers |
There was a problem hiding this comment.
The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.
Reply inline to this comment.
| if err != nil { | ||
| return preparedSelectionContext{}, err | ||
| } | ||
| if strings.TrimSpace(req.PostingIdentity.ID) != "" || strings.TrimSpace(req.PostingIdentity.Login) != "" { |
There was a problem hiding this comment.
SelectionOnly now silently drops normalized thread context whenever PostingIdentity is unset, even though SelectionRequest grew that field specifically to drive the new thread-lifecycle prompt input. That means existing callers can keep compiling while benchmarking or other selection-only flows quietly stop exercising the same selector inputs as cr review, so the changed behavior is no longer proven by the same code path. Make this contract explicit instead of degrading silently: either require PostingIdentity for selection-only runs that load inline threads, or add a focused test that covers the no-identity branch and update callers to pass the resolved posting identity so selection-only continues to match full-review behavior.
Reply inline to this comment.
| New LLM-backed components must not call provider adapters or `internal/llm` | ||
| structured helpers directly. They should call `llmlifecycle` through explicit, | ||
| fakeable dependencies in unit tests and should return domain results rather | ||
| than posting comments or mutating provider state. This guardrail is enforced by |
There was a problem hiding this comment.
This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.
Reply inline to this comment.
| cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author") | ||
| cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions") | ||
| rootCmd.AddCommand(cmd) | ||
| rootCmd.AddCommand(newRespondCommand(opts, factory)) |
There was a problem hiding this comment.
The repo’s command layout is organized as one top-level command package per internal/cmd/* directory, but this change introduces a new top-level respond command by folding it into reviewcmd (newRespondCommand, runRespond, ResponseRunner, extra rendering helpers, and tests) instead of giving it its own internal/cmd/respondcmd boundary. That makes the command tree less discoverable from the package map in docs/development.md, and it couples future respond-specific flags/runtime/output work to the already large reviewcmd package, which is the kind of ownership drift that gets harder to unwind after more command-specific behavior lands. Split respond into its own command package and keep only shared review/respond bootstrap helpers in a common seam so the repo’s internal/cmd/* structure continues to reflect the user-facing command surface.
Reply inline to this comment.
Automated PR ReviewReviewed commit: Summary
go:implementation-tests (1 finding)Minor -
|
| Reviewer | Status | Inspected | Skipped | Constraints |
|---|---|---|---|---|
| go:implementation-tests | incomplete_skipped | internal/approvaloverride/approvaloverride.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/review/review.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go | internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver_test.go | Review scope was intentionally narrowed to the core thread-lifecycle, pipeline, ledger, model-resolution, and command-wiring changes.; Static review only; I did not execute the Go test suite in this read-only workbench. |
| structure:repo-health | incomplete_skipped | docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go | internal/gateio/gateio.go, internal/llmlifecycle/lifecycle_test.go, internal/pipeline/pipeline_test.go | Review scoped to the assigned files, with emphasis on long-term command/package boundaries and durable lifecycle contracts rather than exhaustive behavioral validation. |
| policies:conventions | complete_broad | docs/development.md, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go | unavailable | Shared source-of-truth convenience copies were not present at ../cli-common/docs or ../.github, so this review was limited to repo-local guidance and the assigned changed files. |
| documentation:docs | complete_broad | docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md | unavailable | Review scope was limited to the assigned documentation files, with repo/code reads only to verify factual claims and referenced paths/commands.; Workspace was read-only, so I could not execute the documented workflows end-to-end; findings are based on source inspection. |
0 PR discussion threads considered. 0 summarized; 0 resolved.
Completed in 6m 13s | unavailable | gpt-5.4 | cr dev
| Field | Value |
|---|---|
| Model | gpt-5.4 |
| Reviewers | go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs |
| Engine | codex_cli · gpt-5.4 |
| Reviewed by | cr · rianjs-bot[bot] |
| Duration | 6m 13s wall · 12m 17s compute |
| Cost | unavailable |
| Tokens | unavailable |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost | Duration |
|---|---|---|---|---|---|---|---|
| orchestrator-selection | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 24s |
| go:implementation-tests | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 5m 13s |
| structure:repo-health | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 2m 20s |
| policies:conventions | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 1m 33s |
| documentation:docs | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 2m 38s |
| orchestrator-rollup | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 6s |
Automated PR ReviewReviewed commit: Summary
go:implementation-tests (1 finding)Major -
|
| Reviewer | Status | Inspected | Skipped | Constraints |
|---|---|---|---|---|
| go:implementation-tests | incomplete_skipped | internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/threadrespond/threadrespond.go | internal/approvaloverride/approvaloverride.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/review/review.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go | Filesystem is read-only in this environment, so I inspected code and tests but did not run the Go test suite.; Scope intentionally limited to the assigned Go implementation files and nearby tests needed to validate behavior. |
| structure:repo-health | complete_broad | docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go | unavailable | Review scoped to the assigned files and structural risks; I did not audit the rest of the PR. |
| policies:conventions | complete_broad | docs/development.md, internal/approvaloverride/approvaloverride.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/reviewplan/summary.go | unavailable | Reviewed only the assigned files.; Shared CLI source-of-truth docs were not present in ../cli-common/docs, so findings are limited to conventions visible in this repo and review context. |
| documentation:docs | complete_broad | docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md | unavailable | Review limited to the three assigned documentation files and local repository context; no network-based verification of external source-of-truth links. |
| unassigned | incomplete_unassigned | unavailable | internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go, internal/ledger/ledger_test.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond_test.go | changed files were not assigned to a selected reviewer |
0 PR discussion threads considered. 0 summarized; 0 resolved.
Completed in 4m 25s | unavailable | gpt-5.4 | cr dev
| Field | Value |
|---|---|
| Model | gpt-5.4 |
| Reviewers | go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs |
| Engine | codex_cli · gpt-5.4 |
| Reviewed by | cr · rianjs-bot[bot] |
| Duration | 4m 25s wall · 11m 52s compute |
| Cost | unavailable |
| Tokens | unavailable |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost | Duration |
|---|---|---|---|---|---|---|---|
| orchestrator-selection | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 19s |
| go:implementation-tests | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 3m 19s |
| structure:repo-health | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 2m 58s |
| policies:conventions | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 1m 48s |
| documentation:docs | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 3m 07s |
| orchestrator-rollup | gpt-5.4 | unavailable | unavailable | unavailable | unavailable | unavailable | 18s |
| @@ -30,13 +30,17 @@ import ( | |||
| "github.com/open-cli-collective/codereview-cli/internal/gitprovider" | |||
There was a problem hiding this comment.
File-level note: internal/pipeline/pipeline.go
SelectionRequest now carries PostingIdentity so selector-only runs can normalize thread lifecycle context, but validateSelectionOnly still treats that field as optional while prepareSelectionContext silently skips normalization when it is empty at lines 1052-1057. Existing callers such as benchmark select still construct SelectionRequest without a posting identity, so they now lose discussion context instead of failing, which makes selector benchmarks diverge from real cr review behavior and hides the contract change from future callers. Make PostingIdentity required for SelectionOnly, plumb it through every caller, and add a regression test that empty identities are rejected rather than dropping thread context.
Reply inline to this comment.
| artifacts, session persistence, progress breadcrumbs, and failure | ||
| classification. | ||
|
|
||
| The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers |
There was a problem hiding this comment.
The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).
Reply inline to this comment.
| cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author") | ||
| cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions") | ||
| rootCmd.AddCommand(cmd) | ||
| rootCmd.AddCommand(newRespondCommand(opts, factory)) |
There was a problem hiding this comment.
This adds a new top-level cr respond command from inside reviewcmd, which drifts from the repo’s visible command layout convention: cmd/cr/main.go registers one internal/cmd/*cmd package per top-level command (configcmd, datacmd, reviewcmd, etc.). Keeping respond nested in reviewcmd makes the package name stop matching the command surface and sets an awkward precedent for future top-level commands. The smallest policy-aligned fix is to move the respond Cobra wiring into a new internal/cmd/respondcmd package and register it alongside the other top-level command packages, while leaving shared runtime helpers in a lower-level shared package if needed.
Reply inline to this comment.
| New LLM-backed components must not call provider adapters or `internal/llm` | ||
| structured helpers directly. They should call `llmlifecycle` through explicit, | ||
| fakeable dependencies in unit tests and should return domain results rather | ||
| than posting comments or mutating provider state. This guardrail is enforced by |
There was a problem hiding this comment.
This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.
Reply inline to this comment.
| postingIdentity := postingKey(req.PostingIdentity) | ||
| var best ledger.Run | ||
| found := false | ||
| for _, run := range runs { |
There was a problem hiding this comment.
findIncompleteDryRun now resumes any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full review runs from the new threadrespond dry-runs. threadrespond explicitly writes and checks a thread-response-run.json marker before resuming its own runs, so an interrupted cr respond --dry-run can be picked up by the next cr review --dry-run here, reusing the wrong artifact tree and run ID and mixing response-only state into the review pipeline. Filter review resumes to review-owned runs as well, for example by adding a complementary review marker or at least rejecting artifact roots that contain the response marker, and add a regression test that a pending response dry-run is not resumed by pipeline.DryRun.
Reply inline to this comment.
3ec58bd to
7488454
Compare
7488454 to
34d7a6d
Compare
Summary
Decomposes inline discussion handling into reusable thread lifecycle building blocks for both full
cr reviewand the newcr respondcommand.cr respondfor reusable thread response planning and retryable posting semanticsValidation
rtk go test ./internal/threadrespond ./internal/cmd/reviewcmdrtk go test ./internal/pipelinertk go test ./internal/cmd/...rtk go test ./internal/llmrtk go test ./...(2438 passed in 49 packages)rtk go build ./cmd/cr/tmp/cr-thread-lifecycle review https://github.com/open-cli-collective/codereview-cli/pull/359 --dry-run --max-agents 3 --allow-self-review --json --verboseDry-run Result
Run
d09716b5-8129-4668-89f0-f8675ce768b4completed withoutcome: dry_runagainst head5da53b22aaec.structure:repo-health,go:implementation-tests,policies:conventions--max-agents 3planned_only: 5 inline comments, 1 rollup comment, 1 submit-review actionNotes
This PR intentionally keeps provider writes on the existing durable reviewplan/ledger/outbox path.