-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add reusable thread lifecycle pipeline #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a376644
57959a7
e000c98
bbb4e15
010b195
c63e75b
cd78dd8
b822314
3fb8254
e6399bf
bac48c1
b006629
66e98b3
994639f
331c2c5
34d7a6d
432cb9a
4d73a26
8da289a
05ba883
3826a3b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| # Architecture Decisions | ||
|
|
||
| This document records review-pipeline boundaries that are intended to stay | ||
| stable as the implementation evolves. | ||
|
|
||
| ## Durable LLM Execution Boundary | ||
|
|
||
| All production structured LLM actions must flow through | ||
| `internal/llmlifecycle`. Callers describe the task ID, phase, prompt input, | ||
| structured output contract, model/effort, artifact paths, and run/session | ||
| scope. The lifecycle runner owns provider invocation, structured-output retry, | ||
| provider-session resume, pre-flight reuse, task metadata, accepted-output | ||
| artifacts, session persistence, progress breadcrumbs, and failure | ||
| classification. | ||
|
|
||
| The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under Reply inline to this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This source-of-truth doc records the durable metadata path as Reply inline to this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This new architecture doc is being positioned as a durable source of truth, but it describes the commit-marker path as Reply inline to this comment. |
||
| must publish validated output or failed-attempt payloads first, persist the | ||
| ledger session row when the task is run-owned, and write metadata last. Resume | ||
| code must trust only the final metadata path, never temporary files or partial | ||
| payloads. | ||
|
|
||
| New LLM-backed components must not call provider adapters or `internal/llm` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The guardrail claim here overstates what is actually enforced. Reply inline to this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The new architecture doc overstates what the guardrail tests actually enforce. Reply inline to this comment. |
||
| structured helpers directly. They should call `llmlifecycle` through explicit, | ||
| fakeable dependencies in unit tests and should return domain results rather | ||
| than posting comments or mutating provider state. This guardrail is enforced by | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This overstates what Reply inline to this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This overstates what Reply inline to this comment. |
||
| `internal/architecture/llm_lifecycle_test.go`. | ||
|
|
||
| Most lifecycle tasks are run-owned and must have a matching ledger session row | ||
| when a provider session is available. Caller-owned no-run tasks are allowed only | ||
| where no review run exists yet, such as `SelectionOnly` and the pre-run approval | ||
| override classifier. Those tasks may reuse artifact metadata without a ledger | ||
| session row, but they still use the same metadata schema and lifecycle runner. | ||
|
|
||
| ## Stage Model Resolution | ||
|
|
||
| Runtime model choice must be resolved through `internal/stagemodel`. Code that | ||
| executes an LLM stage must not hard-code model IDs and must not call | ||
| `config.ResolveModelTier` directly. | ||
|
|
||
| `stagemodel.ResolveStageModel` is the single runtime path from profile | ||
| preferences and command overrides to a concrete model and effort. The request | ||
| must include the named stage, requested tier, default effort, and any explicit | ||
| operator override. The resolver applies user profile `llm.model_map` values, | ||
| built-in provider defaults, and configured tier floors before returning the | ||
| concrete runtime choice. | ||
|
|
||
| This boundary exists so model catalog data, provider capabilities, token costs, | ||
| and profile-level tier floors can be added without touching individual review | ||
| stages. Runtime hard-coding bypasses user preference and is a bug. | ||
|
|
||
| Reviewer `agent.model_id` is an exact provider-specific model override. It must | ||
| still enter runtime execution through `stagemodel.ResolveStageModel` as a model | ||
| override rather than bypassing the resolver, but it intentionally bypasses the | ||
| tier map because the agent author selected a concrete model. | ||
|
|
||
| The direct `config.ResolveModelTier` exception is config inspection and the | ||
| resolver implementation itself. The architecture guardrail is enforced by | ||
| `internal/architecture/model_resolution_test.go`. | ||
|
|
||
| ## Git Provider Writes | ||
|
|
||
| Provider writes have one durable path: planned actions in the ledger followed | ||
| by outbox execution. Commands and domain analyzers should not post comments, | ||
| reply to review threads, resolve threads, submit reviews, or mutate provider | ||
| state directly. | ||
|
|
||
| This keeps markers, retries, reconciliation, idempotency, and resume behavior in | ||
| one place. New commands such as `cr respond` should produce planned thread | ||
| actions and let the reviewplan/ledger/outbox flow perform provider writes. | ||
|
|
||
| ## Inline Thread Lifecycle | ||
|
|
||
| Inline PR discussion threads are domain input, not provider-specific prompt | ||
| data. The intended decomposition is: | ||
|
|
||
| - `internal/threadcontext` normalizes `gitprovider.InlineThread`, detects | ||
| codereview-authored finding threads, detects latest human replies, strips | ||
| shared markers, collapses resolved threads to the latest sanitized comment, | ||
| and produces file-scoped reviewer context. | ||
| - `internal/threadanalysis` accepts normalized thread context and returns | ||
| reusable domain decisions: thread ID, decision, reply body, summary, resolve | ||
| flag, and rationale. | ||
|
|
||
| Resolved inline threads should not be reprocessed as full conversations on | ||
| every review. Their durable context is the latest sanitized comment on the | ||
| resolved thread, with marker metadata retained when the comment contains a | ||
| codereview thread-summary marker. Reviewer prompts should receive compact | ||
| file-scoped summaries so agents avoid re-raising issues that have already been | ||
| discussed and resolved. | ||
|
|
||
| `cr review` and `cr respond` should share the same normalization, filtering, | ||
| model resolution, LLM execution, and action-planning components. `cr respond` | ||
| is a command-shaped reuse of the thread-response portion of the review | ||
| pipeline, not a separate posting system. | ||
|
|
||
| ## Retention And Cleanup | ||
|
|
||
| Durable run-owned LLM tasks, thread-analysis results, and artifacts must be | ||
| owned by a run and must be safe to delete through the existing data lifecycle | ||
| commands. Database rows should reference `runs(run_id)` with cascade delete | ||
| semantics, and large artifacts should live under the run artifact directory. | ||
|
|
||
| When the retention window elapses, normal prune/GC commands should remove these | ||
| results along with the rest of the run. If a user deletes retained state, a | ||
| future review may need to spend time and tokens recreating it; that is the | ||
| expected tradeoff for user-controlled local data retention. | ||
|
|
||
| Caller-owned no-run task artifacts must live under the configured data root or | ||
| an explicit caller artifact root. If no run is eventually allocated for that | ||
| artifact root, the directory is treated as orphaned local data and must be safe | ||
| for `cr data purge` to remove. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The durable-path example is off by one important detail: task directories are created under an encoded task ID (
llm-tasks/<encoded-task-id>/metadata.json), not the raw<task>string. The implementation uses path-component encoding before writing metadata, anddocs/llm-task-artifacts.mdalready documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.Reply inline to this comment.