Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
a376644
feat: add reusable thread lifecycle response pipeline
rianjs Jun 23, 2026
57959a7
fix: preserve unknown field names in structured retry prompts
rianjs Jun 23, 2026
e000c98
fix: satisfy lint for thread lifecycle pipeline
rianjs Jun 23, 2026
bbb4e15
fix: harden thread response lifecycle
rianjs Jun 24, 2026
010b195
fix: make llm lifecycle tasks resumable
rianjs Jun 26, 2026
c63e75b
fix: resume thread response attempts
rianjs Jun 26, 2026
cd78dd8
test: cover thread response resume boundaries
rianjs Jun 26, 2026
b822314
test: assert resume persistence artifacts
rianjs Jun 26, 2026
3fb8254
fix: keep failed thread analysis resumable
rianjs Jun 26, 2026
e6399bf
test: cover stale thread resume inputs
rianjs Jun 26, 2026
bac48c1
test: cover dry-run rerun recovery
rianjs Jun 26, 2026
b006629
test: cover real lifecycle resume boundaries
rianjs Jun 26, 2026
66e98b3
fix: route review thread lifecycle through analysis
rianjs Jun 26, 2026
994639f
test: pin thread response planning path
rianjs Jun 26, 2026
331c2c5
fix: preserve checkout-readonly adapter capability
rianjs Jun 26, 2026
34d7a6d
fix: bundle github inline findings into submit review
rianjs Jun 26, 2026
432cb9a
fix: fail fast for non-opinionated github reviews
rianjs Jun 26, 2026
4d73a26
fix: wire respond progress breadcrumbs
rianjs Jun 26, 2026
8da289a
Restrict reviewer identities to PAT auth
rianjs Jun 26, 2026
05ba883
Split review runtime read and post providers
rianjs Jun 26, 2026
3826a3b
Recover init from legacy reviewer config
rianjs Jun 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 21 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,12 +230,6 @@ Host matching is case-insensitive after normalization, while namespace and repo
matching are case-sensitive after trimming whitespace. An explicit `--profile`
bypasses repository routing. Route targets still use the profile's configured
auth mode. Passing `--profile ""` is invalid.
For GitHub App reviewer entities, the review profile chooses whether `cr review`
discovers the installation from each PR owner/repo or uses one pinned
installation ID stored in profile config. Discovery is the normal choice for
profiles routed to multiple organizations or users; pinning is only appropriate
when every route for that profile uses the same installation.

Add or replace one credential later:

```bash
Expand All @@ -260,17 +254,16 @@ target user's credential store. The staging commands must run as the target OS
user who will run `cr`, not as root, SYSTEM, or an administrator account whose
keyring is different from the target user's keyring.

`reviewer_credentials` may use a PAT or GitHub App auth. A shared reviewer PAT
is an access secret even when distributed org-wide. A GitHub App private key is
also a deployment secret; `cr` signs short-lived JWTs locally, mints
installation tokens as needed, and never stores those minted tokens in
`config.yml`. Pre-stage reviewer credentials into each target user's credential
store, rotate or revoke them with the same `set-credential --overwrite` flow,
and do not store them in config files, agent sources, installers, logs, or
shell profiles.
`reviewer_credentials` must use PAT auth for opinionated GitHub reviews. A
shared reviewer PAT is an access secret even when distributed org-wide.
Pre-stage reviewer credentials into each target user's credential store, rotate
or revoke them with the same `set-credential --overwrite` flow, and do not
store them in config files, agent sources, installers, logs, or shell
profiles.

For GitHub App reviewer credentials, grant the app only the repository
permissions needed by the enabled review workflow:
GitHub App auth remains supported for repository access. When using GitHub App
repository access, grant the app only the repository permissions needed by the
enabled review workflow:

| GitHub App permission | Access | Used for |
|-----------------------|--------|----------|
Expand Down Expand Up @@ -303,10 +296,10 @@ profiles:
store: local-os
name: codereview/work
reviewer_credentials:
auth_mode: github_app
auth_mode: pat
credential:
store: work-1password
name: codereview/work-reviewer-app
name: codereview/work-reviewer
llm:
provider: anthropic
auth: api_key
Expand Down Expand Up @@ -409,18 +402,12 @@ When deploying a profile without the interactive wizard, run
then use `set-credential` only for secrets that you are intentionally staging
outside init. Example:

GitHub App reviewer credentials store the App ID and private key only. The
review profile stores installation routing: `discover_from_repository` for PR
context lookup, or `pinned` with one numeric installation ID when every route for
that profile shares the same installation.

```bash
cr --profile work init --non-interactive \
--git-host github.com \
--git-credential-ref codereview/work \
--reviewer-auth-mode github_app \
--reviewer-github-app-id "$GITHUB_APP_ID" \
--reviewer-credential-ref codereview/work-reviewer-app \
--reviewer-auth-mode pat \
--reviewer-credential-ref codereview/work-reviewer \
--llm-provider anthropic \
--llm-auth api_key \
--llm-adapter anthropic_api \
Expand All @@ -433,10 +420,10 @@ printf '%s' "$USER_GITHUB_TOKEN" | cr set-credential \
--stdin \
--overwrite

printf '%s' "$GITHUB_APP_PRIVATE_KEY" | cr set-credential \
printf '%s' "$REVIEWER_GITHUB_TOKEN" | cr set-credential \
--store local-os \
--name codereview/work-reviewer-app \
--key github_app_private_key \
--name codereview/work-reviewer \
--key git_token \
--stdin \
--overwrite

Expand Down Expand Up @@ -592,8 +579,8 @@ Credential key matrix:
|---------------|---------|---------------|---------------|---------------|-------------|
| `git.credential` | User Git host auth | `pat` | `git_token` | None | Supported |
| `reviewer_credentials.credential` | Reviewer Git host auth | `pat` | `git_token` | None | Supported; must use a distinct credential location from `git.credential` in the same profile |
| `git.github_app.app_id` / reviewer entity `github_app.app_id` | Git host auth | `github_app` | Config field, not a credential key | None | Non-secret GitHub App ID stored in config. Scripted init uses `--git-github-app-id` or `--reviewer-github-app-id` |
| `git.credential` / reviewer entity credential | Git host auth | `github_app` | `github_app_private_key` | None | Supported for GitHub. Reviewer GitHub App installation routing lives on `profiles.<name>.reviewer.github_app_installation`, not in the credential bundle |
| `git.github_app.app_id` | Git host auth | `github_app` | Config field, not a credential key | None | Non-secret GitHub App ID stored in config. Scripted init uses `--git-github-app-id`. |
| `git.credential` | Git host auth | `github_app` | `github_app_private_key` | None | Supported for GitHub repository access. |
| `git.credential` / `reviewer_credentials.credential` | Git host auth | `oauth_device` | None | None | Reserved; config recognizes the mode but v1 rejects it and does not accept future keys such as `git_oauth_access_token` or `git_oauth_refresh_token` |
| `llm.credential` | Anthropic direct API auth | `api_key` + `anthropic` | `anthropic_api_key` | None | Supported |
| `llm.credential` | OpenAI direct API auth | `api_key` + `openai` | `openai_api_key` | None | Supported |
Expand Down Expand Up @@ -754,7 +741,7 @@ Flags:
| `--git-token-stdin` | Read the Git token from stdin and write key `git_token`. |
| `--git-token-from-env <env>` | Read the Git token from an environment variable and write key `git_token`. |
| `--reviewer-credential-ref <name>` | Credential name for reviewer Git auth. Defaults to `codereview/<profile>-reviewer` when reviewer credentials are requested. |
| `--reviewer-auth-mode <mode>` | Reviewer Git auth mode, default `pat`. `github_app` writes config and prints follow-up credential commands. |
| `--reviewer-auth-mode <mode>` | Reviewer Git auth mode. Only `pat` is supported for reviewer identities. |
| `--reviewer-token-stdin` | Read the reviewer Git token from stdin and write key `git_token`; PAT reviewer auth only. |
| `--reviewer-token-from-env <env>` | Read the reviewer Git token from an environment variable and write key `git_token`; PAT reviewer auth only. |
| `--llm-provider <provider>` | LLM provider, default `anthropic`; also selects whether API-key ingress writes `anthropic_api_key` or `openai_api_key`. `pi` is adapter-managed and does not accept API-key ingress. |
Expand All @@ -774,10 +761,8 @@ Flags:
Only one stdin secret ingress flag may be used at a time. PAT reviewer
credentials use key `git_token` under their own credential name, so
`--reviewer-credential-ref` must differ from `--git-credential-ref`. GitHub App
reviewers store their non-secret App ID in config via `--reviewer-github-app-id`
and use `github_app_private_key` as the only credential-store key; installation
routing is profile config, not a credential key. `init` does not accept
reviewer token ingress for `--reviewer-auth-mode github_app`. LLM API-key ingress requires
reviewer identities are not supported because GitHub does not count their
approvals or request-changes toward PR state. LLM API-key ingress requires
`--llm-auth api_key`. `--overwrite` with API-key auth requires an LLM key
ingress flag. `--allow-self-review` is intentionally runtime-only on
`cr review`; `init` only stores the profile-level self-approval policy.
Expand Down
111 changes: 111 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Architecture Decisions

This document records review-pipeline boundaries that are intended to stay
stable as the implementation evolves.

## Durable LLM Execution Boundary

All production structured LLM actions must flow through
`internal/llmlifecycle`. Callers describe the task ID, phase, prompt input,
structured output contract, model/effort, artifact paths, and run/session
scope. The lifecycle runner owns provider invocation, structured-output retry,
provider-session resume, pre-flight reuse, task metadata, accepted-output
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.

Reply inline to this comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).

Reply inline to this comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This source-of-truth doc records the durable metadata path as llm-tasks/<task>/metadata.json, but the implementation keys task directories by an encoded task ID (llmlifecycle.Paths.TaskDir uses encodePathComponent). Anyone debugging artifacts or automating around this contract will look in the wrong place. Document the path in terms of the encoded task directory, or point readers at the helper that derives it instead of using the raw <task> placeholder.

Reply inline to this comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new architecture doc is being positioned as a durable source of truth, but it describes the commit-marker path as llm-tasks/<task>/metadata.json while the repo’s artifact contract and helper APIs use encoded task IDs (llm-tasks/<encoded-task-id>/...). That creates policy drift between docs/architecture.md and docs/llm-task-artifacts.md, and future code/tests may follow the wrong path shape. Update this line to use the encoded path form so the two repo-local contracts stay aligned.

Reply inline to this comment.

must publish validated output or failed-attempt payloads first, persist the
ledger session row when the task is run-owned, and write metadata last. Resume
code must trust only the final metadata path, never temporary files or partial
payloads.

New LLM-backed components must not call provider adapters or `internal/llm`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guardrail claim here overstates what is actually enforced. internal/architecture/llm_lifecycle_test.go blocks direct internal/llm structured-helper calls, but it does not prove that new components cannot bypass llmlifecycle through adapter-level entry points or other direct provider usage. Since this file is meant to be the durable boundary map, overstating an automated guarantee is misleading repo guidance. Either narrow the wording to the check that exists today, or add a matching architecture test for the adapter boundary before documenting it as enforced.

Reply inline to this comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new architecture doc overstates what the guardrail tests actually enforce. internal/architecture/llm_lifecycle_test.go only fails on direct llm.RunStructured* calls, and internal/architecture/model_resolution_test.go only fails on direct config.ResolveModelTier calls; neither test proves that provider adapters cannot be called directly or that all runtime model hard-coding paths are blocked. Because this file is now positioned as durable source-of-truth guidance, future contributors may rely on guarantees the repo does not actually check. Either narrow the wording to the concrete invariants the tests enforce today, or add matching architecture tests for the stronger claims.

Reply inline to this comment.

structured helpers directly. They should call `llmlifecycle` through explicit,
fakeable dependencies in unit tests and should return domain results rather
than posting comments or mutating provider state. This guardrail is enforced by

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.

Reply inline to this comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.

Reply inline to this comment.

`internal/architecture/llm_lifecycle_test.go`.

Most lifecycle tasks are run-owned and must have a matching ledger session row
when a provider session is available. Caller-owned no-run tasks are allowed only
where no review run exists yet, such as `SelectionOnly` and the pre-run approval
override classifier. Those tasks may reuse artifact metadata without a ledger
session row, but they still use the same metadata schema and lifecycle runner.

## Stage Model Resolution

Runtime model choice must be resolved through `internal/stagemodel`. Code that
executes an LLM stage must not hard-code model IDs and must not call
`config.ResolveModelTier` directly.

`stagemodel.ResolveStageModel` is the single runtime path from profile
preferences and command overrides to a concrete model and effort. The request
must include the named stage, requested tier, default effort, and any explicit
operator override. The resolver applies user profile `llm.model_map` values,
built-in provider defaults, and configured tier floors before returning the
concrete runtime choice.

This boundary exists so model catalog data, provider capabilities, token costs,
and profile-level tier floors can be added without touching individual review
stages. Runtime hard-coding bypasses user preference and is a bug.

Reviewer `agent.model_id` is an exact provider-specific model override. It must
still enter runtime execution through `stagemodel.ResolveStageModel` as a model
override rather than bypassing the resolver, but it intentionally bypasses the
tier map because the agent author selected a concrete model.

The direct `config.ResolveModelTier` exception is config inspection and the
resolver implementation itself. The architecture guardrail is enforced by
`internal/architecture/model_resolution_test.go`.

## Git Provider Writes

Provider writes have one durable path: planned actions in the ledger followed
by outbox execution. Commands and domain analyzers should not post comments,
reply to review threads, resolve threads, submit reviews, or mutate provider
state directly.

This keeps markers, retries, reconciliation, idempotency, and resume behavior in
one place. New commands such as `cr respond` should produce planned thread
actions and let the reviewplan/ledger/outbox flow perform provider writes.

## Inline Thread Lifecycle

Inline PR discussion threads are domain input, not provider-specific prompt
data. The intended decomposition is:

- `internal/threadcontext` normalizes `gitprovider.InlineThread`, detects
codereview-authored finding threads, detects latest human replies, strips
shared markers, collapses resolved threads to the latest sanitized comment,
and produces file-scoped reviewer context.
- `internal/threadanalysis` accepts normalized thread context and returns
reusable domain decisions: thread ID, decision, reply body, summary, resolve
flag, and rationale.

Resolved inline threads should not be reprocessed as full conversations on
every review. Their durable context is the latest sanitized comment on the
resolved thread, with marker metadata retained when the comment contains a
codereview thread-summary marker. Reviewer prompts should receive compact
file-scoped summaries so agents avoid re-raising issues that have already been
discussed and resolved.

`cr review` and `cr respond` should share the same normalization, filtering,
model resolution, LLM execution, and action-planning components. `cr respond`
is a command-shaped reuse of the thread-response portion of the review
pipeline, not a separate posting system.

## Retention And Cleanup

Durable run-owned LLM tasks, thread-analysis results, and artifacts must be
owned by a run and must be safe to delete through the existing data lifecycle
commands. Database rows should reference `runs(run_id)` with cascade delete
semantics, and large artifacts should live under the run artifact directory.

When the retention window elapses, normal prune/GC commands should remove these
results along with the rest of the run. If a user deletes retained state, a
future review may need to spend time and tokens recreating it; that is the
expected tradeoff for user-controlled local data retention.

Caller-owned no-run task artifacts must live under the configured data root or
an explicit caller artifact root. If no run is eventually allocated for that
artifact root, the directory is treated as orphaned local data and must be safe
for `cr data purge` to remove.
2 changes: 1 addition & 1 deletion docs/config-command-surface-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The codebase already supports:
- `repository_profiles` in config schema
- repository-aware profile resolution in runtime paths used by `cr review` and
`cr agents`
- GitHub App reviewer auth
- PAT-backed reviewer auth only for counted GitHub reviews

The missing piece is command coverage. Today `cr config` exposes `show`,
`clear`, and `llm models`, but path/route/route-preview/agent-source
Expand Down
23 changes: 14 additions & 9 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,13 @@ session management, and local data lifecycle commands.
The current Go code is a Cobra command tree in `internal/cmd/*` with a thin
`cmd/cr` entrypoint, shared exit-code mapping in `internal/cmd/exitcode`, and
version plumbing in `internal/version`. Review orchestration is split across
`internal/pipeline`, `internal/reviewrun`, `internal/reviewplan`,
`internal/outbox`, `internal/gate`, and `internal/gateio`.
`internal/pipeline`, `internal/reviewrun`, `internal/threadrespond`,
`internal/reviewplan`, `internal/outbox`, `internal/gate`, and
`internal/gateio`.

Architecture guardrails for LLM execution, model resolution, Git provider
writes, inline thread lifecycle, and retention live in
[`docs/architecture.md`](architecture.md).

Within `internal/pipeline`, the public entry points are `DryRun`, `Live`, and
`SelectionOnly`. `DryRun` and `Live` execute the full review pipeline, while
Expand Down Expand Up @@ -80,17 +85,17 @@ make clean # remove build artifacts
state/config adapters in `internal/config`, `internal/ledger`, and
`internal/statepaths`, provider/LLM adapters in their owning packages, and
review posting/gating in `internal/outbox`, `internal/gate`, and
`internal/gateio`.
`internal/gateio`, and response-only inline discussion handling in
`internal/threadrespond`.

## Interactive Init Notes

`cr init` interactive mode keeps all writes draft-local until the user commits
staged changes. Reviewer setup may collect PAT or GitHub App reviewer secrets
inside the reviewer-entity flow, but those values are only written to the
credential store during **Commit staged changes and exit**. When the selected
LLM auth mode is `api_key`, the wizard saves the non-secret profile shape and
prints a follow-up `cr set-credential` command instead of collecting the API key
inline.
staged changes. Reviewer setup may collect PAT reviewer secrets inside the
reviewer-entity flow, but those values are only written to the credential store
during **Commit staged changes and exit**. When the selected LLM auth mode is
`api_key`, the wizard saves the non-secret profile shape and prints a follow-up
`cr set-credential` command instead of collecting the API key inline.

Interactive `git.host` edits now route through the repository-route stage when
the target profile already participates in `repository_profiles` routing. The
Expand Down
Loading
Loading