feat(respond): conversational replies on review-comment threads by piekstra · Pull Request #333 · open-cli-collective/codereview-cli

piekstra · 2026-06-17T13:00:49Z

Closes #326

Summary

Adds a new cr respond <PR> command that closes the conversational loop on cr's own open review-comment threads, so a finding can be acknowledged and resolved (or answered) without forcing a full re-review.

What it does:

Lists the PR's inline review threads and selects the ones that are cr-authored findings (first comment matches the posting identity and carries a codereview marker) that are still open and whose latest comment is a new human reply cr has not already answered. Threads where cr already had the last word are skipped, so repeated runs are idempotent without persisted per-thread state.
For each selected thread, a small-tier LLM classifier reads the original finding plus the full thread and returns one of:
- acknowledge_and_resolve - post a brief acknowledgement and resolve the thread (finding addressed/conceded),
- reply_only - post a contextual answer, keep the thread open (question / pushback),
- skip - take no action.
Posts replies via the existing ReplyToThread plumbing and resolves via ResolveThread, both already on the gitprovider seam.

Why it matters

Reviews feel interactive, stale open threads get closed, and authors get answers without a full re-review cycle - exactly the loop described in #326.

Design notes

New internal/threadreply engine package modeled directly on internal/approvaloverride: strict structured-output schema (schema_version=1), deterministic prompt builder, and a pure SelectCandidates thread-selection step. The package has full unit coverage (schema decode, prompt content, classifier behavior, candidate selection, marker stripping).
New internal/cmd/respondcmd command following the reviewcmd/mecmd idioms: injectable RuntimeFactory, profile/provider/identity resolution, domain error mapping, and a lazy LLM classifier so runs with no candidate threads never touch the LLM provider.
New view.RespondResult text/JSON renderer.
Flags: --dry-run / --no-post, --json, --no-resolve-threads. Respects profile resolve_threads: never. The classifier uses the profile small model tier (falls back to medium with a warning; rejects a profile with neither).
README cr respond reference section and a cmd/cr wiring smoke test added.

Scope / remaining work

This PR implements the engine + CLI-trigger slice (part 1 of the issue, plus a manual trigger). The menu-bar watcher trigger (part 2 - polling for new thread replies and invoking cr, e.g. cr-daemon) is a separate tool and is intentionally out of scope here; cr respond is the engine it would invoke. The classifier is stateless per run and relies on thread shape (cr did not post the latest comment) rather than a ledger, which keeps this self-contained; wiring it into the durable ledger/outbox path (like cr review) is a possible follow-up if posts need the same retry/idempotency guarantees.

Testing

All commands run from the repo root:

go build ./... - pass
golangci-lint run - pass (0 issues)
go test ./... - pass (entire suite green)
go mod tidy + git diff --exit-code go.mod go.sum - clean
Manual: go run ./cmd/cr respond --help and go run ./cmd/cr --help render the new command correctly.

New tests:

internal/threadreply - schema decode, decision parsing, prompt content, classifier decision/reply + skip-clears-reply + empty-reply rejection, candidate selection (open/resolved/no-reply/cr-had-last-word/human-only), same-login-without-marker exclusion, marker stripping.
internal/cmd/respondcmd - live post+resolve, dry-run no-post, --no-resolve-threads, skip decision, --json output, no-candidates skips classifier, missing-PR usage error.
internal/view - respond renderer covered by the command JSON test.

Note: gofmt -l flags the pre-existing internal/cmd/configcmd/configcmd.go (unchanged by this PR); all files this PR touches are gofmt-clean and golangci-lint reports 0 issues.

Add a `cr respond <PR>` command that closes the conversational loop on cr's own open review-comment threads. It lists the PR's inline review threads, selects the ones cr authored (its own findings, matched by posting identity plus a codereview marker) that have a newer unaddressed human reply, and asks a small-tier LLM classifier whether to acknowledge and resolve, reply only, or skip. Replies post in-thread via the existing ReplyToThread plumbing and the thread is resolved via ResolveThread when the finding is addressed. The engine lives in a new internal/threadreply package modeled on internal/approvaloverride: a strict structured-output schema, a deterministic prompt builder, and a pure thread-selection step that keeps repeated runs idempotent without persisted per-thread state. The command supports --dry-run/--no-post, --json, and --no-resolve-threads, and respects the profile resolve_threads policy. This is the engine + CLI-trigger slice of the feature. The menu-bar watcher trigger (polling for new thread replies and invoking cr) is a separate tool and is intentionally out of scope. Closes #326

piekstra-dev · 2026-06-17T13:39:07Z

Automated PR Review

Reviewed commit: 14b63012fadc
Profile: reviewer - Posting as: piekstra-dev

Summary

Reviewer	Findings
documentation:docs	0
go:implementation-tests	1
policies:conventions	1
structure:repo-health	1

go:implementation-tests (1 finding)

Minor - `internal/threadreply/responder.go:52`

buildRequest applies marker.SanitizeModelContent to comment bodies before stripMarkers, but uses the raw c.Body (marker-stripped only) for OriginalFinding:
body := marker.SanitizeModelContent(c.Body)   // sanitized
if i == 0 {
    original = stripMarkers(c.Body)             // NOT sanitized
}
comments = append(comments, Comment{
    Body: stripMarkers(body),                   // sanitized
})
The LLM prompt therefore sees two representations of the first comment: OriginalFinding (raw minus markers) and Comments[0].Body (sanitized minus markers). If SanitizeModelContent transforms any content beyond the HTML marker itself—e.g., trims prompt-injection patterns or normalises whitespace—the model receives inconsistent context for the same comment.

Fix: derive original from the already-sanitized body variable rather than c.Body:
body := marker.SanitizeModelContent(c.Body)
if i == 0 {
    original = stripMarkers(body)   // consistent with Comments[0].Body
}

policies:conventions (1 finding)

Minor - `internal/cmd/respondcmd/respondcmd.go:81`

--no-post is registered as a visible flag whose help text explicitly says "Alias for --dry-run". Exposing two sibling flags for the same behavior leaves users without a clear canonical option. Open CLI Collective conventions favour a single authoritative flag; if --no-post must be kept for ergonomic or compatibility reasons it should be marked hidden (cmd.Flags().MarkHidden("no-post")) so only --dry-run appears in --help, with a note in respondLong that --no-post is accepted as a legacy alias. As-is, the respondLong doc also lists both names side-by-side (--dry-run (or --no-post)), which compounds the ambiguity.

structure:repo-health (1 finding)

Minor - `internal/cmd/respondcmd/respondcmd.go:348`

newAdapter in respondcmd is a full dispatch over every known LLM adapter type (ClaudeCLI, CodexCLI, PiRPC, AnthropicAPI, OpenAIAPI). If reviewcmd or any other command contains an equivalent function, adding a new adapter type requires parallel edits across all copies. The invariant for this repo is that shared LLM adapter construction belongs in a single location (likely internal/llm or a shared internal/cmd helper), not replicated per-command.

Fix: extract newAdapter to a shared package (e.g., internal/llm or internal/cmd/cmdllm) and have both reviewcmd and respondcmd import it. The existing newAdapterForRuntime var indirection already provides the test-seam, so the shared function slots in without touching test structure.

0 PR discussion threads considered. 0 summarized; 0 resolved.

Completed in 5m 28s | ~$0.35 (est.) | claude-sonnet-4-6 | cr 0.4.161

Field	Value
Model	claude-sonnet-4-6
Reviewers	documentation:docs, go:implementation-tests, policies:conventions, structure:repo-health
Engine	claude_cli · claude-sonnet-4-6
Reviewed by	cr · piekstra-dev
Duration	5m 28s wall · 5m 20s compute
Cost	~$0.35 (est.)
Tokens	21 in / 6.8k out

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost	Duration
orchestrator-selection	claude-sonnet-4-6	3	124	3.3k	4.8k	~$0.02 (est.)	26s
documentation:docs	claude-sonnet-4-6	4	1.2k	7.2k	12.4k	~$0.07 (est.)	26s
go:implementation-tests	claude-sonnet-4-6	3	124	3.3k	4.8k	~$0.02 (est.)	1m 54s
policies:conventions	claude-sonnet-4-6	3	124	3.3k	4.8k	~$0.02 (est.)	50s
structure:repo-health	claude-sonnet-4-6	4	3.9k	7.7k	23.2k	~$0.15 (est.)	1m 13s
orchestrator-rollup	claude-sonnet-4-6	4	1.3k	18.0k	13.3k	~$0.08 (est.)	29s

piekstra-dev

Automated PR review completed with outcome: approved.

- threadreply: strip cr markers before sanitizing in buildRequest so the first comment's Body and OriginalFinding derive from one consistent representation and no live/escaped marker leaks to the model - threadreply: drop dead duplicate adapter.Queue in the empty-reply test - respond: hide --no-post (keep it as a working legacy alias for --dry-run) and stop advertising both flags side-by-side in the long help - llm: extract shared NewAdapterFromConfig and have reviewcmd/respondcmd delegate to it, removing the duplicated adapter dispatch

piekstra-dev · 2026-06-17T14:32:38Z

Automated PR Review

Reviewed commit: b92a5a9ea955
Profile: reviewer - Posting as: piekstra-dev

Summary

Reviewer	Findings
documentation:docs	0
go:implementation-tests	1
policies:conventions	1
structure:repo-health	1

go:implementation-tests (1 finding)

Minor - `internal/cmd/respondcmd/respondcmd_test.go`

The deprecated --no-post alias (if flags.noPost { flags.dryRun = true } in runRespond) has no test. All dry-run behavioral coverage goes through --dry-run directly. If the alias guard were accidentally removed, callers relying on --no-post would get live posts instead of a dry-run—a silent safety regression. Add a focused test mirroring TestRespondDryRunDoesNotPost but invoking --no-post, asserting RecordedThreadReplies is empty and the output contains "Mode: dry-run".

policies:conventions (1 finding)

Minor - `internal/cmd/respondcmd/respondcmd.go:76`

--no-post is introduced already hidden and described as a "Deprecated alias for --dry-run" on a brand-new command that has never shipped. Pre-deprecating a flag before any users exist creates permanent maintenance debt with no migration benefit. The Open CLI Collective convention for new commands is to ship only the canonical flag (--dry-run) and add a compatibility alias later only if a prior published surface requires it. Remove --no-post and its MarkHidden call; if a cross-command alias is ever required, add it at that point with a real deprecation notice.

structure:repo-health (1 finding)

Minor - `internal/threadreply/responder.go:95`

stripMarkers hardcodes  as local string literals inside the threadreply package, duplicating format knowledge that is owned by the marker package. The marker package is already imported here (for marker.SanitizeModelContent and marker.FindActions), so it is the authoritative owner of the marker format. If the HTML-comment envelope ever changes, stripMarkers will silently fail to strip old markers or misidentify new ones, leaking raw marker text into model context. Fix: expose a StripMarkers(body string) string function from the marker package and replace the local implementation with a call to it, keeping all marker-format knowledge in one place.

4 PR discussion threads considered. 4 summarized; 4 resolved.

Completed in 5m 27s | ~$0.75 (est.) | claude-sonnet-4-6 | cr 0.4.161

Field	Value
Model	claude-sonnet-4-6
Reviewers	documentation:docs, go:implementation-tests, policies:conventions, structure:repo-health
Engine	claude_cli · claude-sonnet-4-6
Reviewed by	cr · piekstra-dev
Duration	5m 27s wall · 5m 19s compute
Cost	~$0.75 (est.)
Tokens	23 in / 12.9k out

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost	Duration
orchestrator-selection	claude-sonnet-4-6	4	3.0k	8.3k	38.5k	~$0.19 (est.)	57s
documentation:docs	claude-sonnet-4-6	4	1.3k	7.2k	12.4k	~$0.07 (est.)	28s
go:implementation-tests	claude-sonnet-4-6	5	5.7k	41.3k	36.0k	~$0.23 (est.)	1m 51s
policies:conventions	claude-sonnet-4-6	3	130	3.3k	4.8k	~$0.02 (est.)	42s
structure:repo-health	claude-sonnet-4-6	4	2.5k	7.2k	24.4k	~$0.13 (est.)	51s
orchestrator-rollup	claude-sonnet-4-6	3	123	3.9k	26.9k	~$0.10 (est.)	28s

piekstra-dev

Automated PR review completed with outcome: approved.

piekstra marked this pull request as ready for review June 17, 2026 13:04

piekstra requested a review from piekstra-dev June 17, 2026 13:04

piekstra-dev reviewed Jun 17, 2026

View reviewed changes

Comment thread internal/cmd/respondcmd/respondcmd.go

piekstra-dev reviewed Jun 17, 2026

View reviewed changes

Comment thread internal/threadreply/threadreply_test.go

piekstra-dev reviewed Jun 17, 2026

View reviewed changes

Comment thread internal/threadreply/responder.go Outdated

piekstra-dev reviewed Jun 17, 2026

View reviewed changes

Comment thread internal/cmd/respondcmd/respondcmd.go

piekstra-dev previously approved these changes Jun 17, 2026

View reviewed changes

piekstra requested a review from rianjs June 17, 2026 13:42

piekstra dismissed piekstra-dev’s stale review via b92a5a9 June 17, 2026 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(respond): conversational replies on review-comment threads#333

feat(respond): conversational replies on review-comment threads#333
piekstra wants to merge 2 commits into
mainfrom
feat/conversational-review-replies

piekstra commented Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

piekstra-dev commented Jun 17, 2026

Minor - `internal/threadreply/responder.go:52`

Minor - `internal/cmd/respondcmd/respondcmd.go:81`

Minor - `internal/cmd/respondcmd/respondcmd.go:348`

Uh oh!

piekstra-dev left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

piekstra-dev commented Jun 17, 2026

Minor - `internal/cmd/respondcmd/respondcmd_test.go`

Minor - `internal/cmd/respondcmd/respondcmd.go:76`

Minor - `internal/threadreply/responder.go:95`

Uh oh!

piekstra-dev left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

piekstra commented Jun 17, 2026

Summary

Why it matters

Design notes

Scope / remaining work

Testing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

piekstra-dev commented Jun 17, 2026

Automated PR Review

Summary

Minor - internal/threadreply/responder.go:52

Minor - internal/cmd/respondcmd/respondcmd.go:81

Minor - internal/cmd/respondcmd/respondcmd.go:348

Uh oh!

piekstra-dev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

piekstra-dev commented Jun 17, 2026

Automated PR Review

Summary

Minor - internal/cmd/respondcmd/respondcmd_test.go

Minor - internal/cmd/respondcmd/respondcmd.go:76

Minor - internal/threadreply/responder.go:95

Uh oh!

piekstra-dev left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Minor - `internal/threadreply/responder.go:52`

Minor - `internal/cmd/respondcmd/respondcmd.go:81`

Minor - `internal/cmd/respondcmd/respondcmd.go:348`

Minor - `internal/cmd/respondcmd/respondcmd_test.go`

Minor - `internal/cmd/respondcmd/respondcmd.go:76`

Minor - `internal/threadreply/responder.go:95`