feat(memory): add --summary mode + volume fan-out signal to recall by gupsammy · Pull Request #77 · gupsammy/Claudest

gupsammy · 2026-06-02T07:20:51Z

What

Adds a --summary retrieval mode and a mechanical volume-based fan-out signal to the recall-conversations skill, so broad/retrospective queries stop flooding the main context.

Why

A taxonomy of 61 real historical recall invocations showed two populations:

~72% are continuation-restore / specific-lookup — small targeted answers; fan-out would only add latency.
~30% are broad-retro-synthesis / gap-antipattern / decision-extraction — high blast radius, heavy consume.

Investigating the heavy cluster surfaced the load-bearing fact: the recall scripts emit full untruncated transcripts (formatting.py renders every m.content). The precomputed context_summary column exists (94% populated, written at import time by summarizer.py) but recall never read it. So recent_chats.py --limit 20 --verbose (the run-retro recipe) was pouring ~20 full transcripts into context.

How — two tiers, gated by measured volume (not session count)

--summary (tier 1): emit the precomputed context_summary instead of full content. Single-pass, free at query time, skips the per-message query. Measured ~3x reduction (309,798 → 85,885 chars on a 20-session retro).
Fan out (tier 2): only when even summaries overflow. The scripts emit a two-tier signal via a shared cli_common.volume_flags() helper — summary_suggested (large full pull → use --summary) and fanout_suggested (large even after --summary → fan out), plus an INFO: stderr nudge in markdown mode. SKILL.md step 4 documents per-project general-purpose fan-out (shard by project, never by arbitrary count — count splits sever a decision/antipattern thread).

Triggering keys on lens type + measured output volume, never raw session count — continuation-restore and specific-lookup (the 72%) never escalate.

Changes

scripts/memory_lib/cli_common.py — --summary flag, FANOUT_SUGGEST_CHARS, volume_signal(), volume_flags() (single source of truth for both scripts' JSON meta), emit_volume_signal().
scripts/recent_chats.py / search_conversations.py — summary routing + JSON meta via volume_flags().
scripts/memory_lib/formatting.py — summary branch in format_markdown_session (early return; placeholder for empty summary).
SKILL.md — gated step-4 escalation + Agent in allowed-tools.
references/tool-reference.md — --summary rows + signal docs.

Testing

New tests/claude-memory/test_summary_mode.py — 30 tests: volume_signal boundaries (49999/50000/50001), formatter summary branch, summary-mode routing in both scripts (via the in-memory memory_db fixture), and the JSON-meta truth table called through the shared volume_flags() helper (guards production, not a copy).
Fixed 2 stale footer assertions in test_context_injection.py (the footer now emits the canonical claude-memory:recall-conversations name).
Full tests/claude-memory suite: 448 passed, 0 failed.

Notes

Version auto-bumped 0.8.104 → 0.8.105 by the auto-version pre-commit hook.
Adding Agent to allowed-tools widens the skill's tool surface; it is gated behind the rare tier-2 path so the common 72% of calls are unaffected.

recall scripts emit full transcripts, flooding context on broad/retro queries. Add a --summary flag (recent_chats + search_conversations) that emits the precomputed context_summary column instead of full message content (~3x smaller, single-pass, skips the per-message query). Add a mechanical two-tier volume signal via a shared cli_common.volume_flags() helper: summary_suggested (large full pull -> use --summary) and fanout_suggested (large even after --summary -> fan out by project), plus an INFO stderr nudge in markdown mode. The orchestrator escalates by measured output volume, not raw session count. SKILL.md gains a gated step-4 escalation (per-project general-purpose fan-out) and Agent in allowed-tools; tool-reference.md documents the flag and signals. Tests: add tests/claude-memory/test_summary_mode.py (30 tests covering volume_signal boundaries, the formatter summary branch, summary-mode routing in both scripts, and the JSON meta truth table via the shared helper). Fix 2 stale footer assertions in test_context_injection.py (footer now uses the canonical skill name).

claude · 2026-06-02T07:21:08Z

Claude finished @gupsammy's task in 5m 12s —— View job

PR Review: feat(memory): add --summary mode + volume fan-out signal to recall

Tasks

Gather context and understand the request
Fetch PR diff and read changed files
Review Python scripts against stdlib/quality criteria
Review SKILL.md and tool-reference.md changes
Review test coverage
Post comprehensive review findings

What looks good

Python rules compliance — All changed .py files open with from __future__ import annotations. No third-party imports anywhere. No RETURNING clause in SQL. open_db_or_exit retains WAL mode + busy_timeout = 5000. Dynamic IN-clause placeholders use ",".join("?" * len(...)). FTS sanitization is untouched. ✓

Single source of truth — volume_flags() and volume_signal() live in cli_common.py and both scripts import them rather than duplicating the logic. The test truth-table calls volume_flags() directly, so it guards production derivation rather than a copy. Smart design choice. ✓

Summary branch in format_markdown_session — The session.get("summary") is not None check correctly distinguishes summary mode (key present, possibly "") from full mode (key absent). The early return correctly skips the full message rendering block. The empty-summary placeholder is clear. ✓

Volume threshold is mechanical — Gated on measured character count, not session count. The "continuation-restore" and "specific-lookup" paths (the 72%) will never trigger escalation regardless of how many sessions they touch. This is the right call. ✓

Mutual exclusivity — summary_suggested and fanout_suggested are mutually exclusive by construction (is_large and not summary_mode vs is_large and summary_mode). Pinned explicitly by test_flags_are_mutually_exclusive_when_large. ✓

Version sync — plugin.json and marketplace.json both bumped to 0.8.105. README badges updated. ✓

Test coverage — 30 tests covering boundary conditions (49999/50000/50001), both modes, NULL DB values, JSON round-trip serialisability, and the full flag truth table. test_context_injection.py stale-footer fixes are correct (the hook now emits claude-memory:recall-conversations). ✓

One issue to fix before merge

b.context_summary lacks the same PRAGMA guard as b.tool_counts in recent_chats.py

The existing has_tool_counts PRAGMA check guards b.tool_counts from being selected on old DBs that predate the column. b.context_summary was added in the same _migrate_columns batch (lines 385–386 of db.py), but this PR hardcodes it into the static SQL string:

# recent_chats.py — after this PR
tool_counts_col = ", b.tool_counts" if has_tool_counts else ""
sql = f"""
    SELECT ..., b.id as branch_db_id, b.context_summary{tool_counts_col}
    ...
"""

If has_tool_counts is False (unmigrated DB from before the _migrate_columns batch), the query will fail with no such column: b.context_summary, rather than the graceful fallback the has_tool_counts pattern was designed to provide. This is only a concern for a DB that was created before both columns were introduced and has never been opened by the import hook — low probability but real.

The least-invasive fix is to gate context_summary on the same flag, since both columns are added together:

# recent_chats.py — suggested fix
context_summary_col = ", b.context_summary" if has_tool_counts else ""
tool_counts_col = ", b.tool_counts" if has_tool_counts else ""
sql = f"""
    SELECT ..., b.id as branch_db_id{context_summary_col}{tool_counts_col}
    ...
"""

Then in the unpacking:

if has_tool_counts:
    (..., branch_db_id, context_summary, tool_counts_json) = session
else:
    (..., branch_db_id) = session
    context_summary = None
    tool_counts_json = None

The same fix applies to both SQL paths in search_conversations.py (lines 76–79 and 101–104). Fix this →

Suggestions (non-blocking)

total_messages: 0 in JSON summary mode — format_json_sessions computes total_messages = sum(len(s.get("messages", [])) for s in sessions). In summary mode the messages key is absent, so this always emits 0. JSON consumers could mistake it for "session had no messages" rather than "messages weren't fetched." A null or omitting the field in summary mode would be less misleading, though it's a breaking change to the JSON schema if anything downstream consumes it. At minimum, the field could be documented in tool-reference.md.

SKILL.md step 4 verbosity — The new step runs ~200 words across two sub-bullets. The CLAUDE.md guidance is "terse and operational." The fanout guidance is useful but could drop the worked-out reasoning ("count-based splits sever a decision/antipattern thread") and keep just the actionable imperative. The prose rationale belongs in the PR description (where it already lives, well).

--summary --verbose interaction — Works correctly (verbose metadata renders before the summary block, full messages skipped). Not documented in tool-reference.md or the EXAMPLES block, so the combination may surprise users. A one-line note under the --summary row would help.

Verdict

The core design is sound and the implementation is clean. The PRAGMA guard asymmetry is the only change I'd call required before merge — everything else is polishing. 30-test suite with 448 passing makes this easy to ship once that guard is addressed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8fdf9fe307

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-02T07:23:29Z

               b.files_modified, b.commits, s.git_branch,
               p.name as project, p.path as project_path,
-               b.id as branch_db_id{tool_counts_col}
+               b.id as branch_db_id, b.context_summary{tool_counts_col}


Guard summary column reads on older databases

For an existing conversations.db that predates the context_summary migration and has not yet been opened by the setup/sync path, even a normal recent_chats.py run without --summary now fails with no such column: b.context_summary because this SELECT always reads the new column. open_db_or_exit() only opens the DB and does not run _migrate_columns, while the previous tool_counts read stayed compatible by probing PRAGMA table_info; please either migrate before querying or gate/fallback this column read, and apply the same pattern in search_conversations.py.

Useful? React with 👍 / 👎.

Both PR reviewers (Codex P2 + Claude) flagged that the new SELECT added b.context_summary unconditionally, so a normal recall on a DB predating the context_summary migration (not yet opened by import/sync, which runs _migrate_columns) would crash with 'no such column'. The sibling tool_counts column was already PRAGMA-guarded. Probe context_summary independently via PRAGMA table_info in both scripts (more robust than coupling to has_tool_counts) and use index-based row unpacking to handle any column-presence combination. Full mode is unaffected on old DBs; summary mode falls back to ''. Adds 3 regression tests (ALTER TABLE DROP COLUMN simulates a pre-migration DB).

gupsammy · 2026-06-02T07:34:59Z

Addressed the one review finding (raised by both Codex P2 and Claude) in 2570a47.

Issue: recent_chats.py / search_conversations.py added b.context_summary to the SELECT unconditionally, so a normal recall on a DB predating the context_summary migration (not yet opened by the import/sync path, which runs _migrate_columns) would fail with no such column — even without --summary.

Fix: probe context_summary independently via PRAGMA table_info(branches) in both scripts (rather than coupling to the has_tool_counts flag, so the two columns can't drift) and use index-based row unpacking to handle any column-presence combination. Full mode is unaffected on old DBs; summary mode falls back to "". Added 3 regression tests that ALTER TABLE ... DROP COLUMN to simulate a pre-migration DB. Full suite: 451 passed.

@codex review

claude · 2026-06-02T07:35:15Z

Claude finished @gupsammy's task in 1m 40s —— View job

Claude finished reviewing PR #77

Tasks

Gather context and understand the PR
Fetch PR diff and read changed files (especially fix in 2570a47)
Verify PRAGMA guard fix in recent_chats.py and search_conversations.py
Review new regression tests for pre-migration DB simulation
Post comprehensive review findings

Fix Verification: `guard context_summary read for unmigrated DBs` (`2570a47`)

The fix correctly resolves the blocker flagged in both prior reviews. Here's the detailed breakdown:

recent_chats.py — correct and robust

The fix probes context_summary independently (rather than coupling to has_tool_counts), so the two optional columns can evolve separately without risk of divergence:

has_context_summary = "context_summary" in branch_columns  # line 70
context_summary_col = ", b.context_summary" if has_context_summary else ""  # line 75

Index-based unpacking correctly tracks column positions across all four combinations:

`has_context_summary`	`has_tool_counts`	col at tool_counts read	Result
True	True	12	session[12] → `tool_counts` ✓
False	True	11	session[11] → `tool_counts` ✓
True	False	—	`None` ✓
False	False	—	`None` ✓

search_conversations.py — correct and simpler

search_conversations.py never reads tool_counts, so there's only one optional column to handle. The pattern is clean:

(_session_id, uuid, ..., branch_db_id) = session[:9]  # line 130–131
context_summary = session[9] if has_context_summary else None  # line 132

The PRAGMA probe is placed before the FTS/LIKE branch selection, so the guard covers both code paths without duplication. ✓

Regression tests — complete

TestMissingContextSummaryColumn (3 tests) uses ALTER TABLE branches DROP COLUMN context_summary to simulate an unmigrated DB, which is the correct technique — it drops the column after seeding so the structural absence is genuine, not just a NULL value. Covers:

test_recent_full_mode_survives_missing_column — normal recall returns full messages, no crash ✓
test_recent_summary_mode_falls_back_to_empty — --summary on pre-migration DB yields "" ✓
test_search_summary_mode_survives_missing_column — LIKE search with --summary degrades gracefully ✓

FTS path isn't tested separately, but the PRAGMA probe happens before the FTS/LIKE branch, so coverage is sufficient.

Status of prior non-blocking suggestions

These three remain unaddressed from the initial review — all still non-blocking:

total_messages: 0 in JSON summary mode (formatting.py:142) — sum(len(s.get("messages", [])) ...) always yields 0 in summary mode since the messages key is absent. A null or omitting the field would be less ambiguous for JSON consumers.
SKILL.md step 4 verbosity — the two fan-out sub-bullets still run ~200 words; CLAUDE.md asks for terse operational prose. The rationale ("count-based splits sever a decision thread") belongs in the PR description rather than the skill file.
--summary --verbose interaction undocumented — the combination works correctly but isn't mentioned in tool-reference.md or the EXAMPLES block.

Verdict

The required fix is correctly implemented and thoroughly tested. The independent has_context_summary probe (rather than reusing has_tool_counts) is actually a stronger approach than the one originally suggested — it makes each column's availability explicit and prevents future coupling bugs. Suite at 451 passed, 0 failed.

This PR is ready to merge. The three non-blocking suggestions above can be addressed in a follow-up if desired.

chatgpt-codex-connector · 2026-06-02T07:38:47Z

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…summary --verbose Addresses two non-blocking review suggestions. format_json_sessions reported total_messages: 0 in summary mode (sessions carry 'summary', not 'messages') — misleading for JSON consumers; now reports total_summaries in summary mode and total_messages in full mode. tool-reference.md documents that --summary composes with --verbose. Adds 2 counter tests. Suite: 453 passed. (Left SKILL.md step-4 prose intact — rule+reasoning is deliberate per CLAUDE.md authoring principle.)

claude · 2026-06-02T09:00:05Z

Claude finished @gupsammy's task in 1m 56s —— View job

Tasks

Gather context and understand the PR
Fetch PR diff and read changed files (including latest commit c643004)
Review the new commit addressing non-blocking suggestions
Post comprehensive review findings

Review: Latest commit `c643004`

This commit addresses two of the three non-blocking suggestions from the prior review. Here's the detailed breakdown:

format_json_sessions — correct and clean

The total_messages: 0 mislead is fixed by detecting retrieval mode from the session shape rather than a separate flag, which is the right call — the format_json_sessions function has no summary_mode argument and adding one would widen its signature unnecessarily.

if any("summary" in s for s in sessions):
    output["total_summaries"] = sum(1 for s in sessions if "summary" in s)
else:
    output["total_messages"] = sum(len(s.get("messages", [])) for s in sessions)

The key-existence heuristic is reliable because "summary" is only populated by the summary-mode code path; full mode never sets this key. sessions = [] correctly falls to the else branch and emits total_messages: 0. ✓

The inline comment explains why the counter is mode-dependent — a non-obvious invariant that belongs exactly here. ✓

tool-reference.md — accurate and concise

--summary composes with --verbose: you still get the files-modified / commits / tool-counts metadata header, only the conversation body is replaced by the precomputed summary. In JSON, summary mode reports total_summaries instead of total_messages.

Verified against format_markdown_session (line 98–118 verbose block runs unconditionally, summary early-return at line 120 follows). The doc matches the code. ✓

Two new tests — sufficient

TestJsonSessionCounters covers the full/summary split cleanly. A mixed-session list (some with "summary", some without) isn't tested, but never occurs in practice — both scripts apply --summary uniformly. No gap. ✓

Version bump — 0.8.106 → 0.8.107, synced in both plugin.json and marketplace.json. ✓

Status of all prior findings

Finding	Status
PRAGMA guard asymmetry (`b.context_summary` without `has_context_summary` probe)	✅ Fixed in `2570a47` — independent probe, index-based unpacking, 3 regression tests
`total_messages: 0` in JSON summary mode	✅ Fixed in `c643004` — now emits `total_summaries` in summary mode
`--summary --verbose` interaction undocumented	✅ Fixed in `c643004` — one-liner added to tool-reference.md
SKILL.md step-4 verbosity	⚠️ Deliberately left intact (commit message: "rule+reasoning is deliberate per CLAUDE.md authoring principle") — non-blocking

The one remaining item is a judgment call the author made consciously. The fan-out sub-bullets contain operational reasoning the agent needs to choose correctly between shard-by-project and shard-by-count — it can be argued to earn its token budget. Non-blocking. ✓

Verdict

All required changes from prior reviews are correctly implemented. Suite at 453 passed, 0 failed. This PR is ready to merge.

chatgpt-codex-connector Bot reviewed Jun 2, 2026

View reviewed changes

gupsammy merged commit e99c982 into main Jun 2, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): add --summary mode + volume fan-out signal to recall#77

feat(memory): add --summary mode + volume fan-out signal to recall#77
gupsammy merged 3 commits into
mainfrom
feat/recall-summary-fanout

gupsammy commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Uh oh!

gupsammy commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gupsammy commented Jun 2, 2026

What

Why

How — two tiers, gated by measured volume (not session count)

Changes

Testing

Notes

Uh oh!

claude Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(memory): add --summary mode + volume fan-out signal to recall

What looks good

One issue to fix before merge

Suggestions (non-blocking)

Verdict

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gupsammy commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

Fix Verification: guard context_summary read for unmigrated DBs (2570a47)

Status of prior non-blocking suggestions

Verdict

Uh oh!

chatgpt-codex-connector Bot commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

Review: Latest commit c643004

Status of all prior findings

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 2, 2026 •

edited

Loading

claude Bot commented Jun 2, 2026 •

edited

Loading

Fix Verification: `guard context_summary read for unmigrated DBs` (`2570a47`)

claude Bot commented Jun 2, 2026 •

edited

Loading

Review: Latest commit `c643004`