Read Cursor state.vscdb values only for matching keys by tony · Pull Request #45 · tony/agentgrep

tony · 2026-06-05T02:27:01Z

Summary

Reads Cursor IDE state.vscdb tables in two stages: a key-only scan with SQL-side key-token filtering (LIKE … COLLATE NOCASE), then indexed value fetches for the matching keys, so large unrelated BLOBs are never materialized.
Speeds up searches over multi-gigabyte Cursor IDE databases, where the previous full SELECT key, value scan pulled gigabytes of editor data through memory just to discard it. On real Cursor schemas the key scan rides the covering key index; index-less databases degrade to a plain scan with identical results, including duplicate-key rows.
Adds SQL-trace tests covering the two-stage statement shapes (both tables, case-insensitive and duplicate keys) and a regression fixture seeding many large irrelevant blobs to prove value reads stay keyed to matching rows.
Records the fix in CHANGES for the unreleased version.

Test Plan

rm -rf docs/_build; uv run ruff check . --fix --show-fixes; uv run ruff format .; uv run ty check; uv run py.test --reruns 0 -vvv; just build-docs;

why: parse_cursor_state_db read every key/value row from ItemTable and cursorDiskKV before checking whether the key could hold chat or prompt history. On large state.vscdb databases that materializes gigabytes of irrelevant BLOB values just to discard them, dominating search time. A key-only first pass rides the covering key index, so non-matching BLOB pages are never read, and values are fetched only for keys that can hold chat or prompt history. what: - Split iter_key_value_rows into a key-only scan with SQL-side key-token filtering (LIKE ... COLLATE NOCASE) and per-key indexed value fetches, deduplicating keys while preserving scan order and still yielding every row for repeated keys in index-less databases. - Pass CURSOR_STATE_TOKENS from parse_cursor_state_db and drop its Python-side key filter. - Cover the two-stage SQL trace shapes (both tables, case-insensitive and duplicate keys) and add a regression fixture of many large irrelevant blobs proving value reads stay keyed to matching rows.

why: Record the Cursor IDE state.vscdb fix for the unreleased version so readers with multi-gigabyte Cursor databases know searches stop loading unrelated editor data. what: - Add a Fixes deliverable under the unreleased 0.1.0a17 section describing the key-first read of chat and prompt entries.

tony · 2026-06-05T02:41:05Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

tony temporarily deployed to docs June 5, 2026 02:27 — with GitHub Actions Inactive

tony force-pushed the fix-cursor-state-two-stage branch from 7fcbb31 to 989ea18 Compare June 5, 2026 02:30

tony temporarily deployed to docs June 5, 2026 02:30 — with GitHub Actions Inactive

tony merged commit 0c90e11 into master Jun 5, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read Cursor state.vscdb values only for matching keys#45

Read Cursor state.vscdb values only for matching keys#45
tony merged 2 commits into
masterfrom
fix-cursor-state-two-stage

tony commented Jun 5, 2026

Uh oh!

tony commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tony commented Jun 5, 2026

Summary

Test Plan

Uh oh!

tony commented Jun 5, 2026

Code review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant