Releases · daniloaguiarbr/sqlite-graphrag

11 Jun 19:14

daniloaguiarbr

v1.0.79

ec5a245

v1.0.79 — Fast parallel batched LLM embedding (G42), dim adoption (G43), adaptive batch (G44) Latest

Latest

Removed

Daemon infrastructure fully removed: src/daemon.rs (1120 lines), src/commands/daemon.rs (79 lines), tests/daemon_integration.rs (316 lines) deleted. DaemonOpts struct and --autostart-daemon flag removed from all command args. All crate::daemon::embed_*_or_local calls replaced with direct crate::embedder::embed_*_local wrappers. CLI is now 100% one-shot with zero IPC. 8 daemon constants removed from src/constants.rs. Net removal: ~764 lines.
Legacy local-model features fully removed (ahead of the v1.1.0 schedule): the embedding-legacy, ner-legacy and full Cargo features are gone, together with the optional fastembed, ort, ndarray, tokenizers and hf-hub dependencies and src/extraction_gliner.rs. EmbeddingBackend is now a permanent stub returning a clear migration error; extract_graph_auto lost its GLiNER delegation path; calculate_safe_concurrency budgets heavy commands with LLM_WORKER_RSS_MB (350) instead of the obsolete 1100 MB ONNX constant (EMBEDDING_LOAD_EXPECTED_RSS_MB deleted). The CI matrix shrinks to default + llm-only. Every build is LLM-only; there is no local-model path.

Deprecated

GLiNER-era flags are formal no-ops with explicit warnings: --gliner-variant (on remember and ingest) and ingest --mode gliner now emit a tracing::warn! deprecation notice when used; --enable-ner performs URL-regex extraction only. All help strings rewritten to stop promising the removed GLiNER pipeline (model variants, sizes, thresholds); SQLITE_GRAPHRAG_GLINER_VARIANT/_MODEL/_THRESHOLD remain accepted for compatibility but have no effect.

Fixed — G42: slow, serialized, fragile LLM embedding pipeline

S1 — configurable embedding dimensionality (default 64): single source of truth in constants.rs (DEFAULT_EMBEDDING_DIM + embedding_dim()); precedence --embedding-dim flag > SQLITE_GRAPHRAG_EMBEDDING_DIM env > schema_meta.dim of the opened database > 64. Existing 384-dim databases keep working unchanged. ZERO schema change (the dim key and columns already existed). Basis: MRL, arXiv 2205.13147 — output per vector drops from ~3072 to ~512 tokens (~6x)
S2 — batched LLM calls: embed_batch_async embeds N numbered texts per call with the {items:[{i,v}]} schema; chunks batch at 8, entity names at 25 (calibration bases at dim 64; dim-adaptive since G44) — 39 subprocess spawns collapse into 4-5
S3 — real parallelism: Arc<Semaphore> + acquire_owned + JoinSet + join_next/is_panic bounded fan-out in embedder.rs; the global Mutex now guards ONLY the config clone (the old flush_group held it across 30-60s of network I/O, forcing effective parallelism 1); results stream through a BOUNDED mpsc channel (backpressure + incremental delivery); permits = min(--llm-parallelism, cpus, ram*0.5/350MB, 32); new --llm-parallelism flag on remember (default 4), ingest (default 2, multiplies with --ingest-parallelism) and edit
S4 — schema tempfile RAII: codex --output-schema files are NamedTempFiles with randomised names created once per process (no per-call write+delete, no PID-path races); the orphan reaper now also removes stale codex-home-{pid} dirs whose PID is gone
S5 — claude model env override: SQLITE_GRAPHRAG_CLAUDE_EMBED_MODEL (symmetric to the codex var); zero hardcoded models without override
S6 — empty CLAUDE_CONFIG_DIR by default on the embedding path: honours SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR, else uses a managed ~/.local/state/sqlite-graphrag/claude-empty-config (mode 0700, copies .credentials.json when present); the MCP-isolation flags are silently ignored upstream (anthropics/claude-code#10787) and a full ~/.claude cost ~223k tokens per call (~40-50s → ~10-15s)
S7 — actionable codex headless error: request_user_input failures now explain the cause and remediation instead of an opaque exit 11
S8 — panic-free signal handler: first signal uses best-effort writeln! (BrokenPipe ignored); second signal exits 130 with ZERO I/O — eliminates the SIGABRT on orphaned processes (panic = "abort" + closed stderr pipe)
S9 — canonical one-shot re-embed: enrich --operation re-embed --limit N --resume documented as the official path; new edit --force-reembed regenerates an embedding without changing the body; removed the BROKEN pre-warm recipe (edit --description "<same>" never re-embedded) from MIGRATION/HOW_TO_USE docs
C5 — no silent dimension normalisation: normalise_dim (truncate/zero-pad) replaced by validate_dim, which errors on divergent vectors; the batch parser validates index coverage and per-item dimensionality
Every LLM subprocess now uses kill_on_drop(true) plus an explicit tokio::time::timeout (SQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS, default 300s); a process-wide multi-thread runtime replaces the per-call current-thread runtime
New concurrency tests: peak never exceeds permits (AtomicUsize), panicking task returns its permit via RAII and surfaces is_panic, cancellation terminates the fan-out quickly, divergent dim fails the fan-out

Fixed — G43: dimensionality adoption did not cover the main commands

Dim adoption on every connection open: the G42/S1 sync (schema_meta.dim → active dim) only ran inside ensure_db_ready, which remember / edit / recall / hybrid-search never call — those commands silently used the compiled default (64) against pre-v1.0.79 384-dim databases, writing mixed-dim embeddings that cosine-score 0.0 against each other (vector recall went blind to the old corpus). open_rw AND open_ro now adopt the recorded database dim (best-effort, env override still wins); 4 regression tests cover rw/ro adoption, env precedence and virgin databases
init no longer stamps dim=384: the hardcoded INSERT OR REPLACE ... ('dim', '384') stamped NEW databases with a dim that contradicts the active default; replaced by INSERT OR IGNORE with the active dim (preserves the recorded dim on re-init of an existing database)
rename-entity no longer records dim=384 and a removed model name: the duplicated INSERT (hardcoded 384 + multilingual-e5-small) was replaced by the canonical upsert_entity_vec writer (real vector length, CLI version as model)
Test mocks speak both embedding shapes: tests/mock-llm/{claude,codex} returned a fixed 384-dim single-shape vector, so the ENTIRE slow-tests integration suite failed since G42/S1+S2 (the gate never runs on CI, hiding it); the mocks now return 64-dim vectors and answer the {items:[{i,v}]} batch schema; the 2 obsolete daemon tests became regression guards for the daemon removal; .config/nextest.toml no longer filters on the deleted daemon_integration binary — --features slow-tests integration suite back to green (69/69 on the integration binary)

Fixed — G44: embedding batch size did not scale with the dimensionality

Dim-adaptive batch size: the G42/S2 batches were FIXED (8 chunks / 25 entity names per LLM call), calibrated for the dim-64 default (~512 / ~1600 floats per response); on legacy 384-dim databases the same chunk batch asked for ~3072 floats — measured in production: claude returned 3 of 8 items (caught by the G42/C5 coverage check) and codex timed out at 300s, failing remember twice. The batch size now adapts as clamp(base×64/dim, 1, base) (embedder.rs::adaptive_batch_for_dim): dim 64 keeps 8/25, dim 384 uses 1/4 — constant float budget per call, no SQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS workaround needed; 6 regression tests cover the formula and the env-dim wrappers

Full changelog: https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/CHANGELOG.md

Assets 2

09 Jun 15:04

daniloaguiarbr

v1.0.77

e188561

v1.0.77 — G40 fix: applied_on = NULL blocks all migrations

Critical Bugfix

v1.0.76 migrate --rehash inserted rows into refinery_schema_history without the applied_on field, leaving it NULL. The refinery-core 0.9.1 rusqlite driver reads this field as String (NOT NULL), crashing with InvalidColumnType(Null at index: 2) on any subsequent migration. All migrations were blocked (exit 20).

How to Upgrade

cargo install sqlite-graphrag --version 1.0.77 --force
sqlite-graphrag migrate

No manual SQL intervention needed — v1.0.77 automatically detects and fixes NULL rows.

Changes

Fixed

run_rehash INSERT now always includes applied_on with RFC3339 timestamp via chrono::Utc::now()
sanitize_null_applied_on helper runs UPDATE on NULL rows before any migration runner call
remove_vec_virtual_tables_without_module cleans orphan vec0 virtual tables via PRAGMA writable_schema when vec0 module is absent (LLM-only build)
debug-schema no longer crashes on databases with applied_on = NULL — field changed from String to Option<String>

Added

null_rows_fixed field in migrate --rehash JSON response
null_rows_fixed and vec_tables_removed_via_writable_schema fields in migrate --to-llm-only JSON response
4 unit tests + 2 integration tests covering the fix
ADR-0027 documenting the G40 root cause and resolution (EN + PT-BR)

Documentation

CHANGELOG, MIGRATION, TESTING, AGENTS, COOKBOOK, DOCUMENTATION_FRAMEWORK updated
3 JSON schemas updated (migrate-rehash, migrate-to-llm-only, debug-schema)

Full Details

Assets 2

04 Jun 00:24

github-actions

v1.0.68

4b7d4bf

v1.0.68

[1.0.68] - 2026-06-03

Fixed

cargo install sqlite-graphrag broke on Windows with error[E0308]: mismatched types in src/terminal.rs:29 because HANDLE in windows-sys >= 0.59 is *mut c_void (was isize in 0.48/0.52). Replaced handle != 0 && handle as isize != -1 with the type-safe idiom !handle.is_null() && handle != INVALID_HANDLE_VALUE. Also pinned windows-sys to =0.59.0 exact and added CI job windows-build-check that runs cargo check --target x86_64-pc-windows-msvc on every push (G29).
enrich and ingest --mode claude-code|codex could be invoked in parallel against the same namespace and saturate the host (root cause of the 2026-06-03 276-load-average incident). Added lock::acquire_job_singleton per (job_type, namespace) and a new AppError::JobSingletonLocked { job_type, namespace } exit-75 error. A second concurrent invocation now fails fast instead of stacking 4 × N workers × 10 MCP processes (G28-B).
claude_runner::build_claude_command now respects SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR — when set to an existing empty directory, the subprocess is spawned with CLAUDE_CONFIG_DIR=<that dir>, suppressing user-scoped MCP servers and the 8-10-process fan-out they cause. We deliberately do not pass --strict-mcp-config / --mcp-config '{}' because [anthropics/claude-code#10787] documents that Claude Code CLI ignores both flags. CLAUDE_CONFIG_DIR is the only mechanism upstream actually honours (G28-A).
retry module gains a CircuitBreaker helper (with AttemptOutcome::{Success,Transient,HardFailure} and tests) that enrich --retry-failed can use to abort persistent-failure loops. Transient / rate-limited errors do NOT count toward the threshold, so a provider that recovers is not penalised (G28-D).
3 pre-existing test failures in src/commands/{history,list,read}.rs that leaked SQLITE_GRAPHRAG_DISPLAY_TZ between parallel test threads and asserted hardcoded 1970-01-01T00:00:00 strings now parse the ISO output via chrono::DateTime::parse_from_rfc3339 and compare timestamp() against DateTime::UNIX_EPOCH for timezone-agnostic assertions. The full test suite is now green on every timezone (UTC, America/Sao_Paulo, Europe/Berlin, etc.) without per-test setup of the env var.

Added

retry::CircuitBreaker (struct + record / is_open / reset) — opt-in helper for bounded retry loops. Rate-limited and timeout errors are explicitly excluded from the failure count.
lock::acquire_job_singleton(job_type, namespace, wait_seconds) — process-wide singleton for heavy commands.
constants::JOB_SINGLETON_POLL_INTERVAL_MS = 1000 — backing interval for the singleton polling loop.
errors::AppError::JobSingletonLocked { job_type, namespace } — exit 75, classified as retryable and with localised PT-BR message.
CI job windows-build-check runs cargo check --target x86_64-pc-windows-msvc --lib --all-features to catch Windows regressions before publish.
tests/terminal_compile_windows.rs — regression test that the public terminal::init_console and should_use_ansi stay callable; on Windows it also references the type-safe HANDLE check.
lock::tests — 3 unit tests covering singleton namespace sanitisation, second-invocation blocking, and per-namespace isolation.

Changed

enrich emits a tracing::warn! (visible with -v) when llm_parallelism > 4 recommending combining with SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR to keep subprocess fan-out manageable (G28-D, non-breaking).
Cargo.toml: windows-sys pinned to =0.59.0 exact (was range 0.59).

Assets 13

29 May 21:16

github-actions

v1.0.66

453ec50

v1.0.66

[1.0.66] - 2026-05-29

Fixed

BUG-01 CRITICAL: reclassify-relation crash — removed updated_at = unixepoch() from 3 SQL UPDATE statements referencing non-existent column in relationships table
BUG-02 HIGH: link --create-missing now normalizes entity names to kebab-case in both storage and JSON response (created_entities array)
BUG-04 MEDIUM: deep-research word-pair decomposition for 3+ word queries without conjunctions — queries like "authentication JWT tokens" now generate multiple sub-queries
BUG-05 LOW: remember --body-file defensive UTF-8 handling — invalid byte sequences replaced with U+FFFD instead of process abort
BUG-06 HIGH: link now updates weight of existing relationships and reports actual DB weight in JSON response (previously returned requested weight while keeping old value)
HIGH-01 CRITICAL: deep-research evidence chains fixed — BFS seeds limited to top-5 memories by score, preventing seed flooding that made all entities seeds with no room for BFS expansion
HIGH-01b: deep-research --graph-min-score default lowered from 0.2 to 0.05 to avoid discarding valid results in small databases; warns when RRF fusion returns 0 despite KNN/FTS hits
HIGH-04: link --max-entity-degree warning now uses emit_progress (always visible on stderr) instead of tracing::warn (requires -v)
HIGH-08: deep-research source classification now reports hybrid when both KNN and FTS matched, instead of always knn
HIGH-12: remember and ingest now use max_relationships_per_memory() function (reads SQLITE_GRAPHRAG_MAX_RELATIONS_PER_MEMORY env var) instead of hardcoded constant; remember --graph-stdin truncates with warning instead of rejecting

Added

edit --type flag to change memory type without re-creating (HIGH-10)
deep-research --mode reserved field (none default; claude-code/codex planned for v1.1.0) (HIGH-06)
deep-research --max-cost-usd reserved field for future LLM cost tracking (HIGH-09)
deep-research graph_context field in JSON response with entities and relationships from result memories (MEDIUM-01b)
deep-research 7 tracing::debug! calls in execute_sub_query() for diagnostics with -vv (HIGH-07)
graph --format json now includes entities alias field alongside nodes for LLM agent compatibility (HIGH-05)
list --json now includes memories alias field alongside items for LLM agent compatibility (HIGH-05)
graph entities --json now includes description field per entity (HIGH-11)
health --json now includes vec_memories_missing and vec_memories_orphaned counts (MEDIUM-09)
history --diff first version now reports baseline changes: {added_chars: N, removed_chars: 0} instead of null (MEDIUM-02)
Entity type validation suggests mapping when memory types are used as entity types: reference→concept, document→file, user→person (HIGH-10c)
remember after_long_help documents positional arg limitation and entity_type vs memory_type taxonomy (HIGH-10b)
debug-schema command renamed from __debug_schema for discoverability (HIGH-03, still hidden from --help)
fuzz/ directory with cargo-fuzz targets for graph-stdin JSON and name validation (LOW-01)
mutants.toml configuration for cargo-mutants (LOW-02)
CI coverage job with 75% threshold enforcement (LOW-03)

Changed

deep-research --graph-min-score default: 0.2 → 0.05

Data Migration (recommended after upgrade)

Run reclassify-relation --from-relation applies-to --to-relation applies_to --batch --yes (and similarly for depends-on, tracked-in) to normalize legacy kebab-case relations to snake_case (HIGH-13)
Run normalize-entities --yes to merge mixed-case entity duplicates (HIGH-13)

Assets 13

28 May 23:02

github-actions

v1.0.65

a2db26a

v1.0.65

[1.0.65] - 2026-05-28

Added

reclassify-relation command — bulk or single reclassification of relationship types with UPDATE OR IGNORE + DELETE duplicate merging, --dry-run, --filter-source-type/--filter-target-type (GAP-13)
normalize-entities command — normalizes existing entity names to lowercase kebab-case and auto-merges near-duplicate collisions, with --dry-run/--yes (GAP-15)
enrich command — LLM-augmented graph quality via --mode claude-code|codex, scan→judge→persist pipeline, 12 operations (memory-bindings, entity-descriptions, body-enrich and more), --dry-run previews without spawning the LLM, queue DB with resume/retry (GAP-14, GAP-18)
health now reports top_relation, top_relation_ratio, applies_to_ratio, and relation_concentration_warning when one relation exceeds 40% of edges (GAP-13)
deep-research flags --rrf-k, --graph-decay, --graph-min-score, and --max-neighbors-per-hop
--max-entity-degree warning on link and remember to flag super-hub growth (GAP-17)
JSON schemas deep-research, reclassify-relation, normalize-entities, and enrich-{phase,item-event,summary}, plus contract_36..39 and schema_36..39 tests — restores 100% schema/contract coverage (GAP-01, GAP-02, GAP-03, GAP-04)

Fixed

GAP-07 CRITICAL: deep-research now computes a separate embedding per sub-query — decomposition was cosmetic because all sub-queries shared the original query embedding for KNN, returning identical results (also resolves GAP-10 centroid collapse and GAP-12 partial decomposition)
GAP-08 CRITICAL: deep-research now fuses KNN, FTS5, and graph pools via Reciprocal Rank Fusion (new shared storage::fusion) instead of assigning FTS results a hardcoded score of 0.5
GAP-11: deep-research graph-pool scoring incorporates seed score, hop decay, and edge weight, fused via RRF with a minimum-score filter
GAP-09 HIGH: deep-research evidence chains are now directed seed→target paths (from, to, path, total_weight) filtered by discovered entities, instead of a flat global dump of the top-20 relationships
GAP-15 HIGH: entity names are normalized to lowercase kebab-case on every write AND read path (find_entity_id, rename-entity, reclassify-relation, prune-ner, enrich) — validation runs on the raw name first so short ALL_CAPS NER noise is still rejected, then the normalized form is stored and looked up

Changed

GAP-17: graph traversal accepts an optional per-hop neighbor cap (top-K by weight); default behavior is unchanged
hybrid-search RRF fusion extracted into the shared storage::fusion module (no behavior change)
GAP-16: docs clarify that relations are accepted in kebab-case or snake_case and always stored and emitted as snake_case

Assets 13

28 May 02:31

github-actions

v1.0.64

a46b03f

v1.0.64

[1.0.64] - 2026-05-28

Fixed

BUG-1 HIGH: ingest --mode claude-code now disables hooks via --settings '{"hooks":{}}' for OAuth users and detects terminal_reason: "max_turns" — prevents Stop hooks from consuming extraction turns (was failing 65% of files for users with hooks configured)
BUG-2 HIGH: ingest --mode claude-code now detects OAuth via apiKeySource from Claude Code init JSON and omits misleading cost_usd from NDJSON output — --max-cost-usd budget cap is ignored with warning for subscription users who are not billed per API call
BUG-3 HIGH: ingest --mode claude-code and --mode codex now validate body size BEFORE sending to LLM subprocess — files exceeding 512 KB body cap are skipped with actionable warning instead of wasting LLM tokens on extraction that will be discarded
rename and rename-entity now reject same-name renames with exit 1 (Validation) — prevents version inflation, unnecessary FTS5 sync, and wasted re-embedding

Added

deep-research command for parallel multi-hop GraphRAG research via heuristic query decomposition (up to 7 sub-queries), bounded fan-out with tokio::task::JoinSet and Arc<Semaphore>, 3-hop graph traversal, evidence chain assembly, and per-sub-query timeout — defaults calibrated against NovelHopQA, StepChain, HopRAG, and GraphRAG-Bench benchmarks (k=20, max-hops=3, max-sub-queries=7)

Assets 13

27 May 21:31

github-actions

v1.0.63

643fff0

v1.0.63

[1.0.63] - 2026-05-27

Fixed

BUG-1 HIGH: restore no longer reverts memory name to version's original — preserves current name after rename, eliminates UNIQUE constraint crash (exit 10) when old name is occupied
BUG-2 HIGH: ingest --mode claude-code and --mode codex now normalize relation strings via normalize_relation() before canonical check and DB insertion — eliminates false non-canonical relation warnings for kebab-case canonical values (depends-on → depends_on) and prevents mixed-format DB inconsistency
FINDING-1: edit now re-generates vector embedding when body changes — recall and hybrid-search return accurate similarity scores after edit (parity with restore which already re-embeds)

Added

AUTHENTICATION section in ingest --help documenting OAuth-first principle for both --mode claude-code and --mode codex
Auth failure detection: actionable tracing::warn! when Claude Code or Codex CLI authentication fails during ingest

Assets 13

23 May 09:55

github-actions

v1.0.62

9ea0069

v1.0.62

[1.0.62] - 2026-05-23

Fixed

G01 CRITICAL: ingest --mode claude-code now computes and persists vector embeddings — recall and hybrid-search find claude-code ingested memories (was creating memories with zero vec_memories/vec_chunks entries)
G02: validate_claude_version() now compares against MIN_CLAUDE_VERSION (2.1.0) — rejects incompatible Claude Code versions with actionable error
G03: env_clear() whitelist for claude -p subprocess now includes Windows-critical variables (LOCALAPPDATA, APPDATA, USERPROFILE, SystemRoot, COMSPEC, PATHEXT) via #[cfg(windows)]
G04: skipped counter in claude-code ingest summary now counts pre-existing done entries in queue DB instead of always reporting 0
G05: files exceeding 10MB stdin limit are rejected with specific error before spawning claude -p, preventing wasted API credits
G06: memory names from Claude extraction are normalized via derive_kebab_name() — prevents non-kebab-case names from entering the database
G07: invalid entity names from Claude extraction now emit tracing::warn! instead of being silently discarded
G08: claude-code queue database (.ingest-queue.sqlite) now uses WAL journal mode for crash resilience
G09: WAL checkpoint runs after claude-code ingest processing loop completes
G10: EXTRACTION_SCHEMA now includes additionalProperties: false at root, entity, and relationship levels — compatible with both Claude Code and Codex structured output

Added

ingest --mode codex for LLM-curated entity/relationship extraction via locally installed OpenAI Codex CLI (codex exec --json)
New ingest flags: --codex-binary, --codex-model, --codex-timeout for Codex CLI configuration
IngestMode::Codex variant — users can choose between --mode claude-code (Anthropic) and --mode codex (OpenAI) per ingest
JSONL parser for Codex CLI output with "last agent_message wins" pattern (verified against Paperclip production adapter)
Token usage tracking for Codex ingest (input_tokens, output_tokens) — cost_usd unavailable from Codex CLI
Full embedding pipeline for Codex-ingested memories (chunking, vec_memories, vec_chunks, vec_entities)
7 unit tests for Codex JSONL parser and schema validation

Assets 13

23 May 04:50

github-actions

v1.0.61

c155fbd

v1.0.61

[1.0.61] - 2026-05-23

Fixed

B00 CRITICAL: ingest --mode claude-code now uses --dangerously-skip-permissions instead of --bare — fixes OAuth authentication failure for Pro/Max subscription users
B00a: --max-turns increased from 1 to 3 — Claude needs >1 turn for structured extraction
B07a: memory source field changed from "claude-code" to "agent" — fixes CHECK constraint violation on insert
B01: --resume flag now resets stuck processing files to pending for re-processing
B02: --retry-failed flag now resets failed files to pending for retry
B03: --dry-run now works with --mode claude-code — emits preview events without spawning Claude
B04: subprocess timeout via wait-timeout crate — kills claude -p after --claude-timeout seconds (default 300)
B05: error messages from claude -p now parsed from stdout JSON instead of empty stderr
B06: re-ingesting same directory updates existing memories instead of UNIQUE constraint failure
B07: cold-start --json-schema failure automatically retried once (workaround for Claude Code Issue #23265)
B08: claude -p subprocess now runs with env_clear() + selective environment injection (security hardening)
B10: fallback parsing of result field when structured_output absent (workaround for Claude Code Issue #18536)
B11: FileEvent index field now uses consistent 0-based indexing across success and failure paths
B12: invalid entity_type from Claude now emits tracing::warn! instead of silent discard
B13: non-canonical relationship types now validated via warn_if_non_canonical() before insertion

Added

--claude-timeout flag for ingest --mode claude-code (default: 300 seconds per file)

Changed

ingest --mode claude-code uses --bare when ANTHROPIC_API_KEY is set (faster startup, no plugins), --dangerously-skip-permissions for OAuth users

Assets 13

23 May 01:46

github-actions

v1.0.60

cc935a7

v1.0.60

[1.0.60] - 2026-05-23

Added

ingest --mode claude-code for LLM-curated entity/relationship extraction via locally installed Claude Code CLI (claude -p headless with --json-schema)
New ingest flags: --mode, --claude-binary, --claude-model, --resume, --retry-failed, --keep-queue, --queue-db, --rate-limit-wait, --max-cost-usd
IngestMode enum: none (default body-only), gliner (NER), claude-code (LLM-curated)
Queue DB (.ingest-queue.sqlite) for resumable claude-code ingestion with per-file tracking
memory-entities-reverse.schema.json for --entity reverse lookup response validation
contract_33b_memory_entities_reverse and schema_33b_memory_entities_reverse tests
delete-entity and merge-entities recipes in COOKBOOK.md (EN/PT)
cleanup-orphans and prune-relations entries in INTEGRATIONS.md (EN/PT)
Ingest modes documentation in llms.txt, llms-full.txt, llms.pt-BR.txt, AGENTS.md, SKILL.md (EN/PT)

Fixed

D1: test_exit_01_validation_invalid_name — changed "x" to "___" (1-char names are valid memory names)
D2-D3: i18n bilingual tests — changed "---" to "___" ("---" is a Clap flag separator, not a value)
D4: test_ingest_fail_fast_aborts_on_first_error — use unreadable files (chmod 000) instead of /proc path; filter error envelope in NDJSON; #[cfg(unix)]
D5: prd_name_double_underscore_rejected — changed "---" to "___"
D6: init_creates_11_migrations_v001_to_v011 — fixed vec literal from [1..9] to [1..11] matching actual 11 migrations
D7: readme_en_bash_examples_all_run — added #[cfg_attr(windows, ignore)] for bash-only tests

Assets 13

Releases: daniloaguiarbr/sqlite-graphrag

v1.0.79 — Fast parallel batched LLM embedding (G42), dim adoption (G43), adaptive batch (G44)

Removed

Deprecated

Fixed — G42: slow, serialized, fragile LLM embedding pipeline

Fixed — G43: dimensionality adoption did not cover the main commands

Fixed — G44: embedding batch size did not scale with the dimensionality

Uh oh!

v1.0.77 — G40 fix: applied_on = NULL blocks all migrations

Critical Bugfix

How to Upgrade

Changes

Fixed

Added

Documentation

Full Details

Uh oh!

v1.0.68

[1.0.68] - 2026-06-03

Fixed

Added

Changed

Uh oh!

v1.0.66

[1.0.66] - 2026-05-29

Fixed

Added

Changed

Data Migration (recommended after upgrade)

Uh oh!

v1.0.65

[1.0.65] - 2026-05-28

Added

Fixed

Changed

Uh oh!

v1.0.64

[1.0.64] - 2026-05-28

Fixed

Added

Uh oh!

v1.0.63

[1.0.63] - 2026-05-27

Fixed

Added

Uh oh!

v1.0.62

[1.0.62] - 2026-05-23

Fixed

Added

Uh oh!

v1.0.61

[1.0.61] - 2026-05-23

Fixed

Added

Changed

Uh oh!

v1.0.60

[1.0.60] - 2026-05-23

Added

Fixed

Uh oh!