Releases: daniloaguiarbr/sqlite-graphrag
Releases · daniloaguiarbr/sqlite-graphrag
v1.0.79 — Fast parallel batched LLM embedding (G42), dim adoption (G43), adaptive batch (G44)
v1.0.79 — Fast parallel batched LLM embedding (G42), dim adoption (G43), adaptive batch (G44)
Latest
Removed
- Daemon infrastructure fully removed:
src/daemon.rs(1120 lines),src/commands/daemon.rs(79 lines),tests/daemon_integration.rs(316 lines) deleted.DaemonOptsstruct and--autostart-daemonflag removed from all command args. Allcrate::daemon::embed_*_or_localcalls replaced with directcrate::embedder::embed_*_localwrappers. CLI is now 100% one-shot with zero IPC. 8 daemon constants removed fromsrc/constants.rs. Net removal: ~764 lines. - Legacy local-model features fully removed (ahead of the v1.1.0 schedule): the
embedding-legacy,ner-legacyandfullCargo features are gone, together with the optionalfastembed,ort,ndarray,tokenizersandhf-hubdependencies andsrc/extraction_gliner.rs.EmbeddingBackendis now a permanent stub returning a clear migration error;extract_graph_autolost its GLiNER delegation path;calculate_safe_concurrencybudgets heavy commands withLLM_WORKER_RSS_MB(350) instead of the obsolete 1100 MB ONNX constant (EMBEDDING_LOAD_EXPECTED_RSS_MBdeleted). The CI matrix shrinks todefault+llm-only. Every build is LLM-only; there is no local-model path.
Deprecated
- GLiNER-era flags are formal no-ops with explicit warnings:
--gliner-variant(onrememberandingest) andingest --mode glinernow emit atracing::warn!deprecation notice when used;--enable-nerperforms URL-regex extraction only. All help strings rewritten to stop promising the removed GLiNER pipeline (model variants, sizes, thresholds);SQLITE_GRAPHRAG_GLINER_VARIANT/_MODEL/_THRESHOLDremain accepted for compatibility but have no effect.
Fixed — G42: slow, serialized, fragile LLM embedding pipeline
- S1 — configurable embedding dimensionality (default 64): single source of truth in
constants.rs(DEFAULT_EMBEDDING_DIM+embedding_dim()); precedence--embedding-dimflag >SQLITE_GRAPHRAG_EMBEDDING_DIMenv >schema_meta.dimof the opened database > 64. Existing 384-dim databases keep working unchanged. ZERO schema change (thedimkey and columns already existed). Basis: MRL, arXiv 2205.13147 — output per vector drops from ~3072 to ~512 tokens (~6x) - S2 — batched LLM calls:
embed_batch_asyncembeds N numbered texts per call with the{items:[{i,v}]}schema; chunks batch at 8, entity names at 25 (calibration bases at dim 64; dim-adaptive since G44) — 39 subprocess spawns collapse into 4-5 - S3 — real parallelism:
Arc<Semaphore>+acquire_owned+JoinSet+join_next/is_panicbounded fan-out inembedder.rs; the global Mutex now guards ONLY the config clone (the oldflush_groupheld it across 30-60s of network I/O, forcing effective parallelism 1); results stream through a BOUNDED mpsc channel (backpressure + incremental delivery); permits = min(--llm-parallelism, cpus, ram*0.5/350MB, 32); new--llm-parallelismflag onremember(default 4),ingest(default 2, multiplies with--ingest-parallelism) andedit - S4 — schema tempfile RAII: codex
--output-schemafiles areNamedTempFiles with randomised names created once per process (no per-call write+delete, no PID-path races); the orphan reaper now also removes stalecodex-home-{pid}dirs whose PID is gone - S5 — claude model env override:
SQLITE_GRAPHRAG_CLAUDE_EMBED_MODEL(symmetric to the codex var); zero hardcoded models without override - S6 — empty
CLAUDE_CONFIG_DIRby default on the embedding path: honoursSQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR, else uses a managed~/.local/state/sqlite-graphrag/claude-empty-config(mode 0700, copies.credentials.jsonwhen present); the MCP-isolation flags are silently ignored upstream (anthropics/claude-code#10787) and a full~/.claudecost ~223k tokens per call (~40-50s → ~10-15s) - S7 — actionable codex headless error:
request_user_inputfailures now explain the cause and remediation instead of an opaque exit 11 - S8 — panic-free signal handler: first signal uses best-effort
writeln!(BrokenPipe ignored); second signal exits 130 with ZERO I/O — eliminates the SIGABRT on orphaned processes (panic = "abort"+ closed stderr pipe) - S9 — canonical one-shot re-embed:
enrich --operation re-embed --limit N --resumedocumented as the official path; newedit --force-reembedregenerates an embedding without changing the body; removed the BROKEN pre-warm recipe (edit --description "<same>"never re-embedded) from MIGRATION/HOW_TO_USE docs - C5 — no silent dimension normalisation:
normalise_dim(truncate/zero-pad) replaced byvalidate_dim, which errors on divergent vectors; the batch parser validates index coverage and per-item dimensionality - Every LLM subprocess now uses
kill_on_drop(true)plus an explicittokio::time::timeout(SQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS, default 300s); a process-wide multi-thread runtime replaces the per-call current-thread runtime - New concurrency tests: peak never exceeds permits (AtomicUsize), panicking task returns its permit via RAII and surfaces
is_panic, cancellation terminates the fan-out quickly, divergent dim fails the fan-out
Fixed — G43: dimensionality adoption did not cover the main commands
- Dim adoption on every connection open: the G42/S1 sync (
schema_meta.dim→ active dim) only ran insideensure_db_ready, whichremember/edit/recall/hybrid-searchnever call — those commands silently used the compiled default (64) against pre-v1.0.79 384-dim databases, writing mixed-dim embeddings that cosine-score 0.0 against each other (vector recall went blind to the old corpus).open_rwANDopen_ronow adopt the recorded database dim (best-effort, env override still wins); 4 regression tests cover rw/ro adoption, env precedence and virgin databases initno longer stampsdim=384: the hardcodedINSERT OR REPLACE ... ('dim', '384')stamped NEW databases with a dim that contradicts the active default; replaced byINSERT OR IGNOREwith the active dim (preserves the recorded dim on re-init of an existing database)rename-entityno longer recordsdim=384and a removed model name: the duplicated INSERT (hardcoded384+multilingual-e5-small) was replaced by the canonicalupsert_entity_vecwriter (real vector length, CLI version asmodel)- Test mocks speak both embedding shapes:
tests/mock-llm/{claude,codex}returned a fixed 384-dim single-shape vector, so the ENTIREslow-testsintegration suite failed since G42/S1+S2 (the gate never runs on CI, hiding it); the mocks now return 64-dim vectors and answer the{items:[{i,v}]}batch schema; the 2 obsolete daemon tests became regression guards for the daemon removal;.config/nextest.tomlno longer filters on the deleteddaemon_integrationbinary —--features slow-testsintegration suite back to green (69/69 on theintegrationbinary)
Fixed — G44: embedding batch size did not scale with the dimensionality
- Dim-adaptive batch size: the G42/S2 batches were FIXED (8 chunks / 25 entity names per LLM call), calibrated for the dim-64 default (~512 / ~1600 floats per response); on legacy 384-dim databases the same chunk batch asked for ~3072 floats — measured in production: claude returned 3 of 8 items (caught by the G42/C5 coverage check) and codex timed out at 300s, failing
remembertwice. The batch size now adapts asclamp(base×64/dim, 1, base)(embedder.rs::adaptive_batch_for_dim): dim 64 keeps 8/25, dim 384 uses 1/4 — constant float budget per call, noSQLITE_GRAPHRAG_EMBED_TIMEOUT_SECSworkaround needed; 6 regression tests cover the formula and the env-dim wrappers
Full changelog: https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/CHANGELOG.md
v1.0.77 — G40 fix: applied_on = NULL blocks all migrations
Critical Bugfix
v1.0.76 migrate --rehash inserted rows into refinery_schema_history without the applied_on field, leaving it NULL. The refinery-core 0.9.1 rusqlite driver reads this field as String (NOT NULL), crashing with InvalidColumnType(Null at index: 2) on any subsequent migration. All migrations were blocked (exit 20).
How to Upgrade
cargo install sqlite-graphrag --version 1.0.77 --force
sqlite-graphrag migrateNo manual SQL intervention needed — v1.0.77 automatically detects and fixes NULL rows.
Changes
Fixed
run_rehashINSERT now always includesapplied_onwith RFC3339 timestamp viachrono::Utc::now()sanitize_null_applied_onhelper runs UPDATE on NULL rows before any migration runner callremove_vec_virtual_tables_without_modulecleans orphan vec0 virtual tables viaPRAGMA writable_schemawhenvec0module is absent (LLM-only build)debug-schemano longer crashes on databases withapplied_on = NULL— field changed fromStringtoOption<String>
Added
null_rows_fixedfield inmigrate --rehashJSON responsenull_rows_fixedandvec_tables_removed_via_writable_schemafields inmigrate --to-llm-onlyJSON response- 4 unit tests + 2 integration tests covering the fix
- ADR-0027 documenting the G40 root cause and resolution (EN + PT-BR)
Documentation
- CHANGELOG, MIGRATION, TESTING, AGENTS, COOKBOOK, DOCUMENTATION_FRAMEWORK updated
- 3 JSON schemas updated (migrate-rehash, migrate-to-llm-only, debug-schema)
Full Details
v1.0.68
[1.0.68] - 2026-06-03
Fixed
cargo install sqlite-graphragbroke on Windows witherror[E0308]: mismatched typesinsrc/terminal.rs:29becauseHANDLEinwindows-sys >= 0.59is*mut c_void(wasisizein 0.48/0.52). Replacedhandle != 0 && handle as isize != -1with the type-safe idiom!handle.is_null() && handle != INVALID_HANDLE_VALUE. Also pinnedwindows-systo=0.59.0exact and added CI jobwindows-build-checkthat runscargo check --target x86_64-pc-windows-msvcon every push (G29).enrichandingest --mode claude-code|codexcould be invoked in parallel against the same namespace and saturate the host (root cause of the 2026-06-03 276-load-average incident). Addedlock::acquire_job_singletonper(job_type, namespace)and a newAppError::JobSingletonLocked { job_type, namespace }exit-75 error. A second concurrent invocation now fails fast instead of stacking 4 × N workers × 10 MCP processes (G28-B).claude_runner::build_claude_commandnow respectsSQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR— when set to an existing empty directory, the subprocess is spawned withCLAUDE_CONFIG_DIR=<that dir>, suppressing user-scoped MCP servers and the 8-10-process fan-out they cause. We deliberately do not pass--strict-mcp-config/--mcp-config '{}'because [anthropics/claude-code#10787] documents that Claude Code CLI ignores both flags.CLAUDE_CONFIG_DIRis the only mechanism upstream actually honours (G28-A).retrymodule gains aCircuitBreakerhelper (withAttemptOutcome::{Success,Transient,HardFailure}and tests) thatenrich --retry-failedcan use to abort persistent-failure loops. Transient / rate-limited errors do NOT count toward the threshold, so a provider that recovers is not penalised (G28-D).- 3 pre-existing test failures in
src/commands/{history,list,read}.rsthat leakedSQLITE_GRAPHRAG_DISPLAY_TZbetween parallel test threads and asserted hardcoded1970-01-01T00:00:00strings now parse the ISO output viachrono::DateTime::parse_from_rfc3339and comparetimestamp()againstDateTime::UNIX_EPOCHfor timezone-agnostic assertions. The full test suite is now green on every timezone (UTC,America/Sao_Paulo,Europe/Berlin, etc.) without per-test setup of the env var.
Added
retry::CircuitBreaker(struct +record/is_open/reset) — opt-in helper for bounded retry loops. Rate-limited and timeout errors are explicitly excluded from the failure count.lock::acquire_job_singleton(job_type, namespace, wait_seconds)— process-wide singleton for heavy commands.constants::JOB_SINGLETON_POLL_INTERVAL_MS = 1000— backing interval for the singleton polling loop.errors::AppError::JobSingletonLocked { job_type, namespace }— exit 75, classified as retryable and with localised PT-BR message.- CI job
windows-build-checkrunscargo check --target x86_64-pc-windows-msvc --lib --all-featuresto catch Windows regressions before publish. tests/terminal_compile_windows.rs— regression test that the publicterminal::init_consoleandshould_use_ansistay callable; on Windows it also references the type-safe HANDLE check.lock::tests— 3 unit tests covering singleton namespace sanitisation, second-invocation blocking, and per-namespace isolation.
Changed
enrichemits atracing::warn!(visible with-v) whenllm_parallelism > 4recommending combining withSQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIRto keep subprocess fan-out manageable (G28-D, non-breaking).Cargo.toml:windows-syspinned to=0.59.0exact (was range0.59).
v1.0.66
[1.0.66] - 2026-05-29
Fixed
- BUG-01 CRITICAL:
reclassify-relationcrash — removedupdated_at = unixepoch()from 3 SQL UPDATE statements referencing non-existent column inrelationshipstable - BUG-02 HIGH:
link --create-missingnow normalizes entity names to kebab-case in both storage and JSON response (created_entitiesarray) - BUG-04 MEDIUM:
deep-researchword-pair decomposition for 3+ word queries without conjunctions — queries like "authentication JWT tokens" now generate multiple sub-queries - BUG-05 LOW:
remember --body-filedefensive UTF-8 handling — invalid byte sequences replaced with U+FFFD instead of process abort - BUG-06 HIGH:
linknow updates weight of existing relationships and reports actual DB weight in JSON response (previously returned requested weight while keeping old value) - HIGH-01 CRITICAL:
deep-researchevidence chains fixed — BFS seeds limited to top-5 memories by score, preventing seed flooding that made all entities seeds with no room for BFS expansion - HIGH-01b:
deep-research --graph-min-scoredefault lowered from 0.2 to 0.05 to avoid discarding valid results in small databases; warns when RRF fusion returns 0 despite KNN/FTS hits - HIGH-04:
link --max-entity-degreewarning now usesemit_progress(always visible on stderr) instead oftracing::warn(requires-v) - HIGH-08:
deep-researchsource classification now reportshybridwhen both KNN and FTS matched, instead of alwaysknn - HIGH-12:
rememberandingestnow usemax_relationships_per_memory()function (readsSQLITE_GRAPHRAG_MAX_RELATIONS_PER_MEMORYenv var) instead of hardcoded constant;remember --graph-stdintruncates with warning instead of rejecting
Added
edit --typeflag to change memory type without re-creating (HIGH-10)deep-research --modereserved field (nonedefault;claude-code/codexplanned for v1.1.0) (HIGH-06)deep-research --max-cost-usdreserved field for future LLM cost tracking (HIGH-09)deep-researchgraph_contextfield in JSON response with entities and relationships from result memories (MEDIUM-01b)deep-research7tracing::debug!calls inexecute_sub_query()for diagnostics with-vv(HIGH-07)graph --format jsonnow includesentitiesalias field alongsidenodesfor LLM agent compatibility (HIGH-05)list --jsonnow includesmemoriesalias field alongsideitemsfor LLM agent compatibility (HIGH-05)graph entities --jsonnow includesdescriptionfield per entity (HIGH-11)health --jsonnow includesvec_memories_missingandvec_memories_orphanedcounts (MEDIUM-09)history --difffirst version now reports baselinechanges: {added_chars: N, removed_chars: 0}instead ofnull(MEDIUM-02)- Entity type validation suggests mapping when memory types are used as entity types: reference→concept, document→file, user→person (HIGH-10c)
rememberafter_long_help documents positional arg limitation and entity_type vs memory_type taxonomy (HIGH-10b)debug-schemacommand renamed from__debug_schemafor discoverability (HIGH-03, still hidden from--help)fuzz/directory with cargo-fuzz targets for graph-stdin JSON and name validation (LOW-01)mutants.tomlconfiguration for cargo-mutants (LOW-02)- CI coverage job with 75% threshold enforcement (LOW-03)
Changed
deep-research --graph-min-scoredefault: 0.2 → 0.05
Data Migration (recommended after upgrade)
- Run
reclassify-relation --from-relation applies-to --to-relation applies_to --batch --yes(and similarly for depends-on, tracked-in) to normalize legacy kebab-case relations to snake_case (HIGH-13) - Run
normalize-entities --yesto merge mixed-case entity duplicates (HIGH-13)
v1.0.65
[1.0.65] - 2026-05-28
Added
reclassify-relationcommand — bulk or single reclassification of relationship types withUPDATE OR IGNORE+DELETEduplicate merging,--dry-run,--filter-source-type/--filter-target-type(GAP-13)normalize-entitiescommand — normalizes existing entity names to lowercase kebab-case and auto-merges near-duplicate collisions, with--dry-run/--yes(GAP-15)enrichcommand — LLM-augmented graph quality via--mode claude-code|codex, scan→judge→persist pipeline, 12 operations (memory-bindings, entity-descriptions, body-enrich and more),--dry-runpreviews without spawning the LLM, queue DB with resume/retry (GAP-14, GAP-18)healthnow reportstop_relation,top_relation_ratio,applies_to_ratio, andrelation_concentration_warningwhen one relation exceeds 40% of edges (GAP-13)deep-researchflags--rrf-k,--graph-decay,--graph-min-score, and--max-neighbors-per-hop--max-entity-degreewarning onlinkandrememberto flag super-hub growth (GAP-17)- JSON schemas
deep-research,reclassify-relation,normalize-entities, andenrich-{phase,item-event,summary}, pluscontract_36..39andschema_36..39tests — restores 100% schema/contract coverage (GAP-01, GAP-02, GAP-03, GAP-04)
Fixed
- GAP-07 CRITICAL:
deep-researchnow computes a separate embedding per sub-query — decomposition was cosmetic because all sub-queries shared the original query embedding for KNN, returning identical results (also resolves GAP-10 centroid collapse and GAP-12 partial decomposition) - GAP-08 CRITICAL:
deep-researchnow fuses KNN, FTS5, and graph pools via Reciprocal Rank Fusion (new sharedstorage::fusion) instead of assigning FTS results a hardcoded score of 0.5 - GAP-11:
deep-researchgraph-pool scoring incorporates seed score, hop decay, and edge weight, fused via RRF with a minimum-score filter - GAP-09 HIGH:
deep-researchevidence chains are now directed seed→target paths (from,to,path,total_weight) filtered by discovered entities, instead of a flat global dump of the top-20 relationships - GAP-15 HIGH: entity names are normalized to lowercase kebab-case on every write AND read path (
find_entity_id,rename-entity,reclassify-relation,prune-ner,enrich) — validation runs on the raw name first so short ALL_CAPS NER noise is still rejected, then the normalized form is stored and looked up
Changed
- GAP-17: graph traversal accepts an optional per-hop neighbor cap (top-K by weight); default behavior is unchanged
- hybrid-search RRF fusion extracted into the shared
storage::fusionmodule (no behavior change) - GAP-16: docs clarify that relations are accepted in kebab-case or snake_case and always stored and emitted as snake_case
v1.0.64
[1.0.64] - 2026-05-28
Fixed
- BUG-1 HIGH:
ingest --mode claude-codenow disables hooks via--settings '{"hooks":{}}'for OAuth users and detectsterminal_reason: "max_turns"— prevents Stop hooks from consuming extraction turns (was failing 65% of files for users with hooks configured) - BUG-2 HIGH:
ingest --mode claude-codenow detects OAuth viaapiKeySourcefrom Claude Code init JSON and omits misleadingcost_usdfrom NDJSON output —--max-cost-usdbudget cap is ignored with warning for subscription users who are not billed per API call - BUG-3 HIGH:
ingest --mode claude-codeand--mode codexnow validate body size BEFORE sending to LLM subprocess — files exceeding 512 KB body cap are skipped with actionable warning instead of wasting LLM tokens on extraction that will be discarded renameandrename-entitynow reject same-name renames with exit 1 (Validation) — prevents version inflation, unnecessary FTS5 sync, and wasted re-embedding
Added
deep-researchcommand for parallel multi-hop GraphRAG research via heuristic query decomposition (up to 7 sub-queries), bounded fan-out withtokio::task::JoinSetandArc<Semaphore>, 3-hop graph traversal, evidence chain assembly, and per-sub-query timeout — defaults calibrated against NovelHopQA, StepChain, HopRAG, and GraphRAG-Bench benchmarks (k=20, max-hops=3, max-sub-queries=7)
v1.0.63
[1.0.63] - 2026-05-27
Fixed
- BUG-1 HIGH:
restoreno longer reverts memory name to version's original — preserves current name after rename, eliminates UNIQUE constraint crash (exit 10) when old name is occupied - BUG-2 HIGH:
ingest --mode claude-codeand--mode codexnow normalize relation strings vianormalize_relation()before canonical check and DB insertion — eliminates falsenon-canonical relationwarnings for kebab-case canonical values (depends-on→depends_on) and prevents mixed-format DB inconsistency - FINDING-1:
editnow re-generates vector embedding when body changes —recallandhybrid-searchreturn accurate similarity scores after edit (parity withrestorewhich already re-embeds)
Added
- AUTHENTICATION section in
ingest --helpdocumenting OAuth-first principle for both--mode claude-codeand--mode codex - Auth failure detection: actionable
tracing::warn!when Claude Code or Codex CLI authentication fails during ingest
v1.0.62
[1.0.62] - 2026-05-23
Fixed
- G01 CRITICAL:
ingest --mode claude-codenow computes and persists vector embeddings —recallandhybrid-searchfind claude-code ingested memories (was creating memories with zero vec_memories/vec_chunks entries) - G02:
validate_claude_version()now compares againstMIN_CLAUDE_VERSION(2.1.0) — rejects incompatible Claude Code versions with actionable error - G03:
env_clear()whitelist forclaude -psubprocess now includes Windows-critical variables (LOCALAPPDATA,APPDATA,USERPROFILE,SystemRoot,COMSPEC,PATHEXT) via#[cfg(windows)] - G04:
skippedcounter in claude-code ingest summary now counts pre-existingdoneentries in queue DB instead of always reporting 0 - G05: files exceeding 10MB stdin limit are rejected with specific error before spawning
claude -p, preventing wasted API credits - G06: memory names from Claude extraction are normalized via
derive_kebab_name()— prevents non-kebab-case names from entering the database - G07: invalid entity names from Claude extraction now emit
tracing::warn!instead of being silently discarded - G08: claude-code queue database (
.ingest-queue.sqlite) now uses WAL journal mode for crash resilience - G09: WAL checkpoint runs after claude-code ingest processing loop completes
- G10:
EXTRACTION_SCHEMAnow includesadditionalProperties: falseat root, entity, and relationship levels — compatible with both Claude Code and Codex structured output
Added
ingest --mode codexfor LLM-curated entity/relationship extraction via locally installed OpenAI Codex CLI (codex exec --json)- New ingest flags:
--codex-binary,--codex-model,--codex-timeoutfor Codex CLI configuration IngestMode::Codexvariant — users can choose between--mode claude-code(Anthropic) and--mode codex(OpenAI) per ingest- JSONL parser for Codex CLI output with "last agent_message wins" pattern (verified against Paperclip production adapter)
- Token usage tracking for Codex ingest (input_tokens, output_tokens) — cost_usd unavailable from Codex CLI
- Full embedding pipeline for Codex-ingested memories (chunking, vec_memories, vec_chunks, vec_entities)
- 7 unit tests for Codex JSONL parser and schema validation
v1.0.61
[1.0.61] - 2026-05-23
Fixed
- B00 CRITICAL:
ingest --mode claude-codenow uses--dangerously-skip-permissionsinstead of--bare— fixes OAuth authentication failure for Pro/Max subscription users - B00a:
--max-turnsincreased from 1 to 3 — Claude needs >1 turn for structured extraction - B07a: memory source field changed from
"claude-code"to"agent"— fixes CHECK constraint violation on insert - B01:
--resumeflag now resets stuckprocessingfiles topendingfor re-processing - B02:
--retry-failedflag now resetsfailedfiles topendingfor retry - B03:
--dry-runnow works with--mode claude-code— emits preview events without spawning Claude - B04: subprocess timeout via
wait-timeoutcrate — killsclaude -pafter--claude-timeoutseconds (default 300) - B05: error messages from
claude -pnow parsed from stdout JSON instead of empty stderr - B06: re-ingesting same directory updates existing memories instead of UNIQUE constraint failure
- B07: cold-start
--json-schemafailure automatically retried once (workaround for Claude Code Issue #23265) - B08:
claude -psubprocess now runs withenv_clear()+ selective environment injection (security hardening) - B10: fallback parsing of
resultfield whenstructured_outputabsent (workaround for Claude Code Issue #18536) - B11: FileEvent
indexfield now uses consistent 0-based indexing across success and failure paths - B12: invalid
entity_typefrom Claude now emitstracing::warn!instead of silent discard - B13: non-canonical relationship types now validated via
warn_if_non_canonical()before insertion
Added
--claude-timeoutflag foringest --mode claude-code(default: 300 seconds per file)
Changed
ingest --mode claude-codeuses--barewhenANTHROPIC_API_KEYis set (faster startup, no plugins),--dangerously-skip-permissionsfor OAuth users
v1.0.60
[1.0.60] - 2026-05-23
Added
ingest --mode claude-codefor LLM-curated entity/relationship extraction via locally installed Claude Code CLI (claude -pheadless with--json-schema)- New ingest flags:
--mode,--claude-binary,--claude-model,--resume,--retry-failed,--keep-queue,--queue-db,--rate-limit-wait,--max-cost-usd IngestModeenum:none(default body-only),gliner(NER),claude-code(LLM-curated)- Queue DB (
.ingest-queue.sqlite) for resumable claude-code ingestion with per-file tracking memory-entities-reverse.schema.jsonfor--entityreverse lookup response validationcontract_33b_memory_entities_reverseandschema_33b_memory_entities_reversetestsdelete-entityandmerge-entitiesrecipes in COOKBOOK.md (EN/PT)cleanup-orphansandprune-relationsentries in INTEGRATIONS.md (EN/PT)- Ingest modes documentation in llms.txt, llms-full.txt, llms.pt-BR.txt, AGENTS.md, SKILL.md (EN/PT)
Fixed
- D1:
test_exit_01_validation_invalid_name— changed"x"to"___"(1-char names are valid memory names) - D2-D3: i18n bilingual tests — changed
"---"to"___"("---"is a Clap flag separator, not a value) - D4:
test_ingest_fail_fast_aborts_on_first_error— use unreadable files (chmod 000) instead of/procpath; filter error envelope in NDJSON;#[cfg(unix)] - D5:
prd_name_double_underscore_rejected— changed"---"to"___" - D6:
init_creates_11_migrations_v001_to_v011— fixed vec literal from[1..9]to[1..11]matching actual 11 migrations - D7:
readme_en_bash_examples_all_run— added#[cfg_attr(windows, ignore)]for bash-only tests