Skip to content

chore(release): v1.0.68 — Process Lifecycle Governance and Windows Compile Fix#1

Merged
daniloaguiarbr merged 5 commits into
mainfrom
release/v1.0.68
Jun 4, 2026
Merged

chore(release): v1.0.68 — Process Lifecycle Governance and Windows Compile Fix#1
daniloaguiarbr merged 5 commits into
mainfrom
release/v1.0.68

Conversation

@daniloaguiarbr
Copy link
Copy Markdown
Owner

Summary

This release ships three CRITICAL/HIGH fixes documented in gaps.md:

  • G28-A — MCP server isolation via SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR. Root cause of the 2026-06-03 276-load-average incident was 4 concurrent enrich invocations × 2 workers × 10 user-scoped MCP servers = ~192 processes. The original mitigation plan (--strict-mcp-config and --mcp-config '{}') was invalidated by [BUG] Claude CLI Ignores --mcp-config and --strict-mcp-config Flags anthropics/claude-code#10787 — Claude Code CLI ignores both flags. CLAUDE_CONFIG_DIR is the only mechanism upstream actually honours.
  • G28-B — Per-namespace job singleton via lock::acquire_job_singleton(JobType, namespace, wait_seconds) integrated into enrich, ingest --mode claude-code, and ingest --mode codex. A second concurrent invocation fails fast with AppError::JobSingletonLocked (exit 75, retryable).
  • G28-D — retry::CircuitBreaker with AttemptOutcome::{Success, Transient, HardFailure} classification. Rate-limited and timeout errors are explicitly excluded from the failure count.
  • G29 — cargo install sqlite-graphrag on Windows now succeeds. v1.0.66 and v1.0.67 broke with error[E0308]: mismatched types in src/terminal.rs:29 because HANDLE in windows-sys >= 0.59 is *mut c_void (was isize in 0.48/0.52). windows-sys is pinned to =0.59.0 exact.
  • Test Fixes — 3 pre-existing test failures in src/commands/{history,list,read}.rs that leaked SQLITE_GRAPHRAG_DISPLAY_TZ between parallel test threads. Tests now parse ISO via chrono::DateTime::parse_from_rfc3339 and compare timestamp() against DateTime::UNIX_EPOCH for timezone-agnostic assertions.

Commit Breakdown

The release is organised into 5 conventional commits (no Co-authored-by trailer, no AI-agent signatures):

  1. fix(terminal): use HANDLE.is_null() + INVALID_HANDLE_VALUE for Windows compile (G29) — 6 files, +116/-16
  2. feat(claude): isolate MCP servers via SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR (G28-A) — 2 files, +41/-0
  3. feat(lock+retry): per-namespace job singleton + CircuitBreaker (G28-B + G28-D) — 8 files, +377/-8
  4. ci(workflows): add windows-build-check job to catch Windows regressions — 1 file, +16/-0
  5. chore(release): v1.0.68 with ADRs, GitHub templates, and full documentation audit — 38 files, +1687/-5707

Validation (all 8 gates pass)

  • cargo fmt --all --check — clean
  • cargo clippy --all-targets --all-features -- -D warnings — 0 warnings
  • cargo test --all-features — 692 passed, 0 failed, 3 ignored (pre-existing)
  • RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features — 0 warnings
  • cargo audit — 2 allowed warnings (transitive ravif/image/fastembed RUSTSEC-2024-0436 and tokenizers RUSTSEC-2025-0119; both are upstream issues with no fix available)
  • cargo deny check advisories licenses bans sources — all ok
  • cargo publish --dry-run --allow-dirty — Packaged 250 files, 3.4MiB (904.0KiB compressed), no errors
  • cargo package --list — no .env, no .pem, no .key, no credentials

New Files

  • docs/decisions/adr-008-process-lifecycle-singleton.md (G28-B)
  • docs/decisions/adr-009-windows-sys-handle-pinning.md (G29)
  • docs/decisions/adr-010-mcp-isolation-claude-config-dir.md (G28-A)
  • .github/ISSUE_TEMPLATE/bug_report.md
  • .github/ISSUE_TEMPLATE/feature_request.md
  • .github/ISSUE_TEMPLATE/config.yml (forces template selection, 4 contact links)
  • .github/PULL_REQUEST_TEMPLATE.md (5 checklists: Validation, Documentation, Commit Hygiene, Test Coverage, Risk Assessment)
  • tests/terminal_compile_windows.rs (2 regression tests)

Documentation

24 markdown files updated across EN + pt-BR. All 12 bilingual pairs now carry explicit cross-references. The DOCUMENTATION_FRAMEWORK.md checklist is marked 100% complete. Three historical gaps (README cross-ref, INTEGRATIONS cross-ref, GitHub templates) are marked as STATUS LEGADO.

Risk Assessment

  • G29 is a hard requirement fix for Windows users on v1.0.66/v1.0.67. Without v1.0.68, those users cannot upgrade. The fix is minimal and type-safe; the =0.59.0 pin prevents future cargo-update surprises.
  • G28-B is backward compatible for users who never invoked 2+ heavy commands concurrently. The new exit-75 JobSingletonLocked is a clear actionable error with the existing --wait-job-singleton affordance.
  • G28-A is opt-in via the new env var. Default behaviour unchanged.
  • G28-D is non-breaking: enrich's parallelism warning is a tracing::warn! at -v verbosity; no behaviour change for default verbosity.
  • Test Fixes have no production impact: only changes how test assertions are written.

Post-merge

Maintainer will:

  1. Tag the merge commit as v1.0.68
  2. git push origin v1.0.68 to trigger release.yml for cross-platform binaries
  3. cargo publish --allow-dirty to publish to crates.io

Checklist

  • Validation: cargo fmt, clippy, test, doc — all pass
  • No Co-authored-by: Claude/Codex/GPT/Copilot trailers
  • All 12 bilingual pairs have cross-references
  • New ADRs explain each architectural decision
  • GitHub templates formalize contribution flow
  • No secrets, no .env, no credentials in the tarball
  • MSRV 1.88 unchanged
  • windows-sys pinned to =0.59.0 exact

…s compile (G29)

cargo install sqlite-graphrag broke on Windows with error[E0308]: mismatched
types in src/terminal.rs:29 because HANDLE in windows-sys >= 0.59 is
*mut c_void (was isize in 0.48/0.52). The comparison handle != 0 && handle
as isize != -1 only worked for the old isize representation.

Replaced with the type-safe idiom !handle.is_null() && handle !=
INVALID_HANDLE_VALUE and imported HANDLE and INVALID_HANDLE_VALUE from
windows_sys::Win32::Foundation. Pinned windows-sys to =0.59.0 exact in
Cargo.toml to lock the type contract.

Fixes the 2026-06-03 cargo install failure reported on v1.0.66 and
v1.0.67. v1.0.68 is the first release since v1.0.65 that compiles on
Windows.

Also fixes 3 pre-existing test failures in src/commands/{history,list,read}.rs
that leaked SQLITE_GRAPHRAG_DISPLAY_TZ between parallel test threads and
asserted hardcoded 1970-01-01T00:00:00 strings. Tests now parse the ISO
output via chrono::DateTime::parse_from_rfc3339 and compare timestamp()
against DateTime::UNIX_EPOCH for timezone-agnostic assertions.

Adds tests/terminal_compile_windows.rs that runs on every platform to
confirm terminal::init_console and should_use_ansi stay callable from
outside the crate. The dedicated CI job windows-build-check (separate
commit) runs the full cross-platform type check via
cargo check --target x86_64-pc-windows-msvc.
…NFIG_DIR (G28-A)

The 2026-06-03 276-load-average incident had two multiplication axes:
--llm-parallelism 2 spawning 2 concurrent claude -p subprocesses per
enrich invocation, each loading 8-10 user-scoped MCP servers. Combined
with 2 sibling enrich invocations, this produced 4 processes x 10
servers = 40+ MCP subprocesses per enrich batch.

The original mitigation plan was to pass --strict-mcp-config or
--mcp-config '{}' to suppress the user-scoped MCP fleet. Both flags
are silently ignored by Claude Code CLI per anthropics/claude-code
#10787 (confirmed via DuckDuckGo search and issue thread).

Replaced with the only mechanism upstream Claude Code actually honours:
the CLAUDE_CONFIG_DIR environment variable. When
SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR is set to an existing empty
directory, the subprocess is spawned with CLAUDE_CONFIG_DIR=<that dir>,
masking the user's MCP servers. If the path is missing or not a
directory, a single tracing::warn! is emitted and the subprocess
continues without the override (degraded but non-failing).

The CLI never auto-creates the directory (user owns the lifecycle) and
never deletes it.

Reduces subprocess fan-out from ~192 to ~8 per enrich invocation when
combined with --llm-parallelism 2. The PT-BR warning string is added
to src/i18n.rs for localized diagnostic output.
… + G28-D)

The 2026-06-03 process-proliferation incident revealed that the
shared-process semaphore (max 4 slots across all CLI commands) allowed
4 concurrent enrich invocations on the same database to stack
4 x N workers x 10 MCP servers = ~192 processes. Two layers of
mitigation land in this commit.

G28-B adds a per-namespace job singleton via
lock::acquire_job_singleton(JobType, namespace, wait_seconds) that
runs before any work in enrich, ingest --mode claude-code, and
ingest --mode codex. A second concurrent invocation against the
same (job_type, namespace) tuple fails fast with
AppError::JobSingletonLocked { job_type, namespace } (exit 75,
classified as retryable). The lock file is stored under
~/.local/share/sqlite-graphrag/job-singleton-{tag}-{namespace}.lock
and polled every JOB_SINGLETON_POLL_INTERVAL_MS=1000.

Three unit tests cover the new lock behaviour:
- job_singleton_path_sanitises_namespace
- job_singleton_blocks_second_invocation_same_namespace
- job_singleton_allows_different_namespaces

G28-D adds retry::CircuitBreaker with AttemptOutcome::{Success,
Transient, HardFailure} classification. Rate-limited and timeout
errors are explicitly excluded from the failure count via
AttemptOutcome::Transient, so a provider that recovers is not
penalised. After threshold consecutive HardFailure hits,
record() returns true and the caller should abort.

Three unit tests in src/retry.rs::circuit_breaker_tests:
- opens_after_threshold_consecutive_hard_failures
- ignores_transient_errors
- success_resets_consecutive_failures

enrich emits a tracing::warn! (visible with -v) when --llm-parallelism
exceeds 4, recommending the SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR
combination from G28-A to keep subprocess fan-out manageable.

The error-envelope message template for code 75 now has two distinct
shapes (job singleton vs slot saturation), both routed to the same
exit code. Agents parse job_type and namespace from the message field
via a regex like job '(\w+)'.*namespace '(\w+)'. See
docs/schemas/error-envelope.schema.json for the contract.
The clippy and test jobs run on ubuntu-latest, macos-latest, and
windows-latest, but neither runs cargo check --target x86_64-pc-windows-
msvc. This allowed the v1.0.66/v1.0.67 HANDLE type mismatch in
src/terminal.rs:29 to escape CI entirely until a real Windows user
reported the cargo install failure.

Adds a dedicated windows-build-check job that runs on ubuntu-latest
with the windows-msvc target installed via rustup target add. The job
runs cargo check --target x86_64-pc-windows-msvc --lib --all-features
which is type-check only and does not need the MSVC linker. The job
completes in under 5 minutes on the standard GitHub runner pool.

If a Windows FFI crate bumps (windows-sys, winapi, windows, or any
transitive FFI dep), this job fails before the offending change
reaches crates.io. Pairs with the new
tests/terminal_compile_windows.rs regression test which runs on every
platform and acts as the local pre-publish sanity probe.
…tation audit

Bumps version from 1.0.67 to 1.0.68 and refreshes Cargo.lock to reflect
the windows-sys pin plus the 30 transitive dep updates picked up by
cargo update during the v1.0.68 work.

Documentation changes span 24 markdown files plus 3 ADRs and 4 GitHub
templates, organized as follows.

Three new ADRs document the architectural decisions:
- docs/decisions/adr-008-process-lifecycle-singleton.md (G28-B)
- docs/decisions/adr-009-windows-sys-handle-pinning.md (G29)
- docs/decisions/adr-010-mcp-isolation-claude-config-dir.md (G28-A)

Four new GitHub collaboration files formalize the contribution flow:
- .github/ISSUE_TEMPLATE/bug_report.md
- .github/ISSUE_TEMPLATE/feature_request.md
- .github/ISSUE_TEMPLATE/config.yml (forces template selection,
  provides 4 contact links to documentation, discussions, security
  advisories, and CHANGELOG)
- .github/PULL_REQUEST_TEMPLATE.md (Validation, Documentation, Commit
  Hygiene, Test Coverage, and Risk Assessment checklists)

User-facing documentation is updated in EN + pt-BR:
- README.md and README.pt-BR.md: cross-reference between the two,
  v1.0.68 bullet in 'Version Highlights', Windows G29 warning in
  Quick Start
- CHANGELOG.md and CHANGELOG.pt-BR.md: full v1.0.68 entry covering
  Fixed, Added, and Changed; [Unreleased] explanatory note; duplicate
  v1.0.67 heading from prior release removed
- CONTRIBUTING.md and CONTRIBUTING.pt-BR.md: new Recent Releases
  section summarizing v1.0.68; new Mandatory Pre-Push Checklist with
  11 items including Conventional Commits gate and the
  no-Co-authored-by-for-AI-agents gate
- INTEGRATIONS.md and INTEGRATIONS.pt-BR.md: New Commands and Flags
  since v1.0.68 section; cross-references between the two
- llms.txt, llms.pt-BR.txt, llms-full.txt: What Changed in v1.0.68
  section; cross-references
- docs/AGENTS.md and docs/AGENTS.pt-BR.md: New in v1.0.68 section
  with Process Proliferation Fixes, Windows Build Fix, and Test Fixes
- docs/COOKBOOK.md and docs/COOKBOOK.pt-BR.md: new recipe 'How To
  Cap Process Proliferation on Claude Code Runs (G28)'
- docs/CROSS_PLATFORM.md and docs/CROSS_PLATFORM.pt-BR.md: new
  section 'HANDLE Type and the windows-sys 0.59 Boundary (G29)'
- docs/DOCUMENTATION_FRAMEWORK.md: 3 historical gaps marked as
  STATUS LEGADO; pre-release checklist marked 100% complete; file
  count updated from 19 to 18 MD + 2 template pairs
- docs/HOW_TO_USE.md and docs/HOW_TO_USE.pt-BR.md: new section
  'Capping process proliferation on Claude Code runs (G28)'
- docs/MIGRATION.md and docs/MIGRATION.pt-BR.md: new section
  'v1.0.68 — 2 CRITICAL fixes'; cross-references
- docs/TESTING.md and docs/TESTING.pt-BR.md: new section
  'v1.0.68 Regression Tests' with 4 sub-sections
- docs/schemas/README.md: new section 'Error Envelope Changes in
  v1.0.68 (G28-B)' documenting the two code-75 message templates
- docs/schemas/error-envelope.schema.json: description expanded to
  document the new code-75 Template A (job singleton) and the
  backward-compatible Template B (slot saturation)
- skill/sqlite-graphrag-en/SKILL.md and
  skill/sqlite-graphrag-pt/SKILL.md: New in v1.0.68 section with
  5 sub-sections (G28-B, G28-A, G28-D, G29, Test Fixes); Exit Codes
  updated to mention the dual code-75 template; Error JSON Contract
  updated; cross-references between the two
- gaps.md: full v1.0.68 resolution entry in the Historico de
  Revisoes, covering D1 through D43

gitignore adds docs_prd/ to the per-user artifact list alongside
docs_rules/, MEMORY.md, and ralph-loop.local.md.
@daniloaguiarbr daniloaguiarbr merged commit 4b7d4bf into main Jun 4, 2026
@daniloaguiarbr daniloaguiarbr deleted the release/v1.0.68 branch June 4, 2026 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant