feat(mapper): add simplicio.context-pack/v1 producer and hash-keyed cache (closes #115)#117
Merged
Merged
Conversation
…ache Closes #115. Adopts the token-efficient LLM workflow contracts pinned in simplicio-runtime#70. The mapper is now the canonical producer of compact context for LLM planners (not whole files, not whole repos), and ships a file-backed cache that lets callers reuse summaries for unchanged content without re-summarizing them. Adds: - `simplicio_mapper/context_pack.py`: - `build_context_pack(root, targets, *, project_map, symbol_index, call_graph)` returns a `simplicio.context-pack/v1` envelope with repo metadata (`mapper_schema`, `root_hash`), per-file entries (path, language, `snapshot_hash`, line_count, compact flag, selected `ranges[]` with `range_hash` + snippet, defined `symbols`, resolved `callers` / `imports` from the call graph, related `tests`), `dependencies` carried from project-map, `recent_changes`, and an overall `pack_hash` for caching. - `needs_broader_context` is set with a concrete reason when targets are missing/unreadable, ranges out-of-bounds, or any of `project-map.json` / `symbol-index.json` / `call-graph.json` is absent. Compact context is never claimed safe when anchors, hashes, or symbol coverage are missing. - Files above `COMPACT_LINE_THRESHOLD` (2000 lines) omit snippets and set `compact=True` while still carrying their hashes. - `simplicio_mapper/context_cache.py`: - `ContextCache(path)` — file-backed `simplicio.context-cache/v1` JSON dict keyed by any caller-chosen hash string. Supports `get`, `set`, `clear`, `__contains__`, `__len__`. Writes are persisted immediately; the file is JSON with a `schema` header so external tools can verify it. Tests in `tests/python/test_context_pack.py` (16 cases) cover envelope shape, multi-language fixtures (TS / Python / JSON / MD), range extraction and per-range hash emission, unstable-range and missing-target broader-context fallback, missing upstream artifacts, deterministic `pack_hash` across repeated runs, large-file compact mode, and the cache lifecycle (hit / miss / invalidation / persistence / clear / schema). Fixtures live under `tests/fixtures/ctx-pack-host/` so downstream consumers can copy them. Documents the two schemas in `SIMPLICIO_INTEGRATION.md` next to the Mechanical Edit Contract from #110 and the Native Runtime Contract from #95. https://claude.ai/code/session_01JdmemqddwFnvbceWyuDE8m
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new mapper-side producer for the simplicio.context-pack/v1 contract plus a JSON-backed simplicio.context-cache/v1, alongside tests, fixtures, and integration docs to support token-efficient planning workflows (issue #115).
Changes:
- Introduces
build_context_pack(...)to emit compact per-target file context with hashes, ranges, symbols, callers/imports, and safety signaling (needs_broader_context). - Introduces
ContextCacheto persist summaries keyed by content hashes across runs/processes. - Adds Python unit tests + multi-language fixtures and documents the new contracts in
SIMPLICIO_INTEGRATION.md.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
simplicio_mapper/context_pack.py |
New context-pack builder (hashes, ranges, artifact loading, safety flag) |
simplicio_mapper/context_cache.py |
New hash-keyed JSON cache for storing summaries |
tests/python/test_context_pack.py |
Unit tests covering pack shape, determinism, range handling, and cache behavior |
tests/fixtures/ctx-pack-host/sample.ts |
TypeScript fixture for language detection/snippet emission tests |
tests/fixtures/ctx-pack-host/sample.py |
Python fixture for language detection/snippet emission tests |
tests/fixtures/ctx-pack-host/sample.md |
Markdown fixture for language detection/snippet emission tests |
tests/fixtures/ctx-pack-host/sample.json |
JSON fixture for language detection/snippet emission tests |
SIMPLICIO_INTEGRATION.md |
New documentation section for context packs and hash-based cache |
Comment on lines
+35
to
+51
| _LANGUAGE_BY_EXT = { | ||
| ".ts": "typescript", | ||
| ".tsx": "typescript", | ||
| ".js": "javascript", | ||
| ".jsx": "javascript", | ||
| ".mjs": "javascript", | ||
| ".cjs": "javascript", | ||
| ".py": "python", | ||
| ".md": "markdown", | ||
| ".json": "json", | ||
| ".yaml": "yaml", | ||
| ".yml": "yaml", | ||
| ".toml": "toml", | ||
| ".go": "go", | ||
| ".rs": "rust", | ||
| ".cs": "csharp", | ||
| } |
Comment on lines
+217
to
+224
| digest = hashlib.sha256() | ||
| root_hash = _sha256_text(abs_root) | ||
| digest.update(root_hash.encode("utf-8")) | ||
| for entry in files_out: | ||
| digest.update(entry["snapshot_hash"].encode("utf-8")) | ||
| for selected in entry["ranges"]: | ||
| digest.update(selected["range_hash"].encode("utf-8")) | ||
|
|
Comment on lines
+234
to
+238
| "recent_changes": ( | ||
| project_map.get("recent_changes") | ||
| or project_map.get("changed_files") | ||
| or [] | ||
| ), |
Comment on lines
+67
to
+77
| def _persist(self) -> None: | ||
| directory = os.path.dirname(self.path) | ||
| if directory: | ||
| os.makedirs(directory, exist_ok=True) | ||
| payload = { | ||
| "schema": CONTEXT_CACHE_SCHEMA, | ||
| "entries": self._entries, | ||
| } | ||
| with open(self.path, "w", encoding="utf-8") as handle: | ||
| json.dump(payload, handle, sort_keys=True, indent=2) | ||
| handle.write("\n") |
Comment on lines
+380
to
+384
| "repo": { | ||
| "mapper_schema": "simplicio.mapper-index/v1", | ||
| "root_hash": "<sha256 of the absolute root path>" | ||
| }, | ||
| "pack_hash": "<sha256 over all snapshot+range hashes>", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resumo
Fecha #115. Implementa o lado do mapper como produtor dos contratos
simplicio.context-pack/v1esimplicio.context-cache/v1(canônicos em simplicio-runtime#70).Complementa #110 (mechanical-edit): juntos, o mapper consegue entregar contexto compacto + âncoras de edição estáveis sem forçar o LLM a ler arquivos inteiros.
Mudanças
simplicio_mapper/context_pack.py(novo):build_context_pack(root, targets, *, project_map, symbol_index, call_graph).snapshot_hash, language, line_count, compact flag,ranges[]comrange_hash+snippet,symbols,callers,imports,tests),dependencies,recent_changes,pack_hash,needs_broader_context+ reason..simplicio/project-map.json,symbol-index.json,call-graph.jsonautomaticamente; aceita pre-loaded.COMPACT_LINE_THRESHOLD=2000: arquivos grandes omitem snippets mas mantêm hashes.simplicio_mapper/context_cache.py(novo):ContextCache(path)— JSON-backed, schema-validated,get/set/clear/__contains__/__len__.tests/python/test_context_pack.py— 16 testes: shape, multi-language (TS/Py/JSON/MD), range extraction, unstable-range fallback, missing-target fallback, missing-upstream fallback, deterministic pack_hash, large-file compact mode, cache hit/miss/invalidation/persistence/clear/schema.tests/fixtures/ctx-pack-host/— sample.ts, sample.py, sample.json, sample.md.SIMPLICIO_INTEGRATION.md— nova seção "Context Packs and Hash-Based Cache" entre Mechanical Edit (Adopt mechanical edit contract v1 for mapper context and file snapshots #110) e Native Runtime (Expose stable contracts for unified native Simplicio runtime #95).Safety rule
→
needs_broader_context=Truecom razão concreta sempre que dados estão faltando.Validação
unittest tests.python.test_context_packruff checknpm run lintRefs #115, #110, simplicio-runtime#70.
https://claude.ai/code/session_01JdmemqddwFnvbceWyuDE8m
Generated by Claude Code