Skip to content

feat(α-8 Phase B): C_rag-ontology sector cell + pipeline_context typed-filter overlay#689

Merged
Hashevolution merged 1 commit into
mainfrom
feat/v0.4-alpha8-phase-b-wiring
Jun 2, 2026
Merged

feat(α-8 Phase B): C_rag-ontology sector cell + pipeline_context typed-filter overlay#689
Hashevolution merged 1 commit into
mainfrom
feat/v0.4-alpha8-phase-b-wiring

Conversation

@Hashevolution
Copy link
Copy Markdown
Owner

Summary

α-8 Phase B — wires the Phase A typed filter module into the production pipeline and adds the matrix runner cell for A/B measurement.

Design memo: `docs/design/v0.4-alpha-8-ontology-typed-filter.md` §2.1 + §4.

Changes

File Change
`scripts/qvt_ablation_matrix.py` New cell `C_rag-ontology` between `C_rag-graph` and `C_rag-full`; C_rag-graph now sets `JAMES_DISABLE_TYPED_FILTER=1` explicitly (pre-α-8 baseline)
`core/reasoning/pipeline_context.py` `build_unified_context` prepends typed entity summary BEFORE graph context (byte-additive); reads query from `loop_state["expanded_query"]`
`tests/test_pipeline_typed_filter_overlay.py` NEW — 5 integration tests (flag polarity + R1-R2 acceptance)
`tests/test_qvt_matrix_sector_cells.py` Updated: six → seven cells; new C_rag-graph + C_rag-ontology flag invariant tests

Default behavior post-merge

`JAMES_DISABLE_TYPED_FILTER` unset → typed filter ACTIVE in /query/:

  • LLM sees typed summary [ENTITIES BY TYPE] BEFORE the existing graph context block
  • R1-R5 evidence-of-absence rows emitted for query-relevant types with no entities
  • gemma4 grounding training should detect "Date is empty → null query" pre-α-7 style

Defensive: any exception in typed filter falls through with empty prefix → byte-identical to pre-α-8. Engine logs the error.

Verification

```
$ python -m pytest tests/test_pipeline_typed_filter_overlay.py
5 passed in 0.05s

$ python -m pytest tests/ -k "pipeline or graph_typed_filter or ontology or sector or qvt_matrix"
137 passed, 3350 deselected

$ python -m ruff check core/reasoning/pipeline_context.py scripts/qvt_ablation_matrix.py tests/test_pipeline_typed_filter_overlay.py tests/test_qvt_matrix_sector_cells.py
All checks passed!
```

Bench numbers — DEFERRED to Phase C

Phase B is the measurement instrument enablement, not the measurement. Phase C (next PR) runs N=3 baseline at M_M + 5-tier remeasurement. Quality Delta Card values land with Phase C closure.

Per design memo §1.3 honest framing tier:

  • ⭐⭐ partial if graded Δ ≥ +0.030
  • ⭐ operational if Δ ≤ noise

Out of scope (Phase C-D)

  • Phase C: N=3 baseline + 5-tier remeasurement (cells × tiers × n=3)
  • Phase D: closure analysis + `docs/ARCHITECTURE.md §5.7` typed filter section + acceptance test (multihop_rag null query + Ali poison_01/04 from Phase 4 sweep)

`Quality delta: exempt (label: code — A/B oracle infrastructure; measurement deferred to Phase C per design memo §4)`

🤖 Generated with Claude Code

…d-filter overlay

α-8 implementation Phase B — wires the Phase A typed filter module into
the production pipeline and adds the matrix runner cell for A/B
measurement against C_rag-graph (post-α-7 baseline reference). Design
memo: docs/design/v0.4-alpha-8-ontology-typed-filter.md §2.1 + §4.

## Sector cell registry (scripts/qvt_ablation_matrix.py)

- New cell `C_rag-ontology` between C_rag-graph and C_rag-full —
  C_rag-graph + typed filter (R1-R5 evidence-of-absence preservation).
- C_rag-graph now sets `JAMES_DISABLE_TYPED_FILTER=1` explicitly so it
  remains the pre-α-8 graph baseline (Δ = C_rag-ontology − C_rag-graph
  measures the typed filter contribution in isolation).
- C_rag-full intentionally does NOT set the disable flag → typed
  filter ON in the full stack (production default once α-8 ships).
- Label dict updated with new entry.

## Pipeline overlay (core/reasoning/pipeline_context.py)

- Imports `apply_typed_filter` + `is_typed_filter_disabled` from
  α-8 Phase A module.
- `build_unified_context` now prepends typed entity summary BEFORE the
  existing `build_graph_context_str` output (byte-additive — original
  block preserved verbatim).
- Reads query from `loop_state["expanded_query"]`.
- Defensive: exception in typed filter falls through with empty prefix
  (= byte-identical to pre-α-8), engine._log records the error.
- When `JAMES_DISABLE_TYPED_FILTER=1` the prefix is skipped (= C_rag-graph
  byte-identical pre-α-8 path).

## Tests

- `tests/test_pipeline_typed_filter_overlay.py` — 5 new integration tests:
  * Flag disabled → no typed prefix (byte-identical)
  * Flag enabled → typed prefix prepended (additive)
  * Temporal query emits `[Date]: (none found in graph for this query)`
  * Person query emits `[Person]: Alice` row
  * Order check: typed prefix BEFORE existing graph block
- `tests/test_qvt_matrix_sector_cells.py` — updated:
  * Six → seven standard sector cells (α-8 adds C_rag-ontology)
  * New test: C_rag-graph sets JAMES_DISABLE_TYPED_FILTER=1
  * New test: C_rag-ontology omits JAMES_DISABLE_TYPED_FILTER (filter ON)

## Verification

- 137/137 tests pass (pipeline + ontology + sector + typed_filter)
- ruff F-class clean

## Default behavior

Default (no JAMES_DISABLE_TYPED_FILTER env var): typed filter ACTIVE.
This means after merge:
- Production /query/ path emits typed entity summary BEFORE graph
  context — additive change, original block preserved.
- LLM sees R1-R5 evidence-of-absence rows for query-relevant types
  with no entities surfaced.
- gemma4 grounding training should now detect "Date is empty → null
  query" the way it did pre-α-7 with 41-161 entity surface.

## Bench numbers — DEFERRED to Phase C

Phase C (next PR) runs the actual measurement: N=3 baseline at M_M +
5-tier remeasurement. The Quality Delta Card values will land with
Phase C closure, NOT Phase B. Per design memo §1.3 honest framing:
- ⭐⭐ partial if graded Δ ≥ +0.030
- ⭐ operational if Δ ≤ noise
- 5-axis cross-tab + per-question-type required (memo §4)

The C_rag-ontology cell registration in this PR is the **measurement
instrument enablement**, not the measurement itself. Phase C runs the
matrix, Phase D writes closure.

Quality delta: exempt (label: code — A/B oracle infrastructure;
measurement deferred to Phase C per design memo §4)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Hashevolution Hashevolution merged commit 6d6698e into main Jun 2, 2026
4 checks passed
@Hashevolution Hashevolution deleted the feat/v0.4-alpha8-phase-b-wiring branch June 2, 2026 09:53
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant