Skip to content

Commit 4fec2a4

Browse files
Polish: @needs_heal pragma fixes 3 known test failures + ROADMAP.md
## --test honors a @needs_heal file-level pragma test_heal_pass.omc exercises the heal pass itself — its tests EXPECT the heal pass to rewrite typo'd identifiers / arity mismatches before execution. Without heal running, typo_correction, arity_pad, and arity_truncate tests surfaced as runtime errors ("Undefined function: add_two_thigns"). run_named_fn now scans the first 40 lines for a `# @needs_heal` comment; when present, temporarily sets OMC_HEAL=1 (and OMC_HEAL_QUIET=1) around execute_program, then restores env state. Test file gets the pragma at the top of its header comment. Result: 1076/1076 OMC tests pass (was 1073/1076). All Rust tests still pass (213/0). ## ROADMAP.md README has referenced ROADMAP.md for a while but only ROADMAP.json existed (stale, from 2026-05-15 before substrate-attention stack landed). Creates a fresh ROADMAP.md keyed to the chapter releases: - Current chapter: v0.2-ergonomics - In flight: v0.3-symbolic-prediction (substrate-indexed completion) - Beyond v0.3: substrate-attention follow-ups, transformerless LLM, JIT expansion, tooling polish - Done table linking each completed chapter to its release page ROADMAP.json preserved for archaeology. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 863f179 commit 4fec2a4

3 files changed

Lines changed: 113 additions & 4 deletions

File tree

ROADMAP.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# OMC Roadmap
2+
3+
Current chapter: **v0.2-ergonomics** (shipped 2026-05-17).
4+
Next chapter: **v0.3-symbolic-prediction** (in flight).
5+
6+
See [CHANGELOG.md](CHANGELOG.md) and [GitHub Releases](https://github.com/RandomCoder-lab/OMC/releases) for the chapter-by-chapter history of how OMC got here. This file describes what's on the path going forward.
7+
8+
---
9+
10+
## v0.3-symbolic-prediction (in flight)
11+
12+
**Substrate-indexed code completion: given a partial OMC prefix, return ranked provenance-tracked continuations from a content-addressed corpus.**
13+
14+
The synthesis of two earlier threads — substrate codec (symbolic context) and Prometheus (text prediction) — into a single primitive that LLM agents (and humans) can use to navigate "what could come next here?" while writing OMC. Branching is first-class: each result is a viable continuation with a substrate-distance score and a pointer back to the source function it came from.
15+
16+
### Architecture
17+
18+
- `omnimcode-core/src/predict.rs``CodeCorpus`, `PrefixTrie`, `predict_continuations`.
19+
- Builtins: `omc_corpus_build(paths)` → handle, `omc_predict(prefix_source, corpus_handle, top_k)` → ranked dict.
20+
- CLI subcommand: `omc --predict --files DIR --prefix "fn ..." --top-k 5 --json`.
21+
- Win condition: prefix `fn prom_linear_` against the Prometheus corpus returns `prom_linear_new`, `prom_linear_forward`, `prom_linear_params` ranked by substrate distance, with provenance pointers to the source files.
22+
23+
### Phases
24+
25+
1. Symbol-stream encoding wrapper over the existing `tokenizer::encode` — already produces `Vec<i64>` symbol IDs; just expose a clean ingestion API.
26+
2. `CodeCorpus` builder: parse each file in a path list, extract top-level fns via `extract_top_level_fns`, build entries `{fn_name, source, symbol_stream, canonical_hash, attractor}`.
27+
3. `PrefixTrie` over symbol streams: insert each stream once, query a prefix to get matching corpus indices in O(prefix length).
28+
4. `predict_continuations(corpus, trie, prefix_source, top_k)` — tokenize prefix, query trie, rank surviving matches by `(longest prefix match, smallest substrate distance)`.
29+
5. Rust tests + OMC tests against the lib/ corpus.
30+
6. CLI demo + writeup as `experiments/symbolic_prediction/FINDING.md`.
31+
7. Tag as `v0.3-symbolic-prediction` with chapter release notes.
32+
33+
### Deferred (post-v0.3)
34+
35+
- **Prometheus rerank pass** — once the trie-based candidate list is solid, train a small Prometheus model on the corpus and rerank top-k by token-stream probability.
36+
- **MCP tool surface** — expose `predict_omc_continuation(prefix, top_k)` as an MCP tool so LLM clients can query during code generation.
37+
- **Streaming queries** — incremental updates as the prefix grows token-by-token.
38+
- **Cross-corpus blending** — query multiple corpora (project, stdlib, registry) with weighted ranking.
39+
40+
---
41+
42+
## Beyond v0.3 (rough)
43+
44+
### Substrate-attention follow-ups
45+
46+
- Substrate-modulated Q projection. Q hasn't been swapped yet; the V resample recipe (post-projection modulation) may generalize.
47+
- Substrate FF: dampen off-attractor activations in the feed-forward residual.
48+
- Substrate LayerNorm: substrate-distance-weighted variance computation.
49+
- Larger-scale validation: every substrate-attention claim was made at TinyShakespeare scale (1.1MB). Need to verify the stack holds at 10-100MB corpora.
50+
51+
### Transformerless LLM
52+
53+
The substrate-attention components stack to −8.94% inside one block. The path forward is a top-to-bottom harmonic-only architecture trained competitively. Open: how to handle non-integer-coherent quantities at this scale (the substrate metric only applies to integer-valued quantities, per the rule derived from the HBit-gate falsification).
54+
55+
### JIT path expansion
56+
57+
- AVX-512 widening — blocked on array-processing OMC fns to fill the wider lanes.
58+
- JIT for float-returning harmonic primitives — `returns_float` dispatch flag mirroring `returns_array_int`.
59+
- JIT for dict ops — currently pure tree-walk for string-keyed data; the L1 array-of-hashed-int rewrite avoided this for hot paths.
60+
61+
### Tooling polish
62+
63+
- Improved formatter (`--fmt`) — preserve comments, configurable line width.
64+
- LSP improvements: completion (uses the v0.3 predict engine), hover with substrate signature.
65+
- VS Code extension: snippet library, inline hint UI for the heal pass.
66+
67+
---
68+
69+
## Done (linked to chapter releases)
70+
71+
| Chapter | Key shipped items |
72+
|---|---|
73+
| [v0.2-ergonomics](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.2-ergonomics) | `+=` / `-=` / `*=` / `/=` / `%=`, `len`/`range`/`getenv`/`to_hex`/`parse_int`, negative array indexing, did-you-mean, traced errors, 11 heal classes |
74+
| [v0.1-substrate-attention](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.1-substrate-attention) | Substrate-K + S-MOD softmax + substrate-V resample → −8.94% val on TinyShakespeare |
75+
| [v0.0.6-prometheus](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.0.6-prometheus) | Tape autograd, AdamW, Embedding, LayerNorm, multi-block transformer, first substrate-K wins |
76+
| [v0.0.5-codec-kernel-protocol](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.0.5-codec-kernel-protocol) | Substrate codec, `omc-kernel`, `omc-grep`, OMC-PROTOCOL v1, substrate-aware tokenizer |
77+
| [v0.0.4-jit-and-dual-band](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.0.4-jit-and-dual-band) | LLVM JIT, dual-band SSE2 codegen, harmony-gated branch elision, array support |
78+
| [v0.0.3-substrate-and-stdlib](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.0.3-substrate-and-stdlib) | Heal pass, substrate-routed search family, stdlib expansion, `--check` / `--fmt` |
79+
| [v0.0.2-language-core](https://github.com/RandomCoder-lab/OMC/releases/tag/v0.0.2-language-core) | Parser, two-engine interpreter, HInt, bytecode VM, self-hosting fixpoint |
80+
| V0.0.1 | Genesis: circuit evolution engine, FFI, Unity/Unreal bindings |
81+
82+
`ROADMAP.json` is preserved for archaeology — it captured the state through v0.0.4. This file supersedes it as the canonical forward plan.

examples/tests/test_heal_pass.omc

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
1+
# @needs_heal
2+
#
13
# Tests for the self-healing compiler.
24
#
3-
# These tests run UNDER the heal pass (the runner enables OMC_HEAL
4-
# automatically when `--test` is invoked with healed-aware mode), so
5-
# the SUT is the heal pass itself transforming the test fn AST.
5+
# The @needs_heal pragma above tells `--test` to enable OMC_HEAL for
6+
# this file — the SUT IS the heal pass transforming the test fn AST.
7+
# Without heal, tests for typo/arity correction would surface as
8+
# runtime errors because the typo'd identifiers wouldn't resolve.
69
#
710
# Each test asserts that the heal observed a specific rewrite by
811
# checking the RUNTIME behavior of the healed code rather than the

omnimcode-cli/src/main.rs

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -714,7 +714,31 @@ fn run_named_fn(source: &str, name: &str) -> Result<(), String> {
714714
// execute_program path runs it after the rest of the file
715715
// (including all other fn defs the test might depend on).
716716
let augmented = format!("{}\n{}();\n", source, name);
717-
execute_program(&augmented)
717+
// File-level pragma `# @needs_heal` (in a comment line near the top
718+
// of the file) auto-enables OMC_HEAL for the duration of this test
719+
// run. Used by test_heal_pass.omc, which exercises the heal pass
720+
// directly — its tests EXPECT the heal pass to fire and rewrite
721+
// typos/arity mismatches before execution. Honors a per-invocation
722+
// env-var save/restore so it doesn't bleed across tests.
723+
let needs_heal = source.lines()
724+
.take(40)
725+
.any(|l| l.trim_start().starts_with('#') && l.contains("@needs_heal"));
726+
let saved = std::env::var("OMC_HEAL").ok();
727+
if needs_heal {
728+
std::env::set_var("OMC_HEAL", "1");
729+
// Also silence the diagnostic dump per-test so output stays clean.
730+
if std::env::var("OMC_HEAL_QUIET").is_err() {
731+
std::env::set_var("OMC_HEAL_QUIET", "1");
732+
}
733+
}
734+
let result = execute_program(&augmented);
735+
if needs_heal {
736+
match saved {
737+
Some(v) => std::env::set_var("OMC_HEAL", v),
738+
None => std::env::remove_var("OMC_HEAL"),
739+
}
740+
}
741+
result
718742
}
719743

720744
/// `--audit FILE`: run FILE under both engines (tree-walk + VM)

0 commit comments

Comments
 (0)