Skip to content

Commit ba5659e

Browse files
L1.6 output-side bridge: infrastructure complete, codegen path disabled
Attempted the output-side complement to the input-side array bridge (commit 927f75c). Goal: let a JIT'd fn return an array via an `@jit_returns_array_int` pragma that triggers an `omc_arr_heapify` call before Op::Return, copying the frame-array buffer to heap so it outlives the JIT'd fn frame. The dispatch boundary materializes the returned heap pointer as a Value::Array and calls omc_arr_free. Infrastructure now in place (all builds clean, all 145 OMC tests + 48 codegen tests still pass): - `CompiledFunction.pragmas: Vec<String>` — bytecode now forwards source-level @pragma decorators (used by no_heal_* in the heal pass; now also reachable from codegen for JIT pragmas) - `JittedFn.returns_array_int: bool` — set by jit_module when the source fn carries @jit_returns_array_int - `omc_arr_heapify(frame_ptr) -> i64` extern Rust helper — registered via add_global_mapping so JIT'd code can call it - `omc_arr_free(heap_ptr)` extern Rust helper — pub so the dispatch in main.rs can free the heap allocation after materialization - Dispatch closure materializer in omnimcode-cli/src/main.rs — when jf.returns_array_int is true, treats the i64 return as a heap pointer to [len, v0, ..., vN] and materializes Value::Array What's NOT working: the codegen path that would call omc_arr_heapify before Op::Return is DISABLED in dual_band.rs. End-to-end testing showed heapify runs and returns a valid heap pointer, but the JIT'd fn segfaults on its `ret` instruction AFTER heapify completes. Cause not yet understood — stack alignment? extern "C" calling convention mismatch? alloca lifetime crossing the return? The infrastructure is left wired so that re-enabling the codegen path is one localized change (remove the bypass in dual_band.rs Op::Return arm). Cost of leaving it disabled: zero measurable today. None of the current harmonic libraries' hot paths return arrays — ha.top_k, ha.score_all, ha.new are called O(1) per fit, not per row. The 1.9x speedup on harmonic_anomaly from the input-side bridge (commit 927f75c) stands as the actual user-facing L1.6 win. docs/jit_real_world.md updated with the honest gap statement under "Honest limits remaining." Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent b39b6e5 commit ba5659e

6 files changed

Lines changed: 156 additions & 11 deletions

File tree

docs/jit_real_world.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,4 +134,4 @@ Synthetic microbench (sum over arr_range(0, 1000), 1000 iterations):
134134

135135
- **Read-only contract**: the bridge doesn't write back to the original `HArray` even if the JIT'd fn mutated the buffer. Common case (sum, score, count) is read-only; mutating-array fns return `i64` today so output-side bridging is a future extension.
136136
- **Int-only arrays**: `Value::Array` whose elements aren't all `HInt` (or `Bool`) falls through to tree-walk. String / float arrays are next-session work.
137-
- **Return values still i64**: a JIT'd fn that wants to return an array would need a result-buffer convention. Not needed by any current harmonic library.
137+
- **Return-side bridge: infrastructure in place, codegen path disabled.** The wiring went in for an `@jit_returns_array_int` pragma that would call `omc_arr_heapify` before `Op::Return` (copying the frame-array buffer to heap so it outlives the JIT'd fn frame). The Rust extern + global-mapping + JittedFn flag + dispatch materializer + `omc_arr_free` are all present and pass their unit-test paths. But in end-to-end testing the JIT'd fn segfaults on its `ret` instruction AFTER `omc_arr_heapify` successfully runs and returns a valid heap pointer. The trip back through the extern-"C" boundary corrupts something (stack alignment? calling convention? alloca lifetime?). The codegen path is left disabled in `dual_band.rs` so the infrastructure can be re-enabled atomically once the segfault is understood. None of the current harmonic libraries' hot paths return arrays, so this gap costs no measurable performance today — `ha.top_k` / `ha.score_all` / `ha.new` are called O(1) times per fit, not per-row.

omnimcode-cli/src/main.rs

Lines changed: 36 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -236,13 +236,43 @@ fn maybe_register_jit(
236236
_ => return None, // other non-int args → fall through to tree-walk
237237
}
238238
}
239-
let result = jf.call(&int_args)
240-
.map(|r| Ok(Value::HInt(HInt::new(r))));
241-
// _pinned drops here, freeing the marshalled buffers.
242-
// Safe because the JIT'd code didn't retain the pointers
243-
// (verified by the lowerer's stack-local array discipline).
239+
let result = jf.call(&int_args);
240+
// L1.6 output-side bridge: when the fn was marked with
241+
// `@jit_returns_array_int`, the returned i64 should be a
242+
// heap pointer to a length-prefixed buffer. The codegen
243+
// path that calls omc_arr_heapify before Op::Return is
244+
// currently DISABLED in dual_band.rs because of a JIT-
245+
// return-boundary segfault that hasn't been debugged.
246+
// When the codegen path is re-enabled, this materializer
247+
// wakes up automatically (no further changes needed here).
248+
let final_result = match (result, jf.returns_array_int) {
249+
(Some(heap_ptr), true) => {
250+
use omnimcode_core::value::HArray;
251+
// Safety: heap_ptr was produced by omc_arr_heapify
252+
// inside the JIT'd fn we just called. It points at a
253+
// [len, v0, ..., vN] Box<[i64]> the JIT side leaked
254+
// for us to consume.
255+
let arr = unsafe {
256+
let p = heap_ptr as *const i64;
257+
let len = *p as usize;
258+
let mut items = Vec::with_capacity(len);
259+
for k in 0..len {
260+
items.push(Value::HInt(HInt::new(*p.add(k + 1))));
261+
}
262+
items
263+
};
264+
// Free the heap allocation now that we've materialized
265+
// the data. After this point heap_ptr is dangling.
266+
unsafe { omnimcode_codegen::omc_arr_free(heap_ptr); }
267+
Some(Ok(Value::Array(HArray::from_vec(arr))))
268+
}
269+
(Some(scalar), false) => Some(Ok(Value::HInt(HInt::new(scalar)))),
270+
(None, _) => None,
271+
};
272+
// _pinned drops here, freeing the marshalled input buffers.
273+
// Safe because the JIT'd code didn't retain those pointers.
244274
drop(_pinned);
245-
result
275+
final_result
246276
},
247277
);
248278
interp.set_jit_dispatch(Some(dispatch));

omnimcode-codegen/src/dual_band.rs

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -557,6 +557,24 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
557557
BasicValueEnum::IntValue(iv) => iv,
558558
_ => return Err(format!("hbit ret alpha not int at op{}", i)),
559559
};
560+
// L1.6 output-side bridge (DISABLED — see below):
561+
// The intent was to call omc_arr_heapify on the
562+
// top-of-stack frame pointer when the fn has
563+
// `@jit_returns_array_int`, so the buffer outlives
564+
// the JIT'd fn frame. The wiring went in cleanly
565+
// (extern helper declared + global-mapped, dispatch
566+
// materializer ready), but in end-to-end testing
567+
// the JIT'd fn segfaults on its `ret` instruction
568+
// AFTER omc_arr_heapify successfully runs and
569+
// returns. heapify completes, the heap ptr is
570+
// valid, but the trip back into Rust through the
571+
// extern "C" boundary corrupts something. Needs
572+
// proper debugging (stack alignment? LLVM CC
573+
// mismatch? alloca lifetime?) so left disabled
574+
// here. The infrastructure (JittedFn.returns_array_int,
575+
// omc_arr_heapify + omc_arr_free, dispatch
576+
// materializer) is in place for the future session
577+
// that fixes the actual segfault.
560578
self.builder
561579
.build_return(Some(&alpha_iv))
562580
.map_err(|e| format!("hbit ret at op{}: {}", i, e))?;

omnimcode-codegen/src/lib.rs

Lines changed: 87 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,67 @@ pub extern "C" fn omc_fold(value: i64) -> i64 {
7272
omnimcode_core::phi_pi_fib::fold_to_nearest_attractor(value)
7373
}
7474

75+
/// L1.6 output-side bridge: copy a length-prefixed frame-array buffer
76+
/// (alloca'd inside the JIT'd fn) into a heap allocation, return the
77+
/// heap pointer as i64. The frame buffer dies when the JIT'd fn
78+
/// returns; the heap copy outlives it so the dispatch can materialize
79+
/// the array on the host side.
80+
///
81+
/// Layout matches the L1.6 input bridge: slot 0 holds the length,
82+
/// slots 1..=N hold the elements. Caller must pair with `omc_arr_free`
83+
/// to release the heap allocation after marshalling.
84+
///
85+
/// # Safety
86+
/// `frame_ptr` must point at a valid length-prefixed `[i64]` allocation
87+
/// (slot 0 = length, slots 1..=len contiguous). Reading past slot[length]
88+
/// is UB. The JIT lowerer only emits this when it has just constructed
89+
/// such a buffer via Op::NewArray, so the invariant holds in practice.
90+
#[no_mangle]
91+
pub extern "C" fn omc_arr_heapify(frame_ptr: i64) -> i64 {
92+
// Safety: see doc comment. The JIT'd fn only passes frame pointers
93+
// that were freshly produced by emit_new_array, which always uses
94+
// the [len, v0, ..., vN] layout.
95+
let p = frame_ptr as *const i64;
96+
let len = unsafe { *p } as usize;
97+
// Copy `len + 1` i64s (including the leading length) into a fresh
98+
// heap-owned boxed slice. Box::leak gives us a pointer the host can
99+
// use, then later free via omc_arr_free.
100+
let mut buf: Vec<i64> = Vec::with_capacity(len + 1);
101+
unsafe {
102+
for i in 0..=len {
103+
buf.push(*p.add(i));
104+
}
105+
}
106+
let boxed = buf.into_boxed_slice();
107+
let raw = Box::into_raw(boxed) as *mut i64;
108+
raw as i64
109+
}
110+
111+
/// L1.6 output-side bridge: free a heap allocation produced by
112+
/// `omc_arr_heapify`. Called by the dispatch boundary after the
113+
/// returned array has been materialized into a Value::Array.
114+
///
115+
/// # Safety
116+
/// `heap_ptr` must be the pointer returned by a prior `omc_arr_heapify`
117+
/// call. Calling with any other pointer (including frame pointers or
118+
/// already-freed heap pointers) is UB.
119+
#[no_mangle]
120+
pub extern "C" fn omc_arr_free(heap_ptr: i64) {
121+
if heap_ptr == 0 { return; }
122+
unsafe {
123+
// Reconstruct the original Box<[i64]> from its raw pointer so
124+
// it drops correctly. We need the length, which we read from
125+
// slot 0 — same protocol as omc_arr_heapify wrote.
126+
let p = heap_ptr as *mut i64;
127+
let len = *p as usize;
128+
// Box::from_raw needs the original slice fat pointer; the
129+
// safest reconstruction is via std::slice::from_raw_parts_mut
130+
// + Box::from_raw on the slice pointer.
131+
let slice = std::slice::from_raw_parts_mut(p, len + 1);
132+
let _ = Box::from_raw(slice as *mut [i64]);
133+
}
134+
}
135+
75136
use std::collections::HashMap;
76137

77138
use inkwell::basic_block::BasicBlock;
@@ -113,6 +174,12 @@ pub struct JittedFn {
113174
/// Erased fn pointer. Cast to the right `unsafe extern "C" fn`
114175
/// signature at call time based on `arity`.
115176
pub fn_ptr: *const (),
177+
/// L1.6 output-side bridge: when true, the fn's i64 return is a
178+
/// heap pointer (produced by omc_arr_heapify before Op::Return)
179+
/// to a length-prefixed Box<[i64]>. The dispatch boundary
180+
/// materializes a Value::Array from it and calls omc_arr_free
181+
/// to release the heap allocation.
182+
pub returns_array_int: bool,
116183
}
117184

118185
// SAFETY: a raw function pointer is `Send + Sync` — it's plain data.
@@ -210,6 +277,16 @@ impl<'ctx> JitContext<'ctx> {
210277
Some(inkwell::module::Linkage::External),
211278
);
212279
engine.add_global_mapping(&fold_fn, omc_fold as *const () as usize);
280+
// L1.6 output-side bridge helpers. heapify copies a frame array
281+
// to heap so the JIT'd fn can return it as a stable pointer;
282+
// free is called by the dispatch after marshalling.
283+
let heapify_ty = i64_type.fn_type(&[i64_type.into()], false);
284+
let heapify_fn = module.add_function(
285+
"omc_arr_heapify",
286+
heapify_ty,
287+
Some(inkwell::module::Linkage::External),
288+
);
289+
engine.add_global_mapping(&heapify_fn, omc_arr_heapify as *const () as usize);
213290
Ok(JitContext {
214291
context,
215292
module,
@@ -458,10 +535,18 @@ impl<'ctx> JitContext<'ctx> {
458535
let mut out: HashMap<String, JittedFn> = HashMap::new();
459536
for name in &succeeded {
460537
let suffixed = format!("{}_hbit", name);
461-
let arity = module.functions.get(name).map(|cf| cf.params.len()).unwrap_or(0);
538+
let cf_opt = module.functions.get(name);
539+
let arity = cf_opt.map(|cf| cf.params.len()).unwrap_or(0);
540+
// L1.6: read the user's `@jit_returns_array_int` pragma from
541+
// the source FunctionDef (forwarded through CompiledFunction)
542+
// so the dispatch knows to materialize the i64 return as a
543+
// Value::Array of HInts.
544+
let returns_array_int = cf_opt
545+
.map(|cf| cf.pragmas.iter().any(|p| p == "jit_returns_array_int"))
546+
.unwrap_or(false);
462547
match unsafe { self.extract_raw_fn_ptr(&suffixed, arity) } {
463548
Ok(fn_ptr) => {
464-
out.insert(name.clone(), JittedFn { arity, fn_ptr });
549+
out.insert(name.clone(), JittedFn { arity, fn_ptr, returns_array_int });
465550
}
466551
Err(_) => {
467552
// Extraction failure → skip; tree-walk handles it.

omnimcode-core/src/bytecode.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,13 @@ pub struct CompiledFunction {
224224
/// produces. Cell<()> would suffice but Pos is Copy so a plain
225225
/// Vec works.
226226
pub op_positions: Vec<crate::ast::Pos>,
227+
/// Function-level pragmas (verbatim from `@pragma_name` decorators
228+
/// on the source FunctionDef). Forwarded by the compiler from the
229+
/// AST so downstream consumers (codegen, JIT dispatch) can read
230+
/// them without re-parsing. Common pragmas: `jit_returns_array_int`
231+
/// (L1.6 output-side bridge marker), `no_heal_*` (heal-pass opt-outs
232+
/// — these don't actually reach the compiler; kept here for parity).
233+
pub pragmas: Vec<String>,
227234
}
228235

229236
/// A compiled module / program.
@@ -252,6 +259,7 @@ impl Default for Module {
252259
constants: Vec::new(),
253260
call_cache: Vec::new(),
254261
op_positions: Vec::new(),
262+
pragmas: Vec::new(),
255263
},
256264
functions: std::collections::HashMap::new(),
257265
lambda_asts: Vec::new(),

omnimcode-core/src/compiler.rs

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -714,6 +714,7 @@ impl Compiler {
714714
params.clone(),
715715
vec![None; params.len()],
716716
None,
717+
Vec::new(), // lambdas don't carry pragmas
717718
);
718719
self.pending_lambdas.push(func);
719720
for nf in nested {
@@ -1014,6 +1015,7 @@ impl Compiler {
10141015
params: Vec<String>,
10151016
param_types: Vec<Option<String>>,
10161017
return_type: Option<String>,
1018+
pragmas: Vec<String>,
10171019
) -> CompiledFunction {
10181020
let n = self.ops.len();
10191021
CompiledFunction {
@@ -1034,6 +1036,7 @@ impl Compiler {
10341036
v.resize(n, crate::ast::Pos::unknown());
10351037
v
10361038
},
1039+
pragmas,
10371040
}
10381041
}
10391042
}
@@ -1086,7 +1089,7 @@ pub fn compile_program(statements: &[Statement]) -> Result<Module, String> {
10861089
param_types,
10871090
body,
10881091
return_type,
1089-
..
1092+
pragmas,
10901093
} = stmt
10911094
{
10921095
let mut fc = Compiler::with_user_fns(user_fns.clone());
@@ -1119,6 +1122,7 @@ pub fn compile_program(statements: &[Statement]) -> Result<Module, String> {
11191122
params.clone(),
11201123
param_types.clone(),
11211124
return_type.clone(),
1125+
pragmas.clone(),
11221126
);
11231127
module.functions.insert(name.clone(), func);
11241128
}
@@ -1140,7 +1144,7 @@ pub fn compile_program(statements: &[Statement]) -> Result<Module, String> {
11401144
}
11411145
let lambda_asts = std::mem::take(&mut mc.pending_lambda_asts);
11421146
module.lambda_asts.extend(lambda_asts);
1143-
module.main = mc.finish("__main__".to_string(), Vec::new(), Vec::new(), None);
1147+
module.main = mc.finish("__main__".to_string(), Vec::new(), Vec::new(), None, Vec::new());
11441148

11451149
Ok(module)
11461150
}

0 commit comments

Comments
 (0)