Skip to content

Commit 08d4e87

Browse files
Sessions F + G + H: divergent bands, harmony intrinsic, cross-fn calls
Three sessions shipped together because they share the dual-band intrinsics surface and the cross-fn-call refactor in jit_module. == Session F: phi_shadow(x) — divergent β Tree-walk: pass-through (returns x unchanged). Tree-walk has no β to manipulate; the value stays α-only. Dual-band JIT: intercepted as intrinsic. Replaces the β lane of the value's <2 x i64> carrier with `(phi_fold(α) * 1000) as i64`. Implemented as a chain of LLVM ops in the lowerer: sitofp i64 -> f64 fmul double, PHI call @llvm.floor.f64 fsub double → fractional part [0, 1) fmul double, 1000.0 fptosi double -> i64 insertelement <2 x i64> at lane 1 After phi_shadow, harmony() of the value diverges from 1000 (matched-band initial condition). Bands maintain their own arithmetic paths through subsequent ops; they only re-converge on another phi_shadow call. 5 tests: pass-through parity, IR-shape snapshot, α preservation through subsequent arithmetic, end-to-end via Interpreter dispatch hook. == Session G: harmony(x) — runtime harmony reading Tree-walk: returns 1000 (perfect). With no β, harmony is trivial. Dual-band JIT: intercepted as intrinsic. Extracts α and β from the value's vector and calls a new extern Rust helper: #[no_mangle] pub extern "C" fn omc_harmony(alpha: i64, beta: i64) -> i64 { let h = HBit::harmony(alpha, beta); (h * 1000.0).round() as i64 } The fn is pre-declared in the LLVM module and bound via add_global_mapping in JitContext::new. JIT'd code calls it at ~native speed. 3 tests: - unshadowed harmony returns 1000 (matched bands) - shadowed harmony < 1000 (bands diverge after phi_shadow) - JIT'd OMC code branches on harmony to skip work — the cost-cut primitive that @predict needs The third test is architecturally significant: it's the first JIT'd OMC fn that decides at runtime whether to do expensive work based on harmony of its input. With harmony >= 500, the fn returns the cheap path; otherwise the expensive path runs. This is "@predict cuts cost" working in shipped code. == Session H: cross-fn calls in dual-band JIT Sessions C-G's lowerer rejected any Op::Call whose target name wasn't the current fn (only self-recursion). jit_module is now three-phase: 1. DECLARE every user fn in the module up-front (signature only, empty body) so cross-fn references can be resolved. 2. LOWER each fn's body using prepare_existing instead of prepare (which previously double-declared and would clash). 3. EXTRACT raw fn pointers via typed get_function. Op::Call now resolves to: - self.function for self-recursion (unchanged path) - module.get_function("<name>_hbit") for cross-fn calls - error if target not declared (caller fn gets erased; tree- walk handles it) A failed body-lowering replaces the partial fn with a single "return 0" entry block instead of erasing it entirely — keeps the symbol table valid for any other fn that referenced it. 4 tests: - simple cross-fn call (caller calls helper) - dispatch fn calling one of two helpers based on a comparison - cross-fn call to a recursive fn (factorial called from double_fact, factorial(10) = 3.6M, 2x = 7.26M) - cascade fail: caller calling unsupported fn → both skipped, pure fn nearby still JIT's normally == Workspace state Codegen: 31 tests pass (8 scalar + 5 dual-band parity + 1 IR snapshot + 5 dispatch + 5 phi_shadow + 3 harmony + 4 cross-fn). omnimcode-core: 149 unit tests pass. OMC harmonic-lib: 18/18 pass. Smoke test: tree-walk and JIT both produce correct output. Bench (Session E): JIT 277x faster than tree-walk on factorial(12), 206x on sum_to(100). Numbers held within noise. == What's left for "@hbit cuts times/cost" parity with SL claims The Sovereign Lattice hbit_full_demo claimed up to 80,000x with @hbit + @Harmony + @predict + @avx512 + @unsafe. After Sessions A-H we have: - @hbit (dual-band) ✅ - @Harmony (harmony() builtin reads divergent bands) ✅ - @predict (harmony-gated branch elision) ✅ Still missing: - @avx512 (<8 x i64> widening; needs array-processing OMC fns to actually have work for the wider lanes — deferred until codegen supports OMC arrays) - @unsafe (fast-math, unroll; LLVM optimizer already runs aggressive loop opts; explicit @unsafe pragma plumbing is straightforward when there's a use case) The 277x measured in Session E was @hbit alone. Adding @Harmony and @predict to that bench would show the cost-cut on high-harmony inputs (cheap branch wins) vs low-harmony inputs (expensive branch runs). That bench is the natural follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 3ce40cc commit 08d4e87

6 files changed

Lines changed: 818 additions & 27 deletions

File tree

omnimcode-codegen/src/dual_band.rs

Lines changed: 225 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,10 @@ use crate::CodegenError;
4444
/// from `lib.rs` but with `<2 x i64>` as the carrier type throughout.
4545
pub(crate) struct DualBandLowerer<'ctx, 'a> {
4646
ctx: &'ctx Context,
47+
/// The LLVM module emit-target. Held so intrinsics that need to
48+
/// look up/declare external helper fns (llvm.floor.f64, harmony
49+
/// callback, etc.) can do so without going through transmute.
50+
module: &'a LlvmModule<'ctx>,
4751
builder: Builder<'ctx>,
4852
function: FunctionValue<'ctx>,
4953
f: &'a CompiledFunction,
@@ -86,6 +90,7 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
8690

8791
Ok(DualBandLowerer {
8892
ctx,
93+
module,
8994
builder,
9095
function,
9196
f,
@@ -96,6 +101,46 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
96101
})
97102
}
98103

104+
/// Variant of `prepare` that reuses an already-declared
105+
/// FunctionValue from the module instead of declaring a new one.
106+
/// Used by `JitContext::jit_module`'s phase-2 body lowering, which
107+
/// needs the declarations populated up-front so cross-fn calls
108+
/// (Session H) can find their targets by name.
109+
pub(crate) fn prepare_existing(
110+
ctx: &'ctx Context,
111+
module: &'a LlvmModule<'ctx>,
112+
f: &'a CompiledFunction,
113+
) -> Result<Self, CodegenError> {
114+
let i64_type = ctx.i64_type();
115+
let v2i64 = i64_type.vec_type(2);
116+
let suffixed = format!("{}_hbit", f.name);
117+
let function = module
118+
.get_function(&suffixed)
119+
.ok_or_else(|| format!("prepare_existing: {} not declared", suffixed))?;
120+
let builder = ctx.create_builder();
121+
Ok(DualBandLowerer {
122+
ctx,
123+
module,
124+
builder,
125+
function,
126+
f,
127+
v2i64,
128+
blocks: HashMap::new(),
129+
var_slots: HashMap::new(),
130+
cleanup_pops: std::collections::HashSet::new(),
131+
})
132+
}
133+
134+
/// Convenience wrapper used by `JitContext::jit_module` —
135+
/// `prepare_existing` then `lower`.
136+
pub(crate) fn lower_existing(
137+
ctx: &'ctx Context,
138+
module: &'a LlvmModule<'ctx>,
139+
f: &'a CompiledFunction,
140+
) -> Result<FunctionValue<'ctx>, CodegenError> {
141+
Self::prepare_existing(ctx, module, f)?.lower()
142+
}
143+
99144
pub(crate) fn lower(mut self) -> Result<FunctionValue<'ctx>, CodegenError> {
100145
let entry = self.ctx.append_basic_block(self.function, "entry");
101146
self.builder.position_at_end(entry);
@@ -418,16 +463,51 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
418463
}
419464

420465
Op::Call(name, argc) => {
421-
if name != &self.f.name {
422-
return Err(format!(
423-
"Session C hbit Call only supports recursive self-call; got call to {} at op{}",
424-
name, i
425-
));
466+
// HBit intrinsics — intercepted before the generic
467+
// user-fn-call path. Pattern-match on (name, argc).
468+
if name == "phi_shadow" && *argc == 1 {
469+
// Session F: replace β with phi_fold(α) * 1000.
470+
// α stays untouched (the user-visible value is
471+
// unchanged), β becomes the harmonic shadow.
472+
let v = self.pop(&mut stack, i, "phi_shadow arg")?;
473+
let new_v = self.emit_phi_shadow(v, i)?;
474+
stack.push(new_v);
475+
continue;
426476
}
427-
// The recursive self-call wants scalar i64 args
428-
// (because the caller-facing fn signature is scalar).
429-
// Extract α from each vector arg, pass as i64,
430-
// splat the scalar return back to a vector.
477+
if name == "harmony" && *argc == 1 {
478+
// Session G: harmony() calls the extern Rust
479+
// helper `omc_harmony(α, β) -> i64` which
480+
// computes the substrate-routed harmony in
481+
// [0, 1000]. Pre-declared in JitContext::new
482+
// and bound via global mapping.
483+
let v = self.pop(&mut stack, i, "harmony arg")?;
484+
let h_scalar = self.emit_harmony_call(v, i)?;
485+
let h_v = self.splat(h_scalar, "harmony_ret_v")?;
486+
stack.push(h_v);
487+
continue;
488+
}
489+
// Resolve the call target. Self-recursion uses
490+
// self.function directly. Cross-fn calls (Session
491+
// H) look up `<name>_hbit` in the module's symbol
492+
// table — populated by jit_module's phase-1
493+
// declaration pass before any body emission.
494+
let target_fn = if name == &self.f.name {
495+
self.function
496+
} else {
497+
let suffixed = format!("{}_hbit", name);
498+
match self.module.get_function(&suffixed) {
499+
Some(f) => f,
500+
None => {
501+
return Err(format!(
502+
"hbit Call target {} not declared (not JIT-eligible) at op{}",
503+
suffixed, i
504+
));
505+
}
506+
}
507+
};
508+
// Args: extract α from each vector, pass scalars
509+
// (the called fn's caller-facing signature is
510+
// scalar i64; it splats internally).
431511
let mut vec_args: Vec<VectorValue<'ctx>> = Vec::with_capacity(*argc);
432512
for _ in 0..*argc {
433513
vec_args.push(self.pop(&mut stack, i, "Call arg")?);
@@ -452,7 +532,7 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
452532
}
453533
let call = self
454534
.builder
455-
.build_call(self.function, &scalar_args, "callret")
535+
.build_call(target_fn, &scalar_args, "callret")
456536
.map_err(|e| format!("hbit Call at op{}: {}", i, e))?;
457537
let ret = call
458538
.try_as_basic_value()
@@ -484,6 +564,141 @@ impl<'ctx, 'a> DualBandLowerer<'ctx, 'a> {
484564
Ok(())
485565
}
486566

567+
/// Session G intrinsic: read α and β out of the vector and call
568+
/// the extern Rust helper `omc_harmony(α, β) -> i64` which
569+
/// computes the substrate-routed harmony scaled to [0, 1000].
570+
/// Returns the i64 result as a scalar — the caller is expected
571+
/// to splat it back into a vector if needed.
572+
fn emit_harmony_call(
573+
&self,
574+
v: VectorValue<'ctx>,
575+
op_idx: usize,
576+
) -> Result<IntValue<'ctx>, CodegenError> {
577+
let i64_type = self.ctx.i64_type();
578+
let alpha = self
579+
.builder
580+
.build_extract_element(v, i64_type.const_int(0, false), "harmony_alpha")
581+
.map_err(|e| format!("harmony extract α at op{}: {}", op_idx, e))?;
582+
let beta = self
583+
.builder
584+
.build_extract_element(v, i64_type.const_int(1, false), "harmony_beta")
585+
.map_err(|e| format!("harmony extract β at op{}: {}", op_idx, e))?;
586+
let alpha_iv = match alpha {
587+
BasicValueEnum::IntValue(iv) => iv,
588+
_ => return Err(format!("harmony: α not int at op{}", op_idx)),
589+
};
590+
let beta_iv = match beta {
591+
BasicValueEnum::IntValue(iv) => iv,
592+
_ => return Err(format!("harmony: β not int at op{}", op_idx)),
593+
};
594+
// omc_harmony is pre-declared in the module by JitContext::new
595+
// and bound via add_global_mapping. Look it up by name.
596+
let harmony_fn = self
597+
.module
598+
.get_function("omc_harmony")
599+
.ok_or_else(|| format!("harmony: omc_harmony not declared at op{}", op_idx))?;
600+
let call = self
601+
.builder
602+
.build_call(
603+
harmony_fn,
604+
&[alpha_iv.into(), beta_iv.into()],
605+
"harmony_call",
606+
)
607+
.map_err(|e| format!("harmony call at op{}: {}", op_idx, e))?;
608+
let ret = call
609+
.try_as_basic_value()
610+
.left()
611+
.ok_or_else(|| format!("harmony call no value at op{}", op_idx))?;
612+
match ret {
613+
BasicValueEnum::IntValue(iv) => Ok(iv),
614+
_ => Err(format!("harmony call ret not int at op{}", op_idx)),
615+
}
616+
}
617+
618+
/// Session F intrinsic: replace the β lane of a `<2 x i64>`
619+
/// vector value with the phi-shadow of α.
620+
///
621+
/// phi_fold(α) = frac(α * PHI) — the fractional part of α scaled
622+
/// by the golden ratio, in [0, 1). We multiply by 1000 to get an
623+
/// integer-friendly range, then cast back to i64. This matches
624+
/// the existing `HBitProcessor::phi_fold` semantics used by tree-
625+
/// walk callers when they want a divergent β.
626+
///
627+
/// After this op, harmony(α, β) is non-trivial: β depends on α
628+
/// in a way that's stable under matched-band operations (Add a
629+
/// constant to both → diff preserved → harmony unchanged) and
630+
/// breaks under operations that touch only one band.
631+
fn emit_phi_shadow(
632+
&self,
633+
v: VectorValue<'ctx>,
634+
op_idx: usize,
635+
) -> Result<VectorValue<'ctx>, CodegenError> {
636+
let i64_type = self.ctx.i64_type();
637+
let f64_type = self.ctx.f64_type();
638+
// Extract α from lane 0.
639+
let alpha = self
640+
.builder
641+
.build_extract_element(v, i64_type.const_int(0, false), "shadow_alpha")
642+
.map_err(|e| format!("phi_shadow extract α at op{}: {}", op_idx, e))?;
643+
let alpha_iv = match alpha {
644+
BasicValueEnum::IntValue(iv) => iv,
645+
_ => return Err(format!("phi_shadow: α not int at op{}", op_idx)),
646+
};
647+
// α_d = (double) α
648+
let alpha_d = self
649+
.builder
650+
.build_signed_int_to_float(alpha_iv, f64_type, "alpha_d")
651+
.map_err(|e| format!("phi_shadow sitofp at op{}: {}", op_idx, e))?;
652+
// α_phi = α_d * PHI
653+
let phi_const = f64_type.const_float(crate::PHI);
654+
let alpha_phi = self
655+
.builder
656+
.build_float_mul(alpha_d, phi_const, "alpha_phi")
657+
.map_err(|e| format!("phi_shadow mul PHI at op{}: {}", op_idx, e))?;
658+
// floor(α_phi) via llvm.floor.f64 intrinsic
659+
let floor_fn = match self.module.get_function("llvm.floor.f64") {
660+
Some(f) => f,
661+
None => {
662+
let ft = f64_type.fn_type(&[f64_type.into()], false);
663+
self.module.add_function("llvm.floor.f64", ft, None)
664+
}
665+
};
666+
let floor_call = self
667+
.builder
668+
.build_call(floor_fn, &[alpha_phi.into()], "alpha_phi_floor")
669+
.map_err(|e| format!("phi_shadow floor at op{}: {}", op_idx, e))?;
670+
let floor_val = floor_call
671+
.try_as_basic_value()
672+
.left()
673+
.ok_or_else(|| format!("phi_shadow floor no value at op{}", op_idx))?;
674+
let floor_f = match floor_val {
675+
BasicValueEnum::FloatValue(fv) => fv,
676+
_ => return Err(format!("phi_shadow floor not float at op{}", op_idx)),
677+
};
678+
// frac = α_phi - floor(α_phi) ∈ [0, 1)
679+
let frac = self
680+
.builder
681+
.build_float_sub(alpha_phi, floor_f, "alpha_frac")
682+
.map_err(|e| format!("phi_shadow sub at op{}: {}", op_idx, e))?;
683+
// β_d = frac * 1000.0
684+
let one_thousand = f64_type.const_float(1000.0);
685+
let beta_d = self
686+
.builder
687+
.build_float_mul(frac, one_thousand, "beta_d")
688+
.map_err(|e| format!("phi_shadow mul1000 at op{}: {}", op_idx, e))?;
689+
// β = (i64) β_d
690+
let beta_iv = self
691+
.builder
692+
.build_float_to_signed_int(beta_d, i64_type, "beta_i64")
693+
.map_err(|e| format!("phi_shadow fptosi at op{}: {}", op_idx, e))?;
694+
// Replace lane 1 of v with β. α (lane 0) is preserved.
695+
let new_v = self
696+
.builder
697+
.build_insert_element(v, beta_iv, i64_type.const_int(1, false), "shadow_v")
698+
.map_err(|e| format!("phi_shadow insert β at op{}: {}", op_idx, e))?;
699+
Ok(new_v)
700+
}
701+
487702
fn splat(&self, scalar: IntValue<'ctx>, name: &str) -> Result<VectorValue<'ctx>, CodegenError> {
488703
let i64_type = self.ctx.i64_type();
489704
let undef = self.v2i64.get_undef();

0 commit comments

Comments
 (0)