Continuous-time Neural Field Theory (NFT) — classification through gain-modulated oscillator dynamics on spherical topology, not feedforward computation or attention.
Oscillators on a Fibonacci sphere classify via thalamic gain modulation. No attention mechanism, no value projection — just volume control over which neurons are amplified. 95% MNIST accuracy with 47K parameters.
- Background
- Key Results
- Architecture
- Installation
- Quick Start
- Project Structure
- Experiments
- Visualization
- Related Research
- Current Status
- Roadmap
- Contributing
- License
Standard neural networks compute by propagating activations through layers. Transformers add attention — weighted mixing of representations. The brain does neither: it computes through oscillatory dynamics where the thalamus modulates gain (volume) of cortical populations.
NeuralField implements this principle:
- Oscillators on an S² manifold provide a dynamic substrate (not the computation itself)
- Thalamic gain modulation selects which neurons are amplified — like a mixing board, not attention
- Population rate readout reads the weighted firing rates — like how motor cortex actually works
- CfC closed-form dynamics — no integration loops, analytically solved ODE
The key insight: information processing happens through gain control, not information mixing. Transformers mix representations (Q·K → V weighted sum). Our model just adjusts volume per neuron. This is simpler, more parameter-efficient, and neuroscience-grounded.
| Model | Params | Test Acc |
|---|---|---|
| Frozen CNN → Linear | 500 | 88.2% |
| Frozen CNN → Oscillator + Thalamic | 47K | 95.0 ± 0.6% |
| Frozen CNN → MLP | 7.7K | 94.8% |
| CNN → MLP (end-to-end, reference) | ~155K | 99.1% |
| Model | Params | Test Acc |
|---|---|---|
| Oscillator + Slow Decay (bias=-5) | 36K | 96.55% |
| LSTM(128) | 82K | 98.70% |
| GRU(128) | 62K | 98.57% |
| Oscillator (default bias=0) | 36K | 9.6% |
Key finding: A single initialization change (interp_bias=-5 → σ(-5)≈0.007 per-step update rate) transforms the oscillator from complete memory failure (9.6%) to near-LSTM performance (96.55%) with less than half the parameters. The architecture works for temporal data — the bottleneck was interpolation rate, not architecture.
[CNN Backbone] (frozen, pretrained 99%)
│
▼ 49 features
[Input Projection] nn.Linear(49, 649) ← learned sensory→cortical mapping
│
▼ 649 cortical nodes
┌─────────────────────────────────────────────────┐
│ S² OSCILLATOR FIELD │
│ │
│ Sensory (49) ─── Motor (100) ─── PFC (500) │
│ 10 sectors │
│ │
│ CfC dynamics (no loop): │
│ t = σ(g·λ·Δt/τ + bias) │
│ x_new = x_old·(1-t) + x_ss·t │
│ │
│ F_sync = Σ K_ij (x_j - x_i) │
└─────────────┬───────────────────────────────────┘
│ node states (xr, xi)
▼
┌─────────────────────────────────────────────────┐
│ THALAMIC ROUTER (gain modulation) │
│ │
│ 8 nuclei (multi-head, NOT attention): │
│ Q = regional_summary → [B, K, d] │
│ K = node_state → [B, N, K, d] │
│ gain = σ(Q·K - TRN_inhibition + CT_feedback) │
│ │
│ No V projection. No weighted sum. │
│ Just per-node gain ∈ (0, 1) — volume control. │
└─────────────┬───────────────────────────────────┘
│ gain [B, N]
▼
[Population Rate Readout]
rate = √(xr² + xi²) × gain
logits = Linear(rate) → 10 classes
git clone https://github.com/neomakes/neural-field.git
cd neural-field
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install seaborn # for visualizationpython3 scripts/train.py bypass_oscillator=true training.epochs=15 \
logging.wandb_mode=disabledpython3 scripts/train.py \
backbone.enabled=true backbone.freeze=true \
backbone.pretrained=checkpoints/mlp_best.pt \
cell.num_sensory=49 readout.mode=thalamic \
training.energy_weight=0.0 training.confidence_weight=0.0 \
logging.wandb_mode=disabledpython3 notebooks/visualize_dynamics.py
# → outputs/visualizations/*.pngneural-field/
├── neuralfield/ # Main package (v2)
│ ├── cell.py # OscSeaCell — CfC closed-form RNN cell
│ ├── rnn.py # OscSeaRNN — sequence wrapper
│ ├── micro.py # MicroLayer — sigmoid interpolation dynamics
│ ├── macro.py # MacroLayer — legacy ambassador gating
│ ├── router.py # ThalamicRouter — multi-head gain modulation
│ ├── backbone.py # CNNBackbone, CNNMLPBaseline
│ ├── loss.py # Free energy + training loss
│ └── grid.py # Fibonacci sphere topology + sectors
├── scripts/
│ └── train.py # Hydra training entry point
├── notebooks/
│ └── visualize_dynamics.py # Oscillator dynamics visualization
├── config/
│ ├── default.yaml # Full Hydra config
│ └── ablation/ # Experiment override configs
├── legacy/ # Stage 5 Phase 1 code (reproducibility)
├── doc/
│ ├── nft_theory.md # v2.0 theoretical specification
│ └── validation_plan.md # Original experiment design
├── CLAUDE.md
├── TASKS.md
├── requirements.txt
└── LICENSE # MIT
exp(-λΔt/τ) vanishing gradient → σ(λΔt/τ + bias) sigmoid interpolation.
Result: 10% → 87.6%.
ThalamicRouter with multi-head gain modulation + population rate readout. Result: 87.6% → 94.1%.
Thalamic (47K) = 95.0% vs MLP (206K) = 93.9%. Architecture effect confirmed — not just more parameters.
CNN features split into 7 chunks, fed as sequence. Result: 46.6% — CfC memory retention insufficient for temporal accumulation.
Oscillator saturates at magnitude clamp (10.0) within 3 steps.
CfC sigmoid interpolation (t≈0.5) forgets 50% of previous state each step.
After 28 steps: 0.5^28 ≈ 0 — Sequential MNIST impossible without memory fix.
Three approaches under investigation:
- Slow decay init (bias=-5 → t≈0.007 per step)
- Gated memory (LSTM-style cell state)
- Coupling memory (stronger F_sync preserves collective patterns)
The notebooks/visualize_dynamics.py script generates:
- Oscillator Magnitude Heatmap — Sensory/Motor/PFC activity over steps
- Thalamic Gain Pattern — which neurons are amplified/suppressed
- Phase State Evolution — angular state per neuron
- Regional Summary — mean magnitude curves per cortical region
- Motor Sector Rates — per-class gated firing rates (correct class highlighted)
| Work | Relation to NFT |
|---|---|
| Kuramoto Model (1975) | Phase synchronization — NFT's oscillator substrate |
| Free Energy Principle (Friston, 2010) | Predictive processing — NFT's energy functional |
| NCP/CfC (Lechner et al., 2020) | Continuous-time RNN — NFT's sigmoid interpolation derived from CfC |
| Continuous Thought Machine (Sakana AI, 2025) | Internal ticks + synchronization — NFT adds physics + gain modulation |
| Neural ODE (Chen et al., 2018) | Continuous-depth — NFT's CfC is analytical closed-form, no ODE solver |
Active research — v2 architecture validated. Gain modulation > attention for this task.
| Component | Status |
|---|---|
neuralfield/ package (7 modules) |
Complete |
| CfC closed-form dynamics | Complete |
| ThalamicRouter (gain modulation) | Complete |
| MPS/CUDA support | Complete |
| Frozen CNN backbone | Complete |
| Step 1-3 experiments | Complete |
| Dynamics visualization | Complete |
| Saturation fix | Next |
| Temporal data (Speech Commands) | Planned |
| Paper | Planned |
- v2 Architecture — CfC + ThalamicRouter + real tensors + MPS
- MNIST Single-Frame — 95.0% (≈ MLP baseline, oscillator adds minimal value here)
- Memory Retention Fix —
interp_bias=-5solves CfC memory decay (96.55% SeqMNIST) - Sequential MNIST — 28 rows × 28 pixels, true temporal test
- Speech Commands — 40 freq × 100 frames, real-world temporal
- Paper — NeurIPS 2026 / ICML 2027
See CONTRIBUTING.md for guidelines.
MIT — Copyright (c) 2026 NeoMakes