Skip to content

neomakes/neural-field

NeuralField

License: MIT Python 3.12+ PyTorch 2.x Status: Research

NeuralField Banner

Continuous-time Neural Field Theory (NFT) — classification through gain-modulated oscillator dynamics on spherical topology, not feedforward computation or attention.

Oscillators on a Fibonacci sphere classify via thalamic gain modulation. No attention mechanism, no value projection — just volume control over which neurons are amplified. 95% MNIST accuracy with 47K parameters.

한국어


Table of Contents


Background

Standard neural networks compute by propagating activations through layers. Transformers add attention — weighted mixing of representations. The brain does neither: it computes through oscillatory dynamics where the thalamus modulates gain (volume) of cortical populations.

NeuralField implements this principle:

  • Oscillators on an S² manifold provide a dynamic substrate (not the computation itself)
  • Thalamic gain modulation selects which neurons are amplified — like a mixing board, not attention
  • Population rate readout reads the weighted firing rates — like how motor cortex actually works
  • CfC closed-form dynamics — no integration loops, analytically solved ODE

The key insight: information processing happens through gain control, not information mixing. Transformers mix representations (Q·K → V weighted sum). Our model just adjusts volume per neuron. This is simpler, more parameter-efficient, and neuroscience-grounded.


Key Results

MNIST Single-Frame (same frozen CNN → 49 features)

Model Params Test Acc
Frozen CNN → Linear 500 88.2%
Frozen CNN → Oscillator + Thalamic 47K 95.0 ± 0.6%
Frozen CNN → MLP 7.7K 94.8%
CNN → MLP (end-to-end, reference) ~155K 99.1%

Sequential MNIST T=28 (row-by-row temporal input)

Model Params Test Acc
Oscillator + Slow Decay (bias=-5) 36K 96.55%
LSTM(128) 82K 98.70%
GRU(128) 62K 98.57%
Oscillator (default bias=0) 36K 9.6%

Key finding: A single initialization change (interp_bias=-5σ(-5)≈0.007 per-step update rate) transforms the oscillator from complete memory failure (9.6%) to near-LSTM performance (96.55%) with less than half the parameters. The architecture works for temporal data — the bottleneck was interpolation rate, not architecture.


Architecture

[CNN Backbone]  (frozen, pretrained 99%)
       │
       ▼  49 features
[Input Projection]  nn.Linear(49, 649)  ← learned sensory→cortical mapping
       │
       ▼  649 cortical nodes
┌─────────────────────────────────────────────────┐
│              S² OSCILLATOR FIELD                 │
│                                                   │
│  Sensory (49) ─── Motor (100) ─── PFC (500)     │
│                   10 sectors                      │
│                                                   │
│  CfC dynamics (no loop):                         │
│    t = σ(g·λ·Δt/τ + bias)                       │
│    x_new = x_old·(1-t) + x_ss·t                 │
│                                                   │
│  F_sync = Σ K_ij (x_j - x_i)                    │
└─────────────┬───────────────────────────────────┘
              │ node states (xr, xi)
              ▼
┌─────────────────────────────────────────────────┐
│          THALAMIC ROUTER (gain modulation)        │
│                                                   │
│  8 nuclei (multi-head, NOT attention):           │
│    Q = regional_summary → [B, K, d]              │
│    K = node_state → [B, N, K, d]                 │
│    gain = σ(Q·K - TRN_inhibition + CT_feedback)  │
│                                                   │
│  No V projection. No weighted sum.               │
│  Just per-node gain ∈ (0, 1) — volume control.   │
└─────────────┬───────────────────────────────────┘
              │ gain [B, N]
              ▼
[Population Rate Readout]
  rate = √(xr² + xi²) × gain
  logits = Linear(rate) → 10 classes

Installation

git clone https://github.com/neomakes/neural-field.git
cd neural-field

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install seaborn  # for visualization

Quick Start

Train MLP Baseline (CNN weights for freezing)

python3 scripts/train.py bypass_oscillator=true training.epochs=15 \
  logging.wandb_mode=disabled

Train Thalamic Router (best model)

python3 scripts/train.py \
  backbone.enabled=true backbone.freeze=true \
  backbone.pretrained=checkpoints/mlp_best.pt \
  cell.num_sensory=49 readout.mode=thalamic \
  training.energy_weight=0.0 training.confidence_weight=0.0 \
  logging.wandb_mode=disabled

Visualize Oscillator Dynamics

python3 notebooks/visualize_dynamics.py
# → outputs/visualizations/*.png

Project Structure

neural-field/
├── neuralfield/                 # Main package (v2)
│   ├── cell.py                  # OscSeaCell — CfC closed-form RNN cell
│   ├── rnn.py                   # OscSeaRNN — sequence wrapper
│   ├── micro.py                 # MicroLayer — sigmoid interpolation dynamics
│   ├── macro.py                 # MacroLayer — legacy ambassador gating
│   ├── router.py                # ThalamicRouter — multi-head gain modulation
│   ├── backbone.py              # CNNBackbone, CNNMLPBaseline
│   ├── loss.py                  # Free energy + training loss
│   └── grid.py                  # Fibonacci sphere topology + sectors
├── scripts/
│   └── train.py                 # Hydra training entry point
├── notebooks/
│   └── visualize_dynamics.py    # Oscillator dynamics visualization
├── config/
│   ├── default.yaml             # Full Hydra config
│   └── ablation/                # Experiment override configs
├── legacy/                      # Stage 5 Phase 1 code (reproducibility)
├── doc/
│   ├── nft_theory.md            # v2.0 theoretical specification
│   └── validation_plan.md       # Original experiment design
├── CLAUDE.md
├── TASKS.md
├── requirements.txt
└── LICENSE                      # MIT

Experiments

Step 1: Gradient Repair (exp→sigmoid)

exp(-λΔt/τ) vanishing gradient → σ(λΔt/τ + bias) sigmoid interpolation. Result: 10% → 87.6%.

Step 2: Thalamic Gain Modulation

ThalamicRouter with multi-head gain modulation + population rate readout. Result: 87.6% → 94.1%.

Step 2.5: Parameter Control

Thalamic (47K) = 95.0% vs MLP (206K) = 93.9%. Architecture effect confirmed — not just more parameters.

Step 3: Sequence Processing (T=7)

CNN features split into 7 chunks, fed as sequence. Result: 46.6% — CfC memory retention insufficient for temporal accumulation.

Visualization Finding

Oscillator saturates at magnitude clamp (10.0) within 3 steps. CfC sigmoid interpolation (t≈0.5) forgets 50% of previous state each step. After 28 steps: 0.5^28 ≈ 0 — Sequential MNIST impossible without memory fix.

Open Problem: Memory Retention

Three approaches under investigation:

  • Slow decay init (bias=-5 → t≈0.007 per step)
  • Gated memory (LSTM-style cell state)
  • Coupling memory (stronger F_sync preserves collective patterns)

Visualization

The notebooks/visualize_dynamics.py script generates:

  1. Oscillator Magnitude Heatmap — Sensory/Motor/PFC activity over steps
  2. Thalamic Gain Pattern — which neurons are amplified/suppressed
  3. Phase State Evolution — angular state per neuron
  4. Regional Summary — mean magnitude curves per cortical region
  5. Motor Sector Rates — per-class gated firing rates (correct class highlighted)

Related Research

Work Relation to NFT
Kuramoto Model (1975) Phase synchronization — NFT's oscillator substrate
Free Energy Principle (Friston, 2010) Predictive processing — NFT's energy functional
NCP/CfC (Lechner et al., 2020) Continuous-time RNN — NFT's sigmoid interpolation derived from CfC
Continuous Thought Machine (Sakana AI, 2025) Internal ticks + synchronization — NFT adds physics + gain modulation
Neural ODE (Chen et al., 2018) Continuous-depth — NFT's CfC is analytical closed-form, no ODE solver

Current Status

Active research — v2 architecture validated. Gain modulation > attention for this task.

Component Status
neuralfield/ package (7 modules) Complete
CfC closed-form dynamics Complete
ThalamicRouter (gain modulation) Complete
MPS/CUDA support Complete
Frozen CNN backbone Complete
Step 1-3 experiments Complete
Dynamics visualization Complete
Saturation fix Next
Temporal data (Speech Commands) Planned
Paper Planned

Roadmap

  • v2 Architecture — CfC + ThalamicRouter + real tensors + MPS
  • MNIST Single-Frame — 95.0% (≈ MLP baseline, oscillator adds minimal value here)
  • Memory Retention Fixinterp_bias=-5 solves CfC memory decay (96.55% SeqMNIST)
  • Sequential MNIST — 28 rows × 28 pixels, true temporal test
  • Speech Commands — 40 freq × 100 frames, real-world temporal
  • Paper — NeurIPS 2026 / ICML 2027

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT — Copyright (c) 2026 NeoMakes

About

Continuous-time Neural Field framework — classification through emergent wave interference on spherical topology (Kuramoto + Free Energy + Goldberg Polyhedron)

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors