Skip to content

metaSATOKEN/Recync_framework

Repository files navigation

Recync: Safety-Constrained Control for Coherence Dynamics in Transformer-Based Systems

Python 3.10+ License: Apache 2.0 Tests

What is Recync?

Large language models fail in characteristic ways: they loop on the same phrase (attention locking), hallucinate (semantic drift), or lose track of context mid-sentence (structural fragmentation). Recync is a control framework that detects these failure modes in real time via internal state monitoring and corrects them through principled intervention.

The framework operates at two levels:

  • Token-level intervention (Recync v3): Per-step hidden-state monitoring with adaptive sampling control. Achieves robust detection (Cohen's d > 1.3) but modest intervention effects (d = +0.211) due to structural limits in discrete-step control.
  • Response-level intervention (Recync v4): Checkpoint restart at pre-crisis points with seed perturbation. Achieves medium-to-large intervention effects (d = +0.494 to +1.020, all p < 0.0001) with zero iatrogenic harm across 5 model architectures (117M--1.5B parameters).

Key Results

Response-Level Checkpoint Restart (15 experiments, N = 2,070 experimental units)

Model Params PIR Cohen's d 95% CI Iatrogenic Crisis-Free
GPT-2 Small 117M +0.494 [+0.354, +0.687] 0/137 (0%) 43.1%
Pythia-160M 160M +0.958 [+0.762, +1.185] 0/177 (0%) 21.5%
GPT-2 Medium 355M +0.796 [+0.591, +1.028] 0/44 (0%) 59.1%
Qwen2-1.5B 1,544M +1.020 [+0.866, +1.193] 0/183 (0%) 14.2%
TinyLlama-1.1B 1,100M +1.40 -- 0% --
  • Zero-tuning transfer: Identical detection/intervention parameters across all 5 models -- no per-model calibration
  • Zero iatrogenic events: The protocol never makes things worse (0% across all experiments)
  • Scale reversal: GPT-2 Medium -- most resistant to token-level control -- becomes a strong responder under response-level restart (d = +0.796)
  • Billion-scale validation: Largest effect (d = +1.020) on Qwen2-1.5B (GQA, RoPE, SwiGLU architecture)
  • Length-invariant: Effect sizes maintained at T=300; multi-restart (2R) improves crisis-free rate by +10.6pp

Token-Level Intervention (69 experiments, ~15,000 paired runs)

  • Detection: Cohen's d = 1.83 (fragmentation), d = 1.36 (hallucination)
  • Intervention: d = +0.211 (p = 0.037) on GPT-2 Small after extensive optimization
  • Structural limits identified: harm threshold, attractor switching, model-specific calibration

Paper Summaries

This repository contains two companion papers that tell a single story: how far can we push real-time coherence control in Transformers?

Paper 1 -- Token-Level Intervention (paper/token_level/)

K. Sato, "From Monitoring to Intervention: Control-Theoretic Coherence Management in Transformers and the Limits of Discrete Safety Enforcement," 2026.

Recync v3 introduces the theoretical foundations: a 3D order-parameter space Z(t) = [lambda, lambda_sem, z] extracted from Transformer internals via a non-invasive Phi-mapping. The dynamics are governed by a Ginzburg-Landau potential with provable stability, and safety is enforced through stochastic Control Barrier Functions solved via quadratic programming. Over 69 experiments across six phases, the paper systematically discovers that monitoring is solved (d > 1.3 across failure modes and architectures) but token-level intervention hits structural walls: a harm threshold in intervention frequency, semantic attractor switches during hidden-state steering, and model-specific calibration that breaks across architectures. The final configuration -- severity-adaptive temperature with a recovery gate -- achieves the first significant positive result (d = +0.211, p = 0.037), but the gap between detection and intervention motivates a fundamentally different approach.

Paper 2 -- Response-Level Intervention (paper/response_level/)

K. Sato, "Beyond Micro-Control: Response-Level Checkpoint Restart for Safe Coherence Recovery in Transformers," 2026.

Recync v4 resolves the detection-intervention asymmetry by moving from token to response granularity. When a crisis is detected, instead of nudging individual tokens, the protocol rewinds to 3 tokens before crisis onset and regenerates with a different random seed. This simple strategy exploits the model's own representational capacity to find alternative coherent trajectories. The result is a complete scale reversal: GPT-2 Medium, which was the most resistant model under token-level control, becomes a strong responder (d = +0.796, crisis-free rate 59.1%). Effect sizes increase with model size rather than decreasing, zero-tuning transfer holds across all tested architectures (117M to 1.5B), and zero iatrogenic harm is maintained across all 2,070 experimental units. The paper explicitly tests and rejects adaptive complexity in favor of a single fixed protocol.

How It Works

Detection

Cosine similarity between consecutive last-layer hidden states (cosim) serves as a real-time coherence proxy. A crisis is detected when k=3 consecutive cosim values fall below a relative threshold (mean - 0.6*std over a 20-step baseline window). This signal transfers across all tested architectures (GPT-2, GPTNeoX, Qwen2, Llama) without modification.

Token-Level Intervention (Recync v3)

Per-step intervention during generation: when a crisis is detected, the framework adjusts sampling parameters (temperature, top-p) based on crisis severity, with a recovery gate that skips intervention when the model is self-recovering. 69 experiments systematically characterize the structural limits of this approach -- including harm thresholds in intervention frequency and semantic attractor switches -- culminating in a modest but significant effect (d = +0.211, p = 0.037).

Response-Level Restart (Recync v4)

Generate --> Monitor cosim --> Detect crisis --> Rewind to onset-3 --> Regenerate with new seed

When a crisis is detected, the protocol rewinds to 3 tokens before crisis onset and regenerates with a different random seed (seed + 10000) at natural temperature (T=1.0, top_p=0.9). Rather than fighting the model's dynamics at token granularity, this exploits the model's own representational capacity to find alternative coherent trajectories. The same detection parameters and restart rule work identically across all tested models (117M--1.5B).

Repository Structure

Recync/
├── experiments/
│   ├── gpt2_integration/              # Token-level experiments (Exp 01--69)
│   │   ├── README.md                  # Experiment guide
│   │   ├── PAPER_RESULTS.md           # Full results report
│   │   └── *.py                       # 72 experiment scripts + result files
│   └── response_level/                # Response-level experiments (Exp 01--14)
│       ├── README.md                  # Experiment guide
│       ├── PAPER_RESULTS.md           # Full results report
│       └── *.py                       # 15 experiment scripts + result files
├── paper/
│   ├── token_level/                   # Paper 1: token-level intervention
│   │   ├── from_monitoring_to_intervention.tex/.md/.pdf
│   │   └── figures/
│   └── response_level/                # Paper 2: response-level intervention
│       ├── beyond_micro_control.tex/.md/.pdf
│       └── figures/
├── pyproject.toml
└── requirements.txt

Quick Start

git clone https://github.com/metaSATOKEN/Recync_framework.git
cd Recync_framework

# Install dependencies
pip install torch transformers scipy numpy

# Run a response-level experiment
python experiments/response_level/03_restart_diff_replication.py

# Results saved as timestamped JSON
ls experiments/response_level/*.json

Environment

Item Spec
Machine MacBook Pro (Apple Silicon, arm64)
RAM 16 GB
GPU Apple MPS (Metal Performance Shaders)
Python 3.13.5
PyTorch 2.10.0
Transformers 5.0.0

Citation

Paper 1 -- Token-Level Intervention

@article{sato2026token,
  title   = {From Monitoring to Intervention: Control-Theoretic Coherence
             Management in Transformers and the Limits of Discrete Safety
             Enforcement},
  author  = {Sato, Kentaro},
  year    = {2026},
  doi     = {10.5281/zenodo.19148449},
  url     = {https://doi.org/10.5281/zenodo.19148449}
}

Paper 2 -- Response-Level Checkpoint Restart

@article{sato2026response,
  title   = {Beyond Micro-Control: Response-Level Checkpoint Restart
             for Safe Coherence Recovery in Transformers},
  author  = {Sato, Kentaro},
  year    = {2026},
  doi     = {10.5281/zenodo.19148721},
  url     = {https://doi.org/10.5281/zenodo.19148721}
}

License

  • Code: Apache License 2.0 -- see LICENSE
  • Papers: Creative Commons Attribution 4.0 (CC BY 4.0) -- see paper/LICENSE

About

Runtime detection and control of LLM coherence failures (looping, hallucination, context loss). No fine-tuning. Zero iatrogenic harm. 69 experiments across 5 architectures.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors