feat: add AdaLN-Zero conditioning as alternative to FiLM by tashapais · Pull Request #24 · AlmondGod/tinyworlds

tashapais · 2026-04-16T19:43:28Z

Summary

Adds AdaLNZeroNorm to models/norms.py: a pre-norm module that produces (scale, shift, gate) from a zero-initialized MLP. The gate starts at zero so all residual paths are identity at initialization, matching the DiT paper's stabilization trick.
Threads use_adaln_zero through SpatialAttention, TemporalAttention, SwiGLUFFN, MoESwiGLUFFN, STTransformerBlock, STTransformer, and all three model classes (VideoTokenizer, LatentActionModel, DynamicsModel).
When enabled, the forward pattern changes from post-residual FiLM: norm(x + sublayer(x), cond) to pre-norm + gated residual: x + gate * sublayer(adaln(x, cond)).
use_adaln_zero=False by default, preserving all existing behavior and checkpoints.
Added to all config dataclasses and YAML configs.

Test plan

Train with use_adaln_zero: true in training.yaml and verify loss decreases normally
Verify use_adaln_zero: false (default) produces identical results to main
Run inference with a checkpoint trained with AdaLN-Zero (use_adaln_zero: true in inference.yaml)

Adds AdaLNZeroNorm to norms.py: pre-norm with zero-initialized MLP that produces (scale, shift, gate). Gate starts at zero so residual paths are identity at init, stabilizing early training (DiT-style). Each sublayer (SpatialAttention, TemporalAttention, SwiGLUFFN, MoESwiGLUFFN) gains a use_adaln_zero flag. When enabled, the forward switches from post-residual FiLM norm to pre-norm + gated residual: x + gate * sublayer(adaln(x, conditioning)) use_adaln_zero=False by default, preserving all existing behavior. Wired through STTransformer, all three model classes, training scripts, and configs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add AdaLN-Zero conditioning as alternative to FiLM#24

feat: add AdaLN-Zero conditioning as alternative to FiLM#24
tashapais wants to merge 1 commit into
AlmondGod:mainfrom
tashapais:feat/adaln-zero

tashapais commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tashapais commented Apr 16, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant