diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/README.md b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/README.md
new file mode 100644
index 0000000000..c954a4486e
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/README.md
@@ -0,0 +1,52 @@
+# Mockingbird
+
+10k-vocab CaseOps body on the SOTA architecture, derived from PR1855.
+
+This is a **non-record** submission. It does not beat the current leader. It is filed as evidence of the SP10240 CaseOps lane on the same compression / phased-TTT machinery as PR1855, for comparison with the SP8192 lane.
+
+## Results
+
+| Seed | val_bpb (quantized_ttt_phased) | Steps | Total submission size |
+|------|--------------------------------|-------|------------------------|
+| 42   | 1.06204667                     | 5,264 | 15,816,988 B           |
+| 0    | 1.06226648                     | 5,231 | 15,818,783 B           |
+| 1    | 1.06299064                     | 5,221 | 15,810,544 B           |
+| **mean** | **1.06243460**             |       | **15,818,783 B (max)** |
+
+Hardware: 8×H100 SXM · 600s wallclock · `bytes_code`: 163,036 (uncompressed) / 41,220 (compressed)
+
+## Architecture
+
+11L · dim 512 · `mlp_mult=3.75` · loop_start=3, loop_end=5, `enable_looping_at=0.45`
+
+- **Vocab/data**: SP10240 CaseOps lossless-caps tokenizer (10,240 tokens), FineWeb 10B sidecar with byte-level loss accounting
+- **Quantization**: per-group, embed int7, matrix int6, LQER asymmetric rank-4
+- **Eval**: PR1855 phased LoRA TTT — `prefix_docs=2500`, `phases=3`, `chunk=48`
+- **Compression**: pergroup
+- **Train budget**: 600 s wallclock, hard 16 MB artifact cap
+
+## Lineage
+
+This is the SP10240 sister of PR #1855 (`510d03e0fc355406c9fd06f92d23b8c5aedea7fb`), which used the same CaseOps + LQER + phased-TTT machinery on SP8192 and reported a 3-seed mean of 1.06107587 post-phased-TTT.
+
+The architecture is held fixed; only the tokenizer / vocab dimension changes (8192 → 10240). The 10k vocab consumes more bytes in the embedding table, so the body is shrunk to MLP3.75 (vs the SP8192 record's wider body) to stay under the 16 MB cap. `enable_looping_at=0.45` matches the same family.
+
+## Seeds
+
+The three runs used identical code and hyperparameters; only the random seed changed. The committed `train_gpt.py` is the seed-42 run (the strongest of the three). Seeds 0 and 1 differ only in `Hyperparameters.seed = N` (line 479 in this file) and the bookkeeping fields `TEST_ID` / `TEST_DATE` / `RUN_KIND` / blurb (lines 433–446). The training body is byte-identical.
+
+Seed choice (`42`, `0`, `1`) reflects the seed-repeat batch we ran on this lane; this submission does not use the protocol's `444 / 300` convention because these specific runs were not re-executed at those seeds.
+
+## Reproduce
+
+```bash
+# From repo root, with flash-attention/hopper on PYTHONPATH
+SKIP_GPTQ=1 SEED=42 torchrun --standalone --nproc_per_node=8 \
+  records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_gpt.py
+```
+
+For seeds 0 and 1, change line 479 (`Hyperparameters.seed = 42`) to `0` or `1` respectively. The default `SEED` env var is overridden inside the file.
+
+## Artifacts
+
+Per-seed compressed artifacts (`final_model.int6.ptz`) and SHA256 hashes are recorded in `submission.json`. Each artifact is well under the 16 MB cap (max 15.82 MB).
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/submission.json b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/submission.json
new file mode 100644
index 0000000000..7335aadf20
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/submission.json
@@ -0,0 +1,54 @@
+{
+  "author": "Frosty40",
+  "github_id": "newjordan",
+  "name": "Mockingbird",
+  "blurb": "10k-vocab CaseOps body on the SOTA architecture: SP10240 lossless-caps tokenizer with PR1855 LQER asym rank-4 + per-group + phased-TTT compression, 11L dim512 mlp_mult=3.75 with looping enabled at 0.45 of training.",
+  "date": "2026-05-01T00:00:00Z",
+  "track": "10min_16mb",
+  "record": false,
+  "val_bpb": 1.0624,
+  "val_bpb_exact": 1.06243460,
+  "val_bpb_std": 0.00040,
+  "seeds": [42, 0, 1],
+  "seed_results": {
+    "42": {
+      "val_bpb": 1.0620,
+      "val_bpb_exact": 1.06204667,
+      "val_loss_exact": 2.38207881,
+      "steps": 5264,
+      "train_time_ms": 599612,
+      "eval_time_ms": 446734,
+      "bytes_total": 15816988,
+      "compressed_artifact_bytes": 15775768,
+      "compressed_artifact_sha256": "68f570ab2cccfa31ecc7064e68eada5fa83cc969c5267a6c74bfd4fe8d5835f9"
+    },
+    "0": {
+      "val_bpb": 1.0623,
+      "val_bpb_exact": 1.06226648,
+      "val_loss_exact": 2.38257182,
+      "steps": 5231,
+      "train_time_ms": 599626,
+      "eval_time_ms": 517991,
+      "bytes_total": 15818783,
+      "compressed_artifact_bytes": 15777612,
+      "compressed_artifact_sha256": "892f585d130801de2116aa3bfcd67aafc337119e1484c1b0f3a54d8e51bb6614"
+    },
+    "1": {
+      "val_bpb": 1.0630,
+      "val_bpb_exact": 1.06299064,
+      "val_loss_exact": 2.38419604,
+      "steps": 5221,
+      "train_time_ms": 599587,
+      "eval_time_ms": 510971,
+      "bytes_total": 15810544,
+      "compressed_artifact_bytes": 15769440,
+      "compressed_artifact_sha256": "46f1f5e5bb1dc67a29c1c88934910832855a0593a008da652a656da413ff2d23"
+    }
+  },
+  "bytes_total": 15818783,
+  "bytes_code": 163036,
+  "bytes_code_compressed": 41220,
+  "hardware": "8xH100 SXM",
+  "wallclock_train_s": 600,
+  "derives_from_pr": "openai/parameter-golf#1855"
+}
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/README.md b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/README.md
new file mode 100644
index 0000000000..a8f69727a7
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/README.md
@@ -0,0 +1,94 @@
+# 10k vocab tooling — reviewer verification
+
+Everything needed to inspect and reproduce the SP10240 CaseOps tokenization stack that mockingbird trained on. This subdir is appendix material — `train_gpt.py` and the seed logs in the parent directory remain the canonical submission.
+
+## Layout
+
+```
+tokenization_10kvocab/
+├── README.md                                    (this file)
+├── tokenizer/
+│   ├── fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model    ← actual tokenizer mockingbird used
+│   ├── fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.vocab
+│   ├── fineweb_10240_bpe.model                  ← base SP10240 BPE (no CaseOps reserves) for reference
+│   ├── fineweb_10240_bpe.vocab
+│   └── tokenizer_specs_sp10240.json             ← BPE training spec (skip_docs, vocab_size, etc.)
+├── build/
+│   ├── run_sp10240_build.sh                     ← one-command rebuild from FineWeb 10B docs
+│   ├── run_sp10240_upload.sh                    ← HF upload helper (used to publish the dataset)
+│   └── sp10240_build.log                        ← BPE training log from the actual build
+├── caseops/
+│   ├── lossless_caps.py                         ← CaseOps codec module (4 reserved operators)
+│   ├── prepare_sp10240_caseops_data.py          ← end-to-end CaseOps tokenizer + dataset prep
+│   ├── build_sp10240_caseops_local.sh           ← local rebuild driver
+│   ├── upload_sp10240_caseops_to_hf.sh          ← HF upload driver
+│   ├── download_sp10240_first80_from_hf.sh      ← partial-shard download (first 80)
+│   ├── download_sp10240_full124_from_hf.sh      ← full 124-shard download
+│   └── stream_pr1855_caseops_to_pod.sh          ← pod streaming helper used in this lane
+└── notes/
+    ├── 2026-04-30_10k_caseops_hf_lane.md        ← derivation note: how this lane was built
+    └── 2026-04-30_claude_sp10240_bytefit_plan.md ← the byte-fit reasoning (why MLP3.75 not MLP4)
+```
+
+## Tokenizer
+
+**Vocab size:** 10,240
+**Variant:** SP10240 lossless-caps CaseOps with 4 reserved operator codepoints
+
+The CaseOps-active tokenizer is `fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model`. It is derived from the same trainer spec that PR #1855 used for SP8192:
+
+- BPE, byte fallback enabled
+- split-digits enabled
+- `nmt_nfkc` normalization
+- no dummy prefix
+- pad / bos / eos / unk ids = 0 / 1 / 2 / 3
+- hard vocab limit disabled
+- reserved ids: U+E001=4, U+E002=5, U+E003=6, U+E004=7 (the four CaseOps operators)
+- training corpus: FineWeb 10B docs `[50000, end)` (val docs `[0, 50000)` excluded — `tokenizer_skip_docs=50000` in `tokenizer_specs_sp10240.json`)
+
+The standard `fineweb_10240_bpe.model` is included alongside as a reference — it is the same BPE training run **without** CaseOps reserved operators (those four codepoints map to `<unk>` id 3). Useful for diff inspection of the embedding-table cost of reserving the four ops.
+
+## CaseOps codec
+
+`caseops/lossless_caps.py` is the encode/decode module. The four operators are inserted at preprocessing time to record case information losslessly so the BPE doesn't need to allocate vocab to capitalization. At eval time, decode reverses the operators to reconstruct the original text.
+
+The `prepare_sp10240_caseops_data.py` script trains the CaseOps tokenizer when no compatible model is found and tokenizes FineWeb 10B end-to-end into the dataset shards. It is the single source of truth for how mockingbird's training data was produced.
+
+## Dataset
+
+The full preprocessed dataset (124 train shards + 1 val shard, ~5 GB) is published publicly on Hugging Face — too large to commit to git:
+
+**https://huggingface.co/datasets/Frosty40/10k_golfer**
+
+Reviewers can pull it with either of:
+
+```bash
+bash caseops/download_sp10240_full124_from_hf.sh    # all shards
+bash caseops/download_sp10240_first80_from_hf.sh    # first 80 only — enough to repro the run
+```
+
+Both scripts use the standard HF CLI and require `huggingface_hub>=1.8.0`.
+
+## Reproducing the tokenizer + dataset from scratch
+
+If you don't trust the HF artifacts and want to rebuild:
+
+```bash
+# 1. Build the standard SP10240 BPE tokenizer (no CaseOps)
+bash build/run_sp10240_build.sh
+
+# 2. Re-train the lossless-caps CaseOps variant + tokenize FineWeb 10B end-to-end
+bash caseops/build_sp10240_caseops_local.sh
+```
+
+Output lands at `data/datasets/fineweb10B_sp10240_caseops/...` matching the paths the training scripts expect.
+
+## Why this is in a non-record PR
+
+A non-record submission is the right venue to land the 10k vocab tooling: it gives reviewers full access to the tokenizer, the CaseOps codec, the build/upload scripts, and the derivation notes — even though mockingbird's BPB does not beat PR #1855. The same machinery applied to SP8192 would produce a near-record SmearGate-class run; we're documenting the SP10240 cost on otherwise-identical compression / phased-TTT machinery.
+
+## Provenance
+
+- Tokenizer file `fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model` size 401,915 B; the byte-identical copy used by all three mockingbird seeds is at `legs/2026-05-01_pr1855_sp10240_caseops_mlp375_late045_seed{0,1}_8x/tokenizers/` and `evidence/pod_pulls/8x_10320714983_20260501_sp10240_mlp375_late045_clean_submission_candidate/...` on the source repo.
+- CaseOps module `lossless_caps.py` is the seed-42 lane copy; seeds 0 and 1 used byte-identical copies of the same module.
+- Build log `sp10240_build.log` is the actual SentencePiece trainer output from the build that produced the standard SP10240 BPE.
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_build.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_build.sh
new file mode 100644
index 0000000000..f3cb98994f
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_build.sh
@@ -0,0 +1,30 @@
+#!/usr/bin/env bash
+set -uo pipefail
+
+DATA_DIR=/home/frosty40/parameter-golf-lab/data
+SPEC=/home/frosty40/parameter-golf-lab/tokenizer_specs_sp10240.json
+LOG=/home/frosty40/parameter-golf-lab/sp10240_build.log
+PY=/home/frosty40/miniconda3/bin/python3
+TS=$(date +%Y%m%d_%H%M%S)
+
+MAN_BAK="$DATA_DIR/manifest.json.bak.before_sp10240_$TS"
+TCE_BAK="$DATA_DIR/tokenizer_config.export.json.bak.before_sp10240_$TS"
+
+cp "$DATA_DIR/manifest.json" "$MAN_BAK"
+cp "$DATA_DIR/tokenizer_config.export.json" "$TCE_BAK"
+echo "[wrapper] backed up index files to $MAN_BAK / $TCE_BAK" | tee -a "$LOG"
+
+cleanup() {
+    rc=$?
+    echo "[wrapper] build exited rc=$rc; restoring index files from backup" | tee -a "$LOG"
+    cp "$MAN_BAK" "$DATA_DIR/manifest.json"
+    cp "$TCE_BAK" "$DATA_DIR/tokenizer_config.export.json"
+    echo "[wrapper] index files restored. New tokenizer/dataset (if built) remain in place." | tee -a "$LOG"
+}
+trap cleanup EXIT
+
+echo "[wrapper] starting sp10240 build at $TS" | tee -a "$LOG"
+"$PY" "$DATA_DIR/download_hf_docs_and_tokenize.py" \
+    --output-root "$DATA_DIR" \
+    --tokenizer-config "$SPEC" \
+    --skip-byte 2>&1 | tee -a "$LOG"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_upload.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_upload.sh
new file mode 100644
index 0000000000..ad153755fa
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/run_sp10240_upload.sh
@@ -0,0 +1,52 @@
+#!/usr/bin/env bash
+set -uo pipefail
+
+DATA_DIR=/home/frosty40/parameter-golf-lab/data
+TOK_MODEL="$DATA_DIR/tokenizers/fineweb_10240_bpe.model"
+TOK_VOCAB="$DATA_DIR/tokenizers/fineweb_10240_bpe.vocab"
+DATASET_DIR="$DATA_DIR/datasets/fineweb10B_sp10240"
+LOG=/home/frosty40/parameter-golf-lab/sp10240_upload.log
+HF=/home/frosty40/miniconda3/bin/hf
+REPO=Frosty40/10k_golfer
+BUILD_PID=2018905
+
+echo "[upload] watcher started $(date -Iseconds)" | tee -a "$LOG"
+echo "[upload] waiting for build PID $BUILD_PID to exit, then for outputs to materialize" | tee -a "$LOG"
+
+# Wait for the build process to exit
+while kill -0 "$BUILD_PID" 2>/dev/null; do
+    sleep 30
+done
+echo "[upload] build PID $BUILD_PID exited at $(date -Iseconds)" | tee -a "$LOG"
+
+# Wait for outputs to be visible (script may flush after exit)
+for i in $(seq 1 20); do
+    if [[ -s "$TOK_MODEL" && -s "$TOK_VOCAB" && -d "$DATASET_DIR" ]]; then
+        shard_count=$(ls "$DATASET_DIR"/fineweb_train_*.bin 2>/dev/null | wc -l)
+        val_count=$(ls "$DATASET_DIR"/fineweb_val_*.bin 2>/dev/null | wc -l)
+        if [[ "$shard_count" -gt 0 && "$val_count" -gt 0 ]]; then
+            echo "[upload] outputs ready: tokenizer + $shard_count train shards + $val_count val shards" | tee -a "$LOG"
+            break
+        fi
+    fi
+    echo "[upload] outputs not yet visible (try $i/20), sleeping 15s" | tee -a "$LOG"
+    sleep 15
+done
+
+if [[ ! -s "$TOK_MODEL" || ! -d "$DATASET_DIR" ]]; then
+    echo "[upload] FATAL: outputs not present after build exit. Check sp10240_build.log." | tee -a "$LOG"
+    exit 1
+fi
+
+echo "[upload] creating repo $REPO (public, dataset)" | tee -a "$LOG"
+"$HF" repo create "$REPO" --repo-type dataset 2>&1 | tee -a "$LOG" || \
+    echo "[upload] repo create returned nonzero (likely already exists), continuing" | tee -a "$LOG"
+
+echo "[upload] uploading tokenizer files" | tee -a "$LOG"
+"$HF" upload "$REPO" "$TOK_MODEL" "fineweb_10240_bpe.model" --repo-type dataset 2>&1 | tee -a "$LOG"
+"$HF" upload "$REPO" "$TOK_VOCAB" "fineweb_10240_bpe.vocab" --repo-type dataset 2>&1 | tee -a "$LOG"
+
+echo "[upload] uploading dataset shards from $DATASET_DIR (large folder)" | tee -a "$LOG"
+"$HF" upload-large-folder "$REPO" "$DATASET_DIR" --repo-type dataset 2>&1 | tee -a "$LOG"
+
+echo "[upload] DONE at $(date -Iseconds). Repo: https://huggingface.co/datasets/$REPO" | tee -a "$LOG"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/sp10240_build.log b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/sp10240_build.log
new file mode 100644
index 0000000000..d7459a08b6
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/build/sp10240_build.log
@@ -0,0 +1,1720 @@
+[wrapper] backed up index files to /home/frosty40/parameter-golf-lab/data/manifest.json.bak.before_sp10240_20260429_110811 / /home/frosty40/parameter-golf-lab/data/tokenizer_config.export.json.bak.before_sp10240_20260429_110811
+[wrapper] starting sp10240 build at 20260429_110811
+sentencepiece_trainer.cc(78) LOG(INFO) Starts training with : 
+trainer_spec {
+  input_format: 
+  model_prefix: /home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe
+  model_type: BPE
+  vocab_size: 10240
+  self_test_sample_size: 0
+  character_coverage: 0.999
+  input_sentence_size: 0
+  shuffle_input_sentence: 1
+  seed_sentencepiece_size: 1000000
+  shrinking_factor: 0.75
+  max_sentence_length: 4192
+  num_threads: 16
+  num_sub_iterations: 2
+  max_sentencepiece_length: 16
+  split_by_unicode_script: 1
+  split_by_number: 1
+  split_by_whitespace: 1
+  split_digits: 1
+  pretokenization_delimiter: 
+  treat_whitespace_as_suffix: 0
+  allow_whitespace_only_pieces: 0
+  required_chars: 
+  byte_fallback: 1
+  vocabulary_output_piece_score: 1
+  train_extremely_large_corpus: 0
+  seed_sentencepieces_file: 
+  hard_vocab_limit: 0
+  use_all_vocab: 0
+  unk_id: 3
+  bos_id: 1
+  eos_id: 2
+  pad_id: 0
+  unk_piece: <unk>
+  bos_piece: <s>
+  eos_piece: </s>
+  pad_piece: <pad>
+  unk_surface:  ⁇ 
+  enable_differential_privacy: 0
+  differential_privacy_noise_level: 0
+  differential_privacy_clipping_threshold: 0
+}
+normalizer_spec {
+  name: nmt_nfkc
+  add_dummy_prefix: 0
+  remove_extra_whitespaces: 1
+  escape_whitespaces: 1
+  normalization_rule_tsv: 
+}
+denormalizer_spec {}
+trainer_interface.cc(382) LOG(WARNING) Found too long line (6213 > 4192).
+trainer_interface.cc(384) LOG(WARNING) Too long lines are skipped in the training.
+trainer_interface.cc(385) LOG(WARNING) The maximum length can be changed with --max_sentence_length=<size> flag.
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: ▂▃▅▇▓ ▓<[o]>▓ ▓▇▅▃▂
+Circuit bent Chimp & Kitty birthday cards
+The Truckquencer! – 2 circuit bent tonka trucks, each connected to a sequencer reving, beeping, starting, reversing in harmony!!?!
+Mods – pitch control, loop switch, mangle switch, body contact distortion, jack socket out, trigger in.
+Circuit bent korg Monotron duo sequencer & unmodded boss dr-5 fed into 1 of my robot destruction fx boxes for a quick test….
+Robot destruction Mic check!!
+Based on a ring modulator ic, pitch intervals up/down, robot sound & lfo circuit.
+▇▅▃▂▓ ░[◣o◢]░ ▓▂▃▅▇
+trainer_interface.cc(148) LOG(INFO) Loaded 1000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 2000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Walkera Vitus – Behind the Scenes
+Walkera Vitus tests log. A rare peek on “a day in the life at a drone factory” and footage from recent tests. First a tablet view of the AR games on a 3-axis gimbal. Next a test on the optical sensors and active track.
+Walkera Vitus 320, third release of the augmented reality flying game console. Designed as a convenient compact folding aerial filming 4K drone, with auto return home using dual GPS and an added gaming function. Vitus pairs with mobile devices for virtual games. Just find an open field, start the quadcopter, this will be your track or battlefield! In racing mode, set up a virtual track and begin racing. In battle mode, fighter pilot is engaged, locate your enemies, shooting down enemy drones.
+“A day in the life at a drone factory”, a rare look inside a drone factory that I thought would be interesting to share, archives from 2014-16. I like how everyone is going about their work, just candid shots, unscripted.
+Walkera Warehouse: www.ucdrone.com
+▇ ▆ ▅ ▄▐ ►DjLeo Rocks Z House◄▐ ▃ ▄ ▅ ▆
+…..|̲̅̅●̲̅̅|̲̅̅=̲̅̅|̲̅̅●̲̅̅| ιllιlı Subscribe For More ιllιlı |̲̅̅●̲̅̅|̲̅̅=̲̅̅|̲̅̅●̲̅̅|…..
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Rosalba Va - Rosalba Vagge
+Report this image
+Email to a friend
+Add to list
+Toggle Worksafe Mode:
+Oct 30, 2007
+YOUR FACE MY ART
+Overpower the sun
+Animal modelling with Pretty Lady
+Time to Fall Back to Autumn
+Dean's List with Animals
+Modelling with Animals
+Shots I Want to recreate
+Try B4 I die
+Killer Photos I would like to try
+Ruben Vasquez's list of killer photos
+something about it...
+Renaissance Modeling, Style, Photography
+Anything Like This & You Can Count Me In!
+When it ALL comes together = Really GREAT SHOTS
+NST - Fashion!
+Some fashion favs...
+Shots i want to do
+OFF THE CHAIN
+killer/wanna try photos
+MY FAVORITE, CONCEPTS, POSES, ETC.
+Amazing and Inspiring
+Awesomeness with animals
+I wanna do this
+Animals as Props
+Emery Productions Creative Images
+Helsings list of great photos
+♥♥ Captivating Coastal Captures & Striking Sunsets ♥♥
+Concepts I'd love to try!!!
+<3...Models and animals,,,,<3
+i love this idea!
+badass people with badass dogs :)
+Shots I gotta try
+what i would like to have in my port
+"Los animales son buenos amigos, no hacen preguntas y tampoco critican"
+Damn! Wish I thought of this
+Now thats Awesome!
+▁▂▃▅▆▇█ Next Shoot Plan █▇▆▅▃▂▁
+Concept Ideas For Future Shoots
+Model with dogs
+Models and animals
+KELLIE'S LIST OF KILLER SHOTS
+Ashliia Jhanei's Favorite Photos on MM
+Shoots I'd like to do
+i am Telling a Story !!!
+The use of pets
+LOVE TO RE-CREATE
+Jonathan Hargrove's list of killer photos
+Sphive's list of killer photos
+Yannis Photography's list of killer photos
+This has been listed so many times it's almost pointless to add but SO AMAZING that I list it anyway
+Joanna Lynae's list of killer photos
+Concepts I'd like to do!
+SF- like this lighting
+Kia's Favourite Pix
+Love the Makeup! i wonder how it would look on me
+Shots and poses i want to do
+Photographers i would love to work with, or just have to work with
+Shoots I would love to do - with animals
+Awesome Animal Shots
+angelina_polska's list of killer photos
+14 photo Animals and Models
+Eleya Maureen's killer photo list
+Concepts and ideas I would love to try
+JUST 222222222 VICIOUS
+MY FAVORITE COLOR
+Favorite Nature/Animal Shots
+AR-Photos's list of killer photos
+Shaman Dreams list of killer photos.
+Haute Styling and Drobes of War!
+Mike Stalnaker's list of killer photos
+Favorite Pose Shots
+Killer Photos I'd Like to ReCreate
+I LOVE THOSE!!!!
+Concepts, poses, or shots to recreate!!
+LOVE the background!
+Great Glamour Shots
+Nicole's list of killer photos
+Gorgeous Captures with Beautiful Skies.. :)
+J'aime la mode!
+photos that make me go WOW!!!!
+I love those shots
+Tony Lipari's list of killer photos
+some more fav's
+Vancouver Models' Best Photo (female)
+Goldstock's list of killer photos
+Work From "The Masters Of Photo & Leader's Of The Power Of Image A-List:
+Beautiful Light and Colour
+Farrahs List Of Tasteful Photos
+Inspiring pictures !
+I WANT THIS SHOT!
+View All Comments
+February 20, 2012 8:40pm
+This photo is amazing!!!
+November 11, 2010 6:21am
+You are beautiful !!!!!!
+October 01, 2010 4:57pm
+June 07, 2010 11:37pm
+I love this shot!
+March 27, 2010 6:53pm
+:-) Love! You look awesome! And so do the dogs! :-)
+February 10, 2010 12:27pm
+Fantastic image! My favourite breed of dogs and you're pretty well bred too! ;-)
+February 10, 2010 11:34am
+Love everything about this! Good job!
+November 28, 2009 4:57pm
+D A V E C
+October 26, 2009 11:00pm
+This is really great. Love it!
+October 16, 2009 10:09am
+this is an amazing picture
+Jmay - Model
+October 11, 2009 5:50pm
+gorgeous...love everything about this!
+i am studios
+September 30, 2009 2:29pm
+Love this !!!
+September 29, 2009 8:32am
+Beyond excellent.... onall levels!
+September 19, 2009 7:31am
+September 05, 2009 4:29pm
+August 15, 2009 11:47pm
+One of the best shots I've seen in a while, so cool. The colors, edit, dogs, location.. outfit.. pose.. hair... beautiful
+June 28, 2009 12:15pm
+June 22, 2009 5:21am
+June 21, 2009 1:36pm
+This is very good work. Best of luck to you. Steve
+June 19, 2009 2:10am
+View All Comments
+Rosalba Va has set a password in order to view this album.
+Password is incorrect. If you would like to view this album, please contact Rosalba Va.
+trainer_interface.cc(148) LOG(INFO) Loaded 3000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 4000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 5000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Daughter & mOmPosted: Wed. Jan. 9 05:07:25 2019
+▃▅▆█▒▓ Mom and daughter ( Toghather ) looking for older guy with nice cock.Looking for anything from oral to hookup or whatever you want. ▒▓█▆▅▃▂
+HIT mE HerE >>>> [email protected] "3x" as the subject line
+i did not get Ur msg.are u real?
+- Location: [email protected]
+- Age: 24
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: 2 Back to school is a 100% After Effects template (CS4 or higher). It’s time to go back to school. We are proud to introduce you to Adobe After Effects template. The project template comes with 25 placeholders for your photos / videos and 16 holders for texts plus logo, and the lower third of the bumper. Just import your media, type your text and you will be ready to render! The project can be used for a wide range of marketing campaigns, especially for schools participating in promotional, advertising and commercials. Student after parties, etc. Best of all, this project is easy to use for beginners and pros, just email your pictures to the designated privilege customize the color scheme to fit your needs, add text with a few simple clicks and display it in the desired resolution. Be creative and do not be afraid to experiment with new quick and simple system of color!
+After Effects template contains:
+Openere – 1:10 sec
+Bumper – 0:06 sec.
+Titer – 0:10 sec
+8 x Transfers – 0:04 sec
+! Come up with a new background and updates!
+22 + doodle silhouettes in AI & PSD and illustrations included
+15 photos / videos and 16 text placeholders
+All are easily replaced (100% After Effects). Drag and drop content and visualization! Text strip, media files (photo or video) and change the background color
+No external plugins required
+3 options resolution pre-defined, ready to assist you in this! 1080p, 720p, SDDV Widescreen
+The sound effects are included
+Visit website and download free project after effect : click Here or Here
+DEMO CLIP :
+Like facebook : https://www.facebook.com/fdownprojectae
+┊ ┊ ┊ ★
+┊ ┊ ☆
+▂ ▃ ▅ ▆ █ www.freeprojectae.com █ ▆ ▅ ▃ ▂
+☞ Link download 1: http://adf.ly/5786924/back-2-school
+☞ Link download 2: http://adf.ly/5786924/back-2-school2
+☞ Link download 3: http://adf.ly/5786924/back-2-school3
+☞ Link download 4:
+Pass winrar : freeprojectae.com
+trainer_interface.cc(148) LOG(INFO) Loaded 6000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 7000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: I updated my jekyll-co2 plugin (first written about here) that shows the change in atmospheric CO₂ at the Mauna Loa observatory in Hawaii. It used to just show three years of data as numbers, but now it shows 20 years in a little text (Unicode) sparkline. In text, it’s this: ▁▁▂▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇.
+Graphically, it’s this:
+If you hover over a year it shows the exact number in a tooltip. The preventable but unstopping increase is what I want to show, though, and in plain (Unicode) text I think that comes across even without obvious numbers.
+I got the mechanism from a nice hack in this gist. I’d like to do more years in a prettier sparkline, but that would require some outside graphical library, and I don’t feel like getting into that right now.
+trainer_interface.cc(148) LOG(INFO) Loaded 8000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 9000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: You are using an outdated browser. Please
+upgrade your browser to improve your ReverbNation experience.
+I want to interview you! Please click "Become a fan" so I can interview you! The questionnaire get's send automatically to your mail inbox after you became a fan! Check my profile page for more informations! Looking really forward to hear your story and feature you!
+Booking Opening Acts for the Northern Hype & Blake Selby Midwest TOUR. Visit NorthernHype.net for all dates & submission info.
+That's what's up much respect ~
+Great music. Im Dame Stacks the Producer CEO of Ben&Frank Productions, and im on a networking mission. We are looking for new artists to listen to and you got some official content. Fan back and come check out my new Website for some of the hottest Beats underground and Free Promotion. And to get beats today contact us on facebook or twitter. #Respect and we also offer free downloads on our Site as well. http://www.benfrankproductions.com/the-beat-vault.html
+Artist send your hottest song to email@example.com For a chance to get on the #GrandHustleUnsignHype Mixtape!!! #GH #GN
+LET COLAB www.reverbnation.com/califiedmusic
+TIGHT MUSIC FAN BACK
+Yoooo...ay man I like dat 'Hol Up'
+Thank you for supporting my music. I appreciate it.
+Whaddup Louie...thnx 4 coming thru, u got some nice music (nice flow), keep up the grind aight.
+dropping by your page to say hello, showing support & wishing u the best in 2012!! <3
+LOUIE MF YOUNG
+I HEARD THAT KUSH HIT LIKE LEBRON
+TRACK AND THAT SONG IS
+CLEANER THAN A BAR OF SOAP
+KEEP MAKING MUSIC
+KEEP IT MOVING
+LIKE THE HANDS ON A CLOCK
+▂ ▃ ▅ ▆ █ G3KBEATS.COM █ ▆ ▅ ▃ ▂:
+© 2015 eMinor Incorporated
+All trademarks are the property of the respective owners. ReverbNation is not affiliated with the trademark owners.
+Not listening to anything?
+Try one of the ReverbNation Channels
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Address: 6 supa square view
+Account name: ooxxmichelxxoo
+Last seen: 11February2019
+SWAT - ProCop - SAI - ZIP - ALT - SAFD - raceTECH - BRINKS
+▂▃▄▅▆▇█▓▒░Special Weapons and Tactics░▒▓█▇▆▅▄▃▂
+Add a screenshot that clearly shows the user's last login (/seen username)
+User has logged in today, request denied.
+trainer_interface.cc(148) LOG(INFO) Loaded 10000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 11000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 12000000 lines
+trainer_interface.cc(125) LOG(WARNING) Too many sentences are loaded! (12426626), which may slow down training.
+trainer_interface.cc(127) LOG(WARNING) Consider using --input_sentence_size=<size> and --shuffle_input_sentence=true.
+trainer_interface.cc(130) LOG(WARNING) They allow to randomly sample <size> sentences from the entire corpus.
+trainer_interface.cc(411) LOG(INFO) Loaded all 12426626 sentences
+trainer_interface.cc(418) LOG(INFO) Skipped 2942174 too long sentences.
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <pad>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <s>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: </s>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <unk>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x00>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x01>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x02>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x03>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x04>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x05>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x06>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x07>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x08>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x09>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x10>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x11>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x12>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x13>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x14>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x15>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x16>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x17>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x18>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x19>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x20>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x21>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x22>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x23>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x24>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x25>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x26>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x27>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x28>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x29>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x30>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x31>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x32>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x33>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x34>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x35>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x36>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x37>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x38>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x39>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x40>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x41>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x42>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x43>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x44>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x45>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x46>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x47>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x48>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x49>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x50>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x51>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x52>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x53>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x54>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x55>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x56>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x57>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x58>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x59>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x60>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x61>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x62>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x63>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x64>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x65>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x66>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x67>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x68>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x69>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x70>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x71>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x72>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x73>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x74>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x75>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x76>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x77>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x78>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x79>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x80>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x81>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x82>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x83>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x84>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x85>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x86>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x87>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x88>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x89>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x90>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x91>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x92>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x93>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x94>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x95>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x96>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x97>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x98>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x99>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xED>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFF>
+trainer_interface.cc(432) LOG(INFO) Normalizing sentences...
+trainer_interface.cc(541) LOG(INFO) all chars count=20477807321
+trainer_interface.cc(552) LOG(INFO) Done: 99.9034% characters are covered.
+trainer_interface.cc(562) LOG(INFO) Alphabet size=85
+trainer_interface.cc(563) LOG(INFO) Final character coverage=0.999034
+trainer_interface.cc(594) LOG(INFO) Done! preprocessed 12426626 sentences.
+trainer_interface.cc(600) LOG(INFO) Tokenizing input sentences with whitespace: 12426626
+[wrapper] build exited rc=0; restoring index files from backup
+[wrapper] index files restored. New tokenizer/dataset (if built) remain in place.
+[wrapper] backed up index files to /home/frosty40/parameter-golf-lab/data/manifest.json.bak.before_sp10240_20260429_121536 / /home/frosty40/parameter-golf-lab/data/tokenizer_config.export.json.bak.before_sp10240_20260429_121536
+[wrapper] starting sp10240 build at 20260429_121536
+sentencepiece_trainer.cc(78) LOG(INFO) Starts training with : 
+trainer_spec {
+  input_format: 
+  model_prefix: /home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe
+  model_type: BPE
+  vocab_size: 10240
+  self_test_sample_size: 0
+  character_coverage: 0.999
+  input_sentence_size: 0
+  shuffle_input_sentence: 1
+  seed_sentencepiece_size: 1000000
+  shrinking_factor: 0.75
+  max_sentence_length: 4192
+  num_threads: 16
+  num_sub_iterations: 2
+  max_sentencepiece_length: 16
+  split_by_unicode_script: 1
+  split_by_number: 1
+  split_by_whitespace: 1
+  split_digits: 1
+  pretokenization_delimiter: 
+  treat_whitespace_as_suffix: 0
+  allow_whitespace_only_pieces: 0
+  required_chars: 
+  byte_fallback: 1
+  vocabulary_output_piece_score: 1
+  train_extremely_large_corpus: 0
+  seed_sentencepieces_file: 
+  hard_vocab_limit: 0
+  use_all_vocab: 0
+  unk_id: 3
+  bos_id: 1
+  eos_id: 2
+  pad_id: 0
+  unk_piece: <unk>
+  bos_piece: <s>
+  eos_piece: </s>
+  pad_piece: <pad>
+  unk_surface:  ⁇ 
+  enable_differential_privacy: 0
+  differential_privacy_noise_level: 0
+  differential_privacy_clipping_threshold: 0
+}
+normalizer_spec {
+  name: nmt_nfkc
+  add_dummy_prefix: 0
+  remove_extra_whitespaces: 1
+  escape_whitespaces: 1
+  normalization_rule_tsv: 
+}
+denormalizer_spec {}
+trainer_interface.cc(382) LOG(WARNING) Found too long line (4245 > 4192).
+trainer_interface.cc(384) LOG(WARNING) Too long lines are skipped in the training.
+trainer_interface.cc(385) LOG(WARNING) The maximum length can be changed with --max_sentence_length=<size> flag.
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: ▂▃▅▇▓ ▓<[o]>▓ ▓▇▅▃▂
+Circuit bent Chimp & Kitty birthday cards
+The Truckquencer! – 2 circuit bent tonka trucks, each connected to a sequencer reving, beeping, starting, reversing in harmony!!?!
+Mods – pitch control, loop switch, mangle switch, body contact distortion, jack socket out, trigger in.
+Circuit bent korg Monotron duo sequencer & unmodded boss dr-5 fed into 1 of my robot destruction fx boxes for a quick test….
+Robot destruction Mic check!!
+Based on a ring modulator ic, pitch intervals up/down, robot sound & lfo circuit.
+▇▅▃▂▓ ░[◣o◢]░ ▓▂▃▅▇
+trainer_interface.cc(148) LOG(INFO) Loaded 1000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 2000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Walkera Vitus – Behind the Scenes
+Walkera Vitus tests log. A rare peek on “a day in the life at a drone factory” and footage from recent tests. First a tablet view of the AR games on a 3-axis gimbal. Next a test on the optical sensors and active track.
+Walkera Vitus 320, third release of the augmented reality flying game console. Designed as a convenient compact folding aerial filming 4K drone, with auto return home using dual GPS and an added gaming function. Vitus pairs with mobile devices for virtual games. Just find an open field, start the quadcopter, this will be your track or battlefield! In racing mode, set up a virtual track and begin racing. In battle mode, fighter pilot is engaged, locate your enemies, shooting down enemy drones.
+“A day in the life at a drone factory”, a rare look inside a drone factory that I thought would be interesting to share, archives from 2014-16. I like how everyone is going about their work, just candid shots, unscripted.
+Walkera Warehouse: www.ucdrone.com
+▇ ▆ ▅ ▄▐ ►DjLeo Rocks Z House◄▐ ▃ ▄ ▅ ▆
+…..|̲̅̅●̲̅̅|̲̅̅=̲̅̅|̲̅̅●̲̅̅| ιllιlı Subscribe For More ιllιlı |̲̅̅●̲̅̅|̲̅̅=̲̅̅|̲̅̅●̲̅̅|…..
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Rosalba Va - Rosalba Vagge
+Report this image
+Email to a friend
+Add to list
+Toggle Worksafe Mode:
+Oct 30, 2007
+YOUR FACE MY ART
+Overpower the sun
+Animal modelling with Pretty Lady
+Time to Fall Back to Autumn
+Dean's List with Animals
+Modelling with Animals
+Shots I Want to recreate
+Try B4 I die
+Killer Photos I would like to try
+Ruben Vasquez's list of killer photos
+something about it...
+Renaissance Modeling, Style, Photography
+Anything Like This & You Can Count Me In!
+When it ALL comes together = Really GREAT SHOTS
+NST - Fashion!
+Some fashion favs...
+Shots i want to do
+OFF THE CHAIN
+killer/wanna try photos
+MY FAVORITE, CONCEPTS, POSES, ETC.
+Amazing and Inspiring
+Awesomeness with animals
+I wanna do this
+Animals as Props
+Emery Productions Creative Images
+Helsings list of great photos
+♥♥ Captivating Coastal Captures & Striking Sunsets ♥♥
+Concepts I'd love to try!!!
+<3...Models and animals,,,,<3
+i love this idea!
+badass people with badass dogs :)
+Shots I gotta try
+what i would like to have in my port
+"Los animales son buenos amigos, no hacen preguntas y tampoco critican"
+Damn! Wish I thought of this
+Now thats Awesome!
+▁▂▃▅▆▇█ Next Shoot Plan █▇▆▅▃▂▁
+Concept Ideas For Future Shoots
+Model with dogs
+Models and animals
+KELLIE'S LIST OF KILLER SHOTS
+Ashliia Jhanei's Favorite Photos on MM
+Shoots I'd like to do
+i am Telling a Story !!!
+The use of pets
+LOVE TO RE-CREATE
+Jonathan Hargrove's list of killer photos
+Sphive's list of killer photos
+Yannis Photography's list of killer photos
+This has been listed so many times it's almost pointless to add but SO AMAZING that I list it anyway
+Joanna Lynae's list of killer photos
+Concepts I'd like to do!
+SF- like this lighting
+Kia's Favourite Pix
+Love the Makeup! i wonder how it would look on me
+Shots and poses i want to do
+Photographers i would love to work with, or just have to work with
+Shoots I would love to do - with animals
+Awesome Animal Shots
+angelina_polska's list of killer photos
+14 photo Animals and Models
+Eleya Maureen's killer photo list
+Concepts and ideas I would love to try
+JUST 222222222 VICIOUS
+MY FAVORITE COLOR
+Favorite Nature/Animal Shots
+AR-Photos's list of killer photos
+Shaman Dreams list of killer photos.
+Haute Styling and Drobes of War!
+Mike Stalnaker's list of killer photos
+Favorite Pose Shots
+Killer Photos I'd Like to ReCreate
+I LOVE THOSE!!!!
+Concepts, poses, or shots to recreate!!
+LOVE the background!
+Great Glamour Shots
+Nicole's list of killer photos
+Gorgeous Captures with Beautiful Skies.. :)
+J'aime la mode!
+photos that make me go WOW!!!!
+I love those shots
+Tony Lipari's list of killer photos
+some more fav's
+Vancouver Models' Best Photo (female)
+Goldstock's list of killer photos
+Work From "The Masters Of Photo & Leader's Of The Power Of Image A-List:
+Beautiful Light and Colour
+Farrahs List Of Tasteful Photos
+Inspiring pictures !
+I WANT THIS SHOT!
+View All Comments
+February 20, 2012 8:40pm
+This photo is amazing!!!
+November 11, 2010 6:21am
+You are beautiful !!!!!!
+October 01, 2010 4:57pm
+June 07, 2010 11:37pm
+I love this shot!
+March 27, 2010 6:53pm
+:-) Love! You look awesome! And so do the dogs! :-)
+February 10, 2010 12:27pm
+Fantastic image! My favourite breed of dogs and you're pretty well bred too! ;-)
+February 10, 2010 11:34am
+Love everything about this! Good job!
+November 28, 2009 4:57pm
+D A V E C
+October 26, 2009 11:00pm
+This is really great. Love it!
+October 16, 2009 10:09am
+this is an amazing picture
+Jmay - Model
+October 11, 2009 5:50pm
+gorgeous...love everything about this!
+i am studios
+September 30, 2009 2:29pm
+Love this !!!
+September 29, 2009 8:32am
+Beyond excellent.... onall levels!
+September 19, 2009 7:31am
+September 05, 2009 4:29pm
+August 15, 2009 11:47pm
+One of the best shots I've seen in a while, so cool. The colors, edit, dogs, location.. outfit.. pose.. hair... beautiful
+June 28, 2009 12:15pm
+June 22, 2009 5:21am
+June 21, 2009 1:36pm
+This is very good work. Best of luck to you. Steve
+June 19, 2009 2:10am
+View All Comments
+Rosalba Va has set a password in order to view this album.
+Password is incorrect. If you would like to view this album, please contact Rosalba Va.
+trainer_interface.cc(148) LOG(INFO) Loaded 3000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 4000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Daughter & mOmPosted: Wed. Jan. 9 05:07:25 2019
+▃▅▆█▒▓ Mom and daughter ( Toghather ) looking for older guy with nice cock.Looking for anything from oral to hookup or whatever you want. ▒▓█▆▅▃▂
+HIT mE HerE >>>> [email protected] "3x" as the subject line
+i did not get Ur msg.are u real?
+- Location: [email protected]
+- Age: 24
+trainer_interface.cc(148) LOG(INFO) Loaded 5000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: 2 Back to school is a 100% After Effects template (CS4 or higher). It’s time to go back to school. We are proud to introduce you to Adobe After Effects template. The project template comes with 25 placeholders for your photos / videos and 16 holders for texts plus logo, and the lower third of the bumper. Just import your media, type your text and you will be ready to render! The project can be used for a wide range of marketing campaigns, especially for schools participating in promotional, advertising and commercials. Student after parties, etc. Best of all, this project is easy to use for beginners and pros, just email your pictures to the designated privilege customize the color scheme to fit your needs, add text with a few simple clicks and display it in the desired resolution. Be creative and do not be afraid to experiment with new quick and simple system of color!
+After Effects template contains:
+Openere – 1:10 sec
+Bumper – 0:06 sec.
+Titer – 0:10 sec
+8 x Transfers – 0:04 sec
+! Come up with a new background and updates!
+22 + doodle silhouettes in AI & PSD and illustrations included
+15 photos / videos and 16 text placeholders
+All are easily replaced (100% After Effects). Drag and drop content and visualization! Text strip, media files (photo or video) and change the background color
+No external plugins required
+3 options resolution pre-defined, ready to assist you in this! 1080p, 720p, SDDV Widescreen
+The sound effects are included
+Visit website and download free project after effect : click Here or Here
+DEMO CLIP :
+Like facebook : https://www.facebook.com/fdownprojectae
+┊ ┊ ┊ ★
+┊ ┊ ☆
+▂ ▃ ▅ ▆ █ www.freeprojectae.com █ ▆ ▅ ▃ ▂
+☞ Link download 1: http://adf.ly/5786924/back-2-school
+☞ Link download 2: http://adf.ly/5786924/back-2-school2
+☞ Link download 3: http://adf.ly/5786924/back-2-school3
+☞ Link download 4:
+Pass winrar : freeprojectae.com
+trainer_interface.cc(148) LOG(INFO) Loaded 6000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 7000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: I updated my jekyll-co2 plugin (first written about here) that shows the change in atmospheric CO₂ at the Mauna Loa observatory in Hawaii. It used to just show three years of data as numbers, but now it shows 20 years in a little text (Unicode) sparkline. In text, it’s this: ▁▁▂▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇.
+Graphically, it’s this:
+If you hover over a year it shows the exact number in a tooltip. The preventable but unstopping increase is what I want to show, though, and in plain (Unicode) text I think that comes across even without obvious numbers.
+I got the mechanism from a nice hack in this gist. I’d like to do more years in a prettier sparkline, but that would require some outside graphical library, and I don’t feel like getting into that right now.
+trainer_interface.cc(148) LOG(INFO) Loaded 8000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 9000000 lines
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: You are using an outdated browser. Please
+upgrade your browser to improve your ReverbNation experience.
+I want to interview you! Please click "Become a fan" so I can interview you! The questionnaire get's send automatically to your mail inbox after you became a fan! Check my profile page for more informations! Looking really forward to hear your story and feature you!
+Booking Opening Acts for the Northern Hype & Blake Selby Midwest TOUR. Visit NorthernHype.net for all dates & submission info.
+That's what's up much respect ~
+Great music. Im Dame Stacks the Producer CEO of Ben&Frank Productions, and im on a networking mission. We are looking for new artists to listen to and you got some official content. Fan back and come check out my new Website for some of the hottest Beats underground and Free Promotion. And to get beats today contact us on facebook or twitter. #Respect and we also offer free downloads on our Site as well. http://www.benfrankproductions.com/the-beat-vault.html
+Artist send your hottest song to email@example.com For a chance to get on the #GrandHustleUnsignHype Mixtape!!! #GH #GN
+LET COLAB www.reverbnation.com/califiedmusic
+TIGHT MUSIC FAN BACK
+Yoooo...ay man I like dat 'Hol Up'
+Thank you for supporting my music. I appreciate it.
+Whaddup Louie...thnx 4 coming thru, u got some nice music (nice flow), keep up the grind aight.
+dropping by your page to say hello, showing support & wishing u the best in 2012!! <3
+LOUIE MF YOUNG
+I HEARD THAT KUSH HIT LIKE LEBRON
+TRACK AND THAT SONG IS
+CLEANER THAN A BAR OF SOAP
+KEEP MAKING MUSIC
+KEEP IT MOVING
+LIKE THE HANDS ON A CLOCK
+▂ ▃ ▅ ▆ █ G3KBEATS.COM █ ▆ ▅ ▃ ▂:
+© 2015 eMinor Incorporated
+All trademarks are the property of the respective owners. ReverbNation is not affiliated with the trademark owners.
+Not listening to anything?
+Try one of the ReverbNation Channels
+trainer_interface.cc(393) LOG(INFO) Reserved chars are found. Skipped: Address: 6 supa square view
+Account name: ooxxmichelxxoo
+Last seen: 11February2019
+SWAT - ProCop - SAI - ZIP - ALT - SAFD - raceTECH - BRINKS
+▂▃▄▅▆▇█▓▒░Special Weapons and Tactics░▒▓█▇▆▅▄▃▂
+Add a screenshot that clearly shows the user's last login (/seen username)
+User has logged in today, request denied.
+trainer_interface.cc(148) LOG(INFO) Loaded 10000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 11000000 lines
+trainer_interface.cc(148) LOG(INFO) Loaded 12000000 lines
+trainer_interface.cc(125) LOG(WARNING) Too many sentences are loaded! (12386083), which may slow down training.
+trainer_interface.cc(127) LOG(WARNING) Consider using --input_sentence_size=<size> and --shuffle_input_sentence=true.
+trainer_interface.cc(130) LOG(WARNING) They allow to randomly sample <size> sentences from the entire corpus.
+trainer_interface.cc(411) LOG(INFO) Loaded all 12386083 sentences
+trainer_interface.cc(418) LOG(INFO) Skipped 2932717 too long sentences.
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <pad>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <s>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: </s>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <unk>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x00>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x01>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x02>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x03>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x04>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x05>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x06>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x07>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x08>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x09>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x0F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x10>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x11>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x12>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x13>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x14>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x15>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x16>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x17>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x18>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x19>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x1F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x20>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x21>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x22>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x23>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x24>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x25>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x26>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x27>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x28>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x29>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x2F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x30>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x31>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x32>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x33>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x34>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x35>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x36>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x37>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x38>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x39>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x3F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x40>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x41>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x42>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x43>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x44>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x45>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x46>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x47>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x48>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x49>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x4F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x50>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x51>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x52>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x53>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x54>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x55>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x56>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x57>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x58>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x59>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x5F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x60>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x61>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x62>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x63>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x64>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x65>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x66>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x67>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x68>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x69>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x6F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x70>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x71>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x72>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x73>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x74>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x75>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x76>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x77>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x78>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x79>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x7F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x80>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x81>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x82>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x83>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x84>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x85>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x86>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x87>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x88>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x89>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x8F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x90>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x91>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x92>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x93>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x94>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x95>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x96>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x97>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x98>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x99>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9A>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9B>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9C>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9D>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9E>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0x9F>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xA9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xAF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xB9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xBF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xC9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xCF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xD9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xDF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xE9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xED>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xEF>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF0>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF1>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF2>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF3>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF4>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF5>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF6>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF7>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF8>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xF9>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFA>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFB>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFC>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFD>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFE>
+trainer_interface.cc(427) LOG(INFO) Adding meta_piece: <0xFF>
+trainer_interface.cc(432) LOG(INFO) Normalizing sentences...
+trainer_interface.cc(541) LOG(INFO) all chars count=20411767669
+trainer_interface.cc(552) LOG(INFO) Done: 99.9034% characters are covered.
+trainer_interface.cc(562) LOG(INFO) Alphabet size=85
+trainer_interface.cc(563) LOG(INFO) Final character coverage=0.999034
+trainer_interface.cc(594) LOG(INFO) Done! preprocessed 12386083 sentences.
+trainer_interface.cc(600) LOG(INFO) Tokenizing input sentences with whitespace: 12386083
+trainer_interface.cc(611) LOG(INFO) Done! 37312115
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=407894956 min_freq=164114
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=123487622 size=20 all=5168 active=2820 piece=en
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=71043179 size=40 all=7308 active=4960 piece=ion
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=43362112 size=60 all=9591 active=7243 piece=▁C
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=32459932 size=80 all=12976 active=10628 piece=ver
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=23248197 size=100 all=16194 active=13846 piece=ul
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=23179296 min_freq=1990397
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=19808235 size=120 all=19569 active=4129 piece=pp
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=17028521 size=140 all=23497 active=8057 piece=▁se
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=14611725 size=160 all=27986 active=12546 piece=um
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=12936501 size=180 all=32159 active=16719 piece=nt
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=10742723 size=200 all=37347 active=21907 piece=ial
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=10590671 min_freq=842839
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=9513785 size=220 all=43109 active=7294 piece=▁do
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=8546571 size=240 all=48293 active=12478 piece=pt
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=7862672 size=260 all=54140 active=18325 piece=ood
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=7131750 size=280 all=60372 active=24557 piece=ike
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=6639898 size=300 all=66049 active=30234 piece=▁man
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=6614546 min_freq=376893
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=6292754 size=320 all=72671 active=9759 piece=▁Ch
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=5883450 size=340 all=78349 active=15437 piece=ven
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=5288910 size=360 all=83860 active=20948 piece=▁over
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=4841305 size=380 all=89648 active=26736 piece=▁kn
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=4584668 size=400 all=95845 active=32933 piece=▁acc
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=4542217 min_freq=210048
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=4284524 size=420 all=101055 active=9872 piece=▁into
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=3916445 size=440 all=107192 active=16009 piece=amp
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=3721144 size=460 all=113501 active=22318 piece=▁know
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=3514497 size=480 all=118423 active=27240 piece=▁most
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=3327533 size=500 all=124011 active=32828 piece=com
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=3325972 min_freq=135964
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=3173924 size=520 all=131212 active=12717 piece=enc
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2994868 size=540 all=137748 active=19253 piece=any
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2907511 size=560 all=143577 active=25082 piece=▁sm
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2802474 size=580 all=149268 active=30773 piece=▁rel
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2706311 size=600 all=154733 active=36238 piece=cess
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=2703033 min_freq=94805
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2639467 size=620 all=161194 active=13889 piece=▁ass
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2554740 size=640 all=167319 active=20014 piece=chool
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2458125 size=660 all=173237 active=25932 piece=▁att
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2348682 size=680 all=178230 active=30925 piece=gram
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2274463 size=700 all=183845 active=36540 piece=▁cre
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=2268137 min_freq=71554
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2213433 size=720 all=190942 active=16154 piece=▁diffe
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2147511 size=740 all=197559 active=22771 piece=ouse
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2063382 size=760 all=204632 active=29844 piece=akes
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=2006314 size=780 all=209764 active=34976 piece=ublic
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1962591 size=800 all=216848 active=42060 piece=roup
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1960852 min_freq=53316
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1869476 size=820 all=221060 active=14841 piece=▁del
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1833281 size=840 all=227394 active=21175 piece=ute
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1788893 size=860 all=232859 active=26640 piece=▁different
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1712426 size=880 all=236907 active=30688 piece=▁report
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1662540 size=900 all=243671 active=37452 piece=▁sol
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1659252 min_freq=43440
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1630272 size=920 all=248461 active=16768 piece=ized
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1587515 size=940 all=256125 active=24432 piece=▁next
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1537125 size=960 all=262119 active=30426 piece=▁var
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1493620 size=980 all=268834 active=37141 piece=▁Ind
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1465937 size=1000 all=273792 active=42099 piece=▁iss
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1464266 min_freq=34697
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1423279 size=1020 all=278445 active=18251 piece=▁days
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1396937 size=1040 all=282148 active=21954 piece=▁second
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1360773 size=1060 all=287042 active=26848 piece=ER
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1333092 size=1080 all=293189 active=32995 piece=aj
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1311902 size=1100 all=300085 active=39891 piece=▁meet
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1310941 min_freq=29587
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1275507 size=1120 all=303762 active=18585 piece=▁US
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1241548 size=1140 all=309214 active=24037 piece=ats
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1211067 size=1160 all=314161 active=28984 piece=▁let
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1187490 size=1180 all=317764 active=32587 piece=ording
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1168422 size=1200 all=322569 active=37392 piece=▁community
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1165361 min_freq=25820
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1142489 size=1220 all=328715 active=22205 piece=ieve
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1117182 size=1240 all=335708 active=29198 piece=orm
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1096628 size=1260 all=343347 active=36837 piece=bo
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1070362 size=1280 all=348355 active=41845 piece=▁include
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1053042 size=1300 all=354268 active=47758 piece=▁course
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=1052324 min_freq=21858
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1037384 size=1320 all=361233 active=24615 piece=▁Am
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1020917 size=1340 all=368556 active=31938 piece=▁bas
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=1007661 size=1360 all=374877 active=38259 piece=▁opport
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=994370 size=1380 all=381997 active=45379 piece=veral
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=976625 size=1400 all=387651 active=51033 piece=▁Che
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=974114 min_freq=18384
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=962636 size=1420 all=392200 active=23642 piece=▁Your
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=942438 size=1440 all=399914 active=31356 piece=▁move
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=928605 size=1460 all=403255 active=34697 piece=▁already
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=913684 size=1480 all=407310 active=38752 piece=▁friends
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=897260 size=1500 all=412635 active=44077 piece=▁tre
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=896857 min_freq=16427
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=882461 size=1520 all=418904 active=26733 piece=app
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=867565 size=1540 all=425865 active=33694 piece=reet
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=852872 size=1560 all=431322 active=39151 piece=▁Comp
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=842665 size=1580 all=435809 active=43638 piece=ai
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=825531 size=1600 all=440325 active=48154 piece=ained
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=823066 min_freq=14666
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=811573 size=1620 all=446941 active=28418 piece=▁games
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=797128 size=1640 all=450556 active=32033 piece=▁doesn
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=785548 size=1660 all=454782 active=36259 piece=▁range
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=774584 size=1680 all=460137 active=41614 piece=▁early
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=762954 size=1700 all=465590 active=47067 piece=▁install
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=762922 min_freq=13261
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=754988 size=1720 all=470165 active=27755 piece=▁Pe
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=746788 size=1740 all=476509 active=34099 piece=resh
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=737539 size=1760 all=483361 active=40951 piece=▁companies
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=727502 size=1780 all=487281 active=44871 piece=▁environment
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=717756 size=1800 all=493638 active=51228 piece=ino
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=716781 min_freq=11869
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=708568 size=1820 all=499523 active=29764 piece=year
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=702017 size=1840 all=504989 active=35230 piece=▁Just
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=695282 size=1860 all=512112 active=42353 piece=▁Euro
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=685907 size=1880 all=515757 active=45998 piece=go
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=678258 size=1900 all=521174 active=51415 piece=ctor
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=678035 min_freq=10679
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=671246 size=1920 all=527182 active=31757 piece=▁Acc
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=659315 size=1940 all=530729 active=35304 piece=▁protect
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=652222 size=1960 all=539078 active=43653 piece=▁black
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=647864 size=1980 all=544402 active=48977 piece=▁further
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=639847 size=2000 all=548749 active=53324 piece=▁repl
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=639803 min_freq=9763
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=631402 size=2020 all=553038 active=31629 piece=▁Europe
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=622402 size=2040 all=556845 active=35436 piece=uation
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=610654 size=2060 all=562739 active=41330 piece=aching
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=604483 size=2080 all=566042 active=44633 piece=▁required
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=599729 size=2100 all=571631 active=50222 piece=▁sent
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=599717 min_freq=9088
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=595571 size=2120 all=577324 active=34139 piece=ffe
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=589503 size=2140 all=581907 active=38722 piece=▁yourself
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=582864 size=2160 all=585469 active=42284 piece=▁either
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=574977 size=2180 all=589449 active=46264 piece=▁via
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=567374 size=2200 all=592944 active=49759 piece=icro
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=566991 min_freq=8449
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=562101 size=2220 all=598739 active=35086 piece=cast
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=557024 size=2240 all=604407 active=40754 piece=▁problems
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=550881 size=2260 all=609541 active=45888 piece=itten
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=544792 size=2280 all=614811 active=51158 piece=▁stri
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=538488 size=2300 all=619638 active=55985 piece=▁Sunday
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=538475 min_freq=7790
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=533728 size=2320 all=624639 active=35922 piece=istration
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=526891 size=2340 all=629514 active=40797 piece=eter
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=520868 size=2360 all=633484 active=44767 piece=▁Are
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=516155 size=2380 all=638159 active=49442 piece=▁abs
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=511404 size=2400 all=641832 active=53115 piece=▁town
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=511247 min_freq=7224
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=506898 size=2420 all=646950 active=37117 piece=ession
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=503729 size=2440 all=651753 active=41920 piece=▁growth
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=495576 size=2460 all=656363 active=46530 piece=▁News
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=491784 size=2480 all=663200 active=53367 piece=▁wall
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=487464 size=2500 all=669061 active=59228 piece=ville
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=487200 min_freq=6682
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=482979 size=2520 all=673129 active=36613 piece=▁true
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=479086 size=2540 all=676927 active=40411 piece=▁regard
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=474162 size=2560 all=680758 active=44242 piece=▁quickly
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=469843 size=2580 all=686561 active=50045 piece=▁deep
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=465440 size=2600 all=692934 active=56418 piece=▁image
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=464894 min_freq=6259
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=460733 size=2620 all=696109 active=37704 piece=▁inde
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=453649 size=2640 all=699751 active=41346 piece=▁Tuesday
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=449200 size=2660 all=705579 active=47174 piece=▁campaign
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=444230 size=2680 all=707970 active=49565 piece=rown
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=439328 size=2700 all=712026 active=53621 piece=▁expected
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=439098 min_freq=5947
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=435786 size=2720 all=715379 active=38918 piece=▁save
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=430903 size=2740 all=720799 active=44338 piece=▁immedi
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=426503 size=2760 all=725802 active=49341 piece=▁emer
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=423427 size=2780 all=730773 active=54312 piece=zz
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=419023 size=2800 all=736918 active=60457 piece=▁myself
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=419006 min_freq=5573
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=415189 size=2820 all=742260 active=42152 piece=▁played
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=413132 size=2840 all=747259 active=47151 piece=▁prim
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=410399 size=2860 all=751185 active=51077 piece=▁door
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=406845 size=2880 all=754021 active=53913 piece=akers
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=403725 size=2900 all=757888 active=57780 piece=▁reach
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=403681 min_freq=5297
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=400847 size=2920 all=762726 active=42672 piece=bum
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=397499 size=2940 all=768628 active=48574 piece=▁remov
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=394317 size=2960 all=776731 active=56677 piece=ube
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=391149 size=2980 all=781721 active=61667 piece=▁Cle
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=387198 size=3000 all=786179 active=66125 piece=bit
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=387117 min_freq=4931
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=385791 size=3020 all=790774 active=43020 piece=▁band
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=382730 size=3040 all=794620 active=46866 piece=▁excellent
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=380185 size=3060 all=798093 active=50339 piece=▁hor
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=377024 size=3080 all=801805 active=54051 piece=▁Good
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=374089 size=3100 all=806632 active=58878 piece=BC
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=373983 min_freq=4699
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=371986 size=3120 all=811861 active=44954 piece=▁vehicle
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=368145 size=3140 all=815669 active=48762 piece=▁dead
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=364344 size=3160 all=821080 active=54173 piece=OL
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=361686 size=3180 all=825724 active=58817 piece=rote
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=358008 size=3200 all=828539 active=61632 piece=▁distrib
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=357927 min_freq=4491
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=355207 size=3220 all=835458 active=48322 piece=▁Mem
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=352817 size=3240 all=839710 active=52574 piece=yond
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=350349 size=3260 all=843691 active=56555 piece=▁jud
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=347752 size=3280 all=848017 active=60881 piece=▁wants
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=343744 size=3300 all=852126 active=64990 piece=▁nut
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=343657 min_freq=4260
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=341726 size=3320 all=855015 active=45343 piece=▁shop
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=338365 size=3340 all=858518 active=48846 piece=unte
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=335916 size=3360 all=862183 active=52511 piece=▁sports
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=333633 size=3380 all=866213 active=56541 piece=▁star
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=328762 size=3400 all=869847 active=60175 piece=▁demand
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=328643 min_freq=4098
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=325912 size=3420 all=876396 active=49962 piece=mitted
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=323378 size=3440 all=882119 active=55685 piece=▁Australia
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=321140 size=3460 all=886742 active=60308 piece=▁thous
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=318876 size=3480 all=890314 active=63880 piece=▁wedding
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=315552 size=3500 all=893412 active=66978 piece=▁supply
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=315514 min_freq=3888
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=313464 size=3520 all=899020 active=50217 piece=▁creative
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=311724 size=3540 all=903497 active=54694 piece=mod
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=309434 size=3560 all=911272 active=62469 piece=place
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=306714 size=3580 all=916819 active=68016 piece=▁analysis
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=303600 size=3600 all=921840 active=73037 piece=▁girls
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=303556 min_freq=3669
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=301613 size=3620 all=927217 active=51364 piece=▁dating
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=299670 size=3640 all=931689 active=55836 piece=anging
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=297803 size=3660 all=937752 active=61899 piece=▁lock
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=295582 size=3680 all=945690 active=69837 piece=▁fashion
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=293023 size=3700 all=952245 active=76392 piece=▁conference
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=292833 min_freq=3426
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=291164 size=3720 all=955522 active=50846 piece=▁increased
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=289802 size=3740 all=960108 active=55432 piece=▁General
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=288060 size=3760 all=965810 active=61134 piece=▁thanks
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=286160 size=3780 all=968230 active=63554 piece=▁selection
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=283907 size=3800 all=972211 active=67535 piece=▁shopping
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=283680 min_freq=3279
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=281641 size=3820 all=979148 active=55488 piece=▁aspect
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=279870 size=3840 all=983678 active=60018 piece=▁occur
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=278433 size=3860 all=986699 active=63039 piece=▁alone
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=275906 size=3880 all=991652 active=67992 piece=▁spring
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=273964 size=3900 all=994555 active=70895 piece=▁liter
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=273856 min_freq=3152
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=272898 size=3920 all=999621 active=54732 piece=uing
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=271685 size=3940 all=1002650 active=57761 piece=iture
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=269599 size=3960 all=1005781 active=60892 piece=rief
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=268068 size=3980 all=1010201 active=65312 piece=apter
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=265973 size=4000 all=1012951 active=68062 piece=osing
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=265931 min_freq=3043
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=263699 size=4020 all=1017565 active=55060 piece=▁Police
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=261769 size=4040 all=1023009 active=60504 piece=▁hom
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=259506 size=4060 all=1031257 active=68752 piece=▁professionals
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=257650 size=4080 all=1036670 active=74165 piece=▁covered
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=255599 size=4100 all=1039729 active=77224 piece=▁flex
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=255585 min_freq=2902
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=253908 size=4120 all=1044264 active=56340 piece=▁cars
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=252072 size=4140 all=1048662 active=60738 piece=▁England
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=250334 size=4160 all=1051799 active=63875 piece=▁Sl
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=249138 size=4180 all=1055189 active=67265 piece=▁rich
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=247671 size=4200 all=1060725 active=72801 piece=▁advent
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=247633 min_freq=2804
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=246113 size=4220 all=1065593 active=57844 piece=▁beat
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=244471 size=4240 all=1069382 active=61633 piece=night
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=242756 size=4260 all=1071586 active=63837 piece=▁bright
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=241595 size=4280 all=1074762 active=67013 piece=There
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=239877 size=4300 all=1081089 active=73340 piece=▁injury
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=239807 min_freq=2708
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=237104 size=4320 all=1086564 active=59490 piece=look
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=235957 size=4340 all=1092663 active=65589 piece=▁dinner
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=234157 size=4360 all=1096273 active=69199 piece=▁ship
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=232768 size=4380 all=1098893 active=71819 piece=▁sto
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=231526 size=4400 all=1103734 active=76660 piece=▁buying
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=231419 min_freq=2606
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=229962 size=4420 all=1107619 active=59051 piece=▁map
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=228199 size=4440 all=1111451 active=62883 piece=▁Ber
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=226206 size=4460 all=1114960 active=66392 piece=life
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=224965 size=4480 all=1120469 active=71901 piece=▁independent
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=223773 size=4500 all=1124416 active=75848 piece=▁Que
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=223631 min_freq=2521
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=222619 size=4520 all=1129559 active=61171 piece=▁designs
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=221320 size=4540 all=1133595 active=65207 piece=▁records
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=220126 size=4560 all=1140539 active=72151 piece=▁forms
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=218815 size=4580 all=1144573 active=76185 piece=▁menu
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=217697 size=4600 all=1147880 active=79492 piece=▁memory
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=217568 min_freq=2414
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=216370 size=4620 all=1151340 active=60783 piece=vere
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=215048 size=4640 all=1154846 active=64289 piece=fol
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=213372 size=4660 all=1159741 active=69184 piece=▁Its
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=211849 size=4680 all=1162417 active=71860 piece=lines
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=210244 size=4700 all=1166150 active=75593 piece=▁accident
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=210159 min_freq=2342
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=209012 size=4720 all=1171347 active=63447 piece=while
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=208056 size=4740 all=1174621 active=66721 piece=▁Lord
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=207041 size=4760 all=1178541 active=70641 piece=▁Virgin
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=205622 size=4780 all=1183293 active=75393 piece=▁Main
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=204230 size=4800 all=1188515 active=80615 piece=een
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=204139 min_freq=2254
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=203211 size=4820 all=1193715 active=64088 piece=ifying
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=202382 size=4840 all=1198218 active=68591 piece=▁Mike
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=201519 size=4860 all=1201581 active=71954 piece=▁joint
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=200348 size=4880 all=1205919 active=76292 piece=▁appeal
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=198840 size=4900 all=1210097 active=80470 piece=▁Contin
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=198837 min_freq=2190
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=197856 size=4920 all=1218512 active=68874 piece=▁Vict
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=196547 size=4940 all=1221355 active=71717 piece=icians
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=195667 size=4960 all=1225411 active=75773 piece=▁Fund
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=194635 size=4980 all=1229097 active=79459 piece=▁adults
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=193831 size=5000 all=1231185 active=81547 piece=▁chat
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=193785 min_freq=2114
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=192593 size=5020 all=1234178 active=64378 piece=▁websites
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=191458 size=5040 all=1238650 active=68850 piece=ef
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=190313 size=5060 all=1244456 active=74656 piece=azz
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=188921 size=5080 all=1248167 active=78367 piece=bu
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=187671 size=5100 all=1255778 active=85978 piece=▁therefore
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=187666 min_freq=2042
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=186880 size=5120 all=1259126 active=66114 piece=ifications
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=185544 size=5140 all=1262992 active=69980 piece=▁Test
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=184370 size=5160 all=1268270 active=75258 piece=▁Creat
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=183320 size=5180 all=1272899 active=79887 piece=▁comprehensive
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=182219 size=5200 all=1278696 active=85684 piece=Cl
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=182213 min_freq=1966
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=181033 size=5220 all=1287169 active=70215 piece=▁mur
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=180071 size=5240 all=1294087 active=77133 piece=▁meal
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=179139 size=5260 all=1298663 active=81709 piece=ENT
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=178481 size=5280 all=1302185 active=85231 piece=▁behavior
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=177526 size=5300 all=1306203 active=89249 piece=▁studio
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=177331 min_freq=1902
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=176590 size=5320 all=1309918 active=68971 piece=▁sides
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=175430 size=5340 all=1315057 active=74110 piece=▁species
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=174739 size=5360 all=1318893 active=77946 piece=▁Sar
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=173961 size=5380 all=1323587 active=82640 piece=ersey
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=172965 size=5400 all=1328321 active=87374 piece=▁revealed
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=172963 min_freq=1840
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=172157 size=5420 all=1332461 active=70534 piece=▁Academy
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=171023 size=5440 all=1338238 active=76311 piece=▁specifically
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=169845 size=5460 all=1342792 active=80865 piece=▁Lt
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=168585 size=5480 all=1346952 active=85025 piece=▁Boy
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=167699 size=5500 all=1349553 active=87626 piece=▁bread
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=167649 min_freq=1785
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=166695 size=5520 all=1354744 active=72583 piece=▁Carolina
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=165976 size=5540 all=1359276 active=77115 piece=▁fields
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=165133 size=5560 all=1363114 active=80953 piece=▁pock
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=164479 size=5580 all=1365473 active=83312 piece=▁arrived
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=163671 size=5600 all=1374991 active=92830 piece=▁broken
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=163660 min_freq=1723
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=162728 size=5620 all=1378787 active=72502 piece=▁Know
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=161900 size=5640 all=1384030 active=77745 piece=▁injuries
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=161374 size=5660 all=1387507 active=81222 piece=ibilities
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=160727 size=5680 all=1390474 active=84189 piece=▁appreciate
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=160217 size=5700 all=1394012 active=87727 piece=▁presentation
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=160215 min_freq=1678
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=159370 size=5720 all=1396871 active=72517 piece=hetic
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=158663 size=5740 all=1402469 active=78115 piece=weet
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=158046 size=5760 all=1406212 active=81858 piece=▁dates
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=157523 size=5780 all=1410096 active=85742 piece=.|
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=156800 size=5800 all=1414311 active=89957 piece=▁poly
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=156751 min_freq=1634
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=155751 size=5820 all=1417333 active=73446 piece=vell
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=155077 size=5840 all=1422385 active=78498 piece=▁Middle
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=154040 size=5860 all=1428032 active=84145 piece=▁violence
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=153146 size=5880 all=1432004 active=88117 piece=▁infrastructure
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=152344 size=5900 all=1434470 active=90583 piece=▁sam
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=152334 min_freq=1591
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=151597 size=5920 all=1437810 active=74763 piece=▁distribution
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=150671 size=5940 all=1440437 active=77390 piece=rum
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=150071 size=5960 all=1445489 active=82442 piece=GE
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=149121 size=5980 all=1447982 active=84935 piece=▁WH
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=148748 size=6000 all=1451681 active=88634 piece=▁subscrib
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=148747 min_freq=1557
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=148141 size=6020 all=1455656 active=76530 piece=arian
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=147473 size=6040 all=1458853 active=79727 piece=▁functions
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=146818 size=6060 all=1461515 active=82389 piece=▁complic
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=145867 size=6080 all=1466045 active=86919 piece=DS
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=145354 size=6100 all=1478120 active=98994 piece=▁password
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=145343 min_freq=1503
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=144045 size=6120 all=1486587 active=82312 piece=▁Hor
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=143650 size=6140 all=1491868 active=87593 piece=▁styles
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=142950 size=6160 all=1493482 active=89207 piece=liance
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=142427 size=6180 all=1498211 active=93936 piece=▁carried
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=141854 size=6200 all=1500535 active=96260 piece=▁domestic
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=141835 min_freq=1457
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=141201 size=6220 all=1503700 active=78152 piece=▁Perform
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=140817 size=6240 all=1506572 active=81024 piece=ployment
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=140239 size=6260 all=1508483 active=82935 piece=ingly
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=139667 size=6280 all=1510147 active=84599 piece=▁bottle
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=139024 size=6300 all=1514226 active=88678 piece=▁command
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=139001 min_freq=1431
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=138364 size=6320 all=1517625 active=79022 piece=▁OS
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=137742 size=6340 all=1520326 active=81723 piece=▁Heart
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=137083 size=6360 all=1524629 active=86026 piece=adium
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=136426 size=6380 all=1530981 active=92378 piece=▁landscape
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=135751 size=6400 all=1536502 active=97899 piece=▁attended
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=135733 min_freq=1396
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=135055 size=6420 all=1538192 active=78505 piece=▁anx
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=134616 size=6440 all=1541580 active=81893 piece=▁Writ
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=133911 size=6460 all=1545407 active=85720 piece=▁accessories
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=133376 size=6480 all=1548643 active=88956 piece=▁Ha
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=132772 size=6500 all=1551740 active=92053 piece=▁bags
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=132772 min_freq=1373
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=132155 size=6520 all=1554125 active=79925 piece=▁Bul
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=131630 size=6540 all=1558776 active=84576 piece=▁fuck
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=130916 size=6560 all=1565017 active=90817 piece=va
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=130045 size=6580 all=1568471 active=94271 piece=▁protein
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=129460 size=6600 all=1571662 active=97462 piece=▁taxes
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=129416 min_freq=1339
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=128837 size=6620 all=1574852 active=81753 piece=▁external
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=128240 size=6640 all=1579033 active=85934 piece=▁filed
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=127684 size=6660 all=1583980 active=90881 piece=▁fighting
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=127283 size=6680 all=1589360 active=96261 piece=▁vul
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=126597 size=6700 all=1593674 active=100575 piece=▁juice
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=126596 min_freq=1302
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=125967 size=6720 all=1601769 active=87732 piece=▁Ray
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=125513 size=6740 all=1608790 active=94753 piece=▁lucky
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=125069 size=6760 all=1612561 active=98524 piece=▁capture
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=124505 size=6780 all=1617865 active=103828 piece=▁privacy
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=123944 size=6800 all=1624453 active=110416 piece=▁charm
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=123893 min_freq=1253
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=123395 size=6820 all=1631620 active=88307 piece=▁bat
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=122976 size=6840 all=1634208 active=90895 piece=Read
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=122450 size=6860 all=1640200 active=96887 piece=▁)
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=121968 size=6880 all=1644129 active=100816 piece=▁stir
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=121255 size=6900 all=1649482 active=106169 piece=faction
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=121205 min_freq=1215
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=120746 size=6920 all=1655320 active=88215 piece=▁guidelines
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=120349 size=6940 all=1657449 active=90344 piece=▁memories
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=119642 size=6960 all=1660883 active=93778 piece=▁pin
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=118905 size=6980 all=1666965 active=99860 piece=ba
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=118510 size=7000 all=1672986 active=105881 piece=AA
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=118500 min_freq=1181
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=117981 size=7020 all=1675687 active=85771 piece=▁objects
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=117414 size=7040 all=1680464 active=90548 piece=▁visited
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=116851 size=7060 all=1684697 active=94781 piece=▁ceremony
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=116332 size=7080 all=1688028 active=98112 piece=▁recommendations
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=115897 size=7100 all=1691782 active=101866 piece=▁gallery
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=115882 min_freq=1158
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=115461 size=7120 all=1693717 active=86435 piece=▁mand
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=114992 size=7140 all=1697969 active=90687 piece=▁hidden
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=114626 size=7160 all=1706007 active=98725 piece=missions
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=114091 size=7180 all=1712313 active=105031 piece=▁legislation
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=113564 size=7200 all=1715880 active=108598 piece=▁communications
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=113536 min_freq=1129
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=113178 size=7220 all=1721195 active=91092 piece=▁Emer
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=112719 size=7240 all=1727233 active=97130 piece=ingu
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=112313 size=7260 all=1732481 active=102378 piece=odge
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=111785 size=7280 all=1736257 active=106154 piece=▁orange
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=111288 size=7300 all=1738197 active=108094 piece=▁Cro
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=111265 min_freq=1102
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=110733 size=7320 all=1743616 active=92064 piece=▁gym
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=110358 size=7340 all=1745435 active=93883 piece=▁singer
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=109749 size=7360 all=1750949 active=99397 piece=▁vegetables
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=109373 size=7380 all=1756180 active=104628 piece=▁meets
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=108810 size=7400 all=1758436 active=106884 piece=▁reducing
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=108747 min_freq=1077
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=108429 size=7420 all=1762071 active=91546 piece=▁Bell
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=108028 size=7440 all=1765486 active=94961 piece=▁DI
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=107541 size=7460 all=1770871 active=100346 piece=▁performances
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=107136 size=7480 all=1774611 active=104086 piece=▁seats
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=106700 size=7500 all=1780068 active=109543 piece=)||
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=106700 min_freq=1052
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=106332 size=7520 all=1782913 active=91783 piece=▁Following
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=105917 size=7540 all=1785210 active=94080 piece=▁Annual
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=105591 size=7560 all=1787651 active=96521 piece=being
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=105338 size=7580 all=1792857 active=101727 piece=▁Partners
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=104806 size=7600 all=1794947 active=103817 piece=▁Shipping
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=104798 min_freq=1033
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=104375 size=7620 all=1799023 active=93737 piece=▁Golden
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=104055 size=7640 all=1801614 active=96328 piece=▁singles
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=103683 size=7660 all=1805274 active=99988 piece=▁vice
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=103237 size=7680 all=1807297 active=102011 piece=▁designers
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=102984 size=7700 all=1811005 active=105719 piece=▁defined
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=102955 min_freq=1014
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=102611 size=7720 all=1814243 active=93773 piece=▁Dark
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=102303 size=7740 all=1818708 active=98238 piece=▁officially
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=101914 size=7760 all=1824210 active=103740 piece=▁unus
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=101405 size=7780 all=1826587 active=106117 piece=▁maintaining
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=101035 size=7800 all=1828336 active=107866 piece=jo
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=101030 min_freq=996
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=100687 size=7820 all=1832755 active=94693 piece=▁licensed
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=100419 size=7840 all=1836195 active=98133 piece=▁colours
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=100140 size=7860 all=1841288 active=103226 piece=▁tap
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=99668 size=7880 all=1843191 active=105129 piece=ilty
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=99239 size=7900 all=1846734 active=108672 piece=▁undert
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=99212 min_freq=977
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=98746 size=7920 all=1852742 active=98263 piece=etch
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=98422 size=7940 all=1854894 active=100415 piece=▁Lic
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=98116 size=7960 all=1859706 active=105227 piece=▁regardless
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=97870 size=7980 all=1863291 active=108812 piece=gent
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=97497 size=8000 all=1868707 active=114228 piece=▁Has
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=97462 min_freq=955
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=97157 size=8020 all=1872935 active=97375 piece=▁stations
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=96850 size=8040 all=1879003 active=103443 piece=▁Resources
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=96521 size=8060 all=1881506 active=105946 piece=▁warranty
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=96140 size=8080 all=1883530 active=107970 piece=▁factory
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=95850 size=8100 all=1885126 active=109566 piece=EST
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=95811 min_freq=941
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=95440 size=8120 all=1889237 active=97976 piece=inity
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=95123 size=8140 all=1896170 active=104909 piece=▁Edition
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=94856 size=8160 all=1900515 active=109254 piece=▁minister
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=94404 size=8180 all=1903047 active=111786 piece=▁puts
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=94189 size=8200 all=1906962 active=115701 piece=▁electricity
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=94170 min_freq=920
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=93756 size=8220 all=1911838 active=100206 piece=▁Hills
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=93265 size=8240 all=1914886 active=103254 piece=uminum
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=92896 size=8260 all=1917719 active=106087 piece=▁Father
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=92568 size=8280 all=1920014 active=108382 piece=▁vill
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=92217 size=8300 all=1924724 active=113092 piece=▁personnel
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=92196 min_freq=905
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=91876 size=8320 all=1927558 active=99053 piece=▁pig
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=91506 size=8340 all=1931383 active=102878 piece=acular
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=91197 size=8360 all=1934648 active=106143 piece=▁Ever
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=90795 size=8380 all=1941269 active=112764 piece=▁lake
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=90510 size=8400 all=1944623 active=116118 piece=▁Sport
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=90476 min_freq=888
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=90094 size=8420 all=1949359 active=101590 piece=▁languages
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=89718 size=8440 all=1953491 active=105722 piece=▁thinks
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=89405 size=8460 all=1956518 active=108749 piece=▁scores
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=89176 size=8480 all=1962182 active=114413 piece=▁repeated
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=88732 size=8500 all=1966120 active=118351 piece=▁complicated
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=88611 min_freq=869
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=88436 size=8520 all=1970077 active=102249 piece=▁struck
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=88225 size=8540 all=1972952 active=105124 piece=▁Var
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=87956 size=8560 all=1975213 active=107385 piece=▁newest
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=87630 size=8580 all=1978127 active=110299 piece=▁Championship
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=87246 size=8600 all=1982549 active=114721 piece=▁Reviews
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=87223 min_freq=855
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=86913 size=8620 all=1985768 active=102246 piece=enities
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=86622 size=8640 all=1991032 active=107510 piece=▁Donald
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=86258 size=8660 all=1994679 active=111157 piece=▁southern
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=86076 size=8680 all=1997179 active=113657 piece=▁frustr
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=85775 size=8700 all=1999512 active=115990 piece=▁prayer
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=85766 min_freq=842
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=85564 size=8720 all=2002081 active=102510 piece=▁gap
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=85396 size=8740 all=2006452 active=106881 piece=▁granted
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=85191 size=8760 all=2009628 active=110057 piece=ji
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=84949 size=8780 all=2016980 active=117409 piece=▁Large
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=84712 size=8800 all=2021289 active=121718 piece=▁Style
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=84711 min_freq=825
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=84165 size=8820 all=2024695 active=104276 piece=igious
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=83790 size=8840 all=2031413 active=110994 piece=▁attitude
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=83383 size=8860 all=2033911 active=113492 piece=▁Barn
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=83009 size=8880 all=2038908 active=118489 piece=▁sed
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=82767 size=8900 all=2041675 active=121256 piece=▁containing
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=82742 min_freq=810
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=82454 size=8920 all=2043201 active=103603 piece=▁telephone
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=82213 size=8940 all=2050959 active=111361 piece=▁subsid
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=81858 size=8960 all=2053959 active=114361 piece=▁floors
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=81461 size=8980 all=2058285 active=118687 piece=aga
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=81104 size=9000 all=2061468 active=121870 piece=▁beans
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=81087 min_freq=796
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=80899 size=9020 all=2064988 active=106561 piece=▁sear
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=80681 size=9040 all=2069157 active=110730 piece=oli
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=80416 size=9060 all=2073607 active=115180 piece=▁explan
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=80237 size=9080 all=2076283 active=117856 piece=SW
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=79999 size=9100 all=2079322 active=120895 piece=utor
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=79975 min_freq=782
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=79758 size=9120 all=2085307 active=109574 piece=▁importantly
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=79509 size=9140 all=2091968 active=116235 piece=▁letting
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=79298 size=9160 all=2095552 active=119819 piece=▁dess
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=79030 size=9180 all=2099098 active=123365 piece=▁por
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=78706 size=9200 all=2103411 active=127678 piece=ILL
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=78696 min_freq=766
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=78512 size=9220 all=2107701 active=109136 piece=▁hide
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=78214 size=9240 all=2112855 active=114290 piece=▁delighted
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=78007 size=9260 all=2118394 active=119829 piece=▁auction
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=77823 size=9280 all=2120416 active=121851 piece=▁therm
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=77661 size=9300 all=2122965 active=124400 piece=View
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=77661 min_freq=752
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=77421 size=9320 all=2128585 active=109256 piece=▁survive
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=77140 size=9340 all=2130533 active=111204 piece=▁coaches
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=76943 size=9360 all=2134520 active=115191 piece=▁pip
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=76678 size=9380 all=2138376 active=119047 piece=mouth
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=76441 size=9400 all=2145055 active=125726 piece=ussion
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=76430 min_freq=738
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=76203 size=9420 all=2151497 active=113523 piece=▁Commercial
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=75956 size=9440 all=2153880 active=115906 piece=▁championship
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=75648 size=9460 all=2160717 active=122743 piece=▁Harris
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=75351 size=9480 all=2166263 active=128289 piece=▁captured
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=75121 size=9500 all=2168439 active=130465 piece=gener
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=75105 min_freq=723
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=74907 size=9520 all=2170525 active=110002 piece=▁SO
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=74691 size=9540 all=2173676 active=113153 piece=▁Bab
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=74485 size=9560 all=2176375 active=115852 piece=enna
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=74214 size=9580 all=2180709 active=120186 piece=▁sleeping
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=74030 size=9600 all=2184379 active=123856 piece=▁races
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=74028 min_freq=714
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=73826 size=9620 all=2189551 active=114371 piece=▁compan
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=73559 size=9640 all=2194963 active=119783 piece=▁Democrats
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=73369 size=9660 all=2199417 active=124237 piece=▁existence
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=73206 size=9680 all=2204598 active=129418 piece=oper
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=73053 size=9700 all=2207422 active=132242 piece=▁celebrating
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=73033 min_freq=699
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=72874 size=9720 all=2210775 active=113720 piece=▁chairs
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=72638 size=9740 all=2216452 active=119397 piece=attan
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=72447 size=9760 all=2220547 active=123492 piece=▁engineers
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=72304 size=9780 all=2227532 active=130477 piece=▁Kentucky
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=72065 size=9800 all=2231080 active=134025 piece=▁latter
+bpe_model_trainer.cc(159) LOG(INFO) Updating active symbols. max_freq=72049 min_freq=685
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=71874 size=9820 all=2236461 active=116910 piece=▁vulnerable
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=71685 size=9840 all=2238799 active=119248 piece=▁spell
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=71419 size=9860 all=2242179 active=122628 piece=▁Tai
+bpe_model_trainer.cc(268) LOG(INFO) Added: freq=71176 size=9880 all=2245004 active=125453 piece=▁toler
+trainer_interface.cc(689) LOG(INFO) Saving model: /home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe.model
+trainer_interface.cc(701) LOG(INFO) Saving vocabs: /home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe.vocab
+Exporting dataset: fineweb10B_sp10240
+fineweb10B_sp10240: 3200000/15368808 docs
+fineweb10B_sp10240: 6400000/15368808 docs
+fineweb10B_sp10240: 9600000/15368808 docs
+fineweb10B_sp10240: 12800000/15368808 docs
+Done. Manifest: /home/frosty40/parameter-golf-lab/data/manifest.json
+[wrapper] build exited rc=0; restoring index files from backup
+[wrapper] index files restored. New tokenizer/dataset (if built) remain in place.
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/build_sp10240_caseops_local.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/build_sp10240_caseops_local.sh
new file mode 100755
index 0000000000..26f2947fb7
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/build_sp10240_caseops_local.sh
@@ -0,0 +1,33 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+cd /home/frosty40/sota_rascal
+
+PYTHON_BIN="${PYTHON_BIN:-python3}"
+DOCS_PATH="${DOCS_PATH:-/home/frosty40/parameter-golf-lab/data/docs_selected.jsonl}"
+OUT_ROOT="${OUT_ROOT:-/home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets}"
+MAX_TRAIN_SHARDS="${MAX_TRAIN_SHARDS:-80}"
+VAL_DOCS="${VAL_DOCS:-50000}"
+SHARD_TOKENS="${SHARD_TOKENS:-10000000}"
+TOKENIZER_SKIP_DOCS="${TOKENIZER_SKIP_DOCS:-50000}"
+
+args=(
+  scripts/prepare_sp10240_caseops_data.py
+  --docs "${DOCS_PATH}"
+  --out "${OUT_ROOT}"
+  --train-tokenizer
+  --val-docs "${VAL_DOCS}"
+  --max-train-shards "${MAX_TRAIN_SHARDS}"
+  --shard-tokens "${SHARD_TOKENS}"
+  --tokenizer-skip-docs "${TOKENIZER_SKIP_DOCS}"
+)
+
+if [[ -n "${TOKENIZER_TRAIN_DOCS:-}" ]]; then
+  args+=(--tokenizer-train-docs "${TOKENIZER_TRAIN_DOCS}")
+fi
+
+echo "[sp10240-caseops] start $(date -Iseconds)"
+echo "[sp10240-caseops] docs=${DOCS_PATH}"
+echo "[sp10240-caseops] out=${OUT_ROOT}"
+echo "[sp10240-caseops] max_train_shards=${MAX_TRAIN_SHARDS} val_docs=${VAL_DOCS} shard_tokens=${SHARD_TOKENS}"
+exec "${PYTHON_BIN}" "${args[@]}"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_first80_from_hf.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_first80_from_hf.sh
new file mode 100755
index 0000000000..9617013de0
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_first80_from_hf.sh
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+BASE_URL="https://huggingface.co/datasets/Frosty40/10k_golfer/resolve/main"
+DATA_ROOT="${DATA_ROOT:-/workspace/SOTA_FINAL/data}"
+DATASET_DIR="$DATA_ROOT/datasets/fineweb10B_sp10240_first80"
+TOKENIZER_DIR="$DATA_ROOT/tokenizers"
+
+mkdir -p "$DATASET_DIR" "$TOKENIZER_DIR"
+
+download_file() {
+  local name="$1"
+  local dest="$2"
+  if [[ -s "$dest" ]]; then
+    echo "exists $dest"
+    return
+  fi
+  echo "download $name -> $dest"
+  curl -fL --retry 8 --retry-all-errors --connect-timeout 20 -C - \
+    -o "$dest.part" "$BASE_URL/$name?download=true"
+  mv "$dest.part" "$dest"
+}
+
+download_file "fineweb_10240_bpe.model" "$TOKENIZER_DIR/fineweb_10240_bpe.model"
+download_file "fineweb_10240_bpe.vocab" "$TOKENIZER_DIR/fineweb_10240_bpe.vocab"
+
+for idx in $(seq 0 79); do
+  shard="$(printf 'fineweb_train_%06d.bin' "$idx")"
+  download_file "$shard" "$DATASET_DIR/$shard"
+done
+
+download_file "fineweb_val_000000.bin" "$DATASET_DIR/fineweb_val_000000.bin"
+
+train_count="$(find "$DATASET_DIR" -maxdepth 1 -name 'fineweb_train_*.bin' | wc -l)"
+val_count="$(find "$DATASET_DIR" -maxdepth 1 -name 'fineweb_val_*.bin' | wc -l)"
+echo "train_count=$train_count val_count=$val_count"
+test "$train_count" -eq 80
+test "$val_count" -eq 1
+test -f "$TOKENIZER_DIR/fineweb_10240_bpe.model"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_full124_from_hf.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_full124_from_hf.sh
new file mode 100755
index 0000000000..1885ee8d4c
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/download_sp10240_full124_from_hf.sh
@@ -0,0 +1,46 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+BASE_URL="https://huggingface.co/datasets/Frosty40/10k_golfer/resolve/main"
+DATA_ROOT="${DATA_ROOT:-/workspace/SOTA_FINAL/data}"
+DATASET_DIR="$DATA_ROOT/datasets/fineweb10B_sp10240"
+TOKENIZER_DIR="$DATA_ROOT/tokenizers"
+FIRST80_DIR="$DATA_ROOT/datasets/fineweb10B_sp10240_first80"
+
+mkdir -p "$DATASET_DIR" "$TOKENIZER_DIR"
+
+download_file() {
+  local name="$1"
+  local dest="$2"
+  if [[ -s "$dest" ]]; then
+    echo "exists $dest"
+    return
+  fi
+  echo "download $name -> $dest"
+  curl -fL --retry 8 --retry-all-errors --connect-timeout 20 -C - \
+    -o "$dest.part" "$BASE_URL/$name?download=true"
+  mv "$dest.part" "$dest"
+}
+
+download_file "fineweb_10240_bpe.model" "$TOKENIZER_DIR/fineweb_10240_bpe.model"
+download_file "fineweb_10240_bpe.vocab" "$TOKENIZER_DIR/fineweb_10240_bpe.vocab"
+
+for idx in $(seq 0 123); do
+  shard="$(printf 'fineweb_train_%06d.bin' "$idx")"
+  if [[ -s "$FIRST80_DIR/$shard" && ! -e "$DATASET_DIR/$shard" ]]; then
+    mv "$FIRST80_DIR/$shard" "$DATASET_DIR/$shard"
+  fi
+  download_file "$shard" "$DATASET_DIR/$shard"
+done
+
+if [[ -s "$FIRST80_DIR/fineweb_val_000000.bin" && ! -e "$DATASET_DIR/fineweb_val_000000.bin" ]]; then
+  mv "$FIRST80_DIR/fineweb_val_000000.bin" "$DATASET_DIR/fineweb_val_000000.bin"
+fi
+download_file "fineweb_val_000000.bin" "$DATASET_DIR/fineweb_val_000000.bin"
+
+train_count="$(find "$DATASET_DIR" -maxdepth 1 -name 'fineweb_train_*.bin' | wc -l)"
+val_count="$(find "$DATASET_DIR" -maxdepth 1 -name 'fineweb_val_*.bin' | wc -l)"
+echo "train_count=$train_count val_count=$val_count"
+test "$train_count" -eq 124
+test "$val_count" -eq 1
+test -f "$TOKENIZER_DIR/fineweb_10240_bpe.model"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/lossless_caps.py b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/lossless_caps.py
new file mode 100644
index 0000000000..98e472f824
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/lossless_caps.py
@@ -0,0 +1,833 @@
+"""Lossless capitalization pre-encoding helpers.
+
+This module provides a narrow, reversible transform that only touches
+ASCII capital letters `A-Z`. Each uppercase ASCII letter is rewritten as
+`<sentinel><lowercase>`, where `sentinel` is a private-use Unicode
+character that is escaped by doubling if it appears literally in the
+input text.
+
+Example with the default sentinel `\\uE000`:
+
+    "The NASA Launch" -> "\\uE000the \\uE000n\\uE000a\\uE000s\\uE000a \\uE000launch"
+
+The transform is intentionally simple for v1:
+
+- lowercase ASCII letters are unchanged
+- uppercase ASCII letters become sentinel + lowercase letter
+- non-ASCII characters are left untouched
+- literal sentinel characters are escaped as sentinel + sentinel
+
+This makes the transform exactly invertible while allowing a downstream
+tokenizer to reuse lowercase subwords across case variants.
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Callable, Iterable
+
+LOSSLESS_CAPS_V1 = "lossless_caps_v1"
+LOSSLESS_CAPS_V2 = "lossless_caps_v2"
+LOSSLESS_CAPS_V3 = "lossless_caps_v3"
+LOSSLESS_CAPS_V4 = "lossless_caps_v4"
+LOSSLESS_CAPS_V5 = "lossless_caps_v5"
+LOSSLESS_CAPS_V6 = "lossless_caps_v6"
+LOSSLESS_CAPS_V7 = "lossless_caps_v7"
+LOSSLESS_CAPS_CASEOPS_V1 = "lossless_caps_caseops_v1"
+IDENTITY = "identity"
+DEFAULT_SENTINEL = "\uE000"
+DEFAULT_V2_TITLE = "\uE001"
+DEFAULT_V2_ALLCAPS = "\uE002"
+DEFAULT_V2_CAPNEXT = "\uE003"
+DEFAULT_V2_ESC = "\uE004"
+DEFAULT_V5_TITLE_MIN_LEN = 7
+DEFAULT_V6_ALLCAPS_MIN_LEN = 3
+DEFAULT_V7_ALLCAPS_MIN_LEN = 4
+
+
+class LosslessCapsError(ValueError):
+    """Raised when a transformed string is malformed."""
+
+
+def _is_ascii_upper(ch: str) -> bool:
+    return "A" <= ch <= "Z"
+
+
+def _is_ascii_lower(ch: str) -> bool:
+    return "a" <= ch <= "z"
+
+
+def _is_ascii_alpha(ch: str) -> bool:
+    return _is_ascii_lower(ch) or _is_ascii_upper(ch)
+
+
+def _validate_distinct_single_chars(*chars: str) -> None:
+    if any(len(ch) != 1 for ch in chars):
+        raise ValueError("all control characters must be exactly one character")
+    if len(set(chars)) != len(chars):
+        raise ValueError("control characters must be distinct")
+
+
+def encode_lossless_caps_v1(text: str, *, sentinel: str = DEFAULT_SENTINEL) -> str:
+    """Encode ASCII capitals reversibly using a one-character sentinel."""
+    if len(sentinel) != 1:
+        raise ValueError("sentinel must be exactly one character")
+    out: list[str] = []
+    for ch in text:
+        if ch == sentinel:
+            out.append(sentinel)
+            out.append(sentinel)
+        elif _is_ascii_upper(ch):
+            out.append(sentinel)
+            out.append(ch.lower())
+        else:
+            out.append(ch)
+    return "".join(out)
+
+
+def decode_lossless_caps_v1(text: str, *, sentinel: str = DEFAULT_SENTINEL) -> str:
+    """Decode the `lossless_caps_v1` transform back to the original text."""
+    if len(sentinel) != 1:
+        raise ValueError("sentinel must be exactly one character")
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch != sentinel:
+            out.append(ch)
+            i += 1
+            continue
+        if i + 1 >= n:
+            raise LosslessCapsError("dangling capitalization sentinel at end of string")
+        nxt = text[i + 1]
+        if nxt == sentinel:
+            out.append(sentinel)
+        elif _is_ascii_lower(nxt):
+            out.append(nxt.upper())
+        else:
+            raise LosslessCapsError(
+                f"invalid sentinel escape sequence {sentinel + nxt!r}; "
+                "expected doubled sentinel or sentinel + lowercase ASCII letter"
+            )
+        i += 2
+    return "".join(out)
+
+
+def encode_lossless_caps_v2(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    capnext: str = DEFAULT_V2_CAPNEXT,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Encode ASCII word capitalization with cheap word-level markers.
+
+    Rules over maximal ASCII alphabetic runs:
+    - lowercase words stay unchanged
+    - TitleCase words become `title + lowercase(word)`
+    - ALLCAPS words become `allcaps + lowercase(word)`
+    - mixed-case words use:
+      - optional `title` when the first letter is uppercase
+      - `capnext + lowercase(letter)` for subsequent uppercase letters
+    - literal control characters are escaped as `esc + literal`
+    """
+    _validate_distinct_single_chars(title, allcaps, capnext, esc)
+    controls = {title, allcaps, capnext, esc}
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch in controls:
+            out.append(esc)
+            out.append(ch)
+            i += 1
+            continue
+        if not _is_ascii_alpha(ch):
+            out.append(ch)
+            i += 1
+            continue
+
+        j = i + 1
+        while j < n and _is_ascii_alpha(text[j]):
+            j += 1
+        word = text[i:j]
+        lower_word = word.lower()
+
+        if word.islower():
+            out.append(word)
+        elif len(word) >= 2 and word.isupper():
+            out.append(allcaps)
+            out.append(lower_word)
+        elif _is_ascii_upper(word[0]) and word[1:].islower():
+            out.append(title)
+            out.append(lower_word)
+        else:
+            if _is_ascii_upper(word[0]):
+                out.append(title)
+            out.append(lower_word[0])
+            for orig_ch, lower_ch in zip(word[1:], lower_word[1:], strict=True):
+                if _is_ascii_upper(orig_ch):
+                    out.append(capnext)
+                out.append(lower_ch)
+        i = j
+    return "".join(out)
+
+
+def decode_lossless_caps_v2(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    capnext: str = DEFAULT_V2_CAPNEXT,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v2` transform back to the original text."""
+    _validate_distinct_single_chars(title, allcaps, capnext, esc)
+    out: list[str] = []
+    pending_escape = False
+    pending_word_mode: str | None = None
+    active_allcaps = False
+    pending_capnext = False
+    in_ascii_word = False
+
+    for ch in text:
+        if pending_escape:
+            if pending_word_mode is not None and not _is_ascii_alpha(ch):
+                raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode")
+            out.append(ch)
+            pending_escape = False
+            if _is_ascii_alpha(ch):
+                in_ascii_word = True
+            else:
+                in_ascii_word = False
+                active_allcaps = False
+            continue
+
+        if ch == esc:
+            pending_escape = True
+            continue
+        if ch == title:
+            if pending_word_mode is not None or in_ascii_word or pending_capnext:
+                raise LosslessCapsError("invalid title marker placement")
+            pending_word_mode = "title"
+            continue
+        if ch == allcaps:
+            if pending_word_mode is not None or in_ascii_word or pending_capnext:
+                raise LosslessCapsError("invalid allcaps marker placement")
+            pending_word_mode = "allcaps"
+            continue
+        if ch == capnext:
+            if pending_capnext:
+                raise LosslessCapsError("duplicate capnext marker")
+            pending_capnext = True
+            continue
+
+        if _is_ascii_alpha(ch):
+            at_word_start = not in_ascii_word
+            if at_word_start:
+                if pending_word_mode == "allcaps":
+                    out.append(ch.upper())
+                    active_allcaps = True
+                elif pending_word_mode == "title":
+                    out.append(ch.upper())
+                elif pending_capnext:
+                    out.append(ch.upper())
+                else:
+                    out.append(ch)
+                pending_word_mode = None
+                pending_capnext = False
+                in_ascii_word = True
+                continue
+
+            if pending_word_mode is not None:
+                raise LosslessCapsError("word capitalization marker leaked into the middle of a word")
+            if active_allcaps:
+                out.append(ch.upper())
+            elif pending_capnext:
+                out.append(ch.upper())
+            else:
+                out.append(ch)
+            pending_capnext = False
+            continue
+
+        if pending_word_mode is not None or pending_capnext:
+            raise LosslessCapsError("capitalization marker not followed by an ASCII letter")
+        out.append(ch)
+        in_ascii_word = False
+        active_allcaps = False
+
+    if pending_escape:
+        raise LosslessCapsError("dangling escape marker at end of string")
+    if pending_word_mode is not None or pending_capnext:
+        raise LosslessCapsError("dangling capitalization marker at end of string")
+    return "".join(out)
+
+
+def encode_lossless_caps_v3(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Encode only common word-level capitalization patterns.
+
+    Rules over maximal ASCII alphabetic runs:
+    - lowercase words stay unchanged
+    - TitleCase words become `title + lowercase(word)`
+    - ALLCAPS words become `allcaps + lowercase(word)`
+    - all other mixed-case words are left unchanged
+    - literal control characters are escaped as `esc + literal`
+    """
+    _validate_distinct_single_chars(title, allcaps, esc)
+    controls = {title, allcaps, esc}
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch in controls:
+            out.append(esc)
+            out.append(ch)
+            i += 1
+            continue
+        if not _is_ascii_alpha(ch):
+            out.append(ch)
+            i += 1
+            continue
+
+        j = i + 1
+        while j < n and _is_ascii_alpha(text[j]):
+            j += 1
+        word = text[i:j]
+
+        if word.islower():
+            out.append(word)
+        elif len(word) >= 2 and word.isupper():
+            out.append(allcaps)
+            out.append(word.lower())
+        elif _is_ascii_upper(word[0]) and word[1:].islower():
+            out.append(title)
+            out.append(word.lower())
+        else:
+            out.append(word)
+        i = j
+    return "".join(out)
+
+
+def decode_lossless_caps_v3(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v3` transform back to the original text."""
+    _validate_distinct_single_chars(title, allcaps, esc)
+    out: list[str] = []
+    pending_escape = False
+    pending_word_mode: str | None = None
+    active_allcaps = False
+    in_ascii_word = False
+
+    for ch in text:
+        if pending_escape:
+            if pending_word_mode is not None and not _is_ascii_alpha(ch):
+                raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode")
+            out.append(ch)
+            pending_escape = False
+            if _is_ascii_alpha(ch):
+                in_ascii_word = True
+            else:
+                in_ascii_word = False
+                active_allcaps = False
+            continue
+
+        if ch == esc:
+            pending_escape = True
+            continue
+        if ch == title:
+            if pending_word_mode is not None or in_ascii_word:
+                raise LosslessCapsError("invalid title marker placement")
+            pending_word_mode = "title"
+            continue
+        if ch == allcaps:
+            if pending_word_mode is not None or in_ascii_word:
+                raise LosslessCapsError("invalid allcaps marker placement")
+            pending_word_mode = "allcaps"
+            continue
+
+        if _is_ascii_alpha(ch):
+            at_word_start = not in_ascii_word
+            if at_word_start:
+                if pending_word_mode == "allcaps":
+                    out.append(ch.upper())
+                    active_allcaps = True
+                elif pending_word_mode == "title":
+                    out.append(ch.upper())
+                else:
+                    out.append(ch)
+                pending_word_mode = None
+                in_ascii_word = True
+                continue
+
+            if pending_word_mode is not None:
+                raise LosslessCapsError("word capitalization marker leaked into the middle of a word")
+            out.append(ch.upper() if active_allcaps else ch)
+            continue
+
+        if pending_word_mode is not None:
+            raise LosslessCapsError("capitalization marker not followed by an ASCII letter")
+        out.append(ch)
+        in_ascii_word = False
+        active_allcaps = False
+
+    if pending_escape:
+        raise LosslessCapsError("dangling escape marker at end of string")
+    if pending_word_mode is not None:
+        raise LosslessCapsError("dangling capitalization marker at end of string")
+    return "".join(out)
+
+
+def encode_lossless_caps_v4(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Encode only ALLCAPS ASCII words, leaving all other case untouched."""
+    _validate_distinct_single_chars(allcaps, esc)
+    controls = {allcaps, esc}
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch in controls:
+            out.append(esc)
+            out.append(ch)
+            i += 1
+            continue
+        if not _is_ascii_alpha(ch):
+            out.append(ch)
+            i += 1
+            continue
+        j = i + 1
+        while j < n and _is_ascii_alpha(text[j]):
+            j += 1
+        word = text[i:j]
+        if len(word) >= 2 and word.isupper():
+            out.append(allcaps)
+            out.append(word.lower())
+        else:
+            out.append(word)
+        i = j
+    return "".join(out)
+
+
+def decode_lossless_caps_v4(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v4` transform back to the original text."""
+    _validate_distinct_single_chars(allcaps, esc)
+    out: list[str] = []
+    pending_escape = False
+    pending_allcaps = False
+    in_ascii_word = False
+    active_allcaps = False
+
+    for ch in text:
+        if pending_escape:
+            if pending_allcaps and not _is_ascii_alpha(ch):
+                raise LosslessCapsError("escaped control char cannot satisfy pending allcaps mode")
+            out.append(ch)
+            pending_escape = False
+            if _is_ascii_alpha(ch):
+                in_ascii_word = True
+            else:
+                in_ascii_word = False
+                active_allcaps = False
+            continue
+
+        if ch == esc:
+            pending_escape = True
+            continue
+        if ch == allcaps:
+            if pending_allcaps or in_ascii_word:
+                raise LosslessCapsError("invalid allcaps marker placement")
+            pending_allcaps = True
+            continue
+
+        if _is_ascii_alpha(ch):
+            if not in_ascii_word:
+                active_allcaps = pending_allcaps
+                pending_allcaps = False
+                in_ascii_word = True
+            out.append(ch.upper() if active_allcaps else ch)
+            continue
+
+        if pending_allcaps:
+            raise LosslessCapsError("allcaps marker not followed by an ASCII letter")
+        out.append(ch)
+        in_ascii_word = False
+        active_allcaps = False
+
+    if pending_escape:
+        raise LosslessCapsError("dangling escape marker at end of string")
+    if pending_allcaps:
+        raise LosslessCapsError("dangling allcaps marker at end of string")
+    return "".join(out)
+
+
+def encode_lossless_caps_v5(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+    title_min_len: int = DEFAULT_V5_TITLE_MIN_LEN,
+) -> str:
+    """Encode ALLCAPS words and only sufficiently long TitleCase words."""
+    _validate_distinct_single_chars(title, allcaps, esc)
+    controls = {title, allcaps, esc}
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch in controls:
+            out.append(esc)
+            out.append(ch)
+            i += 1
+            continue
+        if not _is_ascii_alpha(ch):
+            out.append(ch)
+            i += 1
+            continue
+        j = i + 1
+        while j < n and _is_ascii_alpha(text[j]):
+            j += 1
+        word = text[i:j]
+        if len(word) >= 2 and word.isupper():
+            out.append(allcaps)
+            out.append(word.lower())
+        elif len(word) >= title_min_len and _is_ascii_upper(word[0]) and word[1:].islower():
+            out.append(title)
+            out.append(word.lower())
+        else:
+            out.append(word)
+        i = j
+    return "".join(out)
+
+
+def decode_lossless_caps_v5(
+    text: str,
+    *,
+    title: str = DEFAULT_V2_TITLE,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v5` transform back to the original text."""
+    return decode_lossless_caps_v3(text, title=title, allcaps=allcaps, esc=esc)
+
+
+def encode_lossless_caps_v6(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+    allcaps_min_len: int = DEFAULT_V6_ALLCAPS_MIN_LEN,
+) -> str:
+    """Encode only ALLCAPS words with length >= allcaps_min_len."""
+    _validate_distinct_single_chars(allcaps, esc)
+    controls = {allcaps, esc}
+    out: list[str] = []
+    i = 0
+    n = len(text)
+    while i < n:
+        ch = text[i]
+        if ch in controls:
+            out.append(esc)
+            out.append(ch)
+            i += 1
+            continue
+        if not _is_ascii_alpha(ch):
+            out.append(ch)
+            i += 1
+            continue
+        j = i + 1
+        while j < n and _is_ascii_alpha(text[j]):
+            j += 1
+        word = text[i:j]
+        if len(word) >= allcaps_min_len and word.isupper():
+            out.append(allcaps)
+            out.append(word.lower())
+        else:
+            out.append(word)
+        i = j
+    return "".join(out)
+
+
+def decode_lossless_caps_v6(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v6` transform back to the original text."""
+    return decode_lossless_caps_v4(text, allcaps=allcaps, esc=esc)
+
+
+def encode_lossless_caps_v7(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+    allcaps_min_len: int = DEFAULT_V7_ALLCAPS_MIN_LEN,
+) -> str:
+    """Encode only ALLCAPS words with length >= 4."""
+    return encode_lossless_caps_v6(
+        text,
+        allcaps=allcaps,
+        esc=esc,
+        allcaps_min_len=allcaps_min_len,
+    )
+
+
+def decode_lossless_caps_v7(
+    text: str,
+    *,
+    allcaps: str = DEFAULT_V2_ALLCAPS,
+    esc: str = DEFAULT_V2_ESC,
+) -> str:
+    """Decode the `lossless_caps_v7` transform back to the original text."""
+    return decode_lossless_caps_v6(text, allcaps=allcaps, esc=esc)
+
+
+def get_text_transform(name: str | None) -> Callable[[str], str]:
+    """Return the forward text transform for the given config name."""
+    normalized = IDENTITY if name in {None, "", IDENTITY} else str(name)
+    if normalized == IDENTITY:
+        return lambda text: text
+    if normalized == LOSSLESS_CAPS_V1:
+        return encode_lossless_caps_v1
+    if normalized == LOSSLESS_CAPS_V2:
+        return encode_lossless_caps_v2
+    if normalized == LOSSLESS_CAPS_V3:
+        return encode_lossless_caps_v3
+    if normalized == LOSSLESS_CAPS_V4:
+        return encode_lossless_caps_v4
+    if normalized == LOSSLESS_CAPS_V5:
+        return encode_lossless_caps_v5
+    if normalized == LOSSLESS_CAPS_V6:
+        return encode_lossless_caps_v6
+    if normalized == LOSSLESS_CAPS_V7:
+        return encode_lossless_caps_v7
+    if normalized == LOSSLESS_CAPS_CASEOPS_V1:
+        return encode_lossless_caps_v2
+    raise ValueError(f"unsupported text_transform={name!r}")
+
+
+def get_text_inverse_transform(name: str | None) -> Callable[[str], str]:
+    """Return the inverse transform for the given config name."""
+    normalized = IDENTITY if name in {None, "", IDENTITY} else str(name)
+    if normalized == IDENTITY:
+        return lambda text: text
+    if normalized == LOSSLESS_CAPS_V1:
+        return decode_lossless_caps_v1
+    if normalized == LOSSLESS_CAPS_V2:
+        return decode_lossless_caps_v2
+    if normalized == LOSSLESS_CAPS_V3:
+        return decode_lossless_caps_v3
+    if normalized == LOSSLESS_CAPS_V4:
+        return decode_lossless_caps_v4
+    if normalized == LOSSLESS_CAPS_V5:
+        return decode_lossless_caps_v5
+    if normalized == LOSSLESS_CAPS_V6:
+        return decode_lossless_caps_v6
+    if normalized == LOSSLESS_CAPS_V7:
+        return decode_lossless_caps_v7
+    if normalized == LOSSLESS_CAPS_CASEOPS_V1:
+        return decode_lossless_caps_v2
+    raise ValueError(f"unsupported text_transform={name!r}")
+
+
+def normalize_text_transform_name(name: str | None) -> str:
+    """Normalize empty/None transform names to the identity transform."""
+    return IDENTITY if name in {None, "", IDENTITY} else str(name)
+
+
+def get_text_transform_control_symbols(name: str | None) -> list[str]:
+    """Return reserved control symbols used by a transform, if any."""
+    normalized = normalize_text_transform_name(name)
+    if normalized == IDENTITY:
+        return []
+    if normalized == LOSSLESS_CAPS_V1:
+        return [DEFAULT_SENTINEL]
+    if normalized == LOSSLESS_CAPS_V2:
+        return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_CAPNEXT, DEFAULT_V2_ESC]
+    if normalized == LOSSLESS_CAPS_CASEOPS_V1:
+        return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_CAPNEXT, DEFAULT_V2_ESC]
+    if normalized in {LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V5}:
+        return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_ESC]
+    if normalized in {LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7}:
+        return [DEFAULT_V2_ALLCAPS, DEFAULT_V2_ESC]
+    raise ValueError(f"unsupported text_transform={name!r}")
+
+
+def infer_text_transform_from_manifest(tokenizer_path: str | Path) -> str:
+    """Best-effort lookup of a tokenizer's text transform from a local manifest."""
+    tokenizer_path = Path(tokenizer_path).expanduser().resolve()
+    manifest_candidates = [
+        tokenizer_path.parent.parent / "manifest.json",
+        tokenizer_path.parent / "manifest.json",
+    ]
+    for manifest_path in manifest_candidates:
+        if not manifest_path.is_file():
+            continue
+        try:
+            payload = json.loads(manifest_path.read_text(encoding="utf-8"))
+        except (OSError, json.JSONDecodeError):
+            continue
+        tokenizers = payload.get("tokenizers")
+        if not isinstance(tokenizers, list):
+            continue
+        for tokenizer_meta in tokenizers:
+            if not isinstance(tokenizer_meta, dict):
+                continue
+            model_path = tokenizer_meta.get("model_path") or tokenizer_meta.get("path")
+            if not model_path:
+                continue
+            candidate = (manifest_path.parent / str(model_path)).resolve()
+            if candidate == tokenizer_path:
+                return normalize_text_transform_name(tokenizer_meta.get("text_transform"))
+    return IDENTITY
+
+
+def surface_piece_original_byte_counts(
+    surfaces: Iterable[str],
+    *,
+    text_transform_name: str | None = None,
+    sentinel: str = DEFAULT_SENTINEL,
+) -> list[int]:
+    """Return exact original UTF-8 byte counts contributed by each surface piece.
+
+    `surfaces` must be the exact decoded text fragments emitted by SentencePiece
+    in order, e.g. `piece.surface` from `encode_as_immutable_proto`.
+    """
+    normalized = normalize_text_transform_name(text_transform_name)
+    if normalized == IDENTITY:
+        return [len(surface.encode("utf-8")) for surface in surfaces]
+    if normalized == LOSSLESS_CAPS_V1:
+        if len(sentinel) != 1:
+            raise ValueError("sentinel must be exactly one character")
+        sentinel_bytes = len(sentinel.encode("utf-8"))
+        pending_sentinel = False
+        counts: list[int] = []
+        for surface in surfaces:
+            piece_bytes = 0
+            for ch in surface:
+                if pending_sentinel:
+                    if ch == sentinel:
+                        piece_bytes += sentinel_bytes
+                    elif _is_ascii_lower(ch):
+                        piece_bytes += 1
+                    else:
+                        raise LosslessCapsError(
+                            f"invalid continuation {ch!r} after capitalization sentinel"
+                        )
+                    pending_sentinel = False
+                    continue
+                if ch == sentinel:
+                    pending_sentinel = True
+                else:
+                    piece_bytes += len(ch.encode("utf-8"))
+            counts.append(piece_bytes)
+        if pending_sentinel:
+            raise LosslessCapsError("dangling capitalization sentinel across piece boundary")
+        return counts
+    if normalized not in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V5, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7, LOSSLESS_CAPS_CASEOPS_V1}:
+        raise ValueError(f"unsupported text_transform={text_transform_name!r}")
+
+    title = DEFAULT_V2_TITLE
+    allcaps = DEFAULT_V2_ALLCAPS
+    capnext = DEFAULT_V2_CAPNEXT
+    esc = DEFAULT_V2_ESC
+    if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_CASEOPS_V1}:
+        _validate_distinct_single_chars(title, allcaps, capnext, esc)
+    elif normalized in {LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7}:
+        _validate_distinct_single_chars(allcaps, esc)
+    else:
+        _validate_distinct_single_chars(title, allcaps, esc)
+    pending_escape = False
+    pending_word_mode: str | None = None
+    active_allcaps = False
+    pending_capnext = False
+    in_ascii_word = False
+    counts: list[int] = []
+    for surface in surfaces:
+        piece_bytes = 0
+        for ch in surface:
+            if pending_escape:
+                if pending_word_mode is not None and not _is_ascii_alpha(ch):
+                    raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode")
+                piece_bytes += len(ch.encode("utf-8"))
+                pending_escape = False
+                if _is_ascii_alpha(ch):
+                    in_ascii_word = True
+                else:
+                    in_ascii_word = False
+                    active_allcaps = False
+                continue
+            if ch == esc:
+                pending_escape = True
+                continue
+            if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V5, LOSSLESS_CAPS_CASEOPS_V1} and ch == title:
+                if pending_word_mode is not None or in_ascii_word or pending_capnext:
+                    raise LosslessCapsError("invalid title marker placement")
+                pending_word_mode = "title"
+                continue
+            if ch == allcaps:
+                if pending_word_mode is not None or in_ascii_word or pending_capnext:
+                    raise LosslessCapsError("invalid allcaps marker placement")
+                pending_word_mode = "allcaps"
+                continue
+            if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_CASEOPS_V1} and ch == capnext:
+                if pending_capnext:
+                    raise LosslessCapsError("duplicate capnext marker")
+                pending_capnext = True
+                continue
+
+            if _is_ascii_alpha(ch):
+                at_word_start = not in_ascii_word
+                if at_word_start:
+                    piece_bytes += 1
+                    active_allcaps = pending_word_mode == "allcaps"
+                    pending_word_mode = None
+                    pending_capnext = False
+                    in_ascii_word = True
+                    continue
+                if pending_word_mode is not None:
+                    raise LosslessCapsError("word capitalization marker leaked into the middle of a word")
+                piece_bytes += 1
+                pending_capnext = False
+                continue
+
+            if pending_word_mode is not None or pending_capnext:
+                raise LosslessCapsError("capitalization marker not followed by an ASCII letter")
+            piece_bytes += len(ch.encode("utf-8"))
+            in_ascii_word = False
+            active_allcaps = False
+        counts.append(piece_bytes)
+    if pending_escape:
+        raise LosslessCapsError("dangling escape marker across piece boundary")
+    if pending_word_mode is not None or pending_capnext:
+        raise LosslessCapsError("dangling capitalization marker across piece boundary")
+    return counts
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/prepare_sp10240_caseops_data.py b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/prepare_sp10240_caseops_data.py
new file mode 100755
index 0000000000..d657dd7228
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/prepare_sp10240_caseops_data.py
@@ -0,0 +1,508 @@
+#!/usr/bin/env python3
+"""Build SP10240 CaseOps tokenizer/shards with validation byte sidecars.
+
+This is a local data-build lane. It does not launch training.
+
+Defaults follow the PR1855/PR1797 CaseOps data shape:
+- lossless_caps_caseops_v1 transform from the PR1855 source lane
+- reserved SentencePiece user symbols U+E001..U+E004
+- uint16 header-prefixed shards
+- BOS per document
+- validation byte sidecars named fineweb_val_bytes_*.bin
+- 10,000,000 tokens per shard unless overridden
+"""
+
+from __future__ import annotations
+
+import argparse
+import hashlib
+import importlib.util
+import json
+import pathlib
+import sys
+import time
+from array import array
+from typing import Any, Callable, Iterable
+
+import numpy as np
+import sentencepiece as spm
+
+
+REPO_ROOT = pathlib.Path(__file__).resolve().parents[1]
+PR1855_SOURCE_DIR = REPO_ROOT / "legs" / "2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x"
+
+SHARD_MAGIC = 20240520
+SHARD_VERSION = 1
+DEFAULT_SHARD_TOKENS = 10_000_000
+DEFAULT_VAL_DOCS = 50_000
+BOS_ID = 1
+VOCAB_SIZE = 10_240
+TOKENIZER_BASENAME = "fineweb_10240_bpe_lossless_caps_caseops_v1_reserved"
+DATASET_NAME = "fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+CASEOPS_SYMBOLS = [chr(0xE001), chr(0xE002), chr(0xE003), chr(0xE004)]
+
+
+def _sha256(path: pathlib.Path) -> str:
+    h = hashlib.sha256()
+    with path.open("rb") as fh:
+        for chunk in iter(lambda: fh.read(1024 * 1024), b""):
+            h.update(chunk)
+    return h.hexdigest()
+
+
+def _load_json_if_exists(path: pathlib.Path) -> dict[str, Any] | None:
+    if not path.is_file():
+        return None
+    with path.open("r", encoding="utf-8") as fh:
+        payload = json.load(fh)
+    if not isinstance(payload, dict):
+        raise ValueError(f"expected JSON object: {path}")
+    return payload
+
+
+def _load_lossless_caps(source_dir: pathlib.Path):
+    module_path = source_dir / "lossless_caps.py"
+    if not module_path.is_file():
+        raise FileNotFoundError(module_path)
+    spec = importlib.util.spec_from_file_location("caseops_lossless_caps", module_path)
+    if spec is None or spec.loader is None:
+        raise RuntimeError(f"cannot load {module_path}")
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    return module, module_path
+
+
+def _write_shard(out_path: pathlib.Path, values: array) -> None:
+    if not values:
+        return
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    arr = np.asarray(values, dtype="<u2")
+    header = np.zeros(256, dtype="<i4")
+    header[0] = SHARD_MAGIC
+    header[1] = SHARD_VERSION
+    header[2] = int(arr.size)
+    with out_path.open("wb") as fh:
+        fh.write(header.tobytes())
+        fh.write(arr.tobytes())
+
+
+def _iter_docs(docs_path: pathlib.Path) -> Iterable[str]:
+    with docs_path.open("r", encoding="utf-8") as fh:
+        for line in fh:
+            line = line.strip()
+            if not line:
+                continue
+            obj = json.loads(line)
+            yield obj["text"] if isinstance(obj, dict) else obj
+
+
+def _iter_caseops_training_text(
+    docs_path: pathlib.Path,
+    transform: Callable[[str], str],
+    *,
+    skip_docs: int,
+    max_docs: int | None,
+) -> Iterable[str]:
+    yielded = 0
+    for doc_index, text in enumerate(_iter_docs(docs_path)):
+        if doc_index < skip_docs:
+            continue
+        text = text.replace("\x00", " ").strip()
+        if not text:
+            continue
+        yield transform(text)
+        yielded += 1
+        if max_docs is not None and yielded >= max_docs:
+            return
+
+
+def _train_caseops_tokenizer(
+    *,
+    docs_path: pathlib.Path,
+    model_prefix: pathlib.Path,
+    transform: Callable[[str], str],
+    skip_docs: int,
+    max_docs: int | None,
+) -> None:
+    model_path = model_prefix.with_suffix(".model")
+    vocab_path = model_prefix.with_suffix(".vocab")
+    if model_path.exists() or vocab_path.exists():
+        raise FileExistsError(f"refusing to overwrite existing tokenizer artifacts at {model_prefix}.*")
+    model_prefix.parent.mkdir(parents=True, exist_ok=True)
+    print(
+        json.dumps(
+            {
+                "event": "train_tokenizer_start",
+                "model_prefix": str(model_prefix),
+                "vocab_size": VOCAB_SIZE,
+                "tokenizer_skip_docs": skip_docs,
+                "tokenizer_train_docs": max_docs,
+                "user_defined_symbols_hex": [hex(ord(s)) for s in CASEOPS_SYMBOLS],
+            },
+            sort_keys=True,
+        ),
+        flush=True,
+    )
+    spm.SentencePieceTrainer.train(
+        sentence_iterator=_iter_caseops_training_text(
+            docs_path,
+            transform,
+            skip_docs=skip_docs,
+            max_docs=max_docs,
+        ),
+        model_prefix=str(model_prefix),
+        model_type="bpe",
+        vocab_size=VOCAB_SIZE,
+        character_coverage=0.999,
+        byte_fallback=True,
+        split_digits=True,
+        normalization_rule_name="nmt_nfkc",
+        add_dummy_prefix=False,
+        pad_id=0,
+        bos_id=1,
+        eos_id=2,
+        unk_id=3,
+        hard_vocab_limit=False,
+        user_defined_symbols=CASEOPS_SYMBOLS,
+    )
+
+
+def _validate_tokenizer(sp: spm.SentencePieceProcessor, sp_path: pathlib.Path) -> None:
+    if int(sp.vocab_size()) != VOCAB_SIZE:
+        raise ValueError(f"{sp_path} vocab_size={sp.vocab_size()} != {VOCAB_SIZE}")
+    for offset, symbol in enumerate(CASEOPS_SYMBOLS, start=4):
+        token_id = int(sp.piece_to_id(symbol))
+        if token_id != offset:
+            raise ValueError(
+                f"{sp_path} does not reserve CaseOps symbol {hex(ord(symbol))} at id {offset}; got {token_id}"
+            )
+
+
+def _encode_val_doc(
+    sp: spm.SentencePieceProcessor,
+    transformed_text: str,
+    *,
+    byte_counter: Callable[..., list[int]],
+    transform_name: str,
+) -> tuple[list[int], list[int]]:
+    proto = sp.encode_as_immutable_proto(transformed_text)
+    token_ids = [BOS_ID]
+    token_ids.extend(int(piece.id) for piece in proto.pieces)
+    byte_counts = [0]
+    byte_counts.extend(
+        int(v)
+        for v in byte_counter(
+            (piece.surface for piece in proto.pieces),
+            text_transform_name=transform_name,
+        )
+    )
+    if len(token_ids) != len(byte_counts):
+        raise ValueError(f"token/byte sidecar length mismatch: {len(token_ids)} != {len(byte_counts)}")
+    too_large = [v for v in byte_counts if v > 0xFFFF]
+    if too_large:
+        raise ValueError(f"byte sidecar value exceeds uint16: {too_large[0]}")
+    return token_ids, byte_counts
+
+
+def _append_uint16(buf: array, values: Iterable[int]) -> None:
+    for value in values:
+        if value < 0 or value > 0xFFFF:
+            raise ValueError(f"value outside uint16 range: {value}")
+        buf.append(int(value))
+
+
+def _flush_full_shards(
+    *,
+    buf: array,
+    out_dir: pathlib.Path,
+    prefix: str,
+    shard_index: int,
+    shard_tokens: int,
+) -> int:
+    while len(buf) >= shard_tokens:
+        shard = array("H", buf[:shard_tokens])
+        _write_shard(out_dir / f"{prefix}_{shard_index:06d}.bin", shard)
+        del buf[:shard_tokens]
+        shard_index += 1
+    return shard_index
+
+
+def _flush_tail(*, buf: array, out_dir: pathlib.Path, prefix: str, shard_index: int) -> int:
+    if buf:
+        _write_shard(out_dir / f"{prefix}_{shard_index:06d}.bin", buf)
+        del buf[:]
+        shard_index += 1
+    return shard_index
+
+
+def _fail_if_existing_outputs(dataset_dir: pathlib.Path) -> None:
+    patterns = ("fineweb_train_*.bin", "fineweb_val_*.bin", "fineweb_val_bytes_*.bin")
+    existing: list[pathlib.Path] = []
+    for pattern in patterns:
+        existing.extend(sorted(dataset_dir.glob(pattern)))
+    if existing:
+        sample = ", ".join(str(p) for p in existing[:5])
+        raise FileExistsError(f"refusing to overwrite existing shard outputs in {dataset_dir}; sample: {sample}")
+
+
+def _write_manifest(path: pathlib.Path, payload: dict[str, Any]) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(json.dumps(payload, indent=2, sort_keys=True) + "\n", encoding="utf-8")
+
+
+def build_shards(
+    *,
+    docs_path: pathlib.Path,
+    out_root: pathlib.Path,
+    sp_path: pathlib.Path,
+    source_dir: pathlib.Path,
+    val_docs: int,
+    max_train_shards: int,
+    shard_tokens: int,
+    tokenizer_skip_docs: int,
+    tokenizer_train_docs: int | None,
+    trained_tokenizer: bool,
+) -> pathlib.Path:
+    caps, caps_path = _load_lossless_caps(source_dir)
+    sp = spm.SentencePieceProcessor(model_file=str(sp_path))
+    _validate_tokenizer(sp, sp_path)
+
+    dataset_dir = out_root / "datasets" / DATASET_NAME
+    _fail_if_existing_outputs(dataset_dir)
+    tokenizers_dir = out_root / "tokenizers"
+    tokenizers_dir.mkdir(parents=True, exist_ok=True)
+
+    docs_sidecar = _load_json_if_exists(docs_path.with_name(f"{docs_path.stem}.source_manifest.json"))
+    started_at = time.strftime("%Y-%m-%dT%H:%M:%S%z")
+    manifest: dict[str, Any] = {
+        "label": "new_experiment",
+        "description": "SP10240 CaseOps tokenizer plus PR1855-style validation byte sidecars.",
+        "started_at": started_at,
+        "docs_path": str(docs_path),
+        "docs_sidecar": docs_sidecar,
+        "lossless_caps_path": str(caps_path),
+        "lossless_caps_sha256": _sha256(caps_path),
+        "output_root": str(out_root),
+        "dataset_name": DATASET_NAME,
+        "dataset_path": str(dataset_dir),
+        "tokenizer_model": str(sp_path),
+        "tokenizer_vocab": str(sp_path.with_suffix(".vocab")),
+        "tokenizer_model_sha256": _sha256(sp_path),
+        "tokenizer_vocab_sha256": _sha256(sp_path.with_suffix(".vocab")) if sp_path.with_suffix(".vocab").is_file() else None,
+        "tokenizer_trained_in_this_run": trained_tokenizer,
+        "tokenizer_training_spec": {
+            "vocab_size": VOCAB_SIZE,
+            "model_type": "bpe",
+            "character_coverage": 0.999,
+            "byte_fallback": True,
+            "split_digits": True,
+            "normalization_rule_name": "nmt_nfkc",
+            "add_dummy_prefix": False,
+            "pad_id": 0,
+            "bos_id": 1,
+            "eos_id": 2,
+            "unk_id": 3,
+            "hard_vocab_limit": False,
+            "user_defined_symbols_hex": [hex(ord(s)) for s in CASEOPS_SYMBOLS],
+            "tokenizer_skip_docs": tokenizer_skip_docs,
+            "tokenizer_train_docs": tokenizer_train_docs,
+        },
+        "shard_spec": {
+            "magic": SHARD_MAGIC,
+            "version": SHARD_VERSION,
+            "dtype": "uint16",
+            "shard_tokens": shard_tokens,
+            "val_docs": val_docs,
+            "max_train_shards": max_train_shards,
+            "bos_id": BOS_ID,
+            "byte_sidecars": "validation_only",
+        },
+        "stats": {
+            "docs_total": 0,
+            "docs_val": 0,
+            "docs_train": 0,
+            "tokens_val": 0,
+            "tokens_train": 0,
+            "bytes_sidecar_tokens_val": 0,
+            "files_val": 0,
+            "files_val_bytes": 0,
+            "files_train": 0,
+        },
+    }
+    _write_manifest(out_root / "caseops_manifest.in_progress.json", manifest)
+
+    val_buf_tokens = array("H")
+    val_buf_bytes = array("H")
+    train_buf = array("H")
+    val_written = 0
+    val_bytes_written = 0
+    train_written = 0
+    val_tail_flushed = False
+
+    for text in _iter_docs(docs_path):
+        doc_index = int(manifest["stats"]["docs_total"])
+        transformed = caps.encode_lossless_caps_v2(text)
+        if doc_index < val_docs:
+            token_ids, byte_counts = _encode_val_doc(
+                sp,
+                transformed,
+                byte_counter=caps.surface_piece_original_byte_counts,
+                transform_name=caps.LOSSLESS_CAPS_CASEOPS_V1,
+            )
+            _append_uint16(val_buf_tokens, token_ids)
+            _append_uint16(val_buf_bytes, byte_counts)
+            manifest["stats"]["docs_val"] += 1
+            manifest["stats"]["tokens_val"] += len(token_ids)
+            manifest["stats"]["bytes_sidecar_tokens_val"] += len(byte_counts)
+            new_val_written = _flush_full_shards(
+                buf=val_buf_tokens,
+                out_dir=dataset_dir,
+                prefix="fineweb_val",
+                shard_index=val_written,
+                shard_tokens=shard_tokens,
+            )
+            new_val_bytes_written = _flush_full_shards(
+                buf=val_buf_bytes,
+                out_dir=dataset_dir,
+                prefix="fineweb_val_bytes",
+                shard_index=val_bytes_written,
+                shard_tokens=shard_tokens,
+            )
+            val_written = new_val_written
+            val_bytes_written = new_val_bytes_written
+        else:
+            if not val_tail_flushed:
+                val_written = _flush_tail(
+                    buf=val_buf_tokens,
+                    out_dir=dataset_dir,
+                    prefix="fineweb_val",
+                    shard_index=val_written,
+                )
+                val_bytes_written = _flush_tail(
+                    buf=val_buf_bytes,
+                    out_dir=dataset_dir,
+                    prefix="fineweb_val_bytes",
+                    shard_index=val_bytes_written,
+                )
+                val_tail_flushed = True
+            token_ids = [BOS_ID]
+            token_ids.extend(int(v) for v in sp.encode(transformed, out_type=int))
+            _append_uint16(train_buf, token_ids)
+            manifest["stats"]["docs_train"] += 1
+            manifest["stats"]["tokens_train"] += len(token_ids)
+            while len(train_buf) >= shard_tokens:
+                shard = array("H", train_buf[:shard_tokens])
+                _write_shard(dataset_dir / f"fineweb_train_{train_written:06d}.bin", shard)
+                del train_buf[:shard_tokens]
+                train_written += 1
+                if max_train_shards and train_written >= max_train_shards:
+                    manifest["stats"]["docs_total"] = doc_index + 1
+                    manifest["stats"]["files_val"] = val_written
+                    manifest["stats"]["files_val_bytes"] = val_bytes_written
+                    manifest["stats"]["files_train"] = train_written
+                    manifest["stopped_reason"] = "max_train_shards"
+                    manifest["completed_at"] = time.strftime("%Y-%m-%dT%H:%M:%S%z")
+                    _write_manifest(out_root / "caseops_manifest.json", manifest)
+                    print(json.dumps({"event": "done", "stats": manifest["stats"]}, sort_keys=True), flush=True)
+                    return out_root / "caseops_manifest.json"
+
+        manifest["stats"]["docs_total"] = doc_index + 1
+        if manifest["stats"]["docs_total"] % 10_000 == 0:
+            manifest["stats"]["files_val"] = val_written
+            manifest["stats"]["files_val_bytes"] = val_bytes_written
+            manifest["stats"]["files_train"] = train_written
+            print(json.dumps({"event": "progress", "stats": manifest["stats"]}, sort_keys=True), flush=True)
+
+    if not val_tail_flushed:
+        val_written = _flush_tail(
+            buf=val_buf_tokens,
+            out_dir=dataset_dir,
+            prefix="fineweb_val",
+            shard_index=val_written,
+        )
+        val_bytes_written = _flush_tail(
+            buf=val_buf_bytes,
+            out_dir=dataset_dir,
+            prefix="fineweb_val_bytes",
+            shard_index=val_bytes_written,
+        )
+    if train_buf:
+        _write_shard(dataset_dir / f"fineweb_train_{train_written:06d}.bin", train_buf)
+        train_written += 1
+
+    manifest["stats"]["files_val"] = val_written
+    manifest["stats"]["files_val_bytes"] = val_bytes_written
+    manifest["stats"]["files_train"] = train_written
+    manifest["stopped_reason"] = "eof"
+    manifest["completed_at"] = time.strftime("%Y-%m-%dT%H:%M:%S%z")
+    _write_manifest(out_root / "caseops_manifest.json", manifest)
+    print(json.dumps({"event": "done", "stats": manifest["stats"]}, sort_keys=True), flush=True)
+    return out_root / "caseops_manifest.json"
+
+
+def build_parser() -> argparse.ArgumentParser:
+    ap = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
+    ap.add_argument("--docs", required=True, type=pathlib.Path, help="Path to docs_selected.jsonl")
+    ap.add_argument("--out", required=True, type=pathlib.Path, help="Output root containing tokenizers/ and datasets/")
+    ap.add_argument("--source-dir", type=pathlib.Path, default=PR1855_SOURCE_DIR, help="Directory containing lossless_caps.py")
+    ap.add_argument("--sp", type=pathlib.Path, default=None, help="CaseOps SP10240 model path. Defaults under --out/tokenizers.")
+    ap.add_argument("--train-tokenizer", action="store_true", help="Train the SP10240 CaseOps tokenizer if --sp is missing")
+    ap.add_argument("--tokenizer-skip-docs", type=int, default=DEFAULT_VAL_DOCS)
+    ap.add_argument("--tokenizer-train-docs", type=int, default=None, help="Optional count after --tokenizer-skip-docs")
+    ap.add_argument("--val-docs", type=int, default=DEFAULT_VAL_DOCS)
+    ap.add_argument("--max-train-shards", type=int, default=0, help="0 means run until EOF")
+    ap.add_argument("--shard-tokens", type=int, default=DEFAULT_SHARD_TOKENS)
+    return ap
+
+
+def main() -> None:
+    args = build_parser().parse_args()
+    docs_path = args.docs.expanduser().resolve()
+    out_root = args.out.expanduser().resolve()
+    source_dir = args.source_dir.expanduser().resolve()
+    sp_path = (
+        args.sp.expanduser().resolve()
+        if args.sp is not None
+        else out_root / "tokenizers" / f"{TOKENIZER_BASENAME}.model"
+    )
+
+    if not docs_path.is_file():
+        raise FileNotFoundError(docs_path)
+    if args.val_docs < 0:
+        raise ValueError("--val-docs must be non-negative")
+    if args.max_train_shards < 0:
+        raise ValueError("--max-train-shards must be non-negative")
+    if args.shard_tokens <= 0:
+        raise ValueError("--shard-tokens must be positive")
+
+    caps, _ = _load_lossless_caps(source_dir)
+    trained_tokenizer = False
+    if not sp_path.is_file():
+        if not args.train_tokenizer:
+            raise FileNotFoundError(f"{sp_path}; rerun with --train-tokenizer to create it")
+        _train_caseops_tokenizer(
+            docs_path=docs_path,
+            model_prefix=sp_path.with_suffix(""),
+            transform=caps.encode_lossless_caps_v2,
+            skip_docs=args.tokenizer_skip_docs,
+            max_docs=args.tokenizer_train_docs,
+        )
+        trained_tokenizer = True
+
+    manifest_path = build_shards(
+        docs_path=docs_path,
+        out_root=out_root,
+        sp_path=sp_path,
+        source_dir=source_dir,
+        val_docs=args.val_docs,
+        max_train_shards=args.max_train_shards,
+        shard_tokens=args.shard_tokens,
+        tokenizer_skip_docs=args.tokenizer_skip_docs,
+        tokenizer_train_docs=args.tokenizer_train_docs,
+        trained_tokenizer=trained_tokenizer,
+    )
+    print(f"manifest: {manifest_path}", flush=True)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/stream_pr1855_caseops_to_pod.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/stream_pr1855_caseops_to_pod.sh
new file mode 100755
index 0000000000..0cde7d0615
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/stream_pr1855_caseops_to_pod.sh
@@ -0,0 +1,29 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+DOCS=/home/frosty40/parameter-golf-lab/data/docs_selected.jsonl
+KEY=/home/frosty40/.ssh/id_ed25519_apollo
+HOST=root@206.125.32.60
+PORT=56335
+KNOWN=/tmp/codex_vast_known_hosts
+
+test -f "$DOCS"
+
+ssh \
+  -o StrictHostKeyChecking=no \
+  -o UserKnownHostsFile="$KNOWN" \
+  -o ServerAliveInterval=15 \
+  -o ServerAliveCountMax=8 \
+  -i "$KEY" \
+  -p "$PORT" \
+  "$HOST" \
+  'cd /workspace/sota_rascal/legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x &&
+   mkdir -p /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp8192_caseops/datasets/tokenizers &&
+   cp -f tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp8192_caseops/datasets/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model &&
+   /venv/main/bin/python3 stream_prepare_caseops_data.py \
+     --docs - \
+     --out /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp8192_caseops/datasets \
+     --sp /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp8192_caseops/datasets/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model \
+     --val-docs 50000 \
+     --max-train-shards 80' \
+  < "$DOCS"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/upload_sp10240_caseops_to_hf.sh b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/upload_sp10240_caseops_to_hf.sh
new file mode 100755
index 0000000000..49a140095b
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/caseops/upload_sp10240_caseops_to_hf.sh
@@ -0,0 +1,55 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+REPO="${REPO:-Frosty40/10k_caseops_golfer}"
+OUT_ROOT="${OUT_ROOT:-/home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets}"
+HF_BIN="${HF_BIN:-$(command -v hf)}"
+ACTION="${1:-check}"
+
+TOKENIZER_MODEL="${OUT_ROOT}/tokenizers/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model"
+TOKENIZER_VOCAB="${OUT_ROOT}/tokenizers/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.vocab"
+DATASET_DIR="${OUT_ROOT}/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+MANIFEST="${OUT_ROOT}/caseops_manifest.json"
+
+if [[ -z "${HF_BIN}" || ! -x "${HF_BIN}" ]]; then
+  echo "FATAL: hf CLI not found. Set HF_BIN=/path/to/hf." >&2
+  exit 1
+fi
+
+"${HF_BIN}" auth whoami >/dev/null
+
+missing=0
+for path in "${TOKENIZER_MODEL}" "${TOKENIZER_VOCAB}" "${DATASET_DIR}" "${MANIFEST}"; do
+  if [[ ! -e "${path}" ]]; then
+    echo "MISSING: ${path}" >&2
+    missing=1
+  fi
+done
+
+train_count=$(find "${DATASET_DIR}" -maxdepth 1 -name 'fineweb_train_*.bin' 2>/dev/null | wc -l || true)
+val_count=$(find "${DATASET_DIR}" -maxdepth 1 -name 'fineweb_val_*.bin' 2>/dev/null | wc -l || true)
+val_bytes_count=$(find "${DATASET_DIR}" -maxdepth 1 -name 'fineweb_val_bytes_*.bin' 2>/dev/null | wc -l || true)
+
+echo "repo=${REPO}"
+echo "out_root=${OUT_ROOT}"
+echo "train_shards=${train_count} val_shards=${val_count} val_byte_sidecars=${val_bytes_count}"
+
+if [[ "${missing}" -ne 0 || "${train_count}" -eq 0 || "${val_count}" -eq 0 || "${val_bytes_count}" -eq 0 ]]; then
+  echo "Dataset is not upload-ready yet." >&2
+  exit 2
+fi
+
+if [[ "${ACTION}" == "check" ]]; then
+  echo "Upload-ready. To upload:"
+  echo "  REPO=${REPO} OUT_ROOT=${OUT_ROOT} HF_BIN=${HF_BIN} $0 upload"
+  exit 0
+fi
+
+if [[ "${ACTION}" != "upload" ]]; then
+  echo "Usage: $0 [check|upload]" >&2
+  exit 64
+fi
+
+"${HF_BIN}" repo create "${REPO}" --repo-type dataset --exist-ok
+"${HF_BIN}" upload-large-folder "${REPO}" "${OUT_ROOT}" --repo-type dataset
+echo "DONE: https://huggingface.co/datasets/${REPO}"
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_10k_caseops_hf_lane.md b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_10k_caseops_hf_lane.md
new file mode 100644
index 0000000000..6af3822574
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_10k_caseops_hf_lane.md
@@ -0,0 +1,149 @@
+# SP10240 CaseOps HF Lane
+
+Status timestamp: 2026-04-29T23:59:53-05:00 local host time.
+
+Scope: local dataset/tokenizer build and Hugging Face upload prep only. No
+training was launched, no remote `/workspace` files were touched, and active
+pod processes were left alone.
+
+## Tokenizer Finding
+
+No true SP10240 CaseOps tokenizer was found locally.
+
+Verified standard SP10240 tokenizer copies:
+
+- `/home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe.model`
+- `/home/frosty40/parameter-golf-lab/data/tokenizers/fineweb_10240_bpe.vocab`
+- `/home/frosty40/sota_rascal/pod_pulls/8x_35002131_20260429_sp10240_mlp375_promising_20260429_202105/fineweb_10240_bpe.model`
+- `/home/frosty40/sota_rascal/pod_pulls/8x_35002131_20260429_sp10240_mlp375_promising_20260429_202105/fineweb_10240_bpe.vocab`
+
+Those standard SP10240 models map CaseOps operator codepoints U+E001..U+E004
+to `<unk>` id 3, so they are not CaseOps tokenizers.
+
+Verified PR1855/PR1797 CaseOps tokenizer:
+
+- `legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model`
+- vocab size: 8192
+- reserved CaseOps operator ids: U+E001=4, U+E002=5, U+E003=6, U+E004=7
+- embedded SentencePiece trainer settings: BPE, byte fallback enabled,
+  split digits enabled, `nmt_nfkc`, no dummy prefix, pad/bos/eos/unk ids
+  0/1/2/3, hard vocab limit disabled.
+
+Missing artifact: I did not find the original explicit command/log that trained
+the 8192 CaseOps tokenizer. The SP10240 CaseOps tokenizer is therefore a new
+lane derived from the PR1855 model's embedded trainer spec plus the existing
+standard SP10240 condition note `tokenizer_skip_docs=50000`.
+
+## Added Local Scripts
+
+- `scripts/prepare_sp10240_caseops_data.py`
+  - trains `fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model` when
+    `--train-tokenizer` is set
+  - reserves U+E001..U+E004 and validates ids 4..7
+  - applies PR1855 `lossless_caps_caseops_v1`
+  - writes uint16 `fineweb_train_*.bin` and `fineweb_val_*.bin`
+  - writes validation byte sidecars `fineweb_val_bytes_*.bin`
+  - supports `--max-train-shards`
+  - fails closed if target shard files already exist
+
+- `scripts/build_sp10240_caseops_local.sh`
+  - default docs: `/home/frosty40/parameter-golf-lab/data/docs_selected.jsonl`
+  - default output root:
+    `/home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets`
+  - default `MAX_TRAIN_SHARDS=80`, `VAL_DOCS=50000`,
+    `SHARD_TOKENS=10000000`, `TOKENIZER_SKIP_DOCS=50000`
+
+- `scripts/upload_sp10240_caseops_to_hf.sh`
+  - default repo: `Frosty40/10k_caseops_golfer`
+  - `check` mode validates local outputs and HF auth
+  - `upload` mode creates the dataset repo if needed and runs
+    `hf upload-large-folder`
+
+## Smoke Test
+
+Command:
+
+```bash
+python3 scripts/prepare_sp10240_caseops_data.py \
+  --docs /home/frosty40/parameter-golf-lab/data/docs_selected.jsonl \
+  --out /home/frosty40/sota_rascal/data/smoke_sp10240_caseops_20260429_235907 \
+  --train-tokenizer \
+  --tokenizer-skip-docs 50 \
+  --tokenizer-train-docs 2000 \
+  --val-docs 20 \
+  --max-train-shards 1 \
+  --shard-tokens 2000
+```
+
+Smoke result:
+
+- tokenizer vocab size: 10240
+- reserved CaseOps ids: `[4, 5, 6, 7]`
+- train shards: 1
+- val shards: 12
+- val byte sidecars: 12
+- manifest:
+  `/home/frosty40/sota_rascal/data/smoke_sp10240_caseops_20260429_235907/caseops_manifest.json`
+
+## Real Build
+
+Started:
+
+```bash
+tmux new-session -d -s sp10240_caseops_build_20260429_235938 \
+  'cd /home/frosty40/sota_rascal && PYTHONUNBUFFERED=1 scripts/build_sp10240_caseops_local.sh > notes/runtime_logs/sp10240_caseops_build_20260429_235938.log 2>&1'
+```
+
+Live status at start verification:
+
+- tmux session: `sp10240_caseops_build_20260429_235938`
+- bash PID: `1942873`
+- python PID: `1942875`
+- log: `notes/runtime_logs/sp10240_caseops_build_20260429_235938.log`
+- output root:
+  `/home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets`
+- current stage: SentencePiece tokenizer training
+- command in python process:
+  `python3 scripts/prepare_sp10240_caseops_data.py --docs /home/frosty40/parameter-golf-lab/data/docs_selected.jsonl --out /home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets --train-tokenizer --val-docs 50000 --max-train-shards 80 --shard-tokens 10000000 --tokenizer-skip-docs 50000`
+
+Expected completion artifact:
+
+- `/home/frosty40/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/caseops_manifest.json`
+
+## Hugging Face Upload
+
+HF CLI:
+
+- binary: `/home/frosty40/.local/bin/hf`
+- version: 1.6.0
+- auth check: logged in as `Frosty40`; token value was not printed.
+
+Current upload state: not ready until the real build writes tokenizer, shards,
+and `caseops_manifest.json`.
+
+Check command:
+
+```bash
+scripts/upload_sp10240_caseops_to_hf.sh check
+```
+
+Upload command after check passes:
+
+```bash
+REPO=Frosty40/10k_caseops_golfer scripts/upload_sp10240_caseops_to_hf.sh upload
+```
+
+Alternate repo name if preferred:
+
+```bash
+REPO=Frosty40/10k_golfer_caseops scripts/upload_sp10240_caseops_to_hf.sh upload
+```
+
+## Blockers / Caveats
+
+- The exact original 8192 CaseOps tokenizer training command/log was not found.
+  The new 10k tokenizer spec is explicit and reproducible, but it is a new
+  SP10240 CaseOps lane, not an exact reproduction of the missing 8192 training
+  command.
+- Upload should not start until `caseops_manifest.json` exists and
+  `scripts/upload_sp10240_caseops_to_hf.sh check` exits 0.
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_claude_sp10240_bytefit_plan.md b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_claude_sp10240_bytefit_plan.md
new file mode 100644
index 0000000000..5fc31da54b
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/notes/2026-04-30_claude_sp10240_bytefit_plan.md
@@ -0,0 +1,91 @@
+# Claude SP10240 Byte-Fit Plan - 2026-04-30
+
+Snapshot: `2026-04-30T18:35Z`
+
+## Accepted Advice
+
+Claude's strongest point is correct: stop prioritizing the queued 10L
+`mlp425_late050`. The useful move is to keep 11L and fit between the over-cap
+MLP4 body and the under-cap MLP3.75 body.
+
+## Byte Math Correction
+
+The proposed `3.85-3.90` range is directionally right but probably too high:
+
+```text
+MLP4.0 over cap: about +450KB
+MLP3.75 under cap: about -182KB
+gap across 0.25 MLP: about 632KB
+byte-fit point from MLP3.75: about 0.072 MLP
+```
+
+That places the safe target near `MLP3.8125`, not raw `3.85-3.90`.
+
+Also, raw `MLP3.85` failed on the quads with:
+
+```text
+AssertionError: strides must be 16-byte aligned
+```
+
+So prepared candidates use aligned hidden dims:
+
+- `MLP3.8125`: hidden dim `1952`
+- `MLP3.84375`: hidden dim `1968`
+
+## Prepared 8x Runners
+
+Safe byte-fit:
+
+```bash
+cd /workspace/sota_rascal/legs/2026-04-30_pr1855_sp10240_caseops_mlp38125_late050_8x
+./launch_8x.sh
+tail -f logs/pr1855_sp10240_caseops_mlp38125_late050_8x_seed444.txt
+```
+
+Edge byte-fit:
+
+```bash
+cd /workspace/sota_rascal/legs/2026-04-30_pr1855_sp10240_caseops_mlp384375_late050_8x
+./launch_8x.sh
+tail -f logs/pr1855_sp10240_caseops_mlp384375_late050_8x_seed444.txt
+```
+
+H1 CaseOps hot-loop precision:
+
+```bash
+cd /workspace/sota_rascal/legs/2026-04-30_pr1855_sp10240_caseops_mlp4_late050_h1_hotloop_8x
+./launch_8x.sh
+tail -f logs/pr1855_sp10240_caseops_mlp4_late050_h1_hotloop_8x_seed444.txt
+```
+
+## H1 Port
+
+H1 is now ported onto the CaseOps code path:
+
+- body: SP10240 CaseOps, 11L, MLP4, late050
+- global `matrix_bits=5`
+- hot loop attention blocks `3,4,5` use int6
+- embed int7, pergroup, LQER asym rank4/top3 kept
+
+This is a quant-side test, not the first body-quality run. Run after the
+byte-fit body shows enough neural/size signal.
+
+## Quads Status During Prep
+
+Quads are hot:
+
+- four H100s at `99-100%` utilization
+- about `45GB` used per GPU
+- active lanes around step `3000`
+- no validation read yet
+- old raw `MLP3.85` lane failed alignment and is not active
+
+Active useful quads lanes:
+
+- `caseops6_mlp375_late050_loopoff40_1x`
+- `caseops6_mlp375_late050_loopsmooth_1x`
+- `caseops6_mlp375_late050_smooth_loopoff40_1x`
+- `caseops6_mlp375_late045_loopoff40_1x`
+
+At step ~3000 the train losses are too close to cut. Wait for validation or
+wallclock stop.
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.model b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.model
new file mode 100644
index 0000000000..9f3b398647
Binary files /dev/null and b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.model differ
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.vocab b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.vocab
new file mode 100644
index 0000000000..8f5612340d
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe.vocab
@@ -0,0 +1,10240 @@
+<pad>	0
+<s>	0
+</s>	0
+<unk>	0
+<0x00>	0
+<0x01>	0
+<0x02>	0
+<0x03>	0
+<0x04>	0
+<0x05>	0
+<0x06>	0
+<0x07>	0
+<0x08>	0
+<0x09>	0
+<0x0A>	0
+<0x0B>	0
+<0x0C>	0
+<0x0D>	0
+<0x0E>	0
+<0x0F>	0
+<0x10>	0
+<0x11>	0
+<0x12>	0
+<0x13>	0
+<0x14>	0
+<0x15>	0
+<0x16>	0
+<0x17>	0
+<0x18>	0
+<0x19>	0
+<0x1A>	0
+<0x1B>	0
+<0x1C>	0
+<0x1D>	0
+<0x1E>	0
+<0x1F>	0
+<0x20>	0
+<0x21>	0
+<0x22>	0
+<0x23>	0
+<0x24>	0
+<0x25>	0
+<0x26>	0
+<0x27>	0
+<0x28>	0
+<0x29>	0
+<0x2A>	0
+<0x2B>	0
+<0x2C>	0
+<0x2D>	0
+<0x2E>	0
+<0x2F>	0
+<0x30>	0
+<0x31>	0
+<0x32>	0
+<0x33>	0
+<0x34>	0
+<0x35>	0
+<0x36>	0
+<0x37>	0
+<0x38>	0
+<0x39>	0
+<0x3A>	0
+<0x3B>	0
+<0x3C>	0
+<0x3D>	0
+<0x3E>	0
+<0x3F>	0
+<0x40>	0
+<0x41>	0
+<0x42>	0
+<0x43>	0
+<0x44>	0
+<0x45>	0
+<0x46>	0
+<0x47>	0
+<0x48>	0
+<0x49>	0
+<0x4A>	0
+<0x4B>	0
+<0x4C>	0
+<0x4D>	0
+<0x4E>	0
+<0x4F>	0
+<0x50>	0
+<0x51>	0
+<0x52>	0
+<0x53>	0
+<0x54>	0
+<0x55>	0
+<0x56>	0
+<0x57>	0
+<0x58>	0
+<0x59>	0
+<0x5A>	0
+<0x5B>	0
+<0x5C>	0
+<0x5D>	0
+<0x5E>	0
+<0x5F>	0
+<0x60>	0
+<0x61>	0
+<0x62>	0
+<0x63>	0
+<0x64>	0
+<0x65>	0
+<0x66>	0
+<0x67>	0
+<0x68>	0
+<0x69>	0
+<0x6A>	0
+<0x6B>	0
+<0x6C>	0
+<0x6D>	0
+<0x6E>	0
+<0x6F>	0
+<0x70>	0
+<0x71>	0
+<0x72>	0
+<0x73>	0
+<0x74>	0
+<0x75>	0
+<0x76>	0
+<0x77>	0
+<0x78>	0
+<0x79>	0
+<0x7A>	0
+<0x7B>	0
+<0x7C>	0
+<0x7D>	0
+<0x7E>	0
+<0x7F>	0
+<0x80>	0
+<0x81>	0
+<0x82>	0
+<0x83>	0
+<0x84>	0
+<0x85>	0
+<0x86>	0
+<0x87>	0
+<0x88>	0
+<0x89>	0
+<0x8A>	0
+<0x8B>	0
+<0x8C>	0
+<0x8D>	0
+<0x8E>	0
+<0x8F>	0
+<0x90>	0
+<0x91>	0
+<0x92>	0
+<0x93>	0
+<0x94>	0
+<0x95>	0
+<0x96>	0
+<0x97>	0
+<0x98>	0
+<0x99>	0
+<0x9A>	0
+<0x9B>	0
+<0x9C>	0
+<0x9D>	0
+<0x9E>	0
+<0x9F>	0
+<0xA0>	0
+<0xA1>	0
+<0xA2>	0
+<0xA3>	0
+<0xA4>	0
+<0xA5>	0
+<0xA6>	0
+<0xA7>	0
+<0xA8>	0
+<0xA9>	0
+<0xAA>	0
+<0xAB>	0
+<0xAC>	0
+<0xAD>	0
+<0xAE>	0
+<0xAF>	0
+<0xB0>	0
+<0xB1>	0
+<0xB2>	0
+<0xB3>	0
+<0xB4>	0
+<0xB5>	0
+<0xB6>	0
+<0xB7>	0
+<0xB8>	0
+<0xB9>	0
+<0xBA>	0
+<0xBB>	0
+<0xBC>	0
+<0xBD>	0
+<0xBE>	0
+<0xBF>	0
+<0xC0>	0
+<0xC1>	0
+<0xC2>	0
+<0xC3>	0
+<0xC4>	0
+<0xC5>	0
+<0xC6>	0
+<0xC7>	0
+<0xC8>	0
+<0xC9>	0
+<0xCA>	0
+<0xCB>	0
+<0xCC>	0
+<0xCD>	0
+<0xCE>	0
+<0xCF>	0
+<0xD0>	0
+<0xD1>	0
+<0xD2>	0
+<0xD3>	0
+<0xD4>	0
+<0xD5>	0
+<0xD6>	0
+<0xD7>	0
+<0xD8>	0
+<0xD9>	0
+<0xDA>	0
+<0xDB>	0
+<0xDC>	0
+<0xDD>	0
+<0xDE>	0
+<0xDF>	0
+<0xE0>	0
+<0xE1>	0
+<0xE2>	0
+<0xE3>	0
+<0xE4>	0
+<0xE5>	0
+<0xE6>	0
+<0xE7>	0
+<0xE8>	0
+<0xE9>	0
+<0xEA>	0
+<0xEB>	0
+<0xEC>	0
+<0xED>	0
+<0xEE>	0
+<0xEF>	0
+<0xF0>	0
+<0xF1>	0
+<0xF2>	0
+<0xF3>	0
+<0xF4>	0
+<0xF5>	0
+<0xF6>	0
+<0xF7>	0
+<0xF8>	0
+<0xF9>	0
+<0xFA>	0
+<0xFB>	0
+<0xFC>	0
+<0xFD>	0
+<0xFE>	0
+<0xFF>	0
+▁t	-0
+▁a	-1
+in	-2
+he	-3
+re	-4
+on	-5
+er	-6
+▁the	-7
+▁s	-8
+▁w	-9
+or	-10
+at	-11
+nd	-12
+ou	-13
+▁c	-14
+it	-15
+es	-16
+▁f	-17
+is	-18
+en	-19
+ing	-20
+▁b	-21
+▁p	-22
+▁o	-23
+an	-24
+ed	-25
+al	-26
+▁to	-27
+▁m	-28
+ar	-29
+▁and	-30
+▁in	-31
+▁of	-32
+▁d	-33
+le	-34
+ic	-35
+as	-36
+om	-37
+▁h	-38
+ion	-39
+▁th	-40
+il	-41
+▁T	-42
+ent	-43
+▁l	-44
+ve	-45
+▁y	-46
+ro	-47
+st	-48
+▁I	-49
+▁e	-50
+▁re	-51
+▁n	-52
+▁S	-53
+▁g	-54
+et	-55
+ct	-56
+▁A	-57
+▁you	-58
+▁C	-59
+ly	-60
+▁for	-61
+id	-62
+▁is	-63
+ay	-64
+▁on	-65
+▁be	-66
+ot	-67
+ow	-68
+ol	-69
+am	-70
+ce	-71
+ig	-72
+us	-73
+ad	-74
+im	-75
+▁M	-76
+ch	-77
+el	-78
+ver	-79
+ith	-80
+ut	-81
+▁st	-82
+ation	-83
+ur	-84
+▁P	-85
+▁with	-86
+▁that	-87
+ir	-88
+▁B	-89
+▁W	-90
+▁The	-91
+▁it	-92
+▁he	-93
+ra	-94
+ill	-95
+ers	-96
+▁al	-97
+un	-98
+ul	-99
+▁an	-100
+▁D	-101
+▁H	-102
+▁F	-103
+out	-104
+▁pro	-105
+▁as	-106
+▁wh	-107
+▁are	-108
+ke	-109
+se	-110
+ter	-111
+▁we	-112
+if	-113
+▁ha	-114
+ge	-115
+oo	-116
+▁R	-117
+our	-118
+pp	-119
+ck	-120
+ate	-121
+ess	-122
+▁at	-123
+▁con	-124
+▁com	-125
+▁or	-126
+▁L	-127
+est	-128
+her	-129
+ore	-130
+ment	-131
+▁fr	-132
+ab	-133
+igh	-134
+▁-	-135
+▁ne	-136
+▁N	-137
+ort	-138
+▁se	-139
+▁G	-140
+▁your	-141
+ld	-142
+▁E	-143
+ist	-144
+ri	-145
+op	-146
+▁(	-147
+▁ex	-148
+ity	-149
+ure	-150
+▁O	-151
+em	-152
+▁v	-153
+qu	-154
+ant	-155
+art	-156
+ive	-157
+ust	-158
+um	-159
+▁was	-160
+▁have	-161
+pe	-162
+▁from	-163
+▁this	-164
+▁de	-165
+▁r	-166
+▁sh	-167
+th	-168
+ain	-169
+ies	-170
+▁can	-171
+up	-172
+▁will	-173
+▁ch	-174
+and	-175
+▁by	-176
+os	-177
+ight	-178
+nt	-179
+ie	-180
+▁us	-181
+ome	-182
+all	-183
+ard	-184
+▁not	-185
+ud	-186
+res	-187
+▁le	-188
+▁J	-189
+ast	-190
+▁pl	-191
+ost	-192
+▁su	-193
+▁ab	-194
+iv	-195
+ear	-196
+▁wor	-197
+ide	-198
+ial	-199
+rou	-200
+▁all	-201
+gh	-202
+od	-203
+oc	-204
+ak	-205
+te	-206
+ine	-207
+ould	-208
+▁j	-209
+red	-210
+ag	-211
+▁has	-212
+..	-213
+ice	-214
+▁Th	-215
+ell	-216
+▁U	-217
+age	-218
+▁do	-219
+▁k	-220
+ack	-221
+fe	-222
+ook	-223
+ac	-224
+▁ad	-225
+per	-226
+▁In	-227
+ip	-228
+▁comp	-229
+ake	-230
+▁out	-231
+ions	-232
+ally	-233
+▁up	-234
+are	-235
+▁but	-236
+▁me	-237
+▁whe	-238
+pt	-239
+lo	-240
+able	-241
+ry	-242
+▁our	-243
+▁“	-244
+one	-245
+ind	-246
+▁en	-247
+▁more	-248
+ail	-249
+ite	-250
+ther	-251
+▁their	-252
+▁Y	-253
+ich	-254
+▁so	-255
+very	-256
+ime	-257
+cc	-258
+ood	-259
+ated	-260
+ong	-261
+▁K	-262
+▁my	-263
+▁sa	-264
+for	-265
+iz	-266
+ame	-267
+ber	-268
+▁they	-269
+▁St	-270
+▁te	-271
+so	-272
+ous	-273
+▁one	-274
+ans	-275
+act	-276
+▁about	-277
+ll	-278
+ike	-279
+du	-280
+▁cont	-281
+ase	-282
+og	-283
+▁V	-284
+▁im	-285
+ick	-286
+▁cl	-287
+ia	-288
+ance	-289
+▁work	-290
+▁inc	-291
+ign	-292
+▁un	-293
+ire	-294
+ree	-295
+▁off	-296
+▁fe	-297
+▁who	-298
+▁man	-299
+ue	-300
+ace	-301
+ach	-302
+reat	-303
+ub	-304
+▁It	-305
+ction	-306
+▁go	-307
+ne	-308
+▁app	-309
+▁year	-310
+▁new	-311
+ep	-312
+ult	-313
+ib	-314
+ap	-315
+▁his	-316
+ays	-317
+erv	-318
+▁Ch	-319
+▁We	-320
+▁res	-321
+und	-322
+▁"	-323
+▁sp	-324
+ass	-325
+ark	-326
+ations	-327
+ff	-328
+▁qu	-329
+ary	-330
+▁per	-331
+▁also	-332
+ile	-333
+▁which	-334
+▁int	-335
+▁time	-336
+ove	-337
+form	-338
+ven	-339
+ount	-340
+▁get	-341
+▁tr	-342
+own	-343
+▁like	-344
+▁some	-345
+▁other	-346
+ond	-347
+ents	-348
+ings	-349
+vel	-350
+▁any	-351
+ical	-352
+ence	-353
+▁part	-354
+av	-355
+▁been	-356
+▁dis	-357
+▁This	-358
+▁over	-359
+ition	-360
+ress	-361
+pl	-362
+ors	-363
+▁rec	-364
+▁them	-365
+▁He	-366
+▁ar	-367
+▁sc	-368
+ild	-369
+▁pe	-370
+port	-371
+ink	-372
+low	-373
+▁ag	-374
+▁ro	-375
+▁her	-376
+▁when	-377
+ound	-378
+▁kn	-379
+ord	-380
+mer	-381
+int	-382
+▁need	-383
+ish	-384
+▁pr	-385
+irst	-386
+ens	-387
+ough	-388
+▁said	-389
+ru	-390
+▁pre	-391
+▁spe	-392
+▁just	-393
+wn	-394
+ren	-395
+▁what	-396
+▁there	-397
+▁if	-398
+▁acc	-399
+▁than	-400
+▁its	-401
+ov	-402
+▁Re	-403
+day	-404
+vers	-405
+▁would	-406
+ater	-407
+fter	-408
+▁had	-409
+ade	-410
+ning	-411
+lud	-412
+▁hel	-413
+▁–	-414
+▁were	-415
+▁am	-416
+old	-417
+rough	-418
+▁into	-419
+▁des	-420
+ory	-421
+ople	-422
+itt	-423
+ang	-424
+▁help	-425
+▁tw	-426
+▁how	-427
+use	-428
+lic	-429
+ool	-430
+▁bec	-431
+▁add	-432
+anc	-433
+▁first	-434
+ose	-435
+▁make	-436
+▁comm	-437
+ons	-438
+amp	-439
+ob	-440
+hed	-441
+▁prov	-442
+▁Wh	-443
+▁tra	-444
+...	-445
+ft	-446
+▁look	-447
+▁You	-448
+▁includ	-449
+ual	-450
+▁people	-451
+les	-452
+▁serv	-453
+gr	-454
+▁col	-455
+ian	-456
+ments	-457
+ful	-458
+▁know	-459
+▁produ	-460
+ates	-461
+iew	-462
+▁Ne	-463
+▁em	-464
+rent	-465
+ious	-466
+tern	-467
+▁she	-468
+round	-469
+ek	-470
+▁every	-471
+▁through	-472
+▁may	-473
+ating	-474
+▁no	-475
+▁only	-476
+pport	-477
+▁back	-478
+▁most	-479
+ect	-480
+▁bu	-481
+▁want	-482
+ict	-483
+ices	-484
+▁As	-485
+▁If	-486
+▁well	-487
+ities	-488
+▁ind	-489
+we	-490
+▁bet	-491
+▁ph	-492
+ise	-493
+▁use	-494
+▁two	-495
+▁co	-496
+xt	-497
+ont	-498
+com	-499
+▁act	-500
+▁und	-501
+ph	-502
+iness	-503
+lect	-504
+iss	-505
+oy	-506
+▁after	-507
+▁Se	-508
+ife	-509
+ause	-510
+▁play	-511
+fect	-512
+▁|	-513
+oth	-514
+▁&	-515
+ily	-516
+row	-517
+ork	-518
+enc	-519
+▁exper	-520
+ject	-521
+▁cons	-522
+hen	-523
+cial	-524
+urn	-525
+ert	-526
+▁years	-527
+als	-528
+▁these	-529
+ank	-530
+ting	-531
+▁$	-532
+▁Com	-533
+aw	-534
+▁bus	-535
+▁An	-536
+▁Un	-537
+▁stud	-538
+any	-539
+bs	-540
+ange	-541
+▁For	-542
+vent	-543
+ures	-544
+▁good	-545
+ational	-546
+aking	-547
+▁see	-548
+▁ke	-549
+ased	-550
+ific	-551
+▁Pro	-552
+▁now	-553
+fore	-554
+▁under	-555
+▁very	-556
+▁many	-557
+▁reg	-558
+▁sm	-559
+ward	-560
+hing	-561
+▁imp	-562
+get	-563
+oint	-564
+▁dif	-565
+▁ra	-566
+▁way	-567
+erson	-568
+ience	-569
+▁start	-570
+ts	-571
+pect	-572
+▁fin	-573
+▁great	-574
+▁And	-575
+yst	-576
+uring	-577
+▁De	-578
+▁rel	-579
+formation	-580
+▁gu	-581
+ility	-582
+ible	-583
+▁rem	-584
+▁could	-585
+oss	-586
+hip	-587
+▁dec	-588
+uch	-589
+▁even	-590
+▁inv	-591
+).	-592
+ty	-593
+ics	-594
+rit	-595
+ract	-596
+▁own	-597
+▁sec	-598
+cess	-599
+velop	-600
+▁day	-601
+▁where	-602
+▁show	-603
+ident	-604
+elf	-605
+hes	-606
+alth	-607
+▁high	-608
+its	-609
+▁loc	-610
+air	-611
+▁find	-612
+olog	-613
+▁ac	-614
+ull	-615
+nds	-616
+▁Al	-617
+▁don	-618
+▁ass	-619
+▁home	-620
+▁should	-621
+line	-622
+ath	-623
+▁ent	-624
+▁best	-625
+▁here	-626
+▁down	-627
+lease	-628
+▁then	-629
+▁Sh	-630
+ied	-631
+ble	-632
+ular	-633
+||	-634
+▁right	-635
+The	-636
+arch	-637
+▁set	-638
+chool	-639
+ited	-640
+▁car	-641
+▁av	-642
+▁read	-643
+▁New	-644
+▁mon	-645
+gan	-646
+▁min	-647
+▁take	-648
+▁business	-649
+erm	-650
+▁fam	-651
+▁ins	-652
+ner	-653
+ix	-654
+▁inst	-655
+▁fl	-656
+ys	-657
+▁design	-658
+▁att	-659
+ystem	-660
+▁br	-661
+alk	-662
+▁too	-663
+.”	-664
+▁che	-665
+▁bl	-666
+io	-667
+▁long	-668
+ative	-669
+▁much	-670
+▁information	-671
+▁Be	-672
+▁made	-673
+▁last	-674
+ollow	-675
+ason	-676
+other	-677
+ues	-678
+gram	-679
+arket	-680
+▁product	-681
+omet	-682
+▁because	-683
+ock	-684
+ax	-685
+▁Fr	-686
+),	-687
+rib	-688
+▁week	-689
+▁call	-690
+▁did	-691
+▁before	-692
+▁think	-693
+▁Cl	-694
+▁team	-695
+▁world	-696
+atch	-697
+me	-698
+▁cre	-699
+ale	-700
+pen	-701
+oun	-702
+▁again	-703
+▁sur	-704
+ower	-705
+▁Ad	-706
+▁vis	-707
+ient	-708
+▁But	-709
+chn	-710
+pr	-711
+az	-712
+ustom	-713
+land	-714
+▁requ	-715
+▁art	-716
+▁develop	-717
+▁being	-718
+▁diffe	-719
+rest	-720
+▁pres	-721
+way	-722
+▁person	-723
+ng	-724
+ener	-725
+▁such	-726
+▁inte	-727
+▁Le	-728
+▁mem	-729
+▁disc	-730
+▁him	-731
+ces	-732
+▁support	-733
+▁life	-734
+arn	-735
+ug	-736
+ving	-737
+ced	-738
+ouse	-739
+unity	-740
+ave	-741
+ince	-742
+irect	-743
+▁med	-744
+▁Ar	-745
+▁does	-746
+▁while	-747
+▁those	-748
+ins	-749
+▁provid	-750
+ash	-751
+arm	-752
+view	-753
+▁sim	-754
+ivers	-755
+ros	-756
+▁lead	-757
+▁sk	-758
+akes	-759
+ality	-760
+▁pol	-761
+▁mod	-762
+▁end	-763
+▁used	-764
+▁cur	-765
+ives	-766
+▁around	-767
+ric	-768
+led	-769
+ier	-770
+▁free	-771
+ailable	-772
+ually	-773
+▁each	-774
+▁care	-775
+▁comple	-776
+▁follow	-777
+ional	-778
+ublic	-779
+▁det	-780
+▁On	-781
+ple	-782
+read	-783
+der	-784
+▁ret	-785
+ize	-786
+▁trans	-787
+ather	-788
+▁love	-789
+▁There	-790
+ages	-791
+▁post	-792
+ines	-793
+▁child	-794
+▁system	-795
+ars	-796
+▁bo	-797
+ene	-798
+roup	-799
+▁eas	-800
+▁book	-801
+▁num	-802
+▁ed	-803
+▁How	-804
+▁ser	-805
+,”	-806
+imes	-807
+▁Te	-808
+▁really	-809
+▁count	-810
+ets	-811
+▁gr	-812
+▁str	-813
+▁program	-814
+▁custom	-815
+ton	-816
+▁top	-817
+▁run	-818
+▁del	-819
+au	-820
+▁All	-821
+iet	-822
+▁cour	-823
+ffect	-824
+▁found	-825
+▁So	-826
+▁place	-827
+▁list	-828
+ness	-829
+ved	-830
+iel	-831
+▁form	-832
+▁month	-833
+▁prof	-834
+▁char	-835
+ah	-836
+▁feel	-837
+▁To	-838
+ute	-839
+▁available	-840
+▁going	-841
+▁inter	-842
+ittle	-843
+▁They	-844
+▁sign	-845
+▁sub	-846
+gg	-847
+▁market	-848
+man	-849
+ature	-850
+ames	-851
+▁fun	-852
+▁cle	-853
+▁still	-854
+cept	-855
+▁Pl	-856
+ways	-857
+▁somet	-858
+▁different	-859
+▁aut	-860
+▁both	-861
+▁three	-862
+▁few	-863
+orn	-864
+▁health	-865
+▁though	-866
+▁Ex	-867
+ital	-868
+ired	-869
+▁pur	-870
+ering	-871
+▁rep	-872
+▁adv	-873
+▁exp	-874
+▁techn	-875
+▁happ	-876
+▁open	-877
+▁lot	-878
+▁report	-879
+▁company	-880
+ata	-881
+ween	-882
+▁keep	-883
+meric	-884
+▁Sc	-885
+orth	-886
+▁plan	-887
+▁hand	-888
+ining	-889
+bers	-890
+iqu	-891
+▁She	-892
+tt	-893
+ants	-894
+be	-895
+▁ext	-896
+▁lar	-897
+▁game	-898
+▁sol	-899
+▁point	-900
+▁Q	-901
+ross	-902
+ology	-903
+▁say	-904
+ves	-905
+atur	-906
+▁met	-907
+▁import	-908
+▁process	-909
+▁fil	-910
+▁frie	-911
+▁including	-912
+▁family	-913
+▁ev	-914
+▁using	-915
+▁same	-916
+work	-917
+▁project	-918
+ized	-919
+uc	-920
+oot	-921
+▁school	-922
+▁between	-923
+▁What	-924
+ik	-925
+ling	-926
+▁little	-927
+ution	-928
+ott	-929
+att	-930
+▁experience	-931
+▁during	-932
+."	-933
+less	-934
+▁state	-935
+iving	-936
+▁Col	-937
+▁i	-938
+▁next	-939
+uss	-940
+els	-941
+▁service	-942
+aint	-943
+▁real	-944
+ody	-945
+oh	-946
+▁build	-947
+▁allow	-948
+ms	-949
+reen	-950
+▁opt	-951
+▁water	-952
+ished	-953
+▁things	-954
+▁come	-955
+▁contin	-956
+thing	-957
+▁Americ	-958
+▁var	-959
+▁Ph	-960
+▁dri	-961
+ists	-962
+uck	-963
+ever	-964
+ern	-965
+ield	-966
+▁cent	-967
+arly	-968
+over	-969
+rand	-970
+▁small	-971
+▁rece	-972
+▁organ	-973
+▁appro	-974
+▁rest	-975
+gy	-976
+▁big	-977
+self	-978
+▁Ind	-979
+▁ref	-980
+ex	-981
+▁always	-982
+▁mus	-983
+▁better	-984
+▁sure	-985
+▁With	-986
+▁interest	-987
+▁win	-988
+aut	-989
+loy	-990
+▁full	-991
+▁pat	-992
+▁poss	-993
+▁pass	-994
+ery	-995
+illion	-996
+▁online	-997
+▁pri	-998
+▁iss	-999
+▁ty	-1000
+▁put	-1001
+ined	-1002
+cent	-1003
+ware	-1004
+▁When	-1005
+▁result	-1006
+▁gener	-1007
+▁since	-1008
+▁Bl	-1009
+▁ve	-1010
+ps	-1011
+▁try	-1012
+▁direct	-1013
+▁quest	-1014
+iversity	-1015
+▁mov	-1016
+▁stand	-1017
+▁partic	-1018
+▁days	-1019
+▁perform	-1020
+▁group	-1021
+▁val	-1022
+ok	-1023
+▁pay	-1024
+▁ide	-1025
+▁head	-1026
+▁special	-1027
+▁bel	-1028
+▁Tr	-1029
+▁today	-1030
+▁Chr	-1031
+▁something	-1032
+▁class	-1033
+▁provide	-1034
+ients	-1035
+ours	-1036
+▁tri	-1037
+▁services	-1038
+▁second	-1039
+▁ann	-1040
+▁Our	-1041
+ared	-1042
+▁Con	-1043
+ccess	-1044
+▁resp	-1045
+joy	-1046
+▁phot	-1047
+▁Is	-1048
+▁conf	-1049
+ploy	-1050
+▁Or	-1051
+▁dist	-1052
+▁hard	-1053
+▁without	-1054
+pping	-1055
+con	-1056
+▁Sp	-1057
+▁number	-1058
+ER	-1059
+▁Z	-1060
+▁bro	-1061
+▁sl	-1062
+▁def	-1063
+▁cor	-1064
+▁must	-1065
+oney	-1066
+▁blo	-1067
+▁another	-1068
+ision	-1069
+▁vide	-1070
+stand	-1071
+eng	-1072
+▁current	-1073
+cl	-1074
+outh	-1075
+▁give	-1076
+▁wom	-1077
+▁old	-1078
+aj	-1079
+ically	-1080
+▁access	-1081
+▁webs	-1082
+▁able	-1083
+ards	-1084
+▁important	-1085
+ior	-1086
+iver	-1087
+▁cr	-1088
+,"	-1089
+ately	-1090
+ium	-1091
+▁—	-1092
+▁cost	-1093
+sh	-1094
+▁grow	-1095
+▁ask	-1096
+ope	-1097
+ral	-1098
+▁meet	-1099
+▁fact	-1100
+▁invest	-1101
+▁At	-1102
+▁area	-1103
+ruct	-1104
+▁Cent	-1105
+▁public	-1106
+▁got	-1107
+raph	-1108
+▁Res	-1109
+▁wr	-1110
+▁bre	-1111
+▁soc	-1112
+ote	-1113
+▁visit	-1114
+▁proble	-1115
+ered	-1116
+▁light	-1117
+▁incre	-1118
+▁US	-1119
+ample	-1120
+▁working	-1121
+ems	-1122
+▁ob	-1123
+ense	-1124
+▁data	-1125
+ann	-1126
+▁unt	-1127
+rence	-1128
+pped	-1129
+br	-1130
+▁level	-1131
+▁proper	-1132
+▁looking	-1133
+▁never	-1134
+▁sal	-1135
+▁might	-1136
+inal	-1137
+▁No	-1138
+ats	-1139
+ffic	-1140
+▁order	-1141
+ential	-1142
+ember	-1143
+▁effect	-1144
+ley	-1145
+▁event	-1146
+▁fac	-1147
+▁students	-1148
+▁food	-1149
+▁rese	-1150
+▁local	-1151
+▁Man	-1152
+ency	-1153
+▁four	-1154
+▁Comm	-1155
+▁eng	-1156
+▁profess	-1157
+ird	-1158
+▁let	-1159
+▁That	-1160
+▁offer	-1161
+ission	-1162
+▁inf	-1163
+▁enjoy	-1164
+ww	-1165
+▁site	-1166
+▁Pr	-1167
+▁spec	-1168
+▁season	-1169
+▁addition	-1170
+▁check	-1171
+ertain	-1172
+▁within	-1173
+▁children	-1174
+gin	-1175
+▁oper	-1176
+▁pos	-1177
+▁test	-1178
+ording	-1179
+▁making	-1180
+▁My	-1181
+▁view	-1182
+lection	-1183
+▁room	-1184
+▁sit	-1185
+▁prom	-1186
+▁power	-1187
+ories	-1188
+ney	-1189
+▁expl	-1190
+here	-1191
+▁ca	-1192
+load	-1193
+ently	-1194
+▁products	-1195
+rol	-1196
+▁past	-1197
+▁night	-1198
+▁community	-1199
+▁pop	-1200
+▁Mar	-1201
+▁sing	-1202
+▁against	-1203
+let	-1204
+ream	-1205
+tend	-1206
+▁until	-1207
+ases	-1208
+▁less	-1209
+▁'	-1210
+utes	-1211
+▁el	-1212
+ains	-1213
+agement	-1214
+▁est	-1215
+med	-1216
+ids	-1217
+▁email	-1218
+ieve	-1219
+▁job	-1220
+iron	-1221
+ised	-1222
+ator	-1223
+▁quality	-1224
+ivid	-1225
+ina	-1226
+▁May	-1227
+▁intern	-1228
+▁indust	-1229
+to	-1230
+ills	-1231
+▁gl	-1232
+▁website	-1233
+▁prote	-1234
+▁impro	-1235
+▁law	-1236
+ode	-1237
+ks	-1238
+orm	-1239
+▁equ	-1240
+▁App	-1241
+▁turn	-1242
+ified	-1243
+enn	-1244
+urs	-1245
+co	-1246
+ged	-1247
+▁Br	-1248
+IN	-1249
+▁away	-1250
+icle	-1251
+▁air	-1252
+▁Fe	-1253
+▁contact	-1254
+▁creat	-1255
+▁toget	-1256
+We	-1257
+▁together	-1258
+bo	-1259
+▁University	-1260
+istr	-1261
+ique	-1262
+pend	-1263
+aring	-1264
+▁supp	-1265
+▁learn	-1266
+▁success	-1267
+▁pract	-1268
+▁Co	-1269
+ury	-1270
+▁dr	-1271
+▁complete	-1272
+▁Can	-1273
+▁leg	-1274
+▁applic	-1275
+iday	-1276
+▁expect	-1277
+▁needs	-1278
+▁include	-1279
+por	-1280
+▁Christ	-1281
+iety	-1282
+ocus	-1283
+atter	-1284
+ider	-1285
+▁Cont	-1286
+▁.	-1287
+▁detail	-1288
+▁large	-1289
+▁easy	-1290
+▁la	-1291
+▁Car	-1292
+ability	-1293
+ret	-1294
+▁One	-1295
+▁along	-1296
+oci	-1297
+irl	-1298
+▁course	-1299
+▁says	-1300
+▁change	-1301
+▁news	-1302
+arent	-1303
+aster	-1304
+room	-1305
+▁present	-1306
+ger	-1307
+▁offic	-1308
+vern	-1309
+▁name	-1310
+▁chang	-1311
+ism	-1312
+hor	-1313
+▁conc	-1314
+yle	-1315
+ym	-1316
+atures	-1317
+▁beaut	-1318
+▁Am	-1319
+▁Do	-1320
+▁activ	-1321
+pos	-1322
+▁cap	-1323
+part	-1324
+lish	-1325
+ump	-1326
+ising	-1327
+ries	-1328
+▁members	-1329
+▁Me	-1330
+▁money	-1331
+enef	-1332
+▁Ste	-1333
+min	-1334
+iting	-1335
+▁employ	-1336
+rap	-1337
+▁video	-1338
+▁bas	-1339
+▁times	-1340
+the	-1341
+▁Eng	-1342
+▁talk	-1343
+ify	-1344
+▁buy	-1345
+ec	-1346
+augh	-1347
+▁beh	-1348
+▁music	-1349
+itions	-1350
+▁Ro	-1351
+▁These	-1352
+▁fav	-1353
+▁house	-1354
+▁pa	-1355
+une	-1356
+ift	-1357
+nect	-1358
+▁opport	-1359
+▁dem	-1360
+▁sw	-1361
+side	-1362
+▁/	-1363
+ane	-1364
+▁hist	-1365
+▁why	-1366
+Th	-1367
+▁En	-1368
+▁dra	-1369
+ably	-1370
+▁cond	-1371
+▁ce	-1372
+▁case	-1373
+▁please	-1374
+▁treat	-1375
+by	-1376
+mber	-1377
+ron	-1378
+veral	-1379
+ots	-1380
+▁perfect	-1381
+aff	-1382
+rie	-1383
+aterial	-1384
+pecial	-1385
+▁live	-1386
+ready	-1387
+fort	-1388
+ten	-1389
+▁govern	-1390
+▁account	-1391
+▁dev	-1392
+▁short	-1393
+ention	-1394
+ization	-1395
+▁thing	-1396
+▁create	-1397
+▁following	-1398
+▁Che	-1399
+▁story	-1400
+▁clo	-1401
+ON	-1402
+book	-1403
+▁left	-1404
+▁const	-1405
+ived	-1406
+viron	-1407
+▁review	-1408
+▁below	-1409
+▁trad	-1410
+▁understand	-1411
+▁hum	-1412
+▁million	-1413
+son	-1414
+!!	-1415
+▁side	-1416
+itive	-1417
+▁having	-1418
+▁Your	-1419
+alf	-1420
+ored	-1421
+▁After	-1422
+▁hot	-1423
+ohn	-1424
+ows	-1425
+sc	-1426
+▁page	-1427
+etwork	-1428
+▁Med	-1429
+▁based	-1430
+▁Fl	-1431
+▁focus	-1432
+of	-1433
+▁makes	-1434
+▁word	-1435
+AT	-1436
+▁research	-1437
+RE	-1438
+▁move	-1439
+▁across	-1440
+▁writ	-1441
+▁camp	-1442
+▁personal	-1443
+ienc	-1444
+ances	-1445
+▁line	-1446
+▁link	-1447
+▁kind	-1448
+▁possible	-1449
+▁cou	-1450
+rop	-1451
+▁ever	-1452
+▁mar	-1453
+▁pot	-1454
+uture	-1455
+ividual	-1456
+▁getting	-1457
+▁comes	-1458
+▁already	-1459
+uly	-1460
+▁benef	-1461
+▁elect	-1462
+ajor	-1463
+▁educ	-1464
+vious	-1465
+▁record	-1466
+ured	-1467
+uper	-1468
+osp	-1469
+▁country	-1470
+▁become	-1471
+▁Rep	-1472
+▁soft	-1473
+ination	-1474
+oice	-1475
+orts	-1476
+▁often	-1477
+▁share	-1478
+▁friends	-1479
+▁several	-1480
+ush	-1481
+▁done	-1482
+▁Ass	-1483
+iven	-1484
+ister	-1485
+▁social	-1486
+▁Count	-1487
+▁es	-1488
+duct	-1489
+▁pack	-1490
+▁bit	-1491
+wards	-1492
+▁fund	-1493
+ead	-1494
+iam	-1495
+▁enough	-1496
+▁quick	-1497
+▁mil	-1498
+▁tre	-1499
+ones	-1500
+▁minutes	-1501
+uro	-1502
+▁Please	-1503
+conom	-1504
+fer	-1505
+▁bring	-1506
+▁Inst	-1507
+inc	-1508
+▁women	-1509
+uff	-1510
+▁development	-1511
+▁vers	-1512
+▁Serv	-1513
+▁hours	-1514
+▁body	-1515
+▁Des	-1516
+▁mult	-1517
+unch	-1518
+app	-1519
+oose	-1520
+ips	-1521
+▁tell	-1522
+ides	-1523
+iful	-1524
+▁John	-1525
+vironment	-1526
+▁return	-1527
+▁purch	-1528
+mend	-1529
+aim	-1530
+▁:	-1531
+▁cut	-1532
+▁men	-1533
+ners	-1534
+▁city	-1535
+▁lo	-1536
+arl	-1537
+ape	-1538
+reet	-1539
+▁Intern	-1540
+▁deal	-1541
+▁X	-1542
+oon	-1543
+▁individual	-1544
+AN	-1545
+▁exc	-1546
+▁won	-1547
+ST	-1548
+▁ens	-1549
+▁young	-1550
+ateg	-1551
+ted	-1552
+▁Here	-1553
+▁compet	-1554
+▁hold	-1555
+▁material	-1556
+ograph	-1557
+▁sum	-1558
+▁Comp	-1559
+▁...	-1560
+▁others	-1561
+▁jo	-1562
+utions	-1563
+yn	-1564
+▁started	-1565
+▁Tw	-1566
+▁industry	-1567
+▁called	-1568
+▁months	-1569
+▁mom	-1570
+▁term	-1571
+▁non	-1572
+▁orig	-1573
+idd	-1574
+ights	-1575
+▁didn	-1576
+ript	-1577
+▁land	-1578
+ai	-1579
+ee	-1580
+nder	-1581
+▁Gu	-1582
+▁walk	-1583
+▁clean	-1584
+▁future	-1585
+▁rele	-1586
+▁American	-1587
+▁However	-1588
+▁pie	-1589
+▁City	-1590
+.,	-1591
+▁far	-1592
+▁commun	-1593
+lished	-1594
+▁po	-1595
+ched	-1596
+▁doing	-1597
+▁major	-1598
+ained	-1599
+▁control	-1600
+▁space	-1601
+fact	-1602
+ource	-1603
+urity	-1604
+ball	-1605
+arr	-1606
+osed	-1607
+▁wa	-1608
+▁low	-1609
+ges	-1610
+▁cover	-1611
+▁Ab	-1612
+▁store	-1613
+anies	-1614
+lement	-1615
+ference	-1616
+ford	-1617
+▁occ	-1618
+▁games	-1619
+▁means	-1620
+AR	-1621
+lege	-1622
+▁Not	-1623
+▁mind	-1624
+▁offers	-1625
+oring	-1626
+▁Tra	-1627
+▁yet	-1628
+▁bra	-1629
+▁Dr	-1630
+▁came	-1631
+▁five	-1632
+▁percent	-1633
+▁chall	-1634
+▁comb	-1635
+▁Min	-1636
+▁invol	-1637
+▁took	-1638
+▁doesn	-1639
+sel	-1640
+▁lim	-1641
+orld	-1642
+▁fore	-1643
+ilities	-1644
+▁customers	-1645
+▁*	-1646
+▁features	-1647
+bal	-1648
+▁least	-1649
+▁State	-1650
+▁strong	-1651
+▁step	-1652
+▁price	-1653
+ches	-1654
+▁heart	-1655
+▁God	-1656
+▁Ke	-1657
+urther	-1658
+▁range	-1659
+▁specific	-1660
+▁main	-1661
+▁More	-1662
+most	-1663
+▁require	-1664
+▁close	-1665
+▁School	-1666
+▁once	-1667
+▁key	-1668
+▁pict	-1669
+sw	-1670
+err	-1671
+▁upd	-1672
+ler	-1673
+ilt	-1674
+ither	-1675
+▁mean	-1676
+▁Bo	-1677
+▁ey	-1678
+▁early	-1679
+▁cra	-1680
+▁Jan	-1681
+▁Now	-1682
+▁tool	-1683
+▁stay	-1684
+▁discuss	-1685
+▁government	-1686
+illed	-1687
+aces	-1688
+af	-1689
+▁series	-1690
+▁tem	-1691
+ources	-1692
+▁hig	-1693
+▁priv	-1694
+▁Bro	-1695
+▁ste	-1696
+▁technology	-1697
+pro	-1698
+▁install	-1699
+cle	-1700
+▁charact	-1701
+▁Im	-1702
+atural	-1703
+▁typ	-1704
+▁Ed	-1705
+▁United	-1706
+▁redu	-1707
+▁beautiful	-1708
+atic	-1709
+▁By	-1710
+▁ago	-1711
+▁begin	-1712
+aken	-1713
+▁went	-1714
+//	-1715
+▁announ	-1716
+org	-1717
+▁thought	-1718
+▁Pe	-1719
+▁pick	-1720
+▁told	-1721
+▁hope	-1722
+ancial	-1723
+▁appear	-1724
+isk	-1725
+It	-1726
+resent	-1727
+▁anal	-1728
+▁happen	-1729
+anks	-1730
+rew	-1731
+▁Gr	-1732
+▁Em	-1733
+irm	-1734
+▁break	-1735
+▁wind	-1736
+ille	-1737
+▁questions	-1738
+resh	-1739
+OR	-1740
+▁York	-1741
+▁x	-1742
+▁Qu	-1743
+come	-1744
+▁Pre	-1745
+▁content	-1746
+▁certain	-1747
+▁Add	-1748
+oll	-1749
+▁everything	-1750
+▁prep	-1751
+ourn	-1752
+://	-1753
+hers	-1754
+▁sn	-1755
+ians	-1756
+irt	-1757
+gle	-1758
+▁companies	-1759
+▁field	-1760
+▁travel	-1761
+ony	-1762
+▁Cal	-1763
+▁enc	-1764
+▁recom	-1765
+▁single	-1766
+▁known	-1767
+▁added	-1768
+▁favor	-1769
+▁media	-1770
+cell	-1771
+▁--	-1772
+▁building	-1773
+arning	-1774
+▁manag	-1775
+▁Park	-1776
+aps	-1777
+▁search	-1778
+▁environment	-1779
+▁friend	-1780
+▁actually	-1781
+aur	-1782
+▁address	-1783
+ief	-1784
+▁tot	-1785
+▁ener	-1786
+de	-1787
+▁mess	-1788
+▁study	-1789
+eral	-1790
+▁vol	-1791
+▁press	-1792
+▁tax	-1793
+▁problem	-1794
+play	-1795
+isc	-1796
+▁later	-1797
+▁connect	-1798
+ino	-1799
+▁works	-1800
+ests	-1801
+▁Sm	-1802
+▁girl	-1803
+icy	-1804
+▁improve	-1805
+gest	-1806
+acy	-1807
+ibr	-1808
+▁taking	-1809
+ew	-1810
+▁South	-1811
+▁maint	-1812
+▁ident	-1813
+▁sound	-1814
+ental	-1815
+▁pub	-1816
+lebr	-1817
+ural	-1818
+year	-1819
+▁Su	-1820
+ided	-1821
+▁track	-1822
+▁training	-1823
+▁results	-1824
+▁watch	-1825
+ster	-1826
+▁staff	-1827
+▁card	-1828
+abor	-1829
+▁wond	-1830
+▁North	-1831
+▁face	-1832
+▁professional	-1833
+back	-1834
+nes	-1835
+ensive	-1836
+▁Mc	-1837
+ocu	-1838
+▁Just	-1839
+gs	-1840
+ES	-1841
+▁film	-1842
+▁provides	-1843
+wh	-1844
+atest	-1845
+yl	-1846
+▁seen	-1847
+▁While	-1848
+▁issues	-1849
+▁someone	-1850
+▁Per	-1851
+ama	-1852
+▁unique	-1853
+▁host	-1854
+▁half	-1855
+▁front	-1856
+▁official	-1857
+cer	-1858
+▁Euro	-1859
+fully	-1860
+▁near	-1861
+opy	-1862
+▁econom	-1863
+▁relations	-1864
+▁web	-1865
+▁sell	-1866
+▁particular	-1867
+▁National	-1868
+▁County	-1869
+▁everyone	-1870
+▁miss	-1871
+AL	-1872
+▁port	-1873
+▁dig	-1874
+urch	-1875
+▁due	-1876
+▁Aust	-1877
+▁Some	-1878
+go	-1879
+▁recommend	-1880
+▁network	-1881
+hod	-1882
+▁cook	-1883
+▁Center	-1884
+▁Don	-1885
+lex	-1886
+▁cred	-1887
+▁office	-1888
+▁respons	-1889
+ued	-1890
+▁z	-1891
+▁Inc	-1892
+▁simple	-1893
+▁Oct	-1894
+▁Part	-1895
+itted	-1896
+▁age	-1897
+▁ant	-1898
+ctor	-1899
+ibility	-1900
+▁aud	-1901
+▁management	-1902
+ging	-1903
+▁click	-1904
+not	-1905
+roll	-1906
+▁oil	-1907
+▁Pol	-1908
+▁particip	-1909
+▁Dep	-1910
+time	-1911
+asing	-1912
+▁whole	-1913
+pecially	-1914
+▁mot	-1915
+▁bar	-1916
+obile	-1917
+iod	-1918
+▁Acc	-1919
+▁Pres	-1920
+▁performance	-1921
+▁areas	-1922
+▁Apr	-1923
+▁mor	-1924
+▁ess	-1925
+pper	-1926
+▁fall	-1927
+▁author	-1928
+cing	-1929
+▁given	-1930
+ply	-1931
+imate	-1932
+▁bed	-1933
+▁World	-1934
+icult	-1935
+nding	-1936
+▁above	-1937
+▁reason	-1938
+▁protect	-1939
+ites	-1940
+▁events	-1941
+In	-1942
+ators	-1943
+aining	-1944
+▁among	-1945
+ables	-1946
+▁eff	-1947
+▁experienc	-1948
+umb	-1949
+ops	-1950
+▁Will	-1951
+ask	-1952
+▁Sec	-1953
+▁history	-1954
+EN	-1955
+▁select	-1956
+▁Stud	-1957
+omes	-1958
+▁black	-1959
+ogn	-1960
+ED	-1961
+▁assist	-1962
+▁size	-1963
+▁energy	-1964
+▁foot	-1965
+cy	-1966
+ison	-1967
+ili	-1968
+▁High	-1969
+▁details	-1970
+ledge	-1971
+▁htt	-1972
+▁print	-1973
+▁glo	-1974
+▁Reg	-1975
+▁believe	-1976
+▁flo	-1977
+▁sex	-1978
+▁further	-1979
+▁From	-1980
+crib	-1981
+▁amount	-1982
+▁Post	-1983
+▁six	-1984
+▁log	-1985
+idence	-1986
+ety	-1987
+ulation	-1988
+▁includes	-1989
+▁designed	-1990
+▁prob	-1991
+▁Friday	-1992
+astic	-1993
+▁pain	-1994
+ands	-1995
+vert	-1996
+▁cult	-1997
+ufact	-1998
+▁repl	-1999
+▁points	-2000
+▁parent	-2001
+▁mag	-2002
+▁red	-2003
+▁Day	-2004
+▁property	-2005
+AS	-2006
+▁Ge	-2007
+ruction	-2008
+▁Bar	-2009
+▁continue	-2010
+▁soon	-2011
+nov	-2012
+urance	-2013
+▁feature	-2014
+▁value	-2015
+▁Aug	-2016
+▁et	-2017
+▁Mr	-2018
+▁Europe	-2019
+▁anything	-2020
+▁various	-2021
+▁text	-2022
+itch	-2023
+▁coming	-2024
+▁question	-2025
+▁popular	-2026
+▁latest	-2027
+itional	-2028
+▁according	-2029
+aily	-2030
+▁lov	-2031
+▁living	-2032
+rodu	-2033
+▁forward	-2034
+▁phys	-2035
+▁type	-2036
+my	-2037
+▁fre	-2038
+uation	-2039
+▁March	-2040
+▁phone	-2041
+itc	-2042
+ouch	-2043
+▁consider	-2044
+cript	-2045
+▁pret	-2046
+▁whether	-2047
+aturday	-2048
+IC	-2049
+▁brand	-2050
+IT	-2051
+▁entire	-2052
+▁idea	-2053
+ze	-2054
+though	-2055
+▁claim	-2056
+▁white	-2057
+edd	-2058
+aching	-2059
+▁celebr	-2060
+▁weeks	-2061
+▁gra	-2062
+▁dou	-2063
+▁needed	-2064
+▁Bu	-2065
+▁diff	-2066
+▁consum	-2067
+▁potential	-2068
+▁opportunity	-2069
+▁deb	-2070
+▁comput	-2071
+▁El	-2072
+▁color	-2073
+elt	-2074
+▁taken	-2075
+▁Us	-2076
+▁June	-2077
+▁wide	-2078
+▁required	-2079
+▁receive	-2080
+▁par	-2081
+▁date	-2082
+▁Sept	-2083
+▁extra	-2084
+selves	-2085
+ung	-2086
+▁Sund	-2087
+itter	-2088
+▁docu	-2089
+new	-2090
+▁third	-2091
+▁example	-2092
+AC	-2093
+▁relationship	-2094
+▁safe	-2095
+ival	-2096
+▁ensure	-2097
+▁bad	-2098
+▁sent	-2099
+This	-2100
+ises	-2101
+▁ready	-2102
+itor	-2103
+▁inj	-2104
+▁Off	-2105
+▁West	-2106
+▁,	-2107
+▁comfort	-2108
+ilar	-2109
+amer	-2110
+▁currently	-2111
+▁meas	-2112
+ees	-2113
+▁financial	-2114
+ires	-2115
+▁common	-2116
+▁almost	-2117
+▁sugg	-2118
+ffe	-2119
+▁fire	-2120
+▁ach	-2121
+▁April	-2122
+head	-2123
+uary	-2124
+val	-2125
+▁ways	-2126
+▁human	-2127
+▁kids	-2128
+▁Read	-2129
+▁Art	-2130
+▁period	-2131
+▁pretty	-2132
+▁quite	-2133
+▁Jo	-2134
+▁options	-2135
+▁final	-2136
+▁skin	-2137
+▁natural	-2138
+▁yourself	-2139
+▁especially	-2140
+▁veh	-2141
+irc	-2142
+▁road	-2143
+▁style	-2144
+▁trying	-2145
+▁park	-2146
+▁sho	-2147
+▁box	-2148
+▁Health	-2149
+▁Cor	-2150
+ring	-2151
+▁items	-2152
+▁paper	-2153
+▁His	-2154
+▁answ	-2155
+used	-2156
+▁provided	-2157
+▁member	-2158
+▁either	-2159
+ese	-2160
+ana	-2161
+ively	-2162
+....	-2163
+▁Saturday	-2164
+itting	-2165
+▁engine	-2166
+▁coll	-2167
+onday	-2168
+▁choose	-2169
+▁self	-2170
+▁hon	-2171
+▁crit	-2172
+▁held	-2173
+▁throughout	-2174
+▁happy	-2175
+▁dam	-2176
+▁fit	-2177
+▁download	-2178
+▁via	-2179
+▁swe	-2180
+▁attend	-2181
+▁wanted	-2182
+▁flow	-2183
+▁clients	-2184
+▁stra	-2185
+ication	-2186
+▁summer	-2187
+▁Pa	-2188
+▁recent	-2189
+▁Fin	-2190
+▁impact	-2191
+▁Aut	-2192
+▁users	-2193
+ada	-2194
+▁created	-2195
+▁sales	-2196
+▁Af	-2197
+▁tit	-2198
+icro	-2199
+azing	-2200
+▁July	-2201
+▁blog	-2202
+▁issue	-2203
+▁previous	-2204
+▁behind	-2205
+▁takes	-2206
+oogle	-2207
+arter	-2208
+▁recently	-2209
+hel	-2210
+▁TH	-2211
+▁software	-2212
+angu	-2213
+▁Dav	-2214
+gress	-2215
+IS	-2216
+▁init	-2217
+do	-2218
+cast	-2219
+ams	-2220
+ux	-2221
+▁version	-2222
+▁super	-2223
+▁Get	-2224
+ried	-2225
+▁bott	-2226
+▁Feb	-2227
+▁seem	-2228
+▁Up	-2229
+▁couple	-2230
+▁song	-2231
+▁running	-2232
+▁insp	-2233
+verage	-2234
+▁hol	-2235
+ume	-2236
+▁clear	-2237
+ober	-2238
+▁problems	-2239
+▁collect	-2240
+ades	-2241
+apt	-2242
+▁education	-2243
+▁isn	-2244
+▁method	-2245
+▁received	-2246
+oura	-2247
+▁table	-2248
+▁role	-2249
+▁players	-2250
+▁represent	-2251
+▁reading	-2252
+uge	-2253
+▁Val	-2254
+▁Direct	-2255
+▁Int	-2256
+eth	-2257
+anced	-2258
+itten	-2259
+▁signific	-2260
+atform	-2261
+▁likely	-2262
+eke	-2263
+ole	-2264
+earch	-2265
+ification	-2266
+▁Sw	-2267
+par	-2268
+▁shows	-2269
+▁di	-2270
+▁security	-2271
+where	-2272
+▁increase	-2273
+▁accom	-2274
+▁States	-2275
+▁Mon	-2276
+▁customer	-2277
+▁favorite	-2278
+▁stri	-2279
+▁pan	-2280
+▁party	-2281
+reme	-2282
+▁skills	-2283
+▁action	-2284
+▁regular	-2285
+St	-2286
+▁difficult	-2287
+▁fast	-2288
+▁simply	-2289
+idge	-2290
+OU	-2291
+▁sle	-2292
+▁else	-2293
+▁Face	-2294
+▁writing	-2295
+▁ele	-2296
+▁nice	-2297
+aging	-2298
+▁Sunday	-2299
+oud	-2300
+▁Monday	-2301
+oid	-2302
+▁position	-2303
+overed	-2304
+▁article	-2305
+▁outside	-2306
+▁original	-2307
+▁Her	-2308
+▁probably	-2309
+▁cool	-2310
+aving	-2311
+icles	-2312
+mit	-2313
+▁cup	-2314
+▁necess	-2315
+▁inside	-2316
+ID	-2317
+▁fresh	-2318
+istration	-2319
+▁asked	-2320
+▁wonder	-2321
+▁goal	-2322
+▁systems	-2323
+▁manufact	-2324
+.)	-2325
+arth	-2326
+aby	-2327
+▁model	-2328
+▁House	-2329
+--	-2330
+li	-2331
+▁morning	-2332
+▁ground	-2333
+▁President	-2334
+icated	-2335
+▁application	-2336
+▁leave	-2337
+ham	-2338
+eter	-2339
+▁ful	-2340
+▁learning	-2341
+▁anim	-2342
+uit	-2343
+aker	-2344
+▁Associ	-2345
+▁risk	-2346
+▁Act	-2347
+▁Black	-2348
+▁knowledge	-2349
+▁located	-2350
+based	-2351
+▁contrib	-2352
+▁UK	-2353
+▁projects	-2354
+▁release	-2355
+▁lives	-2356
+▁changes	-2357
+▁tour	-2358
+▁Are	-2359
+▁Bus	-2360
+▁however	-2361
+ox	-2362
+▁Free	-2363
+▁treatment	-2364
+▁stop	-2365
+medi	-2366
+face	-2367
+right	-2368
+▁Austral	-2369
+▁exist	-2370
+▁mix	-2371
+▁recogn	-2372
+▁additional	-2373
+▁polit	-2374
+adem	-2375
+▁Red	-2376
+▁activities	-2377
+▁private	-2378
+▁abs	-2379
+▁sat	-2380
+iple	-2381
+▁career	-2382
+▁board	-2383
+name	-2384
+▁medical	-2385
+▁Work	-2386
+▁total	-2387
+▁cal	-2388
+▁Mich	-2389
+▁anyone	-2390
+▁hit	-2391
+▁etc	-2392
+artment	-2393
+▁fail	-2394
+▁ple	-2395
+▁TV	-2396
+▁accept	-2397
+urg	-2398
+▁town	-2399
+▁Soc	-2400
+ague	-2401
+▁base	-2402
+arget	-2403
+aign	-2404
+amed	-2405
+bor	-2406
+OT	-2407
+hib	-2408
+▁mark	-2409
+▁former	-2410
+▁contract	-2411
+▁matter	-2412
+▁included	-2413
+▁America	-2414
+ounc	-2415
+ming	-2416
+▁mach	-2417
+ules	-2418
+ession	-2419
+▁Sal	-2420
+iol	-2421
+▁stock	-2422
+▁autom	-2423
+▁match	-2424
+▁significant	-2425
+▁words	-2426
+izing	-2427
+▁hair	-2428
+ipment	-2429
+▁saf	-2430
+ecut	-2431
+▁Ser	-2432
+▁meeting	-2433
+wood	-2434
+▁Of	-2435
+▁October	-2436
+▁September	-2437
+▁books	-2438
+▁growth	-2439
+ovember	-2440
+▁playing	-2441
+▁Ac	-2442
+▁January	-2443
+aced	-2444
+▁leaders	-2445
+empt	-2446
+▁ball	-2447
+▁worth	-2448
+mon	-2449
+irth	-2450
+▁round	-2451
+▁drive	-2452
+▁longer	-2453
+▁hy	-2454
+▁character	-2455
+▁variety	-2456
+ny	-2457
+▁concern	-2458
+▁News	-2459
+▁First	-2460
+▁practice	-2461
+ester	-2462
+▁production	-2463
+che	-2464
+▁Sk	-2465
+▁function	-2466
+▁Wed	-2467
+rict	-2468
+▁looks	-2469
+▁squ	-2470
+ground	-2471
+▁exam	-2472
+▁late	-2473
+reg	-2474
+▁San	-2475
+ude	-2476
+airs	-2477
+▁lay	-2478
+▁wall	-2479
+▁Every	-2480
+mercial	-2481
+pm	-2482
+iff	-2483
+▁sun	-2484
+▁defin	-2485
+ursday	-2486
+adu	-2487
+▁determ	-2488
+na	-2489
+▁Ag	-2490
+▁August	-2491
+▁suggest	-2492
+ci	-2493
+▁Har	-2494
+elcome	-2495
+▁worked	-2496
+▁fig	-2497
+▁weeke	-2498
+ville	-2499
+▁associ	-2500
+▁Google	-2501
+uesday	-2502
+imum	-2503
+▁death	-2504
+▁programs	-2505
+▁chance	-2506
+▁platform	-2507
+▁cand	-2508
+▁screen	-2509
+▁international	-2510
+▁Then	-2511
+▁Let	-2512
+iddle	-2513
+ipping	-2514
+cks	-2515
+rect	-2516
+▁deg	-2517
+▁Dis	-2518
+▁true	-2519
+▁nothing	-2520
+▁challeng	-2521
+Wh	-2522
+itchen	-2523
+▁loss	-2524
+▁general	-2525
+▁clos	-2526
+▁rather	-2527
+▁plans	-2528
+arden	-2529
+▁Facebook	-2530
+▁purchase	-2531
+▁estab	-2532
+erc	-2533
+▁amazing	-2534
+▁credit	-2535
+▁subject	-2536
+▁leading	-2537
+▁Department	-2538
+▁regard	-2539
+▁stat	-2540
+cember	-2541
+▁allows	-2542
+ouncil	-2543
+▁seems	-2544
+olution	-2545
+eds	-2546
+▁built	-2547
+▁arri	-2548
+▁police	-2549
+mas	-2550
+▁similar	-2551
+▁Mus	-2552
+▁Sim	-2553
+▁student	-2554
+▁usually	-2555
+▁infl	-2556
+▁Pat	-2557
+▁rate	-2558
+▁quickly	-2559
+oke	-2560
+▁Air	-2561
+▁November	-2562
+▁teac	-2563
+▁Also	-2564
+lin	-2565
+▁Street	-2566
+AM	-2567
+▁draw	-2568
+▁national	-2569
+ashing	-2570
+▁touch	-2571
+ought	-2572
+▁providing	-2573
+▁International	-2574
+▁comment	-2575
+light	-2576
+oph	-2577
+▁excell	-2578
+▁deep	-2579
+▁apply	-2580
+▁higher	-2581
+nesday	-2582
+iter	-2583
+iber	-2584
+▁choice	-2585
+clus	-2586
+▁photos	-2587
+▁Group	-2588
+str	-2589
+gar	-2590
+▁tast	-2591
+ING	-2592
+▁respect	-2593
+▁collection	-2594
+off	-2595
+▁safety	-2596
+▁Out	-2597
+▁Cons	-2598
+▁image	-2599
+now	-2600
+▁hands	-2601
+▁marketing	-2602
+▁prior	-2603
+▁ideas	-2604
+ondon	-2605
+▁integr	-2606
+▁moment	-2607
+▁sil	-2608
+▁movie	-2609
+▁encoura	-2610
+▁easily	-2611
+▁decision	-2612
+▁ut	-2613
+example	-2614
+▁Cour	-2615
+▁location	-2616
+▁cell	-2617
+▁bal	-2618
+▁inde	-2619
+▁dom	-2620
+hern	-2621
+▁rad	-2622
+▁prevent	-2623
+▁af	-2624
+▁court	-2625
+▁bud	-2626
+▁Wind	-2627
+▁op	-2628
+▁released	-2629
+▁decided	-2630
+▁mass	-2631
+▁commit	-2632
+▁ill	-2633
+▁Thursday	-2634
+ached	-2635
+▁digital	-2636
+▁Home	-2637
+put	-2638
+▁Tuesday	-2639
+ournal	-2640
+▁emb	-2641
+ha	-2642
+▁reported	-2643
+▁Well	-2644
+▁benefits	-2645
+▁Calif	-2646
+▁file	-2647
+ivery	-2648
+▁exact	-2649
+▁seek	-2650
+▁December	-2651
+▁wood	-2652
+▁introdu	-2653
+amb	-2654
+▁La	-2655
+▁cannot	-2656
+ma	-2657
+eal	-2658
+▁campaign	-2659
+▁lost	-2660
+reng	-2661
+▁partners	-2662
+▁display	-2663
+▁Most	-2664
+▁daily	-2665
+▁parents	-2666
+▁attack	-2667
+▁ord	-2668
+▁Business	-2669
+ishing	-2670
+idents	-2671
+hood	-2672
+▁involved	-2673
+▁agree	-2674
+▁announced	-2675
+▁cause	-2676
+▁effic	-2677
+▁sche	-2678
+rown	-2679
+▁sens	-2680
+ructure	-2681
+▁Gl	-2682
+unities	-2683
+▁drink	-2684
+▁piece	-2685
+▁center	-2686
+▁Ang	-2687
+ray	-2688
+ospital	-2689
+▁neg	-2690
+atory	-2691
+▁user	-2692
+▁dest	-2693
+OM	-2694
+▁related	-2695
+▁saw	-2696
+▁Any	-2697
+▁affect	-2698
+▁expected	-2699
+▁vict	-2700
+ipe	-2701
+▁Design	-2702
+▁ability	-2703
+▁investig	-2704
+▁club	-2705
+ederal	-2706
+▁patients	-2707
+▁ep	-2708
+▁Wednesday	-2709
+▁London	-2710
+EO	-2711
+▁Click	-2712
+ruary	-2713
+avy	-2714
+▁rout	-2715
+▁send	-2716
+illing	-2717
+▁ri	-2718
+▁save	-2719
+ilies	-2720
+▁modern	-2721
+▁tick	-2722
+▁norm	-2723
+just	-2724
+ET	-2725
+▁mobile	-2726
+▁weekend	-2727
+▁circ	-2728
+▁standard	-2729
+sp	-2730
+▁langu	-2731
+▁Prof	-2732
+▁expert	-2733
+▁option	-2734
+ett	-2735
+▁goes	-2736
+▁boy	-2737
+▁ded	-2738
+▁immedi	-2739
+▁green	-2740
+▁enter	-2741
+▁restaur	-2742
+▁Over	-2743
+▁computer	-2744
+▁fight	-2745
+▁War	-2746
+▁aw	-2747
+▁woman	-2748
+▁global	-2749
+istic	-2750
+▁bag	-2751
+▁pers	-2752
+board	-2753
+lim	-2754
+▁target	-2755
+▁mother	-2756
+ivity	-2757
+▁iP	-2758
+▁emer	-2759
+▁sym	-2760
+uel	-2761
+▁College	-2762
+like	-2763
+iring	-2764
+▁innov	-2765
+▁serious	-2766
+▁parts	-2767
+▁helps	-2768
+▁huge	-2769
+▁costs	-2770
+▁PM	-2771
+▁English	-2772
+key	-2773
+asons	-2774
+aves	-2775
+oday	-2776
+▁gen	-2777
+▁Check	-2778
+zz	-2779
+ellow	-2780
+▁surpr	-2781
+▁weight	-2782
+▁http	-2783
+▁earn	-2784
+enge	-2785
+uk	-2786
+erve	-2787
+ara	-2788
+▁bank	-2789
+▁rights	-2790
+▁ones	-2791
+ornia	-2792
+▁legal	-2793
+▁code	-2794
+▁solutions	-2795
+▁request	-2796
+▁equipment	-2797
+▁Sen	-2798
+▁myself	-2799
+▁gives	-2800
+▁Afric	-2801
+▁tools	-2802
+▁warm	-2803
+▁arch	-2804
+▁Other	-2805
+▁insurance	-2806
+raft	-2807
+▁Del	-2808
+cription	-2809
+band	-2810
+ram	-2811
+edding	-2812
+▁Hol	-2813
+▁feed	-2814
+EC	-2815
+▁approach	-2816
+ault	-2817
+▁conditions	-2818
+▁played	-2819
+▁giving	-2820
+▁Techn	-2821
+▁admin	-2822
+pri	-2823
+▁dress	-2824
+▁attention	-2825
+▁Book	-2826
+▁Ob	-2827
+attle	-2828
+OS	-2829
+▁roll	-2830
+▁levels	-2831
+▁sett	-2832
+▁sus	-2833
+▁resources	-2834
+▁Par	-2835
+▁award	-2836
+unt	-2837
+▁Brit	-2838
+▁prim	-2839
+hold	-2840
+▁deliver	-2841
+ension	-2842
+▁trust	-2843
+iction	-2844
+atives	-2845
+▁note	-2846
+▁Service	-2847
+▁sold	-2848
+▁qual	-2849
+aged	-2850
+bert	-2851
+▁policy	-2852
+▁remember	-2853
+▁interested	-2854
+▁February	-2855
+erous	-2856
+▁solution	-2857
+▁Play	-2858
+▁door	-2859
+▁Trans	-2860
+▁businesses	-2861
+▁capt	-2862
+▁gets	-2863
+▁planning	-2864
+▁subs	-2865
+▁highly	-2866
+▁lab	-2867
+aught	-2868
+▁object	-2869
+iding	-2870
+pose	-2871
+▁starting	-2872
+▁opp	-2873
+▁cases	-2874
+ysis	-2875
+partment	-2876
+▁Christmas	-2877
+▁Law	-2878
+akers	-2879
+▁lower	-2880
+▁upon	-2881
+▁vac	-2882
+▁instead	-2883
+▁write	-2884
+▁hear	-2885
+▁organization	-2886
+▁materials	-2887
+vey	-2888
+▁express	-2889
+▁themselves	-2890
+EL	-2891
+▁published	-2892
+irit	-2893
+▁California	-2894
+ening	-2895
+▁president	-2896
+▁source	-2897
+ica	-2898
+▁reach	-2899
+▁plant	-2900
+▁Gener	-2901
+▁condition	-2902
+ples	-2903
+mission	-2904
+ashion	-2905
+orge	-2906
+urt	-2907
+▁sense	-2908
+▁fine	-2909
+▁streng	-2910
+apan	-2911
+▁effective	-2912
+www	-2913
+▁dry	-2914
+izes	-2915
+ibrary	-2916
+▁firm	-2917
+▁sale	-2918
+bum	-2919
+▁mid	-2920
+▁written	-2921
+▁photo	-2922
+▁types	-2923
+▁average	-2924
+▁dise	-2925
+AP	-2926
+rup	-2927
+▁interview	-2928
+urb	-2929
+rom	-2930
+▁consult	-2931
+▁Go	-2932
+▁countries	-2933
+▁AM	-2934
+▁Met	-2935
+▁positive	-2936
+ule	-2937
+▁multiple	-2938
+▁remov	-2939
+wide	-2940
+▁Rem	-2941
+▁Services	-2942
+iles	-2943
+gu	-2944
+ida	-2945
+ael	-2946
+▁lif	-2947
+arant	-2948
+▁Great	-2949
+▁join	-2950
+mm	-2951
+▁Je	-2952
+enty	-2953
+unk	-2954
+▁slow	-2955
+▁India	-2956
+▁Spe	-2957
+▁trip	-2958
+ube	-2959
+▁describ	-2960
+aches	-2961
+ato	-2962
+▁began	-2963
+ength	-2964
+▁imm	-2965
+▁interesting	-2966
+▁Mod	-2967
+▁images	-2968
+▁answer	-2969
+▁prem	-2970
+▁player	-2971
+▁cat	-2972
+add	-2973
+▁viol	-2974
+▁opportunities	-2975
+urer	-2976
+▁message	-2977
+▁employees	-2978
+▁Cle	-2979
+▁dream	-2980
+▁healthy	-2981
+ography	-2982
+▁heat	-2983
+ager	-2984
+▁Why	-2985
+▁Sch	-2986
+▁sites	-2987
+▁Thanks	-2988
+ration	-2989
+▁camer	-2990
+▁directly	-2991
+▁hour	-2992
+▁item	-2993
+rel	-2994
+▁document	-2995
+rought	-2996
+▁fans	-2997
+▁According	-2998
+bit	-2999
+orage	-3000
+▁necessary	-3001
+press	-3002
+itute	-3003
+▁achieve	-3004
+▁picture	-3005
+IL	-3006
+▁David	-3007
+▁essential	-3008
+▁copy	-3009
+▁Hot	-3010
+▁completely	-3011
+▁Program	-3012
+▁Av	-3013
+▁Sub	-3014
+▁gift	-3015
+▁lic	-3016
+▁Once	-3017
+▁tele	-3018
+▁band	-3019
+▁families	-3020
+▁stories	-3021
+▁prices	-3022
+sy	-3023
+▁groups	-3024
+duc	-3025
+▁Year	-3026
+olf	-3027
+▁commercial	-3028
+▁Phot	-3029
+▁King	-3030
+arlier	-3031
+▁Rec	-3032
+▁Whe	-3033
+▁Found	-3034
+▁Since	-3035
+▁reve	-3036
+elling	-3037
+▁offe	-3038
+▁excellent	-3039
+▁goals	-3040
+ocol	-3041
+▁div	-3042
+▁East	-3043
+▁cert	-3044
+▁promot	-3045
+▁Cr	-3046
+▁Even	-3047
+▁dru	-3048
+▁pull	-3049
+▁successful	-3050
+▁Market	-3051
+▁eye	-3052
+▁fully	-3053
+▁www	-3054
+▁growing	-3055
+ares	-3056
+itely	-3057
+▁Mag	-3058
+▁hor	-3059
+▁led	-3060
+▁itself	-3061
+itation	-3062
+▁Many	-3063
+▁Loc	-3064
+▁creating	-3065
+▁fix	-3066
+▁stru	-3067
+▁except	-3068
+iant	-3069
+▁adult	-3070
+▁traditional	-3071
+▁comments	-3072
+▁gold	-3073
+▁White	-3074
+▁paint	-3075
+▁separ	-3076
+oul	-3077
+erved	-3078
+▁Good	-3079
+▁fab	-3080
+▁aim	-3081
+coming	-3082
+▁neigh	-3083
+▁broad	-3084
+▁Germ	-3085
+▁Russ	-3086
+mb	-3087
+ancy	-3088
+iable	-3089
+▁Green	-3090
+▁birth	-3091
+onse	-3092
+▁propos	-3093
+omen	-3094
+▁fair	-3095
+▁cy	-3096
+▁device	-3097
+ooth	-3098
+BC	-3099
+▁reports	-3100
+▁gar	-3101
+uses	-3102
+anch	-3103
+▁Best	-3104
+▁block	-3105
+▁mount	-3106
+▁kitchen	-3107
+▁terms	-3108
+▁teams	-3109
+▁cross	-3110
+oms	-3111
+udd	-3112
+tee	-3113
+▁stuff	-3114
+▁Spr	-3115
+▁extreme	-3116
+▁dark	-3117
+ffee	-3118
+▁vehicle	-3119
+▁Last	-3120
+▁Jack	-3121
+▁attempt	-3122
+▁Each	-3123
+urning	-3124
+▁glass	-3125
+▁applications	-3126
+▁wasn	-3127
+ores	-3128
+venue	-3129
+▁hop	-3130
+▁saying	-3131
+hest	-3132
+▁floor	-3133
+▁wrong	-3134
+ey	-3135
+▁baby	-3136
+imately	-3137
+▁Tex	-3138
+▁dead	-3139
+ties	-3140
+uth	-3141
+▁Bra	-3142
+▁China	-3143
+▁thinking	-3144
+▁Port	-3145
+▁rev	-3146
+▁depend	-3147
+▁Web	-3148
+▁shoot	-3149
+▁Ty	-3150
+inner	-3151
+ipped	-3152
+▁bi	-3153
+▁blood	-3154
+ecutive	-3155
+ashington	-3156
+oming	-3157
+ald	-3158
+OL	-3159
+▁Develop	-3160
+▁Twitter	-3161
+istry	-3162
+▁mention	-3163
+▁See	-3164
+”.	-3165
+TM	-3166
+▁gave	-3167
+▁Japan	-3168
+▁smart	-3169
+aughter	-3170
+▁System	-3171
+▁Hall	-3172
+▁wait	-3173
+inary	-3174
+▁implement	-3175
+pite	-3176
+▁profession	-3177
+▁obs	-3178
+rote	-3179
+▁speed	-3180
+▁aware	-3181
+▁serve	-3182
+▁attract	-3183
+▁spend	-3184
+▁director	-3185
+▁organiz	-3186
+▁Bel	-3187
+▁offering	-3188
+iced	-3189
+▁section	-3190
+▁sen	-3191
+▁budget	-3192
+▁Association	-3193
+▁became	-3194
+▁farm	-3195
+ological	-3196
+aries	-3197
+▁impress	-3198
+▁distrib	-3199
+Ch	-3200
+rows	-3201
+▁Office	-3202
+▁ge	-3203
+▁Mor	-3204
+▁pictures	-3205
+▁nation	-3206
+▁college	-3207
+▁wish	-3208
+AD	-3209
+▁Sol	-3210
+▁Pri	-3211
+▁correct	-3212
+overn	-3213
+field	-3214
+▁Make	-3215
+▁suit	-3216
+▁IN	-3217
+▁effort	-3218
+▁Mem	-3219
+▁developed	-3220
+▁moving	-3221
+▁places	-3222
+▁tal	-3223
+▁coun	-3224
+▁conduct	-3225
+▁carry	-3226
+▁dog	-3227
+▁limited	-3228
+▁individuals	-3229
+ils	-3230
+▁advice	-3231
+▁dro	-3232
+▁rent	-3233
+vest	-3234
+▁son	-3235
+pre	-3236
+▁avoid	-3237
+▁spent	-3238
+yond	-3239
+zy	-3240
+ications	-3241
+▁complex	-3242
+▁defe	-3243
+▁Paul	-3244
+▁bath	-3245
+lock	-3246
+▁title	-3247
+▁situation	-3248
+▁sleep	-3249
+▁Down	-3250
+▁Road	-3251
+idered	-3252
+▁requirements	-3253
+▁album	-3254
+▁progress	-3255
+ceed	-3256
+▁delivery	-3257
+▁Today	-3258
+▁jud	-3259
+▁cas	-3260
+▁Educ	-3261
+▁Washington	-3262
+▁Vis	-3263
+▁Inter	-3264
+rench	-3265
+▁construction	-3266
+▁vot	-3267
+riend	-3268
+▁enh	-3269
+▁Public	-3270
+ibly	-3271
+▁About	-3272
+house	-3273
+haps	-3274
+▁ble	-3275
+word	-3276
+▁Canada	-3277
+▁advant	-3278
+▁wants	-3279
+▁Top	-3280
+▁statement	-3281
+▁feet	-3282
+▁Use	-3283
+▁schools	-3284
+▁Gold	-3285
+▁war	-3286
+useum	-3287
+down	-3288
+▁race	-3289
+▁heard	-3290
+▁convers	-3291
+▁eat	-3292
+▁Find	-3293
+US	-3294
+▁sometimes	-3295
+▁sweet	-3296
+▁Director	-3297
+▁stress	-3298
+▁nut	-3299
+▁AN	-3300
+▁billion	-3301
+reci	-3302
+▁Lear	-3303
+▁quarter	-3304
+▁physical	-3305
+▁felt	-3306
+ancing	-3307
+▁hous	-3308
+▁Indian	-3309
+▁hotel	-3310
+PS	-3311
+▁Mac	-3312
+▁towards	-3313
+▁consist	-3314
+itary	-3315
+▁stage	-3316
+▁annual	-3317
+▁spot	-3318
+▁shop	-3319
+▁strateg	-3320
+▁shot	-3321
+▁Flor	-3322
+ports	-3323
+▁wonderful	-3324
+porate	-3325
+▁Open	-3326
+▁loved	-3327
+▁region	-3328
+▁ing	-3329
+▁path	-3330
+▁Dem	-3331
+▁feeling	-3332
+▁owners	-3333
+▁finish	-3334
+▁ver	-3335
+▁Pal	-3336
+▁THE	-3337
+▁aff	-3338
+unte	-3339
+▁mat	-3340
+ari	-3341
+▁eyes	-3342
+▁pattern	-3343
+▁Council	-3344
+isions	-3345
+▁finally	-3346
+▁lik	-3347
+▁ten	-3348
+ctions	-3349
+ION	-3350
+▁language	-3351
+▁brought	-3352
+▁wife	-3353
+▁Care	-3354
+▁pet	-3355
+▁Texas	-3356
+▁interact	-3357
+▁partner	-3358
+▁sports	-3359
+▁straight	-3360
+▁inform	-3361
+rast	-3362
+▁Dan	-3363
+▁nature	-3364
+ads	-3365
+▁investment	-3366
+roid	-3367
+▁Club	-3368
+▁respond	-3369
+▁concept	-3370
+▁nearly	-3371
+dule	-3372
+owl	-3373
+▁Hel	-3374
+▁helping	-3375
+▁overall	-3376
+▁Class	-3377
+▁exerc	-3378
+▁star	-3379
+▁Bre	-3380
+▁categ	-3381
+▁weather	-3382
+▁ult	-3383
+▁Apple	-3384
+▁max	-3385
+▁guide	-3386
+▁tried	-3387
+▁blue	-3388
+▁William	-3389
+end	-3390
+▁temper	-3391
+estival	-3392
+▁collabor	-3393
+▁pow	-3394
+▁largest	-3395
+▁Court	-3396
+".	-3397
+ened	-3398
+▁demand	-3399
+▁independ	-3400
+▁charge	-3401
+▁client	-3402
+hips	-3403
+▁Board	-3404
+As	-3405
+▁rock	-3406
+▁Time	-3407
+itect	-3408
+ourney	-3409
+▁wear	-3410
+change	-3411
+▁Oh	-3412
+▁pred	-3413
+ament	-3414
+▁advert	-3415
+He	-3416
+▁appoint	-3417
+▁definitely	-3418
+mitted	-3419
+▁wrote	-3420
+▁candid	-3421
+▁activity	-3422
+▁gas	-3423
+▁Ir	-3424
+rences	-3425
+▁seven	-3426
+▁Windows	-3427
+▁cold	-3428
+▁Ann	-3429
+rig	-3430
+aly	-3431
+ago	-3432
+▁benefit	-3433
+▁Internet	-3434
+▁offered	-3435
+inger	-3436
+roud	-3437
+asc	-3438
+▁Australia	-3439
+yd	-3440
+▁acqu	-3441
+▁influ	-3442
+▁response	-3443
+▁turned	-3444
+wise	-3445
+▁Ant	-3446
+▁miles	-3447
+▁double	-3448
+▁Review	-3449
+▁pieces	-3450
+▁uses	-3451
+last	-3452
+▁Tom	-3453
+ounds	-3454
+▁earlier	-3455
+▁Fam	-3456
+▁internet	-3457
+▁devices	-3458
+▁thous	-3459
+uted	-3460
+▁beginning	-3461
+ned	-3462
+▁considered	-3463
+▁ahead	-3464
+lies	-3465
+▁altern	-3466
+▁appreci	-3467
+ails	-3468
+▁grand	-3469
+▁reduce	-3470
+▁exactly	-3471
+▁Adv	-3472
+▁histor	-3473
+▁prec	-3474
+▁View	-3475
+bon	-3476
+▁Research	-3477
+▁James	-3478
+▁wedding	-3479
+▁active	-3480
+▁homes	-3481
+▁entertain	-3482
+▁imag	-3483
+arc	-3484
+ategy	-3485
+▁Michael	-3486
+▁paid	-3487
+▁doll	-3488
+ustain	-3489
+▁transport	-3490
+▁difference	-3491
+▁belie	-3492
+▁Thank	-3493
+icks	-3494
+olute	-3495
+▁political	-3496
+▁regul	-3497
+▁IT	-3498
+▁supply	-3499
+▁served	-3500
+▁cho	-3501
+▁challenge	-3502
+more	-3503
+▁surround	-3504
+ampions	-3505
+▁Micro	-3506
+▁Rich	-3507
+▁finished	-3508
+▁Have	-3509
+OV	-3510
+▁Big	-3511
+icate	-3512
+umn	-3513
+ading	-3514
+▁cash	-3515
+You	-3516
+▁Rel	-3517
+▁Look	-3518
+▁creative	-3519
+agn	-3520
+cause	-3521
+▁eight	-3522
+▁understanding	-3523
+ston	-3524
+estern	-3525
+▁replace	-3526
+▁retail	-3527
+▁Govern	-3528
+icip	-3529
+▁states	-3530
+LE	-3531
+ying	-3532
+:||	-3533
+▁Cur	-3534
+▁rates	-3535
+▁Mark	-3536
+orrow	-3537
+▁Char	-3538
+mod	-3539
+▁culture	-3540
+antly	-3541
+ky	-3542
+vin	-3543
+oly	-3544
+▁European	-3545
+▁Super	-3546
+▁lots	-3547
+▁guarant	-3548
+▁experienced	-3549
+▁easier	-3550
+▁afford	-3551
+▁Call	-3552
+▁ST	-3553
+box	-3554
+▁pages	-3555
+▁Life	-3556
+▁hus	-3557
+dd	-3558
+place	-3559
+▁bottom	-3560
+▁expand	-3561
+▁truly	-3562
+iny	-3563
+sec	-3564
+▁pressure	-3565
+▁father	-3566
+▁maybe	-3567
+▁flav	-3568
+▁economic	-3569
+hens	-3570
+ales	-3571
+▁thank	-3572
+▁reflect	-3573
+inated	-3574
+▁machine	-3575
+ses	-3576
+▁Company	-3577
+error	-3578
+▁analysis	-3579
+rial	-3580
+amic	-3581
+icious	-3582
+▁fat	-3583
+▁IS	-3584
+▁emot	-3585
+▁immediately	-3586
+▁named	-3587
+alt	-3588
+aled	-3589
+▁gradu	-3590
+▁numbers	-3591
+het	-3592
+▁tom	-3593
+sych	-3594
+▁Child	-3595
+▁demon	-3596
+▁Det	-3597
+▁Angel	-3598
+▁girls	-3599
+rey	-3600
+▁prot	-3601
+▁comfortable	-3602
+▁exhib	-3603
+IP	-3604
+erry	-3605
+pa	-3606
+▁assess	-3607
+▁posted	-3608
+▁satis	-3609
+nown	-3610
+▁degree	-3611
+chan	-3612
+▁tips	-3613
+▁helped	-3614
+▁damage	-3615
+ivil	-3616
+▁Ev	-3617
+▁opening	-3618
+▁dating	-3619
+▁Management	-3620
+▁garden	-3621
+▁videos	-3622
+▁Bank	-3623
+▁wild	-3624
+▁obt	-3625
+▁contain	-3626
+▁PC	-3627
+ronic	-3628
+care	-3629
+▁storage	-3630
+▁Ret	-3631
+▁Bay	-3632
+▁speak	-3633
+▁behav	-3634
+▁subst	-3635
+phone	-3636
+▁remain	-3637
+force	-3638
+anging	-3639
+▁Plan	-3640
+▁trade	-3641
+▁launch	-3642
+rem	-3643
+undred	-3644
+▁reviews	-3645
+▁Ins	-3646
+▁completed	-3647
+▁II	-3648
+ico	-3649
+▁pool	-3650
+▁Sun	-3651
+amm	-3652
+▁Island	-3653
+▁beyond	-3654
+▁lack	-3655
+▁disease	-3656
+asy	-3657
+▁Sing	-3658
+▁lock	-3659
+set	-3660
+▁Rock	-3661
+▁threat	-3662
+▁purpose	-3663
+If	-3664
+tion	-3665
+▁Water	-3666
+order	-3667
+orial	-3668
+▁cards	-3669
+▁Contact	-3670
+ado	-3671
+▁Mart	-3672
+▁adjust	-3673
+dom	-3674
+que	-3675
+▁ter	-3676
+▁spread	-3677
+▁existing	-3678
+▁fashion	-3679
+▁accur	-3680
+arily	-3681
+▁decor	-3682
+▁knew	-3683
+▁fant	-3684
+▁Love	-3685
+▁Jes	-3686
+▁highest	-3687
+Re	-3688
+lied	-3689
+▁cancer	-3690
+▁Florida	-3691
+▁plus	-3692
+OW	-3693
+▁craft	-3694
+soft	-3695
+▁jobs	-3696
+▁Although	-3697
+met	-3698
+▁conference	-3699
+body	-3700
+▁Rob	-3701
+▁Win	-3702
+▁responsible	-3703
+▁increasing	-3704
+▁During	-3705
+▁Sur	-3706
+▁allowed	-3707
+aling	-3708
+▁train	-3709
+▁setting	-3710
+▁excited	-3711
+atever	-3712
+rapy	-3713
+▁prefer	-3714
+▁driving	-3715
+▁camera	-3716
+▁proud	-3717
+▁Sa	-3718
+▁increased	-3719
+door	-3720
+▁sty	-3721
+imal	-3722
+▁welcome	-3723
+▁himself	-3724
+▁lines	-3725
+▁initial	-3726
+▁appropri	-3727
+▁middle	-3728
+▁Dec	-3729
+▁proced	-3730
+ona	-3731
+aith	-3732
+ences	-3733
+▁fem	-3734
+illa	-3735
+▁Church	-3736
+▁Sum	-3737
+▁certainly	-3738
+▁General	-3739
+▁passion	-3740
+▁furn	-3741
+▁frame	-3742
+▁coffee	-3743
+cel	-3744
+▁strugg	-3745
+▁holiday	-3746
+▁journey	-3747
+▁Product	-3748
+iling	-3749
+▁files	-3750
+▁Community	-3751
+▁effects	-3752
+▁Camp	-3753
+▁estate	-3754
+▁er	-3755
+za	-3756
+fl	-3757
+▁husband	-3758
+▁thanks	-3759
+▁Back	-3760
+aming	-3761
+▁frequ	-3762
+▁cast	-3763
+▁steps	-3764
+▁ingred	-3765
+▁button	-3766
+▁Republic	-3767
+▁update	-3768
+▁length	-3769
+▁pen	-3770
+▁People	-3771
+▁Custom	-3772
+▁born	-3773
+ologies	-3774
+▁normal	-3775
+istics	-3776
+▁efforts	-3777
+▁Education	-3778
+▁selection	-3779
+▁changed	-3780
+▁Two	-3781
+ously	-3782
+▁batter	-3783
+▁Mary	-3784
+▁Cong	-3785
+▁secure	-3786
+net	-3787
+▁mission	-3788
+vant	-3789
+▁cru	-3790
+▁spirit	-3791
+anta	-3792
+▁dedicated	-3793
+▁bill	-3794
+▁owner	-3795
+▁clin	-3796
+▁relax	-3797
+▁surv	-3798
+▁shopping	-3799
+lying	-3800
+▁looked	-3801
+ken	-3802
+icken	-3803
+▁incred	-3804
+▁stream	-3805
+▁occas	-3806
+ovel	-3807
+▁moved	-3808
+▁Show	-3809
+ady	-3810
+▁mis	-3811
+▁links	-3812
+omb	-3813
+▁Cap	-3814
+nection	-3815
+▁science	-3816
+ij	-3817
+EM	-3818
+▁aspect	-3819
+▁protection	-3820
+oma	-3821
+):	-3822
+▁haven	-3823
+fit	-3824
+▁wine	-3825
+▁powerful	-3826
+▁French	-3827
+othing	-3828
+▁extend	-3829
+▁demonstr	-3830
+▁evening	-3831
+▁meaning	-3832
+▁instruct	-3833
+▁Take	-3834
+oos	-3835
+▁Like	-3836
+▁background	-3837
+ipp	-3838
+▁occur	-3839
+▁talking	-3840
+▁produce	-3841
+▁patient	-3842
+IV	-3843
+▁particularly	-3844
+nded	-3845
+enance	-3846
+▁USA	-3847
+▁aren	-3848
+porary	-3849
+▁guys	-3850
+▁Sil	-3851
+reed	-3852
+friend	-3853
+▁Power	-3854
+▁measure	-3855
+▁opin	-3856
+▁basic	-3857
+▁challenges	-3858
+▁alone	-3859
+ota	-3860
+▁Under	-3861
+▁Online	-3862
+▁fan	-3863
+DA	-3864
+▁cream	-3865
+▁payment	-3866
+ocr	-3867
+▁biggest	-3868
+▁transfer	-3869
+▁rules	-3870
+▁Gra	-3871
+▁session	-3872
+itiz	-3873
+CC	-3874
+▁doub	-3875
+▁shared	-3876
+▁fill	-3877
+leg	-3878
+▁spring	-3879
+▁fra	-3880
+▁winter	-3881
+▁sort	-3882
+▁Project	-3883
+range	-3884
+▁whose	-3885
+▁runs	-3886
+▁letter	-3887
+▁basis	-3888
+▁couldn	-3889
+IM	-3890
+▁coach	-3891
+▁Special	-3892
+▁Information	-3893
+▁federal	-3894
+annel	-3895
+azine	-3896
+▁bur	-3897
+▁schedule	-3898
+▁liter	-3899
+free	-3900
+▁organizations	-3901
+▁Because	-3902
+▁Pet	-3903
+▁manager	-3904
+ios	-3905
+▁leader	-3906
+istrict	-3907
+▁Phil	-3908
+see	-3909
+▁Who	-3910
+▁electric	-3911
+icing	-3912
+▁drop	-3913
+▁strength	-3914
+▁models	-3915
+acity	-3916
+▁Music	-3917
+▁artist	-3918
+uing	-3919
+▁church	-3920
+▁peace	-3921
+isl	-3922
+▁reasons	-3923
+uled	-3924
+▁Food	-3925
+▁egg	-3926
+esome	-3927
+iques	-3928
+▁Lake	-3929
+▁slight	-3930
+▁capital	-3931
+▁communities	-3932
+▁absolute	-3933
+▁sugar	-3934
+▁volunte	-3935
+▁extremely	-3936
+▁Star	-3937
+▁adding	-3938
+iture	-3939
+▁competition	-3940
+▁exclus	-3941
+▁instit	-3942
+▁guests	-3943
+▁views	-3944
+▁onto	-3945
+▁unit	-3946
+▁mer	-3947
+▁stick	-3948
+▁British	-3949
+▁shown	-3950
+▁regarding	-3951
+istered	-3952
+▁Follow	-3953
+vision	-3954
+iation	-3955
+▁Ve	-3956
+▁Sam	-3957
+▁residents	-3958
+rief	-3959
+▁Thom	-3960
+gency	-3961
+▁Profess	-3962
+umber	-3963
+▁conven	-3964
+▁voice	-3965
+▁hundred	-3966
+▁Miss	-3967
+▁Enter	-3968
+hone	-3969
+azon	-3970
+la	-3971
+▁seeing	-3972
+▁taste	-3973
+▁chem	-3974
+▁River	-3975
+▁strategy	-3976
+▁ideal	-3977
+▁Mil	-3978
+apter	-3979
+▁scient	-3980
+▁Yes	-3981
+▁followed	-3982
+▁AP	-3983
+▁Blue	-3984
+▁Dri	-3985
+ustr	-3986
+▁daughter	-3987
+▁Real	-3988
+oyal	-3989
+eria	-3990
+▁colors	-3991
+▁Institute	-3992
+▁heavy	-3993
+▁trou	-3994
+▁compon	-3995
+▁sched	-3996
+▁cry	-3997
+▁Att	-3998
+osing	-3999
+▁brother	-4000
+▁gone	-4001
+▁advantage	-4002
+imb	-4003
+▁notice	-4004
+rian	-4005
+▁Lou	-4006
+▁guid	-4007
+▁manage	-4008
+esterday	-4009
+oman	-4010
+▁score	-4011
+▁Matt	-4012
+▁virt	-4013
+▁characters	-4014
+standing	-4015
+ags	-4016
+▁Fore	-4017
+▁Fire	-4018
+▁Police	-4019
+iverse	-4020
+▁traffic	-4021
+asp	-4022
+▁window	-4023
+▁surface	-4024
+▁ton	-4025
+term	-4026
+ocolate	-4027
+▁Mount	-4028
+▁experiences	-4029
+▁Pay	-4030
+▁smooth	-4031
+ette	-4032
+▁Mal	-4033
+▁happened	-4034
+▁reb	-4035
+▁Ben	-4036
+fast	-4037
+▁graph	-4038
+▁hom	-4039
+▁Vol	-4040
+▁names	-4041
+▁identify	-4042
+encies	-4043
+▁shipping	-4044
+▁standards	-4045
+▁pair	-4046
+▁senior	-4047
+Sh	-4048
+ech	-4049
+▁Wood	-4050
+acing	-4051
+icine	-4052
+gen	-4053
+▁talent	-4054
+mark	-4055
+▁u	-4056
+itude	-4057
+▁District	-4058
+▁professionals	-4059
+BS	-4060
+▁hospital	-4061
+▁List	-4062
+raw	-4063
+uce	-4064
+▁initi	-4065
+▁breat	-4066
+▁although	-4067
+▁classic	-4068
+▁workers	-4069
+▁experts	-4070
+ula	-4071
+ixt	-4072
+▁luck	-4073
+TS	-4074
+gn	-4075
+▁Step	-4076
+▁Hist	-4077
+▁audience	-4078
+▁covered	-4079
+▁Est	-4080
+▁laws	-4081
+ero	-4082
+▁Mot	-4083
+▁Sign	-4084
+▁passed	-4085
+▁waiting	-4086
+▁academ	-4087
+▁guy	-4088
+▁dang	-4089
+rooms	-4090
+▁beauty	-4091
+▁fear	-4092
+▁approx	-4093
+▁continues	-4094
+▁Development	-4095
+▁finding	-4096
+▁snow	-4097
+▁Team	-4098
+▁flex	-4099
+▁efficient	-4100
+▁master	-4101
+orney	-4102
+▁mail	-4103
+▁associated	-4104
+▁exciting	-4105
+▁Elect	-4106
+▁eval	-4107
+▁Exper	-4108
+inese	-4109
+▁compared	-4110
+inate	-4111
+ga	-4112
+▁larger	-4113
+sequ	-4114
+▁Chic	-4115
+▁laun	-4116
+▁critical	-4117
+ss	-4118
+▁cars	-4119
+▁rob	-4120
+▁cab	-4121
+▁Color	-4122
+▁technical	-4123
+▁Family	-4124
+▁trail	-4125
+▁ice	-4126
+UR	-4127
+icon	-4128
+▁shape	-4129
+▁district	-4130
+▁beg	-4131
+▁keeping	-4132
+▁TO	-4133
+▁remind	-4134
+▁solid	-4135
+▁den	-4136
+osh	-4137
+▁Foundation	-4138
+▁England	-4139
+▁boo	-4140
+▁Science	-4141
+▁facilities	-4142
+rees	-4143
+▁wat	-4144
+▁restaurant	-4145
+▁calls	-4146
+▁scene	-4147
+▁maintain	-4148
+▁greater	-4149
+▁Engine	-4150
+▁PR	-4151
+▁sustain	-4152
+▁sy	-4153
+▁officials	-4154
+mail	-4155
+▁Bet	-4156
+▁Alex	-4157
+▁Jesus	-4158
+▁Sl	-4159
+▁posts	-4160
+▁station	-4161
+▁Str	-4162
+▁friendly	-4163
+▁epis	-4164
+▁driver	-4165
+▁sand	-4166
+▁recipe	-4167
+▁listed	-4168
+▁bul	-4169
+▁plenty	-4170
+odes	-4171
+▁Glo	-4172
+▁fish	-4173
+▁forget	-4174
+▁Vir	-4175
+▁older	-4176
+cul	-4177
+illage	-4178
+▁rich	-4179
+▁Start	-4180
+▁continued	-4181
+▁football	-4182
+incip	-4183
+▁developing	-4184
+▁package	-4185
+itors	-4186
+log	-4187
+▁established	-4188
+iller	-4189
+▁Hum	-4190
+yer	-4191
+▁useful	-4192
+▁Brown	-4193
+rowd	-4194
+▁income	-4195
+▁minute	-4196
+▁studies	-4197
+well	-4198
+▁advent	-4199
+▁truck	-4200
+▁announce	-4201
+oop	-4202
+▁learned	-4203
+ervation	-4204
+atically	-4205
+▁Press	-4206
+▁tim	-4207
+▁disapp	-4208
+▁produced	-4209
+win	-4210
+▁motor	-4211
+tra	-4212
+▁League	-4213
+▁rooms	-4214
+unately	-4215
+using	-4216
+▁closed	-4217
+▁handle	-4218
+▁beat	-4219
+▁appropriate	-4220
+▁Whether	-4221
+▁classes	-4222
+unning	-4223
+▁origin	-4224
+ander	-4225
+▁military	-4226
+▁Central	-4227
+▁artists	-4228
+▁explore	-4229
+▁died	-4230
+▁sup	-4231
+gal	-4232
+▁Offic	-4233
+▁Commission	-4234
+CA	-4235
+▁placed	-4236
+▁economy	-4237
+▁kept	-4238
+night	-4239
+▁thousands	-4240
+▁knows	-4241
+▁Franc	-4242
+▁connection	-4243
+▁remove	-4244
+▁winning	-4245
+▁Smith	-4246
+▁Social	-4247
+▁pros	-4248
+▁primary	-4249
+▁evidence	-4250
+▁CEO	-4251
+▁force	-4252
+▁adop	-4253
+▁tree	-4254
+▁Media	-4255
+▁repair	-4256
+▁salt	-4257
+▁Build	-4258
+▁bright	-4259
+aded	-4260
+▁Download	-4261
+▁novel	-4262
+▁testing	-4263
+▁util	-4264
+▁operations	-4265
+iment	-4266
+IG	-4267
+▁Christian	-4268
+rael	-4269
+▁status	-4270
+▁opened	-4271
+▁figure	-4272
+▁requires	-4273
+▁street	-4274
+BA	-4275
+▁discount	-4276
+▁fol	-4277
+▁Another	-4278
+There	-4279
+▁gun	-4280
+▁communication	-4281
+ipes	-4282
+atab	-4283
+▁presented	-4284
+▁Grand	-4285
+▁decl	-4286
+rd	-4287
+▁Beach	-4288
+▁discover	-4289
+ka	-4290
+What	-4291
+overy	-4292
+▁ingredients	-4293
+▁teaching	-4294
+▁medium	-4295
+▁Network	-4296
+▁surg	-4297
+▁Obama	-4298
+▁injury	-4299
+inn	-4300
+▁Arch	-4301
+semb	-4302
+▁harm	-4303
+▁starts	-4304
+vention	-4305
+oe	-4306
+▁brain	-4307
+bed	-4308
+▁Carol	-4309
+iled	-4310
+▁catch	-4311
+▁contains	-4312
+▁selected	-4313
+irection	-4314
+▁shall	-4315
+▁Mex	-4316
+outhern	-4317
+▁sharing	-4318
+look	-4319
+▁brings	-4320
+action	-4321
+▁butter	-4322
+idential	-4323
+arge	-4324
+▁doctor	-4325
+▁structure	-4326
+▁advance	-4327
+▁Disc	-4328
+▁poor	-4329
+rehens	-4330
+▁scen	-4331
+itar	-4332
+ideo	-4333
+men	-4334
+▁famous	-4335
+asure	-4336
+▁pray	-4337
+mp	-4338
+▁dinner	-4339
+apers	-4340
+▁arrest	-4341
+pective	-4342
+▁prepared	-4343
+▁Dig	-4344
+▁esc	-4345
+olic	-4346
+▁Scott	-4347
+▁Hill	-4348
+▁manufacturer	-4349
+▁suff	-4350
+enses	-4351
+▁Mad	-4352
+▁Word	-4353
+▁Microsoft	-4354
+▁pm	-4355
+▁serving	-4356
+▁Card	-4357
+▁jump	-4358
+▁ship	-4359
+▁loan	-4360
+▁architect	-4361
+▁Light	-4362
+uries	-4363
+▁Full	-4364
+▁mo	-4365
+▁department	-4366
+▁remains	-4367
+▁funds	-4368
+▁Valley	-4369
+▁vision	-4370
+▁watching	-4371
+▁rank	-4372
+▁secret	-4373
+atively	-4374
+PA	-4375
+▁victim	-4376
+▁Amazon	-4377
+▁resist	-4378
+▁sto	-4379
+▁Cup	-4380
+ini	-4381
+ctors	-4382
+▁veget	-4383
+▁gain	-4384
+▁Chicago	-4385
+▁Their	-4386
+▁methods	-4387
+noon	-4388
+aven	-4389
+▁balance	-4390
+usion	-4391
+iers	-4392
+▁agency	-4393
+lor	-4394
+allery	-4395
+▁movement	-4396
+”,	-4397
+▁updated	-4398
+▁buying	-4399
+riage	-4400
+▁Keep	-4401
+▁leaves	-4402
+CH	-4403
+▁Bill	-4404
+▁drug	-4405
+▁compl	-4406
+▁Chinese	-4407
+▁guess	-4408
+▁Support	-4409
+▁Net	-4410
+ooper	-4411
+aked	-4412
+▁encourage	-4413
+RA	-4414
+▁Stand	-4415
+▁spending	-4416
+▁cloud	-4417
+▁journal	-4418
+▁map	-4419
+▁OF	-4420
+▁Week	-4421
+▁reality	-4422
+lands	-4423
+ption	-4424
+▁Award	-4425
+going	-4426
+ishes	-4427
+▁Africa	-4428
+LC	-4429
+▁properties	-4430
+lastname	-4431
+eless	-4432
+okes	-4433
+▁becoming	-4434
+▁beach	-4435
+▁happens	-4436
+ellig	-4437
+▁Date	-4438
+▁Ber	-4439
+▁bought	-4440
+top	-4441
+▁sector	-4442
+▁cleaning	-4443
+▁Women	-4444
+▁spons	-4445
+▁ID	-4446
+▁RE	-4447
+▁Mel	-4448
+▁leaving	-4449
+▁sport	-4450
+iency	-4451
+▁relig	-4452
+▁Commit	-4453
+▁showing	-4454
+antic	-4455
+▁plants	-4456
+▁maintenance	-4457
+itness	-4458
+life	-4459
+▁https	-4460
+▁metal	-4461
+▁Fort	-4462
+▁Tor	-4463
+▁facility	-4464
+ception	-4465
+▁perhaps	-4466
+▁dep	-4467
+ki	-4468
+▁Times	-4469
+essions	-4470
+hem	-4471
+▁determine	-4472
+ifts	-4473
+▁leadership	-4474
+▁advanced	-4475
+▁Israel	-4476
+▁Long	-4477
+▁worksh	-4478
+▁independent	-4479
+▁stores	-4480
+▁entry	-4481
+▁Android	-4482
+▁Academ	-4483
+▁Rad	-4484
+▁cris	-4485
+▁mechan	-4486
+▁fee	-4487
+▁analy	-4488
+▁rain	-4489
+▁Where	-4490
+berg	-4491
+edy	-4492
+▁upgr	-4493
+▁rare	-4494
+osure	-4495
+▁unc	-4496
+outs	-4497
+▁cart	-4498
+▁Que	-4499
+abilities	-4500
+▁exercise	-4501
+▁committed	-4502
+ror	-4503
+▁wouldn	-4504
+▁faith	-4505
+itz	-4506
+▁meant	-4507
+▁NY	-4508
+alls	-4509
+▁Mass	-4510
+▁vote	-4511
+▁sem	-4512
+▁iPhone	-4513
+▁mist	-4514
+craft	-4515
+ograp	-4516
+▁Both	-4517
+▁bird	-4518
+▁designs	-4519
+▁numerous	-4520
+▁Tim	-4521
+▁fabric	-4522
+▁ride	-4523
+▁focused	-4524
+▁anti	-4525
+▁markets	-4526
+▁Div	-4527
+▁brows	-4528
+▁ju	-4529
+▁Nov	-4530
+▁incor	-4531
+▁Fil	-4532
+agram	-4533
+fr	-4534
+▁sources	-4535
+▁signed	-4536
+▁Pub	-4537
+**	-4538
+▁records	-4539
+▁funding	-4540
+▁actual	-4541
+aturing	-4542
+▁theme	-4543
+iest	-4544
+▁establish	-4545
+ae	-4546
+▁steel	-4547
+▁changing	-4548
+▁chair	-4549
+▁visitors	-4550
+▁visual	-4551
+▁multi	-4552
+For	-4553
+▁Next	-4554
+estic	-4555
+▁ir	-4556
+MS	-4557
+▁Los	-4558
+▁forms	-4559
+iences	-4560
+iance	-4561
+▁crowd	-4562
+isation	-4563
+▁joined	-4564
+▁coverage	-4565
+▁mill	-4566
+▁elements	-4567
+▁Organ	-4568
+▁showed	-4569
+rim	-4570
+▁kick	-4571
+▁selling	-4572
+▁Watch	-4573
+▁practices	-4574
+▁animals	-4575
+▁operating	-4576
+▁obvious	-4577
+fin	-4578
+▁menu	-4579
+▁capacity	-4580
+▁busy	-4581
+▁Nor	-4582
+▁locations	-4583
+▁grant	-4584
+▁Medical	-4585
+▁roof	-4586
+▁songs	-4587
+▁fell	-4588
+▁Set	-4589
+▁neighbor	-4590
+▁Head	-4591
+▁refer	-4592
+isher	-4593
+eared	-4594
+oor	-4595
+▁George	-4596
+miss	-4597
+▁raised	-4598
+▁memory	-4599
+▁Only	-4600
+rics	-4601
+▁ban	-4602
+▁worry	-4603
+▁whatever	-4604
+▁corner	-4605
+▁lose	-4606
+▁allowing	-4607
+igan	-4608
+IA	-4609
+▁listen	-4610
+▁central	-4611
+reek	-4612
+▁accommod	-4613
+▁society	-4614
+▁plastic	-4615
+gage	-4616
+▁relationships	-4617
+SS	-4618
+vere	-4619
+▁diet	-4620
+▁Tri	-4621
+igation	-4622
+▁lux	-4623
+▁thr	-4624
+▁diagn	-4625
+▁managed	-4626
+OP	-4627
+▁updates	-4628
+▁Copy	-4629
+▁limit	-4630
+▁parking	-4631
+▁caused	-4632
+▁rap	-4633
+▁estim	-4634
+▁population	-4635
+▁tables	-4636
+ya	-4637
+▁Before	-4638
+fol	-4639
+▁uns	-4640
+",	-4641
+▁Note	-4642
+▁parties	-4643
+▁decide	-4644
+isco	-4645
+uty	-4646
+▁claims	-4647
+▁articles	-4648
+ano	-4649
+▁core	-4650
+▁survey	-4651
+▁repe	-4652
+ferences	-4653
+▁Mer	-4654
+▁assistance	-4655
+▁walking	-4656
+amin	-4657
+▁tickets	-4658
+▁Its	-4659
+▁techniques	-4660
+▁thoughts	-4661
+rab	-4662
+ection	-4663
+▁CD	-4664
+▁Sy	-4665
+ivered	-4666
+▁colour	-4667
+▁afternoon	-4668
+▁documents	-4669
+▁wire	-4670
+arrant	-4671
+▁bowl	-4672
+▁ended	-4673
+▁transl	-4674
+▁youth	-4675
+▁brown	-4676
+▁combination	-4677
+▁vehicles	-4678
+lines	-4679
+▁flat	-4680
+▁forum	-4681
+▁yesterday	-4682
+▁previously	-4683
+▁landsc	-4684
+▁Game	-4685
+▁enjoyed	-4686
+▁Society	-4687
+▁profile	-4688
+▁courses	-4689
+iliar	-4690
+▁launched	-4691
+▁toward	-4692
+DF	-4693
+▁appears	-4694
+▁sea	-4695
+▁eating	-4696
+point	-4697
+▁Bur	-4698
+▁accident	-4699
+▁Cre	-4700
+▁Town	-4701
+▁optim	-4702
+▁filled	-4703
+▁awesome	-4704
+▁teacher	-4705
+coh	-4706
+▁factors	-4707
+bour	-4708
+eed	-4709
+▁Chris	-4710
+▁Technology	-4711
+▁temperature	-4712
+rs	-4713
+▁mort	-4714
+▁micro	-4715
+pan	-4716
+▁psych	-4717
+▁generally	-4718
+while	-4719
+▁putting	-4720
+▁charges	-4721
+▁shel	-4722
+▁Learn	-4723
+▁citiz	-4724
+▁Trump	-4725
+▁Mont	-4726
+▁smaller	-4727
+▁Atl	-4728
+▁notes	-4729
+▁Author	-4730
+▁firstname	-4731
+▁Pack	-4732
+▁values	-4733
+▁task	-4734
+▁direction	-4735
+rehensive	-4736
+no	-4737
+▁counter	-4738
+▁Lord	-4739
+▁Log	-4740
+▁AL	-4741
+▁outdoor	-4742
+▁Wil	-4743
+▁earth	-4744
+▁CA	-4745
+▁kid	-4746
+▁Sand	-4747
+▁teachers	-4748
+▁panel	-4749
+▁becomes	-4750
+▁vs	-4751
+orthern	-4752
+▁tend	-4753
+▁corporate	-4754
+▁favour	-4755
+▁Arts	-4756
+ola	-4757
+▁bon	-4758
+▁Virgin	-4759
+▁century	-4760
+▁honest	-4761
+▁separate	-4762
+??	-4763
+▁legisl	-4764
+▁cheese	-4765
+▁assign	-4766
+▁Security	-4767
+yan	-4768
+▁Congress	-4769
+▁matt	-4770
+On	-4771
+▁sch	-4772
+▁truth	-4773
+▁purs	-4774
+▁concerns	-4775
+OD	-4776
+▁situ	-4777
+▁Committee	-4778
+▁Main	-4779
+istan	-4780
+▁Data	-4781
+▁dur	-4782
+▁helpful	-4783
+▁Jew	-4784
+▁shut	-4785
+New	-4786
+▁swim	-4787
+▁Centre	-4788
+iration	-4789
+▁missing	-4790
+▁fold	-4791
+▁orders	-4792
+▁milk	-4793
+▁McC	-4794
+▁Frank	-4795
+rain	-4796
+▁Jul	-4797
+▁Government	-4798
+een	-4799
+▁flu	-4800
+!!!	-4801
+▁throw	-4802
+▁Ext	-4803
+po	-4804
+▁adapt	-4805
+▁polic	-4806
+▁installation	-4807
+▁innovative	-4808
+ownt	-4809
+▁Aud	-4810
+▁ur	-4811
+▁relevant	-4812
+▁south	-4813
+▁tow	-4814
+pet	-4815
+▁Lo	-4816
+▁van	-4817
+rical	-4818
+ifying	-4819
+olars	-4820
+▁Robert	-4821
+▁Museum	-4822
+SP	-4823
+▁decisions	-4824
+▁environmental	-4825
+ye	-4826
+▁discussion	-4827
+▁despite	-4828
+▁waste	-4829
+▁AND	-4830
+▁Tur	-4831
+▁fourth	-4832
+orter	-4833
+▁slightly	-4834
+▁ang	-4835
+▁inspired	-4836
+oles	-4837
+▁net	-4838
+▁Mike	-4839
+▁dance	-4840
+▁Tre	-4841
+▁Den	-4842
+▁enhance	-4843
+▁apart	-4844
+▁Prov	-4845
+▁Wall	-4846
+▁spect	-4847
+▁scr	-4848
+▁mental	-4849
+▁Hotel	-4850
+▁fantastic	-4851
+▁Jim	-4852
+▁Old	-4853
+▁pal	-4854
+▁Land	-4855
+▁sav	-4856
+▁Somet	-4857
+▁format	-4858
+▁joint	-4859
+ita	-4860
+▁upcoming	-4861
+▁desk	-4862
+▁ath	-4863
+▁AC	-4864
+▁Lead	-4865
+▁spl	-4866
+inct	-4867
+▁Dou	-4868
+▁emp	-4869
+rist	-4870
+▁willing	-4871
+▁YOU	-4872
+▁hearing	-4873
+▁sounds	-4874
+▁fuel	-4875
+▁commitment	-4876
+ups	-4877
+▁consumers	-4878
+▁appeal	-4879
+▁raise	-4880
+▁Manager	-4881
+?”	-4882
+▁UN	-4883
+kin	-4884
+▁civil	-4885
+osen	-4886
+▁Place	-4887
+umin	-4888
+▁vir	-4889
+ensions	-4890
+SA	-4891
+▁library	-4892
+▁Through	-4893
+▁north	-4894
+▁expertise	-4895
+▁Report	-4896
+▁promote	-4897
+▁units	-4898
+▁Contin	-4899
+▁asking	-4900
+▁absolutely	-4901
+water	-4902
+▁chocolate	-4903
+▁extensive	-4904
+cher	-4905
+▁delivered	-4906
+▁movies	-4907
+▁Louis	-4908
+▁bask	-4909
+▁Series	-4910
+▁delicious	-4911
+▁Ill	-4912
+Pro	-4913
+▁eth	-4914
+▁reached	-4915
+▁sets	-4916
+Com	-4917
+zen	-4918
+▁Vict	-4919
+known	-4920
+uable	-4921
+▁executive	-4922
+▁agreement	-4923
+▁plays	-4924
+ternal	-4925
+▁Link	-4926
+nergy	-4927
+▁radio	-4928
+▁Ma	-4929
+▁Posted	-4930
+▁foreign	-4931
+▁alle	-4932
+▁transform	-4933
+REE	-4934
+▁lunch	-4935
+▁datab	-4936
+aser	-4937
+▁register	-4938
+icians	-4939
+▁emergency	-4940
+▁thick	-4941
+▁struct	-4942
+▁trees	-4943
+▁Invest	-4944
+▁Angeles	-4945
+list	-4946
+eline	-4947
+▁Ham	-4948
+▁Const	-4949
+▁Lim	-4950
+▁provider	-4951
+▁Oper	-4952
+▁brief	-4953
+▁presence	-4954
+▁NE	-4955
+text	-4956
+▁Upd	-4957
+▁combined	-4958
+▁Fund	-4959
+!)	-4960
+▁rid	-4961
+▁achie	-4962
+▁Admin	-4963
+▁Gal	-4964
+prise	-4965
+▁furniture	-4966
+▁Fun	-4967
+▁seeking	-4968
+▁fruit	-4969
+▁Hand	-4970
+▁controll	-4971
+▁NOT	-4972
+osition	-4973
+▁connected	-4974
+▁Union	-4975
+▁Join	-4976
+bre	-4977
+▁expensive	-4978
+▁adults	-4979
+▁readers	-4980
+▁Jun	-4981
+▁Person	-4982
+▁Cook	-4983
+reens	-4984
+▁Democr	-4985
+▁seconds	-4986
+▁feels	-4987
+uality	-4988
+▁ON	-4989
+▁poll	-4990
+▁rat	-4991
+▁distance	-4992
+▁generation	-4993
+▁mentioned	-4994
+▁edge	-4995
+▁fees	-4996
+▁recommended	-4997
+▁trial	-4998
+▁chat	-4999
+▁calling	-5000
+▁har	-5001
+▁cities	-5002
+▁nine	-5003
+▁chicken	-5004
+▁approximately	-5005
+TH	-5006
+atin	-5007
+▁bringing	-5008
+▁Plus	-5009
+▁consid	-5010
+▁Access	-5011
+▁Journal	-5012
+▁Inte	-5013
+▁wel	-5014
+▁married	-5015
+fortunately	-5016
+▁prepare	-5017
+▁Peter	-5018
+▁websites	-5019
+▁operation	-5020
+▁alternative	-5021
+▁confidence	-5022
+▁server	-5023
+IR	-5024
+▁dogs	-5025
+▁stars	-5026
+▁registered	-5027
+LA	-5028
+cean	-5029
+▁educational	-5030
+burg	-5031
+▁Di	-5032
+▁Master	-5033
+appy	-5034
+▁Indust	-5035
+▁photograph	-5036
+ruit	-5037
+▁restrict	-5038
+ef	-5039
+▁Chief	-5040
+▁Ol	-5041
+▁tight	-5042
+My	-5043
+▁Children	-5044
+▁centre	-5045
+hab	-5046
+emporary	-5047
+▁square	-5048
+othes	-5049
+▁France	-5050
+▁Spring	-5051
+▁tun	-5052
+▁returned	-5053
+▁minimum	-5054
+▁lovely	-5055
+▁category	-5056
+OC	-5057
+▁Live	-5058
+azz	-5059
+▁exchange	-5060
+▁seat	-5061
+irmed	-5062
+▁stret	-5063
+▁Prote	-5064
+ears	-5065
+▁installed	-5066
+▁topic	-5067
+▁info	-5068
+▁Rest	-5069
+▁tea	-5070
+rag	-5071
+▁brands	-5072
+▁tough	-5073
+asks	-5074
+▁guest	-5075
+▁princip	-5076
+▁Way	-5077
+▁majority	-5078
+bu	-5079
+▁researc	-5080
+atre	-5081
+inations	-5082
+▁wearing	-5083
+▁Minister	-5084
+▁female	-5085
+▁appearance	-5086
+how	-5087
+▁colle	-5088
+▁neck	-5089
+estyle	-5090
+▁Cy	-5091
+orry	-5092
+ship	-5093
+IF	-5094
+When	-5095
+ulated	-5096
+aks	-5097
+▁ven	-5098
+▁therefore	-5099
+▁accompl	-5100
+▁mostly	-5101
+▁Canad	-5102
+▁instru	-5103
+▁Price	-5104
+▁maximum	-5105
+▁HD	-5106
+elines	-5107
+▁Ok	-5108
+▁sauce	-5109
+▁processes	-5110
+▁academic	-5111
+▁measures	-5112
+van	-5113
+▁winner	-5114
+▁responsibility	-5115
+kins	-5116
+▁surgery	-5117
+▁Ver	-5118
+ifications	-5119
+▁leads	-5120
+▁impl	-5121
+▁teen	-5122
+▁Mo	-5123
+▁Sup	-5124
+▁killed	-5125
+▁apps	-5126
+▁approved	-5127
+▁Max	-5128
+▁anywhere	-5129
+▁arrange	-5130
+nel	-5131
+▁Sports	-5132
+▁Men	-5133
+osis	-5134
+▁Video	-5135
+▁stre	-5136
+▁importance	-5137
+▁Hy	-5138
+▁Test	-5139
+▁gather	-5140
+▁climate	-5141
+▁ring	-5142
+▁Squ	-5143
+alian	-5144
+▁satisf	-5145
+▁detailed	-5146
+▁boost	-5147
+▁battery	-5148
+▁signs	-5149
+An	-5150
+hi	-5151
+▁nom	-5152
+▁feedback	-5153
+▁battle	-5154
+▁chief	-5155
+▁veter	-5156
+▁Festival	-5157
+▁switch	-5158
+▁Creat	-5159
+▁dyn	-5160
+mond	-5161
+▁worldwide	-5162
+▁disp	-5163
+▁featured	-5164
+▁cooking	-5165
+▁scheduled	-5166
+▁highlight	-5167
+▁Wild	-5168
+ius	-5169
+lets	-5170
+▁supporting	-5171
+ait	-5172
+▁crim	-5173
+▁rise	-5174
+▁Library	-5175
+▁sympt	-5176
+▁cheap	-5177
+cohol	-5178
+▁comprehensive	-5179
+ulty	-5180
+▁predict	-5181
+▁participants	-5182
+▁Jud	-5183
+▁Cat	-5184
+vis	-5185
+arsh	-5186
+▁Walk	-5187
+ker	-5188
+▁affordable	-5189
+▁Thomas	-5190
+▁IP	-5191
+▁otherwise	-5192
+paper	-5193
+▁Tour	-5194
+▁Bob	-5195
+alend	-5196
+ters	-5197
+▁defense	-5198
+Cl	-5199
+▁Conference	-5200
+roy	-5201
+▁bike	-5202
+cious	-5203
+▁Lab	-5204
+otten	-5205
+▁properly	-5206
+ician	-5207
+▁actions	-5208
+▁animal	-5209
+▁Using	-5210
+ulate	-5211
+▁clearly	-5212
+▁performed	-5213
+ena	-5214
+▁Earth	-5215
+▁Search	-5216
+FL	-5217
+gl	-5218
+▁mur	-5219
+▁Pan	-5220
+itable	-5221
+▁purchased	-5222
+bl	-5223
+▁Those	-5224
+idden	-5225
+▁ourselves	-5226
+iner	-5227
+pected	-5228
+▁Bi	-5229
+oston	-5230
+▁conv	-5231
+uts	-5232
+▁joy	-5233
+▁vent	-5234
+▁audio	-5235
+▁chemical	-5236
+▁Copyright	-5237
+iser	-5238
+▁meal	-5239
+▁competitive	-5240
+verse	-5241
+▁Johnson	-5242
+anda	-5243
+▁appeared	-5244
+▁windows	-5245
+▁advertising	-5246
+▁Global	-5247
+▁applied	-5248
+▁push	-5249
+bol	-5250
+▁motiv	-5251
+UT	-5252
+▁Prem	-5253
+▁ment	-5254
+▁doors	-5255
+▁Cam	-5256
+▁Party	-5257
+▁Soft	-5258
+ENT	-5259
+▁sister	-5260
+▁pump	-5261
+oga	-5262
+▁policies	-5263
+gment	-5264
+▁Form	-5265
+▁mouth	-5266
+▁topics	-5267
+erg	-5268
+▁supported	-5269
+▁valid	-5270
+▁Jeff	-5271
+▁Bas	-5272
+▁pregn	-5273
+▁flowers	-5274
+▁scale	-5275
+▁technologies	-5276
+▁rom	-5277
+▁African	-5278
+▁behavior	-5279
+▁arm	-5280
+rastructure	-5281
+▁sitting	-5282
+GB	-5283
+MA	-5284
+▁minor	-5285
+▁Jose	-5286
+▁familiar	-5287
+▁writer	-5288
+▁holding	-5289
+▁entertainment	-5290
+▁featuring	-5291
+▁rub	-5292
+▁Germany	-5293
+but	-5294
+▁bond	-5295
+▁coord	-5296
+▁episode	-5297
+ushed	-5298
+▁studio	-5299
+▁editor	-5300
+▁Charl	-5301
+▁Western	-5302
+▁opinion	-5303
+▁Kore	-5304
+▁elim	-5305
+▁Cost	-5306
+▁revenue	-5307
+▁Haw	-5308
+alog	-5309
+tr	-5310
+▁plug	-5311
+▁participate	-5312
+▁faster	-5313
+▁removed	-5314
+▁Connect	-5315
+▁Fair	-5316
+▁Help	-5317
+▁strategies	-5318
+▁sides	-5319
+▁Saf	-5320
+inch	-5321
+▁Champions	-5322
+west	-5323
+▁coast	-5324
+erts	-5325
+▁jew	-5326
+▁depending	-5327
+▁charged	-5328
+▁totally	-5329
+col	-5330
+prene	-5331
+oration	-5332
+▁birthday	-5333
+▁reliable	-5334
+▁visiting	-5335
+▁quiet	-5336
+▁begins	-5337
+▁Martin	-5338
+▁species	-5339
+UN	-5340
+▁conversation	-5341
+inating	-5342
+▁Energy	-5343
+orough	-5344
+▁flight	-5345
+▁described	-5346
+▁Girl	-5347
+▁caught	-5348
+▁ap	-5349
+▁Cert	-5350
+▁eventually	-5351
+▁monthly	-5352
+▁fif	-5353
+hus	-5354
+▁consumer	-5355
+▁Hospital	-5356
+tered	-5357
+den	-5358
+▁Sar	-5359
+▁tail	-5360
+▁restaurants	-5361
+▁housing	-5362
+▁meat	-5363
+▁dish	-5364
+▁cells	-5365
+▁teach	-5366
+▁MP	-5367
+▁deals	-5368
+▁inches	-5369
+▁television	-5370
+▁pu	-5371
+otic	-5372
+▁Mic	-5373
+▁Digital	-5374
+▁accounts	-5375
+▁improved	-5376
+with	-5377
+reprene	-5378
+ersey	-5379
+▁German	-5380
+▁nav	-5381
+▁Dev	-5382
+▁Orig	-5383
+▁labor	-5384
+apes	-5385
+▁Gen	-5386
+▁Australian	-5387
+▁delight	-5388
+inter	-5389
+▁university	-5390
+▁dim	-5391
+▁Id	-5392
+▁Joe	-5393
+▁fly	-5394
+▁officer	-5395
+▁neighborhood	-5396
+▁shoes	-5397
+ario	-5398
+▁revealed	-5399
+▁hundreds	-5400
+▁marriage	-5401
+▁campus	-5402
+▁employee	-5403
+ste	-5404
+▁cro	-5405
+▁breakfast	-5406
+ulous	-5407
+▁label	-5408
+▁CH	-5409
+▁ign	-5410
+weight	-5411
+▁Ul	-5412
+SE	-5413
+▁confirm	-5414
+▁typically	-5415
+▁administration	-5416
+▁Penn	-5417
+▁occasion	-5418
+▁Academy	-5419
+▁exclusive	-5420
+▁introduced	-5421
+▁celebrate	-5422
+How	-5423
+▁election	-5424
+▁covers	-5425
+ht	-5426
+▁essay	-5427
+▁appointment	-5428
+▁Secret	-5429
+▁Mid	-5430
+ighter	-5431
+▁volume	-5432
+▁Ce	-5433
+▁unless	-5434
+▁Opt	-5435
+sm	-5436
+achel	-5437
+▁discovered	-5438
+▁specifically	-5439
+hew	-5440
+▁amb	-5441
+▁vary	-5442
+hent	-5443
+▁compar	-5444
+▁internal	-5445
+iat	-5446
+▁indic	-5447
+▁planned	-5448
+Our	-5449
+▁Hope	-5450
+▁twe	-5451
+▁debt	-5452
+▁cultural	-5453
+NA	-5454
+▁intended	-5455
+▁cutting	-5456
+▁sessions	-5457
+▁AT	-5458
+▁Lt	-5459
+▁aspects	-5460
+▁Americans	-5461
+▁manufacturing	-5462
+▁remaining	-5463
+▁Maybe	-5464
+▁Young	-5465
+eries	-5466
+ushing	-5467
+▁mel	-5468
+bur	-5469
+▁sexual	-5470
+▁SP	-5471
+ixture	-5472
+igr	-5473
+▁shares	-5474
+edia	-5475
+▁Box	-5476
+merce	-5477
+▁nor	-5478
+▁Boy	-5479
+▁Second	-5480
+▁recovery	-5481
+);	-5482
+▁fle	-5483
+▁basket	-5484
+▁Boston	-5485
+▁chart	-5486
+▁icon	-5487
+▁engineering	-5488
+▁remote	-5489
+▁trading	-5490
+ords	-5491
+▁concent	-5492
+▁Ari	-5493
+▁partnership	-5494
+▁Er	-5495
+▁scored	-5496
+▁incredible	-5497
+▁Key	-5498
+▁bread	-5499
+▁lights	-5500
+▁edition	-5501
+▁dining	-5502
+▁investigation	-5503
+ournament	-5504
+▁Commun	-5505
+▁industrial	-5506
+asts	-5507
+uke	-5508
+▁Jon	-5509
+▁guarantee	-5510
+▁forg	-5511
+▁detect	-5512
+▁Mur	-5513
+aren	-5514
+CE	-5515
+▁Meet	-5516
+▁invent	-5517
+cont	-5518
+▁Carolina	-5519
+gas	-5520
+▁drivers	-5521
+▁components	-5522
+▁negative	-5523
+▁Japanese	-5524
+▁liqu	-5525
+▁hyd	-5526
+▁automatically	-5527
+mosp	-5528
+▁End	-5529
+elly	-5530
+eper	-5531
+ala	-5532
+▁resource	-5533
+▁cake	-5534
+▁depos	-5535
+▁mir	-5536
+▁Pac	-5537
+▁freed	-5538
+▁fields	-5539
+lymp	-5540
+▁Virginia	-5541
+▁practical	-5542
+▁burn	-5543
+odies	-5544
+▁chain	-5545
+▁Type	-5546
+berry	-5547
+cm	-5548
+▁choices	-5549
+rupt	-5550
+▁noted	-5551
+▁evalu	-5552
+▁Human	-5553
+▁quot	-5554
+▁confirmed	-5555
+inet	-5556
+▁interior	-5557
+▁dollars	-5558
+▁pock	-5559
+▁seemed	-5560
+lywood	-5561
+▁Applic	-5562
+▁Lee	-5563
+▁cop	-5564
+▁bedroom	-5565
+itionally	-5566
+otton	-5567
+▁thus	-5568
+▁Jones	-5569
+▁victory	-5570
+▁rule	-5571
+idays	-5572
+▁suitable	-5573
+iability	-5574
+▁Wal	-5575
+▁argu	-5576
+▁Brand	-5577
+▁depart	-5578
+▁arrived	-5579
+cles	-5580
+▁Quest	-5581
+ua	-5582
+unting	-5583
+▁perfectly	-5584
+Al	-5585
+▁FREE	-5586
+▁twice	-5587
+tters	-5588
+hand	-5589
+uits	-5590
+▁buildings	-5591
+▁boys	-5592
+▁teeth	-5593
+Ex	-5594
+away	-5595
+▁Tem	-5596
+▁Mult	-5597
+aped	-5598
+▁broken	-5599
+▁possibly	-5600
+▁Equ	-5601
+▁warrant	-5602
+king	-5603
+abet	-5604
+gers	-5605
+▁films	-5606
+▁symptoms	-5607
+▁crew	-5608
+uous	-5609
+▁honor	-5610
+▁Italian	-5611
+▁bathroom	-5612
+▁elig	-5613
+▁doubt	-5614
+▁shooting	-5615
+▁Victor	-5616
+arp	-5617
+▁ticket	-5618
+▁Know	-5619
+▁anc	-5620
+!”	-5621
+No	-5622
+arks	-5623
+▁island	-5624
+▁stated	-5625
+▁Gar	-5626
+▁issued	-5627
+flow	-5628
+ailability	-5629
+▁chosen	-5630
+ilit	-5631
+▁Cast	-5632
+▁DV	-5633
+rier	-5634
+▁considering	-5635
+▁enable	-5636
+▁commission	-5637
+▁Mexico	-5638
+▁injuries	-5639
+▁Steve	-5640
+▁Little	-5641
+urban	-5642
+▁Trust	-5643
+▁candidates	-5644
+poses	-5645
+otal	-5646
+▁Williams	-5647
+▁tests	-5648
+related	-5649
+▁reference	-5650
+▁desire	-5651
+▁keeps	-5652
+▁foods	-5653
+▁rapid	-5654
+TC	-5655
+▁corn	-5656
+▁bigger	-5657
+road	-5658
+ibilities	-5659
+ipl	-5660
+▁missed	-5661
+▁ris	-5662
+▁Instead	-5663
+▁mode	-5664
+▁paying	-5665
+ulations	-5666
+▁boat	-5667
+▁golf	-5668
+▁picked	-5669
+▁Does	-5670
+▁contest	-5671
+▁intellig	-5672
+iors	-5673
+acks	-5674
+▁circum	-5675
+▁Farm	-5676
+▁Students	-5677
+▁Hard	-5678
+▁appreciate	-5679
+▁decades	-5680
+▁tomorrow	-5681
+▁premium	-5682
+▁turns	-5683
+▁trend	-5684
+▁valuable	-5685
+iamond	-5686
+▁settings	-5687
+▁Games	-5688
+gend	-5689
+▁sizes	-5690
+owntown	-5691
+▁Coast	-5692
+▁fro	-5693
+▁voc	-5694
+▁protected	-5695
+ien	-5696
+▁Tit	-5697
+▁soul	-5698
+▁presentation	-5699
+▁Mat	-5700
+▁Kn	-5701
+▁Mov	-5702
+▁lived	-5703
+▁Page	-5704
+▁regularly	-5705
+▁realize	-5706
+atoes	-5707
+mes	-5708
+▁earned	-5709
+▁Current	-5710
+▁registration	-5711
+▁config	-5712
+▁Night	-5713
+▁nurs	-5714
+▁attorney	-5715
+▁citizens	-5716
+▁Ohio	-5717
+▁quant	-5718
+hetic	-5719
+▁magazine	-5720
+▁oven	-5721
+▁aid	-5722
+▁failed	-5723
+▁database	-5724
+▁AS	-5725
+fection	-5726
+▁spr	-5727
+▁therapy	-5728
+ris	-5729
+ora	-5730
+▁Assist	-5731
+ias	-5732
+▁organic	-5733
+▁sequ	-5734
+▁Canadian	-5735
+▁license	-5736
+wing	-5737
+▁Econom	-5738
+weet	-5739
+▁Michigan	-5740
+▁agent	-5741
+▁surrounding	-5742
+AY	-5743
+▁affected	-5744
+▁mine	-5745
+▁providers	-5746
+▁resol	-5747
+▁greatest	-5748
+oosing	-5749
+▁moments	-5750
+▁county	-5751
+▁ends	-5752
+▁Olymp	-5753
+▁ran	-5754
+▁tells	-5755
+▁ec	-5756
+what	-5757
+PR	-5758
+▁dates	-5759
+▁Cross	-5760
+▁Span	-5761
+▁MS	-5762
+▁grown	-5763
+▁reput	-5764
+▁virtual	-5765
+▁Instagram	-5766
+ev	-5767
+▁athlet	-5768
+▁investors	-5769
+▁Code	-5770
+▁surf	-5771
+▁grade	-5772
+▁calcul	-5773
+▁Pass	-5774
+spe	-5775
+▁answers	-5776
+▁loves	-5777
+▁shock	-5778
+.|	-5779
+▁supports	-5780
+▁painting	-5781
+phas	-5782
+▁inn	-5783
+▁draft	-5784
+▁influence	-5785
+▁proposed	-5786
+oup	-5787
+lights	-5788
+▁surprise	-5789
+▁agencies	-5790
+▁History	-5791
+pass	-5792
+▁Control	-5793
+▁Kh	-5794
+abled	-5795
+▁dial	-5796
+▁Sn	-5797
+▁hero	-5798
+▁poly	-5799
+▁explain	-5800
+▁renew	-5801
+▁accurate	-5802
+▁weap	-5803
+▁degrees	-5804
+▁submit	-5805
+race	-5806
+▁recorded	-5807
+▁Bal	-5808
+▁Executive	-5809
+▁Van	-5810
+▁ages	-5811
+▁Point	-5812
+▁owned	-5813
+oking	-5814
+▁convenient	-5815
+▁AR	-5816
+▁Georg	-5817
+▁purposes	-5818
+vell	-5819
+▁Share	-5820
+ria	-5821
+▁load	-5822
+which	-5823
+▁beer	-5824
+▁Did	-5825
+▁yes	-5826
+irms	-5827
+▁whom	-5828
+▁Inf	-5829
+fficient	-5830
+▁league	-5831
+ella	-5832
+▁holds	-5833
+▁processing	-5834
+▁Buy	-5835
+▁Federal	-5836
+TA	-5837
+▁gro	-5838
+▁Middle	-5839
+▁instructions	-5840
+TV	-5841
+▁Asia	-5842
+▁die	-5843
+▁Cas	-5844
+▁interests	-5845
+kes	-5846
+▁efficiency	-5847
+▁Jackson	-5848
+▁Def	-5849
+▁pure	-5850
+ansas	-5851
+hors	-5852
+▁apparent	-5853
+▁jack	-5854
+▁effectively	-5855
+▁atmosp	-5856
+▁Expl	-5857
+mar	-5858
+▁violence	-5859
+luding	-5860
+▁Comple	-5861
+alendar	-5862
+▁returns	-5863
+▁goods	-5864
+▁Enjoy	-5865
+▁pleased	-5866
+▁element	-5867
+▁Paris	-5868
+▁awareness	-5869
+real	-5870
+vy	-5871
+▁messages	-5872
+OVID	-5873
+cking	-5874
+▁channel	-5875
+▁pepper	-5876
+▁receiving	-5877
+print	-5878
+▁infrastructure	-5879
+▁Ken	-5880
+▁pod	-5881
+▁Ire	-5882
+▁electronic	-5883
+▁Three	-5884
+rick	-5885
+▁occup	-5886
+▁forced	-5887
+intage	-5888
+▁Made	-5889
+▁Size	-5890
+▁officers	-5891
+▁facing	-5892
+▁creation	-5893
+ospit	-5894
+▁Requ	-5895
+▁standing	-5896
+▁musical	-5897
+▁researchers	-5898
+▁sam	-5899
+▁Dom	-5900
+▁Royal	-5901
+▁incident	-5902
+▁perman	-5903
+▁Columb	-5904
+▁belong	-5905
+▁lighting	-5906
+▁closer	-5907
+irty	-5908
+▁everyday	-5909
+▁diverse	-5910
+▁Try	-5911
+▁grad	-5912
+▁Richard	-5913
+▁Daily	-5914
+profit	-5915
+▁route	-5916
+ban	-5917
+▁Travel	-5918
+▁distribution	-5919
+▁ongoing	-5920
+▁Photo	-5921
+▁lit	-5922
+▁causes	-5923
+poration	-5924
+▁Cred	-5925
+made	-5926
+▁trouble	-5927
+▁Ell	-5928
+▁thread	-5929
+▁apartment	-5930
+▁Sher	-5931
+▁advoc	-5932
+▁administr	-5933
+▁serves	-5934
+▁usual	-5935
+▁wheel	-5936
+▁Chair	-5937
+▁Ut	-5938
+rum	-5939
+▁sad	-5940
+▁Need	-5941
+▁pun	-5942
+anche	-5943
+▁du	-5944
+▁mini	-5945
+▁Store	-5946
+isters	-5947
+▁kinds	-5948
+▁obtain	-5949
+▁ped	-5950
+▁healthcare	-5951
+▁favourite	-5952
+hy	-5953
+▁silver	-5954
+▁judge	-5955
+▁arts	-5956
+PM	-5957
+▁wid	-5958
+GE	-5959
+▁Cath	-5960
+▁supposed	-5961
+▁meetings	-5962
+▁crime	-5963
+▁error	-5964
+equ	-5965
+▁spaces	-5966
+▁rough	-5967
+▁yellow	-5968
+rete	-5969
+▁knowing	-5970
+▁plate	-5971
+udden	-5972
+▁affili	-5973
+ribe	-5974
+▁disappoint	-5975
+▁stopped	-5976
+▁flour	-5977
+▁enthus	-5978
+▁WH	-5979
+▁fellow	-5980
+umes	-5981
+▁Wi	-5982
+▁bound	-5983
+aration	-5984
+never	-5985
+oses	-5986
+Tube	-5987
+▁collaboration	-5988
+▁manner	-5989
+▁Rev	-5990
+▁designer	-5991
+itage	-5992
+xy	-5993
+▁licens	-5994
+▁construct	-5995
+▁concerned	-5996
+actions	-5997
+▁Ltd	-5998
+▁subscrib	-5999
+▁massive	-6000
+▁monit	-6001
+▁Andrew	-6002
+person	-6003
+anges	-6004
+▁clothes	-6005
+▁weekly	-6006
+▁follows	-6007
+uction	-6008
+▁Low	-6009
+ennis	-6010
+▁tut	-6011
+▁rot	-6012
+▁Four	-6013
+sembly	-6014
+cue	-6015
+▁Local	-6016
+ancer	-6017
+ello	-6018
+arian	-6019
+▁Daniel	-6020
+▁household	-6021
+yard	-6022
+▁tur	-6023
+▁Wr	-6024
+▁prison	-6025
+▁reduced	-6026
+▁Clean	-6027
+▁simpl	-6028
+▁regional	-6029
+▁forces	-6030
+▁challenging	-6031
+iveness	-6032
+EE	-6033
+astern	-6034
+▁Mean	-6035
+▁male	-6036
+▁stone	-6037
+▁Guide	-6038
+▁functions	-6039
+▁Ra	-6040
+▁tack	-6041
+▁agreed	-6042
+▁Right	-6043
+▁hang	-6044
+pond	-6045
+▁script	-6046
+▁Santa	-6047
+▁Room	-6048
+oti	-6049
+▁lifestyle	-6050
+▁Russian	-6051
+▁Francisco	-6052
+▁Hen	-6053
+▁moist	-6054
+▁treated	-6055
+orable	-6056
+▁horse	-6057
+▁debut	-6058
+▁complic	-6059
+▁Marketing	-6060
+▁alcohol	-6061
+▁assets	-6062
+ansion	-6063
+▁innovation	-6064
+▁native	-6065
+▁sample	-6066
+▁payments	-6067
+ml	-6068
+▁fixed	-6069
+▁successfully	-6070
+▁impressive	-6071
+▁reserved	-6072
+Con	-6073
+▁powder	-6074
+▁crisis	-6075
+▁emotional	-6076
+FC	-6077
+▁explained	-6078
+DS	-6079
+Ar	-6080
+▁Ep	-6081
+▁inspiration	-6082
+▁Job	-6083
+All	-6084
+▁cute	-6085
+▁witness	-6086
+ache	-6087
+Un	-6088
+▁Visit	-6089
+under	-6090
+▁leather	-6091
+▁row	-6092
+▁spokes	-6093
+▁fort	-6094
+▁forest	-6095
+▁Rights	-6096
+writ	-6097
+ench	-6098
+▁password	-6099
+ppers	-6100
+▁matters	-6101
+▁Brook	-6102
+ani	-6103
+Pl	-6104
+▁FOR	-6105
+▁identified	-6106
+▁luxury	-6107
+alled	-6108
+▁employment	-6109
+BI	-6110
+▁photograp	-6111
+Be	-6112
+▁drugs	-6113
+▁Pot	-6114
+▁blogg	-6115
+▁Summer	-6116
+▁cock	-6117
+igration	-6118
+▁Hor	-6119
+▁extended	-6120
+▁iron	-6121
+And	-6122
+▁Die	-6123
+▁phil	-6124
+▁Area	-6125
+shire	-6126
+erves	-6127
+lyn	-6128
+▁determined	-6129
+▁rand	-6130
+▁outstanding	-6131
+▁grab	-6132
+▁accepted	-6133
+▁prompt	-6134
+▁recognized	-6135
+▁Blo	-6136
+▁der	-6137
+▁prop	-6138
+▁styles	-6139
+▁Southern	-6140
+▁resolution	-6141
+▁tou	-6142
+folio	-6143
+▁height	-6144
+▁walls	-6145
+▁odd	-6146
+▁gifts	-6147
+▁casino	-6148
+▁Rose	-6149
+▁clinical	-6150
+▁vacation	-6151
+▁Name	-6152
+▁decre	-6153
+▁Cra	-6154
+▁accessible	-6155
+▁advis	-6156
+▁context	-6157
+▁nearby	-6158
+liance	-6159
+▁graduate	-6160
+▁conducted	-6161
+vate	-6162
+can	-6163
+They	-6164
+▁happening	-6165
+▁Number	-6166
+rip	-6167
+▁positions	-6168
+▁Financial	-6169
+▁worse	-6170
+▁perspective	-6171
+▁Small	-6172
+▁dangerous	-6173
+▁Awards	-6174
+▁freedom	-6175
+▁gear	-6176
+▁SH	-6177
+mary	-6178
+▁carried	-6179
+▁speaking	-6180
+▁factor	-6181
+letter	-6182
+▁Ash	-6183
+▁stunning	-6184
+▁Turn	-6185
+▁sustainable	-6186
+▁speech	-6187
+▁Scot	-6188
+cling	-6189
+▁significantly	-6190
+▁tag	-6191
+▁folks	-6192
+▁candidate	-6193
+▁Colorado	-6194
+unction	-6195
+▁Oil	-6196
+▁telling	-6197
+▁examples	-6198
+▁domestic	-6199
+ulture	-6200
+anged	-6201
+▁Avenue	-6202
+▁constantly	-6203
+rid	-6204
+▁emphas	-6205
+▁committee	-6206
+▁Training	-6207
+▁symbol	-6208
+▁Coll	-6209
+▁cable	-6210
+▁likes	-6211
+▁Kim	-6212
+▁univers	-6213
+▁Lin	-6214
+▁mixed	-6215
+▁hardware	-6216
+▁solar	-6217
+ificate	-6218
+▁Perform	-6219
+▁originally	-6220
+▁Having	-6221
+▁Account	-6222
+▁Which	-6223
+▁Sometimes	-6224
+ucle	-6225
+▁hook	-6226
+▁vit	-6227
+emic	-6228
+▁stands	-6229
+▁conflic	-6230
+▁Hon	-6231
+▁awards	-6232
+▁retire	-6233
+▁adventure	-6234
+Don	-6235
+LY	-6236
+▁showc	-6237
+▁contemporary	-6238
+ployment	-6239
+▁involve	-6240
+▁houses	-6241
+▁fulf	-6242
+▁village	-6243
+▁logo	-6244
+▁Though	-6245
+▁bless	-6246
+▁Cond	-6247
+▁Spanish	-6248
+▁carefully	-6249
+▁MA	-6250
+▁patterns	-6251
+▁Dub	-6252
+▁supplies	-6253
+▁Select	-6254
+▁procedures	-6255
+▁DC	-6256
+▁Print	-6257
+▁programme	-6258
+ingly	-6259
+▁auto	-6260
+▁browser	-6261
+▁Mobile	-6262
+▁imagine	-6263
+▁Despite	-6264
+▁confident	-6265
+▁stretch	-6266
+▁criminal	-6267
+▁fitness	-6268
+▁losing	-6269
+▁replacement	-6270
+lete	-6271
+▁routine	-6272
+▁Ireland	-6273
+▁Available	-6274
+▁procedure	-6275
+▁adds	-6276
+▁engage	-6277
+▁illustr	-6278
+▁bottle	-6279
+▁circumst	-6280
+▁Rom	-6281
+ca	-6282
+▁Ryan	-6283
+▁Garden	-6284
+etime	-6285
+utch	-6286
+▁crazy	-6287
+▁YouTube	-6288
+▁turning	-6289
+▁hosting	-6290
+▁random	-6291
+▁taught	-6292
+▁rose	-6293
+▁expectations	-6294
+▁Russia	-6295
+▁recipes	-6296
+▁lift	-6297
+front	-6298
+▁command	-6299
+state	-6300
+▁Tay	-6301
+▁Drive	-6302
+secut	-6303
+▁fo	-6304
+▁improvement	-6305
+▁excess	-6306
+▁hur	-6307
+▁alleged	-6308
+▁trained	-6309
+▁tro	-6310
+▁sheet	-6311
+▁mixture	-6312
+▁noticed	-6313
+▁festival	-6314
+▁Bon	-6315
+illy	-6316
+▁tech	-6317
+▁funny	-6318
+▁OS	-6319
+ATE	-6320
+▁shots	-6321
+▁tab	-6322
+▁flavor	-6323
+▁reporting	-6324
+▁syn	-6325
+▁passeng	-6326
+▁guitar	-6327
+▁ol	-6328
+▁hoping	-6329
+▁COVID	-6330
+▁severe	-6331
+inder	-6332
+▁entreprene	-6333
+▁suspect	-6334
+▁eleg	-6335
+▁foundation	-6336
+ether	-6337
+orgeous	-6338
+▁Heart	-6339
+ington	-6340
+ossible	-6341
+▁SU	-6342
+▁upper	-6343
+inem	-6344
+▁Building	-6345
+▁Environment	-6346
+anger	-6347
+▁blow	-6348
+eration	-6349
+▁clothing	-6350
+▁scholars	-6351
+▁Non	-6352
+▁publish	-6353
+▁Italy	-6354
+enced	-6355
+▁ok	-6356
+anna	-6357
+▁authent	-6358
+adium	-6359
+comes	-6360
+▁FA	-6361
+▁pink	-6362
+▁climb	-6363
+▁Pop	-6364
+rad	-6365
+▁Senior	-6366
+▁kill	-6367
+pat	-6368
+▁talks	-6369
+iano	-6370
+▁grew	-6371
+▁Son	-6372
+▁pil	-6373
+▁root	-6374
+▁Beaut	-6375
+hered	-6376
+▁san	-6377
+oster	-6378
+▁landscape	-6379
+▁figures	-6380
+ayer	-6381
+tle	-6382
+▁millions	-6383
+▁machines	-6384
+ERS	-6385
+ums	-6386
+▁Country	-6387
+▁Jersey	-6388
+▁Sky	-6389
+So	-6390
+iece	-6391
+ERE	-6392
+orders	-6393
+iversary	-6394
+▁Run	-6395
+▁tasks	-6396
+▁vital	-6397
+▁reward	-6398
+▁attended	-6399
+ikes	-6400
+▁eggs	-6401
+▁tall	-6402
+▁identity	-6403
+▁tested	-6404
+▁PS	-6405
+’.	-6406
+▁hits	-6407
+▁coc	-6408
+▁Senate	-6409
+▁integrated	-6410
+▁champions	-6411
+▁herself	-6412
+▁laugh	-6413
+▁trends	-6414
+▁input	-6415
+▁vibr	-6416
+▁Disney	-6417
+▁Division	-6418
+▁anx	-6419
+▁?	-6420
+▁council	-6421
+forcement	-6422
+oral	-6423
+▁Shop	-6424
+▁Nick	-6425
+▁Stock	-6426
+▁chapter	-6427
+HS	-6428
+▁mal	-6429
+▁shift	-6430
+▁Ref	-6431
+▁Jenn	-6432
+▁weak	-6433
+▁historical	-6434
+▁guard	-6435
+▁wealth	-6436
+▁dram	-6437
+▁incl	-6438
+▁Writ	-6439
+▁Dog	-6440
+▁fishing	-6441
+.’	-6442
+▁baking	-6443
+▁airport	-6444
+▁Proper	-6445
+▁depth	-6446
+▁museum	-6447
+▁improving	-6448
+▁AD	-6449
+▁smile	-6450
+▁invited	-6451
+host	-6452
+▁arrested	-6453
+izz	-6454
+RI	-6455
+▁wash	-6456
+luded	-6457
+rition	-6458
+▁accessories	-6459
+dy	-6460
+▁Professor	-6461
+▁profit	-6462
+▁Safety	-6463
+ampion	-6464
+▁ease	-6465
+▁thin	-6466
+▁unf	-6467
+▁output	-6468
+▁qualified	-6469
+▁residential	-6470
+▁Ford	-6471
+▁Ent	-6472
+▁upload	-6473
+riends	-6474
+▁Want	-6475
+rate	-6476
+▁hire	-6477
+▁rear	-6478
+▁Ha	-6479
+▁carbon	-6480
+▁tonight	-6481
+▁abuse	-6482
+▁authorities	-6483
+▁certified	-6484
+▁skill	-6485
+▁Georgia	-6486
+▁mountain	-6487
+▁Fre	-6488
+▁Sales	-6489
+▁wet	-6490
+remony	-6491
+ATION	-6492
+zil	-6493
+▁ordered	-6494
+▁km	-6495
+pret	-6496
+▁managing	-6497
+▁Far	-6498
+▁bags	-6499
+▁instance	-6500
+▁destination	-6501
+▁entered	-6502
+▁Still	-6503
+▁thorough	-6504
+▁Email	-6505
+▁sole	-6506
+iana	-6507
+icial	-6508
+▁dropped	-6509
+▁institutions	-6510
+▁entirely	-6511
+▁terror	-6512
+▁Silver	-6513
+▁atmosphere	-6514
+▁recy	-6515
+▁Further	-6516
+iami	-6517
+yers	-6518
+▁Bul	-6519
+▁Supp	-6520
+▁Systems	-6521
+LS	-6522
+▁Fed	-6523
+▁Luc	-6524
+▁Space	-6525
+▁closely	-6526
+▁Zeal	-6527
+MP	-6528
+▁Stat	-6529
+▁guidance	-6530
+▁sick	-6531
+▁photography	-6532
+▁rating	-6533
+ras	-6534
+▁breast	-6535
+▁tiny	-6536
+▁Tax	-6537
+▁description	-6538
+▁fuck	-6539
+▁vend	-6540
+▁offices	-6541
+▁Members	-6542
+▁scientific	-6543
+▁layer	-6544
+▁transportation	-6545
+▁printed	-6546
+stone	-6547
+De	-6548
+long	-6549
+▁frequently	-6550
+▁Fac	-6551
+▁Dist	-6552
+eller	-6553
+▁spin	-6554
+igned	-6555
+▁entr	-6556
+agues	-6557
+▁cooper	-6558
+va	-6559
+▁dental	-6560
+▁EU	-6561
+▁shower	-6562
+▁yards	-6563
+▁searching	-6564
+▁cycle	-6565
+▁loans	-6566
+▁delay	-6567
+▁CO	-6568
+▁Phone	-6569
+▁Pract	-6570
+▁failure	-6571
+▁kne	-6572
+▁medicine	-6573
+PC	-6574
+▁equal	-6575
+▁lessons	-6576
+izza	-6577
+▁unable	-6578
+▁protein	-6579
+ogue	-6580
+adow	-6581
+▁founded	-6582
+▁Aff	-6583
+sen	-6584
+▁Finally	-6585
+▁broadcast	-6586
+▁cm	-6587
+▁flexible	-6588
+quir	-6589
+▁column	-6590
+▁operate	-6591
+▁Tech	-6592
+▁compens	-6593
+▁typical	-6594
+▁rail	-6595
+▁Looking	-6596
+▁bonus	-6597
+▁Hou	-6598
+▁taxes	-6599
+aduate	-6600
+▁Should	-6601
+▁glad	-6602
+▁religious	-6603
+▁sac	-6604
+▁Never	-6605
+▁Engineering	-6606
+▁situations	-6607
+▁vacc	-6608
+▁bear	-6609
+▁PDF	-6610
+▁Ca	-6611
+▁awarded	-6612
+▁lad	-6613
+▁Zealand	-6614
+oes	-6615
+▁Ball	-6616
+▁Put	-6617
+quality	-6618
+▁external	-6619
+▁eligible	-6620
+▁Very	-6621
+▁Mach	-6622
+▁historic	-6623
+▁alongside	-6624
+awn	-6625
+▁Sat	-6626
+icket	-6627
+▁strategic	-6628
+UL	-6629
+▁OR	-6630
+▁flood	-6631
+▁sudden	-6632
+▁wra	-6633
+worth	-6634
+▁unlike	-6635
+▁assessment	-6636
+▁Smart	-6637
+▁DVD	-6638
+▁filed	-6639
+▁networks	-6640
+ilst	-6641
+▁seriously	-6642
+osoph	-6643
+▁Sus	-6644
+▁workshop	-6645
+▁creates	-6646
+Is	-6647
+▁rental	-6648
+umps	-6649
+xx	-6650
+▁worst	-6651
+▁Unfortunately	-6652
+?"	-6653
+▁Charles	-6654
+uting	-6655
+▁BE	-6656
+▁critic	-6657
+▁transition	-6658
+▁fighting	-6659
+▁river	-6660
+▁membership	-6661
+nam	-6662
+oker	-6663
+ircle	-6664
+▁Mountain	-6665
+▁believes	-6666
+asters	-6667
+▁platforms	-6668
+bi	-6669
+omy	-6670
+▁attacks	-6671
+friendly	-6672
+▁availability	-6673
+▁none	-6674
+class	-6675
+▁tracks	-6676
+▁Foot	-6677
+▁versions	-6678
+▁vul	-6679
+uling	-6680
+erman	-6681
+▁distinct	-6682
+▁Es	-6683
+▁younger	-6684
+▁listening	-6685
+tain	-6686
+▁Fox	-6687
+osite	-6688
+plate	-6689
+aturally	-6690
+▁motion	-6691
+▁Ask	-6692
+▁contribute	-6693
+▁faculty	-6694
+▁hasn	-6695
+arrow	-6696
+inos	-6697
+▁Professional	-6698
+▁juice	-6699
+!"	-6700
+▁proven	-6701
+II	-6702
+eding	-6703
+One	-6704
+▁bab	-6705
+▁hopes	-6706
+▁Pacific	-6707
+star	-6708
+onto	-6709
+With	-6710
+aze	-6711
+▁letters	-6712
+▁joining	-6713
+ucky	-6714
+irts	-6715
+active	-6716
+▁risks	-6717
+▁performing	-6718
+▁Ray	-6719
+ounter	-6720
+▁soph	-6721
+▁streets	-6722
+car	-6723
+▁developers	-6724
+you	-6725
+▁Ariz	-6726
+▁SC	-6727
+▁obl	-6728
+▁cups	-6729
+▁conver	-6730
+neys	-6731
+▁pounds	-6732
+Fi	-6733
+▁cos	-6734
+▁recording	-6735
+▁Term	-6736
+ati	-6737
+▁tip	-6738
+▁lucky	-6739
+▁Easy	-6740
+zer	-6741
+▁Tele	-6742
+▁informed	-6743
+▁choosing	-6744
+▁Harr	-6745
+oured	-6746
+▁Kent	-6747
+▁grass	-6748
+ented	-6749
+)|	-6750
+▁facilit	-6751
+▁meals	-6752
+▁surprised	-6753
+▁mortgage	-6754
+nic	-6755
+obby	-6756
+▁infect	-6757
+▁Phys	-6758
+▁capture	-6759
+▁liquid	-6760
+▁banks	-6761
+ican	-6762
+▁diss	-6763
+agon	-6764
+▁kit	-6765
+▁PA	-6766
+▁Leg	-6767
+▁tournament	-6768
+▁Fall	-6769
+amps	-6770
+▁anticip	-6771
+▁papers	-6772
+▁LLC	-6773
+elry	-6774
+▁savings	-6775
+▁Field	-6776
+earing	-6777
+At	-6778
+▁privacy	-6779
+cers	-6780
+▁discip	-6781
+pons	-6782
+To	-6783
+aping	-6784
+uine	-6785
+▁Event	-6786
+▁hurt	-6787
+born	-6788
+▁rein	-6789
+▁Ram	-6790
+▁regulations	-6791
+▁decade	-6792
+▁Mom	-6793
+▁inch	-6794
+ashed	-6795
+law	-6796
+ially	-6797
+▁Broad	-6798
+▁charm	-6799
+rency	-6800
+▁Taylor	-6801
+celer	-6802
+▁submitted	-6803
+▁Kat	-6804
+▁arg	-6805
+etic	-6806
+▁Northern	-6807
+▁west	-6808
+▁Ter	-6809
+▁blend	-6810
+▁ille	-6811
+Le	-6812
+▁reputation	-6813
+▁Po	-6814
+▁LED	-6815
+Se	-6816
+▁suggested	-6817
+▁monitor	-6818
+▁bat	-6819
+▁proceed	-6820
+▁hall	-6821
+▁liked	-6822
+▁filter	-6823
+▁shops	-6824
+▁organized	-6825
+▁relief	-6826
+▁domain	-6827
+▁consequ	-6828
+▁mic	-6829
+▁belief	-6830
+▁engagement	-6831
+▁sight	-6832
+entle	-6833
+▁Lind	-6834
+▁Cut	-6835
+▁Source	-6836
+bury	-6837
+▁extract	-6838
+Read	-6839
+▁Miami	-6840
+▁Radio	-6841
+▁pulled	-6842
+▁Come	-6843
+▁Credit	-6844
+▁gorgeous	-6845
+days	-6846
+▁justice	-6847
+uter	-6848
+pes	-6849
+▁Sea	-6850
+▁Cab	-6851
+▁drawing	-6852
+▁negoti	-6853
+▁circumstances	-6854
+▁capable	-6855
+▁crucial	-6856
+▁Arab	-6857
+▁quote	-6858
+▁)	-6859
+▁empt	-6860
+▁monitoring	-6861
+rell	-6862
+▁tank	-6863
+ava	-6864
+▁Think	-6865
+▁portfolio	-6866
+▁Order	-6867
+▁legs	-6868
+▁sky	-6869
+▁Bible	-6870
+bing	-6871
+ulf	-6872
+ographic	-6873
+▁hate	-6874
+▁immediate	-6875
+▁arrive	-6876
+▁ads	-6877
+▁increases	-6878
+▁stir	-6879
+bar	-6880
+▁exhibition	-6881
+▁Ms	-6882
+▁believed	-6883
+▁penal	-6884
+▁Insurance	-6885
+foot	-6886
+▁moves	-6887
+▁linked	-6888
+ta	-6889
+▁Continue	-6890
+athan	-6891
+▁counsel	-6892
+▁relatively	-6893
+▁treatments	-6894
+▁Pak	-6895
+▁faces	-6896
+▁attached	-6897
+▁manual	-6898
+faction	-6899
+▁soil	-6900
+▁crack	-6901
+▁defend	-6902
+▁adm	-6903
+uis	-6904
+▁mm	-6905
+illiant	-6906
+ura	-6907
+▁jun	-6908
+▁Mir	-6909
+▁planet	-6910
+bles	-6911
+resents	-6912
+Ad	-6913
+▁technique	-6914
+cknow	-6915
+▁enjoying	-6916
+▁listing	-6917
+esides	-6918
+▁guidelines	-6919
+▁concert	-6920
+rowse	-6921
+▁directed	-6922
+▁vast	-6923
+▁dent	-6924
+▁injured	-6925
+arters	-6926
+▁hosted	-6927
+▁interface	-6928
+▁execut	-6929
+▁ast	-6930
+▁Conf	-6931
+▁LA	-6932
+▁Rod	-6933
+▁authors	-6934
+▁spark	-6935
+▁garage	-6936
+▁hospit	-6937
+uration	-6938
+▁memories	-6939
+rich	-6940
+▁contrast	-6941
+▁aside	-6942
+▁equipped	-6943
+▁volunteers	-6944
+▁Ron	-6945
+ardens	-6946
+sey	-6947
+▁Ur	-6948
+▁normally	-6949
+ppy	-6950
+▁estimated	-6951
+▁promise	-6952
+▁firms	-6953
+▁:)	-6954
+▁dreams	-6955
+▁Happy	-6956
+▁Republican	-6957
+▁Pow	-6958
+▁pin	-6959
+▁trig	-6960
+onym	-6961
+▁Jac	-6962
+▁warn	-6963
+▁trick	-6964
+hot	-6965
+▁phase	-6966
+▁rice	-6967
+▁depress	-6968
+▁Remember	-6969
+▁urban	-6970
+▁Being	-6971
+By	-6972
+▁Quality	-6973
+▁illness	-6974
+iger	-6975
+▁agents	-6976
+▁Justice	-6977
+▁prove	-6978
+ba	-6979
+▁consistent	-6980
+▁acid	-6981
+▁dust	-6982
+oty	-6983
+▁spoke	-6984
+▁Airport	-6985
+▁Houston	-6986
+▁organis	-6987
+holders	-6988
+▁pitch	-6989
+▁pleasure	-6990
+▁Bed	-6991
+▁matches	-6992
+▁arms	-6993
+▁Medicine	-6994
+aints	-6995
+ults	-6996
+%.	-6997
+Bl	-6998
+AA	-6999
+▁Ide	-7000
+▁Talk	-7001
+▁Conc	-7002
+▁portion	-7003
+▁index	-7004
+▁Line	-7005
+▁chances	-7006
+asant	-7007
+ogether	-7008
+▁Brazil	-7009
+▁fasc	-7010
+▁Fact	-7011
+▁lapt	-7012
+.'	-7013
+icit	-7014
+▁newly	-7015
+▁Personal	-7016
+▁dynamic	-7017
+▁chose	-7018
+▁objects	-7019
+ensity	-7020
+▁Carl	-7021
+▁breath	-7022
+▁finance	-7023
+rm	-7024
+▁Arizona	-7025
+▁Asian	-7026
+▁refund	-7027
+▁Living	-7028
+▁proof	-7029
+eling	-7030
+▁Prom	-7031
+SC	-7032
+▁Standard	-7033
+▁seed	-7034
+▁passing	-7035
+▁continuing	-7036
+But	-7037
+▁Officer	-7038
+▁visited	-7039
+▁Give	-7040
+▁drinking	-7041
+▁represents	-7042
+site	-7043
+ership	-7044
+▁iPad	-7045
+cket	-7046
+▁formed	-7047
+▁mile	-7048
+inois	-7049
+pack	-7050
+▁ultimate	-7051
+▁storm	-7052
+alle	-7053
+▁Mill	-7054
+▁border	-7055
+▁roles	-7056
+▁Estate	-7057
+▁Brad	-7058
+▁ceremony	-7059
+▁forever	-7060
+▁discussed	-7061
+▁superv	-7062
+▁MO	-7063
+annels	-7064
+▁Cru	-7065
+iking	-7066
+▁Las	-7067
+▁approval	-7068
+amber	-7069
+▁Welcome	-7070
+▁zone	-7071
+▁id	-7072
+▁Season	-7073
+▁Army	-7074
+▁Student	-7075
+▁suc	-7076
+she	-7077
+▁gaming	-7078
+▁recommendations	-7079
+▁stim	-7080
+▁dealing	-7081
+▁exposure	-7082
+adel	-7083
+▁sending	-7084
+ultural	-7085
+stal	-7086
+▁Oak	-7087
+▁stake	-7088
+▁Iran	-7089
+▁Therefore	-7090
+▁evol	-7091
+▁phones	-7092
+MC	-7093
+anes	-7094
+▁Kevin	-7095
+▁Sav	-7096
+▁teasp	-7097
+▁capabilities	-7098
+▁gallery	-7099
+▁division	-7100
+▁Webs	-7101
+whel	-7102
+uclear	-7103
+amsung	-7104
+Americ	-7105
+▁boxes	-7106
+▁downtown	-7107
+▁saving	-7108
+▁presents	-7109
+▁holidays	-7110
+▁collected	-7111
+▁lawyer	-7112
+respond	-7113
+▁possibility	-7114
+▁fairly	-7115
+▁Again	-7116
+▁pra	-7117
+▁implementation	-7118
+▁mand	-7119
+▁vulner	-7120
+iki	-7121
+ainless	-7122
+▁susp	-7123
+▁ensuring	-7124
+ja	-7125
+▁hat	-7126
+GA	-7127
+▁permanent	-7128
+aper	-7129
+▁Choose	-7130
+▁attractive	-7131
+▁pharm	-7132
+▁cookies	-7133
+▁constit	-7134
+▁smell	-7135
+▁flash	-7136
+▁Administration	-7137
+▁industries	-7138
+▁hidden	-7139
+▁Site	-7140
+▁tub	-7141
+▁suggestions	-7142
+ih	-7143
+aste	-7144
+▁scheme	-7145
+▁porn	-7146
+bro	-7147
+▁trib	-7148
+▁Experience	-7149
+▁finds	-7150
+▁Natural	-7151
+lers	-7152
+izer	-7153
+wear	-7154
+▁Brian	-7155
+ione	-7156
+▁recognize	-7157
+urse	-7158
+missions	-7159
+▁instrument	-7160
+▁Express	-7161
+RS	-7162
+▁facts	-7163
+▁Kenn	-7164
+▁Ju	-7165
+phy	-7166
+▁heads	-7167
+▁theory	-7168
+▁vari	-7169
+pot	-7170
+▁priority	-7171
+▁acknow	-7172
+▁mainly	-7173
+zes	-7174
+lessly	-7175
+▁($	-7176
+▁Meanwhile	-7177
+Sc	-7178
+▁legislation	-7179
+▁Ros	-7180
+▁Clin	-7181
+▁Isl	-7182
+▁Case	-7183
+ffered	-7184
+▁reader	-7185
+rible	-7186
+▁bodies	-7187
+FA	-7188
+▁butt	-7189
+▁categories	-7190
+▁Chall	-7191
+▁liber	-7192
+▁posting	-7193
+▁mut	-7194
+▁realized	-7195
+▁Hollywood	-7196
+anned	-7197
+page	-7198
+▁communications	-7199
+▁Software	-7200
+inson	-7201
+▁solve	-7202
+▁Vers	-7203
+▁Own	-7204
+▁Ba	-7205
+▁personally	-7206
+▁Secretary	-7207
+▁bench	-7208
+▁upgrade	-7209
+▁garlic	-7210
+allas	-7211
+▁bars	-7212
+▁Dun	-7213
+da	-7214
+▁Queen	-7215
+boy	-7216
+phones	-7217
+▁bridge	-7218
+▁Emer	-7219
+Book	-7220
+EA	-7221
+▁incredibly	-7222
+▁Stay	-7223
+then	-7224
+▁USB	-7225
+▁ancient	-7226
+▁Policy	-7227
+▁Learning	-7228
+CT	-7229
+▁Create	-7230
+▁tradition	-7231
+▁reform	-7232
+esy	-7233
+▁||	-7234
+▁permission	-7235
+▁Bang	-7236
+stra	-7237
+▁hole	-7238
+ingu	-7239
+▁tiss	-7240
+▁Anal	-7241
+osc	-7242
+▁Prime	-7243
+▁generate	-7244
+▁Yet	-7245
+ounce	-7246
+odd	-7247
+anny	-7248
+▁Cand	-7249
+▁rum	-7250
+▁packages	-7251
+▁CN	-7252
+▁exec	-7253
+▁copyright	-7254
+▁calendar	-7255
+tw	-7256
+ials	-7257
+▁handling	-7258
+odge	-7259
+▁substant	-7260
+▁travell	-7261
+▁pace	-7262
+▁basketball	-7263
+▁Hold	-7264
+▁east	-7265
+▁magic	-7266
+parent	-7267
+▁debate	-7268
+▁claimed	-7269
+▁raw	-7270
+▁Additionally	-7271
+OO	-7272
+▁Level	-7273
+▁victims	-7274
+iti	-7275
+That	-7276
+▁clar	-7277
+▁celebration	-7278
+▁orange	-7279
+▁programming	-7280
+▁walked	-7281
+▁doctors	-7282
+▁Jr	-7283
+▁MD	-7284
+▁achieved	-7285
+ulpt	-7286
+▁fest	-7287
+HA	-7288
+▁giant	-7289
+▁absor	-7290
+▁Toronto	-7291
+▁purchasing	-7292
+▁forth	-7293
+▁cotton	-7294
+▁habit	-7295
+onna	-7296
+▁replaced	-7297
+▁prospect	-7298
+▁Cro	-7299
+▁Stan	-7300
+▁Film	-7301
+▁bare	-7302
+uls	-7303
+burgh	-7304
+▁explains	-7305
+▁fifth	-7306
+▁tooth	-7307
+▁Illinois	-7308
+▁desired	-7309
+CD	-7310
+level	-7311
+▁Studies	-7312
+zing	-7313
+isa	-7314
+▁king	-7315
+▁manufacturers	-7316
+▁Tool	-7317
+▁titles	-7318
+▁gym	-7319
+▁spots	-7320
+▁Dar	-7321
+▁saved	-7322
+▁seasons	-7323
+▁Auto	-7324
+▁marked	-7325
+▁somewhere	-7326
+▁insight	-7327
+season	-7328
+▁Consult	-7329
+▁proposal	-7330
+▁cuts	-7331
+▁marks	-7332
+▁hotels	-7333
+▁initiative	-7334
+▁feelings	-7335
+uster	-7336
+▁venue	-7337
+▁slowly	-7338
+▁singer	-7339
+▁suffering	-7340
+▁specialist	-7341
+RL	-7342
+▁Produ	-7343
+ila	-7344
+▁Catholic	-7345
+▁expressed	-7346
+▁NFL	-7347
+▁Story	-7348
+▁Capital	-7349
+▁Irish	-7350
+▁compat	-7351
+▁requests	-7352
+▁drinks	-7353
+▁Material	-7354
+imize	-7355
+App	-7356
+▁architecture	-7357
+iot	-7358
+▁vegetables	-7359
+▁Sep	-7360
+▁Agency	-7361
+acon	-7362
+igate	-7363
+▁Save	-7364
+aters	-7365
+esh	-7366
+aron	-7367
+▁buyers	-7368
+▁Joseph	-7369
+▁merch	-7370
+▁volunteer	-7371
+▁impossible	-7372
+▁gay	-7373
+▁exceptional	-7374
+▁Liber	-7375
+▁stuck	-7376
+▁Table	-7377
+stream	-7378
+▁meets	-7379
+▁enables	-7380
+▁swimming	-7381
+▁combine	-7382
+inton	-7383
+▁broke	-7384
+▁murder	-7385
+bridge	-7386
+▁publication	-7387
+▁announcement	-7388
+▁destroy	-7389
+▁extension	-7390
+▁ultimately	-7391
+▁tie	-7392
+ylvan	-7393
+▁causing	-7394
+▁enem	-7395
+▁consultation	-7396
+VER	-7397
+▁encouraged	-7398
+▁reducing	-7399
+▁Mess	-7400
+▁Pakistan	-7401
+regon	-7402
+▁accomplish	-7403
+▁err	-7404
+▁muscle	-7405
+ologist	-7406
+nesota	-7407
+▁split	-7408
+▁packaging	-7409
+▁yard	-7410
+▁Pu	-7411
+▁Mix	-7412
+▁surprising	-7413
+▁lets	-7414
+▁publ	-7415
+ickets	-7416
+aid	-7417
+▁Short	-7418
+▁Bell	-7419
+▁magn	-7420
+▁Vegas	-7421
+▁Map	-7422
+▁actor	-7423
+▁Autom	-7424
+▁Would	-7425
+▁printing	-7426
+▁engaged	-7427
+▁rig	-7428
+▁enterprise	-7429
+▁pit	-7430
+▁heav	-7431
+▁describe	-7432
+lements	-7433
+▁Camer	-7434
+▁massage	-7435
+▁pricing	-7436
+run	-7437
+apore	-7438
+▁DI	-7439
+▁electrical	-7440
+des	-7441
+bel	-7442
+aska	-7443
+▁Motor	-7444
+▁noise	-7445
+▁Location	-7446
+▁widely	-7447
+▁preparation	-7448
+▁mood	-7449
+▁Kids	-7450
+▁seeds	-7451
+ifer	-7452
+▁reasonable	-7453
+▁Pen	-7454
+▁talked	-7455
+▁blocks	-7456
+▁covering	-7457
+▁enroll	-7458
+▁performances	-7459
+▁Labor	-7460
+▁Spain	-7461
+ns	-7462
+▁breaking	-7463
+▁pill	-7464
+▁expansion	-7465
+▁recognition	-7466
+bell	-7467
+▁framework	-7468
+olis	-7469
+▁wins	-7470
+▁default	-7471
+▁Recent	-7472
+▁overwhel	-7473
+eah	-7474
+▁genuine	-7475
+▁remark	-7476
+▁traveling	-7477
+▁Forest	-7478
+▁seats	-7479
+▁blank	-7480
+rage	-7481
+▁classroom	-7482
+RC	-7483
+wan	-7484
+▁agric	-7485
+▁knock	-7486
+uct	-7487
+cons	-7488
+inator	-7489
+▁concrete	-7490
+▁interactive	-7491
+▁CB	-7492
+▁Theatre	-7493
+▁neighb	-7494
+▁Ira	-7495
+▁Ess	-7496
+▁pand	-7497
+iler	-7498
+)||	-7499
+▁Adam	-7500
+▁unw	-7501
+▁Gallery	-7502
+▁Studio	-7503
+▁compr	-7504
+▁Pin	-7505
+▁birds	-7506
+▁formal	-7507
+▁Force	-7508
+▁dishes	-7509
+wich	-7510
+▁Band	-7511
+▁Ice	-7512
+▁resistance	-7513
+▁Memorial	-7514
+▁writers	-7515
+▁franch	-7516
+▁Rs	-7517
+▁Toy	-7518
+▁Following	-7519
+▁gall	-7520
+▁empty	-7521
+▁spray	-7522
+gypt	-7523
+▁brilliant	-7524
+▁consists	-7525
+ulum	-7526
+▁constant	-7527
+▁staying	-7528
+▁increasingly	-7529
+▁scenes	-7530
+proof	-7531
+▁compliance	-7532
+▁Square	-7533
+▁incorpor	-7534
+▁acting	-7535
+▁resulting	-7536
+▁Mrs	-7537
+EP	-7538
+▁Annual	-7539
+▁duty	-7540
+▁Davis	-7541
+▁suggests	-7542
+▁pic	-7543
+▁dad	-7544
+▁recover	-7545
+▁managers	-7546
+nda	-7547
+▁experiment	-7548
+▁Fred	-7549
+ludes	-7550
+▁Member	-7551
+▁basically	-7552
+▁Treat	-7553
+▁spiritual	-7554
+axy	-7555
+ateful	-7556
+ding	-7557
+▁Things	-7558
+being	-7559
+▁professor	-7560
+▁Mah	-7561
+▁Paper	-7562
+ifies	-7563
+▁Diego	-7564
+▁anyway	-7565
+▁mere	-7566
+▁nights	-7567
+▁bow	-7568
+▁Spirit	-7569
+child	-7570
+books	-7571
+▁graphics	-7572
+▁Eric	-7573
+leep	-7574
+▁Dam	-7575
+▁FL	-7576
+otted	-7577
+▁lists	-7578
+▁Partners	-7579
+▁slot	-7580
+▁forecast	-7581
+▁Jord	-7582
+▁slic	-7583
+▁Solutions	-7584
+▁deck	-7585
+▁pride	-7586
+’,	-7587
+▁scan	-7588
+▁Samsung	-7589
+abetes	-7590
+▁Roman	-7591
+▁prize	-7592
+▁authority	-7593
+▁Interest	-7594
+rated	-7595
+▁producing	-7596
+▁Ly	-7597
+ilton	-7598
+▁Shipping	-7599
+alo	-7600
+irus	-7601
+▁centers	-7602
+una	-7603
+▁packed	-7604
+▁clicking	-7605
+▁Model	-7606
+▁Gro	-7607
+▁Seattle	-7608
+▁wireless	-7609
+erate	-7610
+alse	-7611
+▁everywhere	-7612
+▁Books	-7613
+▁aims	-7614
+▁legend	-7615
+▁enthusi	-7616
+acle	-7617
+ghan	-7618
+▁Golden	-7619
+▁expenses	-7620
+▁whenever	-7621
+▁Minnesota	-7622
+▁Pur	-7623
+vas	-7624
+ashes	-7625
+▁Age	-7626
+▁indeed	-7627
+▁Limited	-7628
+▁healing	-7629
+utional	-7630
+▁closing	-7631
+▁talented	-7632
+▁Cover	-7633
+▁interpret	-7634
+▁succeed	-7635
+▁inner	-7636
+inding	-7637
+▁anniversary	-7638
+▁singles	-7639
+▁involves	-7640
+rome	-7641
+▁Lew	-7642
+making	-7643
+▁Swed	-7644
+▁unex	-7645
+▁Sound	-7646
+ls	-7647
+heast	-7648
+▁connections	-7649
+▁riding	-7650
+▁pocket	-7651
+▁channels	-7652
+pends	-7653
+▁obtained	-7654
+▁GM	-7655
+▁narr	-7656
+▁founder	-7657
+▁OK	-7658
+▁vice	-7659
+▁displayed	-7660
+▁Magazine	-7661
+▁Dream	-7662
+ylvania	-7663
+▁Perhaps	-7664
+▁Customer	-7665
+▁bunch	-7666
+▁Total	-7667
+▁assum	-7668
+▁opens	-7669
+▁delivering	-7670
+greg	-7671
+▁Month	-7672
+▁Collection	-7673
+▁Dallas	-7674
+▁struggle	-7675
+▁Bad	-7676
+▁lemon	-7677
+▁trips	-7678
+▁designers	-7679
+Press	-7680
+ureau	-7681
+▁Based	-7682
+▁attrib	-7683
+▁Steel	-7684
+stein	-7685
+▁Working	-7686
+▁acts	-7687
+▁differences	-7688
+▁driven	-7689
+lder	-7690
+▁ending	-7691
+▁Station	-7692
+▁Pict	-7693
+▁CP	-7694
+abeth	-7695
+nders	-7696
+ronics	-7697
+▁Mother	-7698
+▁defined	-7699
+▁complim	-7700
+▁watched	-7701
+▁improvements	-7702
+coin	-7703
+▁Cloud	-7704
+▁mob	-7705
+▁primarily	-7706
+▁Action	-7707
+bits	-7708
+▁vintage	-7709
+▁CL	-7710
+▁loving	-7711
+sters	-7712
+▁boss	-7713
+▁gender	-7714
+▁Oregon	-7715
+▁introduction	-7716
+▁guaranteed	-7717
+ambling	-7718
+▁Dark	-7719
+▁booking	-7720
+▁returning	-7721
+oom	-7722
+▁Rub	-7723
+▁Sym	-7724
+▁SS	-7725
+▁Rand	-7726
+▁fits	-7727
+▁sensitive	-7728
+▁Eastern	-7729
+▁shouldn	-7730
+▁podcast	-7731
+Fr	-7732
+▁Everyone	-7733
+▁apparently	-7734
+▁politics	-7735
+▁Anth	-7736
+▁Base	-7737
+▁precise	-7738
+▁officially	-7739
+pool	-7740
+owa	-7741
+oned	-7742
+issions	-7743
+▁Common	-7744
+rive	-7745
+▁Products	-7746
+▁rug	-7747
+▁Bru	-7748
+▁alive	-7749
+▁headed	-7750
+AB	-7751
+▁chopped	-7752
+▁Return	-7753
+su	-7754
+iders	-7755
+▁Miller	-7756
+▁Spec	-7757
+▁fing	-7758
+▁unus	-7759
+▁Jay	-7760
+▁Blog	-7761
+▁Change	-7762
+▁narrow	-7763
+▁protest	-7764
+▁coat	-7765
+▁highlights	-7766
+▁trim	-7767
+▁potentially	-7768
+AND	-7769
+▁honey	-7770
+▁recre	-7771
+▁shell	-7772
+ailing	-7773
+▁Transport	-7774
+▁Austin	-7775
+▁authentic	-7776
+▁percentage	-7777
+▁filling	-7778
+▁maintaining	-7779
+▁Capt	-7780
+▁tape	-7781
+▁lin	-7782
+▁analyst	-7783
+▁Cry	-7784
+▁retirement	-7785
+▁speaker	-7786
+pson	-7787
+▁crash	-7788
+▁casual	-7789
+atics	-7790
+riers	-7791
+▁Among	-7792
+▁personality	-7793
+▁assistant	-7794
+▁Corporation	-7795
+▁charity	-7796
+▁acquis	-7797
+▁scientists	-7798
+jo	-7799
+wart	-7800
+▁Kingdom	-7801
+▁resident	-7802
+inent	-7803
+▁Guard	-7804
+lose	-7805
+scribe	-7806
+▁falling	-7807
+▁plot	-7808
+raid	-7809
+▁DO	-7810
+▁elev	-7811
+pection	-7812
+iac	-7813
+▁opinions	-7814
+onut	-7815
+▁Iraq	-7816
+▁bills	-7817
+▁aircraft	-7818
+▁licensed	-7819
+▁Josh	-7820
+ali	-7821
+▁CR	-7822
+▁Barb	-7823
+▁strike	-7824
+▁heading	-7825
+▁naturally	-7826
+ydney	-7827
+acher	-7828
+▁Dead	-7829
+raction	-7830
+▁consumption	-7831
+▁renov	-7832
+▁Nic	-7833
+▁Sarah	-7834
+▁tired	-7835
+▁carrying	-7836
+arliam	-7837
+▁gentle	-7838
+▁colours	-7839
+▁Jewish	-7840
+Cont	-7841
+▁correspond	-7842
+▁Egypt	-7843
+▁obviously	-7844
+▁preparing	-7845
+▁functional	-7846
+▁involving	-7847
+asted	-7848
+ilipp	-7849
+▁Ap	-7850
+▁Stephen	-7851
+▁suffered	-7852
+▁iOS	-7853
+▁smartphone	-7854
+▁NC	-7855
+▁oz	-7856
+▁Ven	-7857
+teen	-7858
+▁tap	-7859
+▁stronger	-7860
+▁intent	-7861
+▁winners	-7862
+osophy	-7863
+▁controls	-7864
+▁Singapore	-7865
+athy	-7866
+▁instant	-7867
+▁CON	-7868
+▁satisfaction	-7869
+ockey	-7870
+oices	-7871
+▁communicate	-7872
+▁integration	-7873
+▁Golf	-7874
+▁aggress	-7875
+▁Jason	-7876
+▁attending	-7877
+▁colleagues	-7878
+ilty	-7879
+hour	-7880
+▁statements	-7881
+phia	-7882
+▁lect	-7883
+▁Jam	-7884
+▁racing	-7885
+ador	-7886
+▁Dave	-7887
+tics	-7888
+anchester	-7889
+▁rising	-7890
+▁finger	-7891
+▁component	-7892
+epend	-7893
+▁minimal	-7894
+▁gained	-7895
+▁ski	-7896
+▁tracking	-7897
+uma	-7898
+▁undert	-7899
+▁Weight	-7900
+▁Atlanta	-7901
+▁Website	-7902
+iley	-7903
+terior	-7904
+▁Creek	-7905
+▁behalf	-7906
+▁reduction	-7907
+▁trailer	-7908
+▁Past	-7909
+arliament	-7910
+▁bid	-7911
+vard	-7912
+Man	-7913
+unes	-7914
+tery	-7915
+rovers	-7916
+icted	-7917
+lee	-7918
+etch	-7919
+▁checking	-7920
+▁principles	-7921
+▁Vice	-7922
+▁superior	-7923
+▁signing	-7924
+▁dollar	-7925
+▁focuses	-7926
+▁tone	-7927
+▁ED	-7928
+▁displ	-7929
+▁illegal	-7930
+▁preferred	-7931
+▁findings	-7932
+▁Friends	-7933
+▁deposit	-7934
+inally	-7935
+sylvania	-7936
+avor	-7937
+▁Bit	-7938
+▁Lic	-7939
+adelphia	-7940
+lace	-7941
+ocation	-7942
+▁producer	-7943
+▁Rog	-7944
+▁Britain	-7945
+▁Korea	-7946
+ucks	-7947
+ression	-7948
+▁batt	-7949
+▁intelligence	-7950
+▁heating	-7951
+▁enforcement	-7952
+bow	-7953
+intend	-7954
+fall	-7955
+aka	-7956
+▁clubs	-7957
+▁Image	-7958
+▁regardless	-7959
+inals	-7960
+▁Pick	-7961
+▁humans	-7962
+amel	-7963
+▁emails	-7964
+▁divor	-7965
+▁assault	-7966
+brid	-7967
+limited	-7968
+▁Tony	-7969
+▁Kansas	-7970
+▁removal	-7971
+▁participating	-7972
+▁Columbia	-7973
+haust	-7974
+vi	-7975
+▁;	-7976
+▁newspaper	-7977
+iously	-7978
+gent	-7979
+▁till	-7980
+▁Marc	-7981
+----	-7982
+▁array	-7983
+▁Uk	-7984
+▁Bridge	-7985
+htt	-7986
+zzle	-7987
+▁anymore	-7988
+makers	-7989
+aine	-7990
+▁flower	-7991
+▁Winter	-7992
+▁yield	-7993
+▁Below	-7994
+▁acceler	-7995
+▁lifetime	-7996
+cker	-7997
+▁lesson	-7998
+▁Has	-7999
+ints	-8000
+▁roads	-8001
+UM	-8002
+▁Hi	-8003
+uten	-8004
+▁usage	-8005
+roke	-8006
+▁compare	-8007
+olit	-8008
+▁loose	-8009
+▁ham	-8010
+▁Hun	-8011
+▁entering	-8012
+▁jewelry	-8013
+▁Quick	-8014
+▁Sem	-8015
+▁Eliz	-8016
+▁faced	-8017
+▁Flow	-8018
+▁stations	-8019
+pir	-8020
+ventional	-8021
+▁passionate	-8022
+▁divid	-8023
+▁stable	-8024
+▁conclus	-8025
+▁Village	-8026
+AV	-8027
+ho	-8028
+▁CC	-8029
+▁ocean	-8030
+ran	-8031
+▁intense	-8032
+giving	-8033
+tail	-8034
+▁Front	-8035
+▁expression	-8036
+mers	-8037
+etary	-8038
+▁Resources	-8039
+▁Lady	-8040
+▁introduce	-8041
+▁poet	-8042
+▁drama	-8043
+▁recip	-8044
+▁Break	-8045
+▁rural	-8046
+▁diseases	-8047
+▁pandemic	-8048
+onic	-8049
+▁occurred	-8050
+▁somewhat	-8051
+swe	-8052
+▁Kitchen	-8053
+orous	-8054
+▁string	-8055
+pending	-8056
+▁NEW	-8057
+▁exception	-8058
+▁warranty	-8059
+aza	-8060
+▁lie	-8061
+▁Henry	-8062
+▁Tenn	-8063
+▁olive	-8064
+▁careful	-8065
+▁sharp	-8066
+▁strange	-8067
+▁workshops	-8068
+▁Excell	-8069
+▁oppon	-8070
+▁damaged	-8071
+▁viewing	-8072
+▁BB	-8073
+▁compete	-8074
+▁strict	-8075
+▁EN	-8076
+▁baseball	-8077
+▁Was	-8078
+▁factory	-8079
+oked	-8080
+▁elegant	-8081
+▁literally	-8082
+▁solo	-8083
+▁computers	-8084
+itled	-8085
+▁whilst	-8086
+▁creativity	-8087
+▁visible	-8088
+winning	-8089
+▁screens	-8090
+▁Bol	-8091
+▁era	-8092
+▁association	-8093
+▁referred	-8094
+▁photographs	-8095
+▁shoulder	-8096
+▁EX	-8097
+▁Sale	-8098
+EST	-8099
+oween	-8100
+▁Lar	-8101
+▁Due	-8102
+▁mac	-8103
+▁exceed	-8104
+▁amounts	-8105
+Up	-8106
+▁Baby	-8107
+▁attach	-8108
+▁Without	-8109
+▁DJ	-8110
+▁visits	-8111
+▁Impro	-8112
+▁conflict	-8113
+liers	-8114
+▁Wilson	-8115
+▁flying	-8116
+▁depends	-8117
+▁Ministry	-8118
+inity	-8119
+▁je	-8120
+▁stead	-8121
+▁SM	-8122
+EG	-8123
+▁limits	-8124
+▁lady	-8125
+walk	-8126
+▁withd	-8127
+▁twist	-8128
+▁fraud	-8129
+▁Process	-8130
+▁SEO	-8131
+▁Body	-8132
+cycle	-8133
+▁AB	-8134
+▁Deb	-8135
+▁smoke	-8136
+that	-8137
+acc	-8138
+▁Edition	-8139
+▁Steven	-8140
+from	-8141
+▁Creative	-8142
+▁Sug	-8143
+▁Pennsylvania	-8144
+▁engaging	-8145
+▁Grow	-8146
+▁artwork	-8147
+▁chips	-8148
+▁phr	-8149
+▁coaching	-8150
+▁drawn	-8151
+▁signature	-8152
+riculum	-8153
+▁flexibility	-8154
+Do	-8155
+▁samples	-8156
+▁developer	-8157
+▁resort	-8158
+▁minister	-8159
+▁bact	-8160
+▁Sydney	-8161
+▁newsletter	-8162
+achelor	-8163
+icide	-8164
+usive	-8165
+▁Stone	-8166
+▁Jordan	-8167
+▁Bry	-8168
+▁Single	-8169
+▁golden	-8170
+umer	-8171
+▁speakers	-8172
+▁Money	-8173
+▁regions	-8174
+▁Forum	-8175
+▁Must	-8176
+▁invite	-8177
+▁accuracy	-8178
+▁puts	-8179
+▁setup	-8180
+▁!	-8181
+▁Perfect	-8182
+pton	-8183
+▁persons	-8184
+awa	-8185
+lier	-8186
+yes	-8187
+▁accused	-8188
+▁fault	-8189
+▁NAS	-8190
+▁Fast	-8191
+▁Mu	-8192
+▁consideration	-8193
+▁Tru	-8194
+▁decent	-8195
+unicip	-8196
+▁Eth	-8197
+▁Holy	-8198
+▁electricity	-8199
+▁anxiety	-8200
+▁Bern	-8201
+▁weren	-8202
+▁Sche	-8203
+▁sees	-8204
+▁shirt	-8205
+▁Kelly	-8206
+obe	-8207
+▁bold	-8208
+▁participation	-8209
+▁discl	-8210
+Te	-8211
+ubs	-8212
+▁Staff	-8213
+atile	-8214
+▁farmers	-8215
+▁focusing	-8216
+▁Bush	-8217
+▁wondering	-8218
+▁Hills	-8219
+▁Supreme	-8220
+▁Ave	-8221
+▁temperatures	-8222
+PD	-8223
+▁necessarily	-8224
+month	-8225
+▁Greg	-8226
+anted	-8227
+▁grateful	-8228
+▁subsequ	-8229
+▁virus	-8230
+▁cabin	-8231
+▁interviews	-8232
+▁contributions	-8233
+found	-8234
+anguage	-8235
+▁insights	-8236
+▁Manchester	-8237
+▁Philadelphia	-8238
+uminum	-8239
+mates	-8240
+▁yoga	-8241
+onsin	-8242
+She	-8243
+▁applying	-8244
+▁boards	-8245
+▁Histor	-8246
+.....	-8247
+▁fundra	-8248
+▁loyal	-8249
+▁rely	-8250
+▁NO	-8251
+▁lip	-8252
+▁afraid	-8253
+▁sculpt	-8254
+isconsin	-8255
+▁Harry	-8256
+otive	-8257
+▁HERE	-8258
+▁Father	-8259
+▁texture	-8260
+mble	-8261
+▁reaction	-8262
+▁tor	-8263
+▁Kind	-8264
+▁seam	-8265
+▁childhood	-8266
+▁controlled	-8267
+ifting	-8268
+▁Details	-8269
+▁Private	-8270
+▁structures	-8271
+▁stages	-8272
+style	-8273
+▁recon	-8274
+▁raising	-8275
+▁hung	-8276
+▁cameras	-8277
+▁compensation	-8278
+▁vill	-8279
+▁Hay	-8280
+▁phen	-8281
+▁lies	-8282
+Me	-8283
+▁brush	-8284
+▁wanting	-8285
+▁wake	-8286
+▁Nut	-8287
+bourne	-8288
+▁blind	-8289
+▁Reading	-8290
+▁Scotland	-8291
+▁promoting	-8292
+▁ear	-8293
+athered	-8294
+acious	-8295
+▁conserv	-8296
+TER	-8297
+▁weapons	-8298
+▁personnel	-8299
+▁checked	-8300
+oken	-8301
+oted	-8302
+▁unknown	-8303
+iary	-8304
+▁struggling	-8305
+▁Eag	-8306
+▁cul	-8307
+▁generated	-8308
+▁writes	-8309
+claim	-8310
+onder	-8311
+▁synt	-8312
+▁zero	-8313
+▁param	-8314
+▁Flo	-8315
+▁Comb	-8316
+quis	-8317
+▁clock	-8318
+▁pig	-8319
+ista	-8320
+▁sour	-8321
+▁Him	-8322
+step	-8323
+▁Tel	-8324
+▁tweet	-8325
+▁Everything	-8326
+▁panels	-8327
+▁Corn	-8328
+including	-8329
+▁arrival	-8330
+▁Comments	-8331
+▁Install	-8332
+▁reveal	-8333
+▁tied	-8334
+▁commonly	-8335
+▁Explore	-8336
+▁yours	-8337
+▁Trade	-8338
+acular	-8339
+CO	-8340
+▁Fab	-8341
+▁signal	-8342
+▁mold	-8343
+▁graphic	-8344
+▁semi	-8345
+▁Modern	-8346
+▁myst	-8347
+▁convenience	-8348
+▁champion	-8349
+▁recruit	-8350
+▁teaspoon	-8351
+▁Civil	-8352
+▁territ	-8353
+▁promotion	-8354
+▁Matthew	-8355
+high	-8356
+▁vess	-8357
+date	-8358
+▁Ever	-8359
+▁ly	-8360
+▁Attorney	-8361
+▁Core	-8362
+aire	-8363
+▁opposite	-8364
+▁laptop	-8365
+gon	-8366
+▁chest	-8367
+▁wave	-8368
+▁Exp	-8369
+▁controvers	-8370
+▁Appro	-8371
+▁Regional	-8372
+Sp	-8373
+▁comparison	-8374
+▁warning	-8375
+chers	-8376
+wr	-8377
+▁entrance	-8378
+▁lake	-8379
+▁exposed	-8380
+▁Iowa	-8381
+elled	-8382
+▁harder	-8383
+▁CS	-8384
+ulating	-8385
+▁PH	-8386
+▁borrow	-8387
+▁Protection	-8388
+▁elected	-8389
+▁Muslim	-8390
+obody	-8391
+▁Arm	-8392
+▁exploring	-8393
+▁SE	-8394
+▁Dim	-8395
+mun	-8396
+oof	-8397
+▁criteria	-8398
+▁Sport	-8399
+sole	-8400
+▁codes	-8401
+▁wooden	-8402
+▁donation	-8403
+▁Victoria	-8404
+▁gotten	-8405
+▁remained	-8406
+▁flag	-8407
+▁Poly	-8408
+'.	-8409
+▁scoring	-8410
+vironments	-8411
+oda	-8412
+▁escape	-8413
+▁laid	-8414
+▁Prim	-8415
+edge	-8416
+och	-8417
+stract	-8418
+▁languages	-8419
+▁vine	-8420
+sex	-8421
+▁wal	-8422
+eb	-8423
+▁nuclear	-8424
+▁asset	-8425
+▁subscription	-8426
+▁outcome	-8427
+▁hanging	-8428
+ellect	-8429
+▁Bh	-8430
+▁Patrick	-8431
+ervations	-8432
+▁Ont	-8433
+▁Kong	-8434
+▁Pitt	-8435
+▁greatly	-8436
+▁definition	-8437
+▁Ox	-8438
+▁thinks	-8439
+▁diversity	-8440
+sea	-8441
+sized	-8442
+▁Orange	-8443
+▁sexy	-8444
+gor	-8445
+▁addresses	-8446
+▁toile	-8447
+▁painted	-8448
+▁reaching	-8449
+apping	-8450
+▁Hawai	-8451
+▁Philipp	-8452
+week	-8453
+▁Editor	-8454
+▁unexpected	-8455
+▁isol	-8456
+▁pets	-8457
+▁Elizabeth	-8458
+▁scores	-8459
+itution	-8460
+war	-8461
+hol	-8462
+▁Published	-8463
+▁repairs	-8464
+roit	-8465
+▁Description	-8466
+▁Classic	-8467
+▁Sex	-8468
+▁union	-8469
+angers	-8470
+▁Prior	-8471
+▁Features	-8472
+▁shipped	-8473
+▁Innov	-8474
+▁sections	-8475
+power	-8476
+▁Psych	-8477
+▁uncom	-8478
+▁repeated	-8479
+▁clip	-8480
+▁Ah	-8481
+words	-8482
+▁admitted	-8483
+▁durable	-8484
+▁outcomes	-8485
+▁export	-8486
+▁demands	-8487
+olec	-8488
+ira	-8489
+IGH	-8490
+nce	-8491
+oen	-8492
+▁functionality	-8493
+▁Wisconsin	-8494
+▁Bowl	-8495
+▁everybody	-8496
+▁tsp	-8497
+ivalent	-8498
+▁complicated	-8499
+char	-8500
+▁sin	-8501
+EW	-8502
+▁plane	-8503
+▁Lyn	-8504
+screen	-8505
+▁employers	-8506
+▁Youth	-8507
+▁fiber	-8508
+▁ALL	-8509
+▁tutorial	-8510
+▁Cool	-8511
+▁Assistant	-8512
+▁UC	-8513
+▁safely	-8514
+▁concepts	-8515
+▁twenty	-8516
+rets	-8517
+▁Manufact	-8518
+▁struck	-8519
+▁employer	-8520
+▁representative	-8521
+▁alarm	-8522
+▁requirement	-8523
+▁dough	-8524
+▁toys	-8525
+▁obst	-8526
+▁networking	-8527
+▁Hur	-8528
+▁plain	-8529
+▁contribution	-8530
+▁Bat	-8531
+▁tum	-8532
+▁Bow	-8533
+▁Gi	-8534
+▁Collect	-8535
+▁certificate	-8536
+rug	-8537
+▁Ten	-8538
+▁Var	-8539
+▁studying	-8540
+▁revolution	-8541
+▁Advis	-8542
+▁Additional	-8543
+▁Simply	-8544
+▁evaluation	-8545
+▁deploy	-8546
+▁jail	-8547
+erning	-8548
+▁instruction	-8549
+eping	-8550
+▁Chapter	-8551
+▁cooked	-8552
+heim	-8553
+▁Rick	-8554
+▁completion	-8555
+▁bands	-8556
+▁Latin	-8557
+▁Industry	-8558
+▁newest	-8559
+minute	-8560
+▁beef	-8561
+▁Exchange	-8562
+▁matching	-8563
+▁calm	-8564
+ruption	-8565
+▁tradem	-8566
+▁temporary	-8567
+arring	-8568
+dale	-8569
+ieties	-8570
+rog	-8571
+▁Blu	-8572
+▁cher	-8573
+▁possess	-8574
+▁strongly	-8575
+▁Indiana	-8576
+▁Nav	-8577
+▁Complete	-8578
+▁Championship	-8579
+▁guilty	-8580
+▁initially	-8581
+▁Halloween	-8582
+▁bass	-8583
+aver	-8584
+▁deeply	-8585
+▁hip	-8586
+▁waters	-8587
+▁arr	-8588
+▁holes	-8589
+▁maintained	-8590
+▁Kar	-8591
+▁Send	-8592
+▁organisation	-8593
+utter	-8594
+▁Employ	-8595
+lude	-8596
+oz	-8597
+pense	-8598
+▁Reviews	-8599
+▁screw	-8600
+▁Glass	-8601
+▁romantic	-8602
+▁reception	-8603
+▁Entertain	-8604
+▁residence	-8605
+▁hearts	-8606
+▁releases	-8607
+ocal	-8608
+▁pizza	-8609
+%,	-8610
+▁Host	-8611
+▁Method	-8612
+▁hes	-8613
+▁Currently	-8614
+▁Cancer	-8615
+▁requested	-8616
+▁actress	-8617
+ko	-8618
+enities	-8619
+Get	-8620
+xic	-8621
+erving	-8622
+oir	-8623
+▁salad	-8624
+▁exhaust	-8625
+▁voters	-8626
+▁blogs	-8627
+Tr	-8628
+▁supplement	-8629
+▁Viet	-8630
+▁tablet	-8631
+vals	-8632
+▁Circ	-8633
+▁satisfied	-8634
+▁Independ	-8635
+▁singing	-8636
+▁subjects	-8637
+▁approaches	-8638
+▁Donald	-8639
+▁haz	-8640
+ounge	-8641
+▁Finance	-8642
+▁Days	-8643
+▁coal	-8644
+▁loud	-8645
+▁sought	-8646
+▁sorry	-8647
+▁Brother	-8648
+iche	-8649
+IST	-8650
+▁represented	-8651
+▁Ak	-8652
+▁Tam	-8653
+▁boot	-8654
+▁Nation	-8655
+▁swing	-8656
+▁Update	-8657
+▁Content	-8658
+▁southern	-8659
+▁Questions	-8660
+▁carpet	-8661
+▁franchise	-8662
+▁prime	-8663
+▁Six	-8664
+▁kil	-8665
+▁bone	-8666
+▁Ban	-8667
+oore	-8668
+▁Drag	-8669
+▁Anton	-8670
+▁athletes	-8671
+▁III	-8672
+▁Five	-8673
+▁insert	-8674
+▁convert	-8675
+▁reject	-8676
+▁slee	-8677
+▁Sir	-8678
+▁frustr	-8679
+▁infection	-8680
+▁Bud	-8681
+▁automatic	-8682
+▁Hong	-8683
+▁stored	-8684
+▁environments	-8685
+▁cancell	-8686
+▁flights	-8687
+▁overse	-8688
+▁Planning	-8689
+▁Bath	-8690
+▁tact	-8691
+▁bull	-8692
+▁Jess	-8693
+oln	-8694
+▁errors	-8695
+▁inventory	-8696
+▁Du	-8697
+▁desktop	-8698
+▁prayer	-8699
+▁chef	-8700
+▁hopefully	-8701
+charge	-8702
+▁sink	-8703
+▁orient	-8704
+▁Sony	-8705
+▁dual	-8706
+▁Round	-8707
+▁investments	-8708
+▁falls	-8709
+▁unusual	-8710
+▁comedy	-8711
+agen	-8712
+▁donations	-8713
+izon	-8714
+▁Wars	-8715
+▁delivers	-8716
+▁Self	-8717
+▁BM	-8718
+▁gap	-8719
+▁Study	-8720
+itis	-8721
+▁pregnant	-8722
+ificial	-8723
+▁discussions	-8724
+▁stamp	-8725
+▁nose	-8726
+▁accommodation	-8727
+▁Ocean	-8728
+link	-8729
+▁ft	-8730
+urious	-8731
+▁initiatives	-8732
+▁McG	-8733
+▁false	-8734
+▁stylish	-8735
+▁transactions	-8736
+aret	-8737
+▁TR	-8738
+▁granted	-8739
+▁rapidly	-8740
+ormal	-8741
+▁Iss	-8742
+▁transaction	-8743
+▁Restaur	-8744
+▁strengthen	-8745
+▁Lewis	-8746
+▁react	-8747
+game	-8748
+▁Univers	-8749
+▁ladies	-8750
+▁lens	-8751
+▁worn	-8752
+▁pregnancy	-8753
+▁indoor	-8754
+▁Tal	-8755
+ART	-8756
+▁experim	-8757
+▁crypt	-8758
+ji	-8759
+▁directions	-8760
+▁hiring	-8761
+▁dispos	-8762
+▁ET	-8763
+oa	-8764
+▁certification	-8765
+oons	-8766
+▁ensures	-8767
+imp	-8768
+▁finishing	-8769
+Qu	-8770
+▁Sciences	-8771
+ingham	-8772
+▁largely	-8773
+▁Property	-8774
+hire	-8775
+▁resume	-8776
+▁edit	-8777
+via	-8778
+▁Large	-8779
+amental	-8780
+Every	-8781
+▁Exam	-8782
+▁extraord	-8783
+olly	-8784
+▁deeper	-8785
+▁therap	-8786
+▁Major	-8787
+▁Electric	-8788
+▁acquired	-8789
+▁Ren	-8790
+▁hyp	-8791
+lah	-8792
+▁Benef	-8793
+otes	-8794
+▁surve	-8795
+▁Saint	-8796
+▁trusted	-8797
+df	-8798
+▁Style	-8799
+▁Greek	-8800
+▁catalog	-8801
+▁cats	-8802
+etts	-8803
+▁restrictions	-8804
+▁Prince	-8805
+▁Fresh	-8806
+▁ranging	-8807
+▁Performance	-8808
+▁branch	-8809
+▁globe	-8810
+rape	-8811
+HL	-8812
+▁Application	-8813
+▁Less	-8814
+▁aimed	-8815
+uties	-8816
+▁collections	-8817
+ds	-8818
+igious	-8819
+piece	-8820
+▁Michel	-8821
+▁emerging	-8822
+▁hitting	-8823
+▁mild	-8824
+▁WW	-8825
+▁Democratic	-8826
+▁underst	-8827
+▁Delivery	-8828
+▁Computer	-8829
+▁correctly	-8830
+Ne	-8831
+ensus	-8832
+▁makeup	-8833
+world	-8834
+▁HP	-8835
+▁expanded	-8836
+aug	-8837
+elle	-8838
+▁attitude	-8839
+▁Architect	-8840
+▁literature	-8841
+orship	-8842
+▁targeted	-8843
+TE	-8844
+▁workplace	-8845
+▁Early	-8846
+▁Comment	-8847
+onymous	-8848
+orses	-8849
+▁finest	-8850
+▁mistake	-8851
+▁Course	-8852
+▁gate	-8853
+ffective	-8854
+▁diamond	-8855
+▁Op	-8856
+▁contained	-8857
+Hz	-8858
+▁Barn	-8859
+▁fabulous	-8860
+ALL	-8861
+agers	-8862
+wall	-8863
+▁Snow	-8864
+▁Parent	-8865
+isdom	-8866
+▁Ross	-8867
+▁rib	-8868
+▁surge	-8869
+▁tissue	-8870
+▁Such	-8871
+▁BBC	-8872
+American	-8873
+doors	-8874
+▁alt	-8875
+▁COM	-8876
+IVE	-8877
+▁tube	-8878
+▁sed	-8879
+▁hell	-8880
+ceeds	-8881
+▁homeown	-8882
+▁tons	-8883
+▁contracts	-8884
+▁experiencing	-8885
+▁Tro	-8886
+▁Environmental	-8887
+▁sne	-8888
+achus	-8889
+▁skilled	-8890
+▁offensive	-8891
+▁mask	-8892
+nie	-8893
+▁estimate	-8894
+uana	-8895
+▁weigh	-8896
+bound	-8897
+iffe	-8898
+▁containing	-8899
+▁appointed	-8900
+▁sophist	-8901
+▁Challenge	-8902
+▁rolling	-8903
+▁Section	-8904
+UST	-8905
+swered	-8906
+▁refrig	-8907
+▁Hom	-8908
+▁Laure	-8909
+▁bub	-8910
+▁SA	-8911
+▁themes	-8912
+▁engines	-8913
+▁Premier	-8914
+▁demonstrate	-8915
+▁suffer	-8916
+▁sufficient	-8917
+▁Entertainment	-8918
+▁telephone	-8919
+itate	-8920
+tes	-8921
+essee	-8922
+▁weird	-8923
+▁Moon	-8924
+othe	-8925
+▁editing	-8926
+Free	-8927
+▁container	-8928
+rus	-8929
+covery	-8930
+▁Advent	-8931
+viously	-8932
+uters	-8933
+han	-8934
+▁personalized	-8935
+▁stayed	-8936
+PL	-8937
+▁Os	-8938
+▁subsid	-8939
+▁equally	-8940
+▁Text	-8941
+UD	-8942
+pay	-8943
+▁Song	-8944
+▁Celebr	-8945
+▁assume	-8946
+▁ministry	-8947
+▁copies	-8948
+▁workout	-8949
+▁segment	-8950
+▁Construction	-8951
+▁formula	-8952
+▁Integr	-8953
+▁math	-8954
+incoln	-8955
+▁defensive	-8956
+graduate	-8957
+▁aged	-8958
+▁floors	-8959
+orneys	-8960
+▁Side	-8961
+▁photographer	-8962
+core	-8963
+▁responded	-8964
+▁Server	-8965
+▁roots	-8966
+raine	-8967
+▁Cooper	-8968
+▁voted	-8969
+aya	-8970
+▁drives	-8971
+▁Buff	-8972
+▁sentence	-8973
+ela	-8974
+▁tours	-8975
+▁Anderson	-8976
+▁couples	-8977
+▁recru	-8978
+aga	-8979
+▁Advanced	-8980
+▁Events	-8981
+▁Dur	-8982
+▁BC	-8983
+▁drain	-8984
+▁utility	-8985
+orms	-8986
+▁breaks	-8987
+▁Compet	-8988
+▁outs	-8989
+▁shapes	-8990
+▁’	-8991
+▁indicate	-8992
+▁purchases	-8993
+▁Channel	-8994
+Last	-8995
+▁pilot	-8996
+▁stood	-8997
+▁viewed	-8998
+▁beans	-8999
+▁aest	-9000
+cut	-9001
+▁layout	-9002
+▁inspect	-9003
+▁nursing	-9004
+▁Double	-9005
+achusetts	-9006
+icking	-9007
+▁advantages	-9008
+▁wise	-9009
+▁stability	-9010
+▁admit	-9011
+▁Galaxy	-9012
+▁WordPress	-9013
+gi	-9014
+▁wrap	-9015
+▁Linux	-9016
+otte	-9017
+▁muscles	-9018
+▁sear	-9019
+||$	-9020
+▁Bring	-9021
+▁streaming	-9022
+▁Veter	-9023
+From	-9024
+▁ranked	-9025
+inar	-9026
+▁Lock	-9027
+gree	-9028
+▁Les	-9029
+▁Allen	-9030
+▁Laur	-9031
+▁buff	-9032
+▁powered	-9033
+▁hosts	-9034
+▁Tickets	-9035
+▁Inn	-9036
+ouses	-9037
+▁campaigns	-9038
+oli	-9039
+▁displays	-9040
+▁Iron	-9041
+▁Along	-9042
+▁Sant	-9043
+▁outfit	-9044
+home	-9045
+▁Mind	-9046
+▁Hair	-9047
+▁religion	-9048
+iour	-9049
+▁align	-9050
+▁keys	-9051
+▁Ath	-9052
+▁conscious	-9053
+▁excite	-9054
+▁amend	-9055
+▁Shel	-9056
+bered	-9057
+▁leaf	-9058
+▁explan	-9059
+▁Detroit	-9060
+▁grace	-9061
+▁Grant	-9062
+▁nutrition	-9063
+▁Anthony	-9064
+▁belt	-9065
+▁Near	-9066
+▁login	-9067
+▁Mail	-9068
+▁lowest	-9069
+hell	-9070
+▁emotions	-9071
+▁File	-9072
+▁equivalent	-9073
+iscal	-9074
+▁Gre	-9075
+▁onion	-9076
+▁studied	-9077
+▁Football	-9078
+SW	-9079
+▁respectively	-9080
+▁killing	-9081
+▁IM	-9082
+▁cam	-9083
+▁Overall	-9084
+▁Sweet	-9085
+▁alert	-9086
+▁Thus	-9087
+▁Heat	-9088
+▁accounting	-9089
+▁neither	-9090
+zed	-9091
+▁peak	-9092
+▁Final	-9093
+inyl	-9094
+▁productivity	-9095
+▁Albert	-9096
+looking	-9097
+▁equity	-9098
+utor	-9099
+▁FR	-9100
+▁fewer	-9101
+▁statistics	-9102
+▁Kir	-9103
+▁evil	-9104
+ESS	-9105
+▁Marsh	-9106
+cor	-9107
+▁SD	-9108
+▁organisations	-9109
+▁Categ	-9110
+▁Maryland	-9111
+OG	-9112
+ouri	-9113
+▁Tree	-9114
+▁junior	-9115
+▁neut	-9116
+▁god	-9117
+rat	-9118
+▁importantly	-9119
+▁institution	-9120
+▁Mini	-9121
+ellite	-9122
+▁encounter	-9123
+▁Mayor	-9124
+Post	-9125
+▁Ontario	-9126
+▁Nig	-9127
+▁Girls	-9128
+▁Title	-9129
+▁objective	-9130
+lined	-9131
+iatric	-9132
+athon	-9133
+▁respective	-9134
+fire	-9135
+opher	-9136
+ervice	-9137
+▁Denver	-9138
+▁letting	-9139
+cery	-9140
+umni	-9141
+undry	-9142
+▁deserve	-9143
+▁Records	-9144
+▁locally	-9145
+loved	-9146
+ighth	-9147
+▁Address	-9148
+▁parks	-9149
+Ab	-9150
+▁Clinton	-9151
+ECT	-9152
+▁Register	-9153
+▁comprom	-9154
+▁Roll	-9155
+▁URL	-9156
+▁veteran	-9157
+!!!!	-9158
+▁dess	-9159
+wa	-9160
+▁compliment	-9161
+▁Valent	-9162
+▁babies	-9163
+▁layers	-9164
+▁characteristics	-9165
+▁Index	-9166
+▁Article	-9167
+▁operated	-9168
+email	-9169
+▁earnings	-9170
+▁Clark	-9171
+▁banking	-9172
+roduction	-9173
+▁adopted	-9174
+▁combat	-9175
+▁maps	-9176
+▁beneficial	-9177
+KE	-9178
+▁por	-9179
+onda	-9180
+▁suppliers	-9181
+▁Always	-9182
+criptions	-9183
+▁Hu	-9184
+▁entrepreneurs	-9185
+▁Release	-9186
+▁ru	-9187
+▁votes	-9188
+▁stocks	-9189
+▁Polit	-9190
+▁mistakes	-9191
+▁Sure	-9192
+▁losses	-9193
+▁Coach	-9194
+▁balls	-9195
+fa	-9196
+▁worried	-9197
+py	-9198
+ILL	-9199
+▁Trail	-9200
+▁possibilities	-9201
+▁Birth	-9202
+dri	-9203
+▁Inv	-9204
+pany	-9205
+▁resid	-9206
+▁patch	-9207
+▁rubber	-9208
+▁diabetes	-9209
+▁Hart	-9210
+ousing	-9211
+▁Chem	-9212
+▁Trad	-9213
+lahoma	-9214
+▁Craft	-9215
+oding	-9216
+FF	-9217
+assion	-9218
+▁hide	-9219
+▁complement	-9220
+▁AI	-9221
+▁pursue	-9222
+terest	-9223
+▁spir	-9224
+mate	-9225
+▁Az	-9226
+▁inspire	-9227
+▁Rain	-9228
+▁expanding	-9229
+▁attempts	-9230
+era	-9231
+▁walks	-9232
+▁Brew	-9233
+▁refres	-9234
+▁dies	-9235
+ogen	-9236
+▁graduated	-9237
+oto	-9238
+▁delighted	-9239
+▁pul	-9240
+overs	-9241
+▁ownership	-9242
+▁Speed	-9243
+▁continu	-9244
+BM	-9245
+▁venture	-9246
+▁acquisition	-9247
+olen	-9248
+▁intention	-9249
+▁shortly	-9250
+Pr	-9251
+▁aer	-9252
+▁prohib	-9253
+▁Olympic	-9254
+▁Massachusetts	-9255
+grade	-9256
+▁brothers	-9257
+QL	-9258
+▁auction	-9259
+▁laser	-9260
+▁abilities	-9261
+abama	-9262
+▁consequences	-9263
+▁tender	-9264
+▁vendors	-9265
+▁aband	-9266
+▁sevent	-9267
+▁Da	-9268
+▁picking	-9269
+▁checks	-9270
+▁spectacular	-9271
+bie	-9272
+ijuana	-9273
+▁bomb	-9274
+▁depression	-9275
+cked	-9276
+▁Prep	-9277
+▁hunt	-9278
+▁therm	-9279
+dep	-9280
+itches	-9281
+▁stainless	-9282
+angle	-9283
+',	-9284
+▁oral	-9285
+▁distributed	-9286
+▁crown	-9287
+▁unif	-9288
+▁ske	-9289
+▁celebrated	-9290
+▁implemented	-9291
+▁Future	-9292
+▁squad	-9293
+▁occurs	-9294
+▁Mort	-9295
+▁bol	-9296
+▁Schools	-9297
+ceived	-9298
+View	-9299
+▁Ult	-9300
+▁reviewed	-9301
+▁secondary	-9302
+▁suite	-9303
+▁exercises	-9304
+▁gathering	-9305
+▁Gift	-9306
+ii	-9307
+▁Jane	-9308
+▁pointed	-9309
+▁consistently	-9310
+berries	-9311
+▁entitled	-9312
+▁representatives	-9313
+▁northern	-9314
+▁reportedly	-9315
+incess	-9316
+▁vibrant	-9317
+▁inspection	-9318
+▁survive	-9319
+▁chronic	-9320
+overty	-9321
+▁Pers	-9322
+▁lawyers	-9323
+▁strip	-9324
+▁gig	-9325
+▁Tell	-9326
+▁Years	-9327
+▁passes	-9328
+▁Rober	-9329
+▁instantly	-9330
+▁Unt	-9331
+▁Remove	-9332
+through	-9333
+▁retired	-9334
+▁cater	-9335
+unate	-9336
+▁shr	-9337
+▁vert	-9338
+▁coaches	-9339
+▁bes	-9340
+▁enhanced	-9341
+▁retailers	-9342
+▁disaster	-9343
+▁appreciated	-9344
+store	-9345
+▁deadline	-9346
+shop	-9347
+▁Stadium	-9348
+▁Machine	-9349
+▁Corp	-9350
+▁investing	-9351
+▁Bird	-9352
+▁Tips	-9353
+ortion	-9354
+▁hydro	-9355
+▁argument	-9356
+▁Gun	-9357
+▁gre	-9358
+▁pip	-9359
+▁transmission	-9360
+aha	-9361
+▁Wales	-9362
+▁Anyone	-9363
+▁brew	-9364
+urities	-9365
+▁Tennessee	-9366
+▁fiction	-9367
+oust	-9368
+▁Simon	-9369
+▁curriculum	-9370
+▁Better	-9371
+▁ratio	-9372
+gov	-9373
+▁mountains	-9374
+▁reply	-9375
+abil	-9376
+▁advertise	-9377
+olar	-9378
+mouth	-9379
+▁universities	-9380
+tical	-9381
+▁sacr	-9382
+▁soup	-9383
+build	-9384
+▁overnight	-9385
+▁Sab	-9386
+▁Beh	-9387
+▁iconic	-9388
+▁knee	-9389
+mo	-9390
+▁Unit	-9391
+inst	-9392
+▁tackle	-9393
+mont	-9394
+▁hyper	-9395
+▁privile	-9396
+▁thousand	-9397
+▁suddenly	-9398
+ussion	-9399
+idal	-9400
+▁Pear	-9401
+▁Lincoln	-9402
+ami	-9403
+▁CT	-9404
+erk	-9405
+▁Wire	-9406
+▁shed	-9407
+▁restore	-9408
+encing	-9409
+Pe	-9410
+▁impressed	-9411
+▁concerning	-9412
+amics	-9413
+itty	-9414
+▁followers	-9415
+iah	-9416
+▁amongst	-9417
+▁cow	-9418
+▁Commercial	-9419
+ouver	-9420
+▁Relations	-9421
+▁Communications	-9422
+sect	-9423
+▁documentation	-9424
+▁answered	-9425
+▁philosophy	-9426
+After	-9427
+▁western	-9428
+▁HR	-9429
+▁offense	-9430
+▁beautifully	-9431
+▁decline	-9432
+▁repeat	-9433
+▁abroad	-9434
+▁fundamental	-9435
+▁circle	-9436
+airy	-9437
+▁extent	-9438
+▁championship	-9439
+owned	-9440
+osit	-9441
+▁heavily	-9442
+▁explos	-9443
+▁stom	-9444
+eters	-9445
+▁compatible	-9446
+▁servers	-9447
+FT	-9448
+▁Jean	-9449
+fish	-9450
+▁Governor	-9451
+Des	-9452
+▁heritage	-9453
+▁contributed	-9454
+▁PL	-9455
+apolis	-9456
+Not	-9457
+▁credits	-9458
+▁Harris	-9459
+▁ink	-9460
+▁passengers	-9461
+▁wildlife	-9462
+▁interaction	-9463
+amon	-9464
+hyth	-9465
+sis	-9466
+▁quotes	-9467
+inth	-9468
+▁Wine	-9469
+▁describes	-9470
+rection	-9471
+ugs	-9472
+Ind	-9473
+▁Kit	-9474
+▁phenomen	-9475
+▁Gas	-9476
+▁Nature	-9477
+▁producers	-9478
+▁captured	-9479
+▁Portland	-9480
+▁Roy	-9481
+▁beds	-9482
+▁wheels	-9483
+▁adequ	-9484
+▁Springs	-9485
+▁IC	-9486
+▁gathered	-9487
+pon	-9488
+ellectual	-9489
+gged	-9490
+gorith	-9491
+▁User	-9492
+▁perm	-9493
+▁semin	-9494
+▁Delhi	-9495
+▁handy	-9496
+▁Turkey	-9497
+▁discounts	-9498
+gener	-9499
+▁Burn	-9500
+▁fruits	-9501
+ustration	-9502
+▁minds	-9503
+▁currency	-9504
+▁operational	-9505
+▁Rail	-9506
+▁salary	-9507
+▁sponsored	-9508
+▁dramatic	-9509
+▁Making	-9510
+▁applicable	-9511
+▁Ast	-9512
+▁assigned	-9513
+▁Afghan	-9514
+▁meters	-9515
+▁Morgan	-9516
+semble	-9517
+▁Fashion	-9518
+▁SO	-9519
+▁generations	-9520
+abling	-9521
+▁Several	-9522
+▁Updated	-9523
+quarters	-9524
+▁customized	-9525
+▁excellence	-9526
+▁overcome	-9527
+▁periods	-9528
+ICE	-9529
+enth	-9530
+▁mature	-9531
+▁happiness	-9532
+▁targets	-9533
+ima	-9534
+▁bedrooms	-9535
+▁rated	-9536
+▁Cart	-9537
+▁Vit	-9538
+▁Bab	-9539
+▁Justin	-9540
+▁portra	-9541
+▁moisture	-9542
+▁indicated	-9543
+▁NBA	-9544
+▁Represent	-9545
+▁Altern	-9546
+▁cann	-9547
+▁chairman	-9548
+▁Rap	-9549
+▁continuous	-9550
+▁Judge	-9551
+FO	-9552
+▁compact	-9553
+▁Oklahoma	-9554
+▁empower	-9555
+▁template	-9556
+▁divorce	-9557
+iere	-9558
+enna	-9559
+▁Safe	-9560
+▁representing	-9561
+▁Results	-9562
+▁dancing	-9563
+▁Doctor	-9564
+▁vanilla	-9565
+ancouver	-9566
+▁Cow	-9567
+▁scratch	-9568
+▁conversations	-9569
+▁hospitals	-9570
+▁alter	-9571
+▁Alabama	-9572
+olds	-9573
+▁moon	-9574
+Res	-9575
+offs	-9576
+▁accommodate	-9577
+▁stack	-9578
+▁sleeping	-9579
+▁Tan	-9580
+▁medication	-9581
+▁Authority	-9582
+▁meaningful	-9583
+▁Dress	-9584
+racy	-9585
+▁sectors	-9586
+▁actively	-9587
+▁Easter	-9588
+▁scar	-9589
+▁lasting	-9590
+▁harvest	-9591
+▁Rh	-9592
+▁observed	-9593
+▁apple	-9594
+▁Lat	-9595
+rot	-9596
+olid	-9597
+▁terr	-9598
+▁races	-9599
+ibl	-9600
+▁Lam	-9601
+▁drum	-9602
+uition	-9603
+▁Balt	-9604
+atal	-9605
+▁Getting	-9606
+ITY	-9607
+iate	-9608
+▁rehab	-9609
+▁okay	-9610
+orse	-9611
+▁gambling	-9612
+▁reserve	-9613
+▁hub	-9614
+▁voting	-9615
+▁elections	-9616
+May	-9617
+▁nail	-9618
+▁compan	-9619
+▁excitement	-9620
+▁impression	-9621
+▁Coc	-9622
+eman	-9623
+▁cust	-9624
+▁Salt	-9625
+▁Dor	-9626
+▁entries	-9627
+works	-9628
+rix	-9629
+▁potatoes	-9630
+▁wider	-9631
+▁Chel	-9632
+▁comic	-9633
+▁shade	-9634
+agan	-9635
+itzer	-9636
+▁ware	-9637
+▁entertaining	-9638
+▁Democrats	-9639
+▁evolution	-9640
+ropical	-9641
+▁rescue	-9642
+▁straw	-9643
+▁bell	-9644
+this	-9645
+▁protocol	-9646
+▁Cher	-9647
+▁lun	-9648
+▁supporters	-9649
+▁smoking	-9650
+ika	-9651
+▁alike	-9652
+▁Deep	-9653
+▁pushed	-9654
+▁Move	-9655
+▁coconut	-9656
+▁Affairs	-9657
+igen	-9658
+▁existence	-9659
+▁Cash	-9660
+▁Atlantic	-9661
+oenix	-9662
+▁Rout	-9663
+aware	-9664
+▁chip	-9665
+▁hun	-9666
+▁remem	-9667
+ige	-9668
+▁eliminate	-9669
+DC	-9670
+▁Andy	-9671
+town	-9672
+▁pushing	-9673
+▁responses	-9674
+▁Chairman	-9675
+aux	-9676
+▁pour	-9677
+▁opposition	-9678
+oper	-9679
+▁Plant	-9680
+▁heaven	-9681
+▁stops	-9682
+▁inquir	-9683
+▁mirror	-9684
+Welcome	-9685
+▁developments	-9686
+▁Navy	-9687
+iameter	-9688
+▁Utah	-9689
+▁publishing	-9690
+▁Holiday	-9691
+opping	-9692
+▁frozen	-9693
+▁Imp	-9694
+?!	-9695
+▁Aqu	-9696
+▁amenities	-9697
+▁sheets	-9698
+▁celebrating	-9699
+▁substantial	-9700
+▁Malays	-9701
+▁objectives	-9702
+▁generous	-9703
+▁withdraw	-9704
+▁accord	-9705
+▁fired	-9706
+▁Py	-9707
+▁chemicals	-9708
+was	-9709
+▁viewers	-9710
+▁Agr	-9711
+▁Moore	-9712
+▁Ped	-9713
+▁financing	-9714
+DP	-9715
+▁processed	-9716
+▁Budd	-9717
+▁flaw	-9718
+▁chairs	-9719
+▁thoroughly	-9720
+▁wound	-9721
+▁completing	-9722
+▁asks	-9723
+Ed	-9724
+vo	-9725
+▁ceiling	-9726
+▁Lane	-9727
+▁relative	-9728
+apped	-9729
+▁robust	-9730
+▁Virt	-9731
+▁actors	-9732
+▁Bron	-9733
+▁regulatory	-9734
+▁hunting	-9735
+▁recall	-9736
+boards	-9737
+▁gal	-9738
+attan	-9739
+mart	-9740
+▁surrounded	-9741
+uild	-9742
+children	-9743
+▁physician	-9744
+▁Besides	-9745
+▁accompan	-9746
+BO	-9747
+▁instruments	-9748
+▁lawn	-9749
+rer	-9750
+▁buttons	-9751
+▁preview	-9752
+▁extraordinary	-9753
+▁Version	-9754
+▁precious	-9755
+▁Melbourne	-9756
+▁protecting	-9757
+▁diagnosis	-9758
+▁engineers	-9759
+▁emotion	-9760
+▁folder	-9761
+prof	-9762
+▁relaxing	-9763
+▁virtually	-9764
+▁spokesman	-9765
+ridge	-9766
+papers	-9767
+▁habits	-9768
+Rep	-9769
+▁fib	-9770
+▁specialists	-9771
+▁Works	-9772
+maker	-9773
+ez	-9774
+▁courts	-9775
+osa	-9776
+▁Nothing	-9777
+unicipal	-9778
+▁Kentucky	-9779
+▁Cub	-9780
+otion	-9781
+▁Mand	-9782
+▁padd	-9783
+▁caps	-9784
+▁soccer	-9785
+▁penalty	-9786
+▁encouraging	-9787
+elson	-9788
+mann	-9789
+▁oils	-9790
+▁hal	-9791
+ocument	-9792
+▁Highway	-9793
+▁fingers	-9794
+▁exclusively	-9795
+▁cooperation	-9796
+elve	-9797
+▁”	-9798
+▁latter	-9799
+▁Gall	-9800
+▁prev	-9801
+▁TX	-9802
+▁Casino	-9803
+▁Bureau	-9804
+▁desper	-9805
+organ	-9806
+▁seal	-9807
+▁proved	-9808
+▁Mission	-9809
+http	-9810
+▁balanced	-9811
+Us	-9812
+▁theater	-9813
+bean	-9814
+▁designing	-9815
+▁Original	-9816
+▁evaluate	-9817
+▁buyer	-9818
+▁vulnerable	-9819
+,'	-9820
+▁libr	-9821
+omed	-9822
+▁surely	-9823
+▁Missouri	-9824
+▁sixth	-9825
+▁backup	-9826
+▁scope	-9827
+▁ward	-9828
+▁resulted	-9829
+▁prevention	-9830
+▁complaint	-9831
+▁Answ	-9832
+▁Legal	-9833
+▁qualify	-9834
+asty	-9835
+▁module	-9836
+break	-9837
+▁dil	-9838
+▁spell	-9839
+▁fur	-9840
+▁Jacob	-9841
+▁Meeting	-9842
+▁subscribe	-9843
+▁Phill	-9844
+▁seller	-9845
+▁cabinet	-9846
+▁snap	-9847
+▁Dance	-9848
+These	-9849
+▁DNA	-9850
+UC	-9851
+iology	-9852
+▁Economic	-9853
+atern	-9854
+▁da	-9855
+▁bug	-9856
+▁Kate	-9857
+liances	-9858
+▁Tai	-9859
+▁tremend	-9860
+PP	-9861
+▁discovery	-9862
+%)	-9863
+▁listings	-9864
+▁Chap	-9865
+▁integrity	-9866
+▁showcase	-9867
+ACK	-9868
+▁rarely	-9869
+laim	-9870
+isms	-9871
+▁cinem	-9872
+▁dismiss	-9873
+eness	-9874
+▁discrim	-9875
+▁refe	-9876
+▁resolve	-9877
+▁Certified	-9878
+▁toler	-9879
+▁USD	-9880
+?)	-9881
+▁Gary	-9882
+▁exch	-9883
+▁musicians	-9884
+nut	-9885
+▁tennis	-9886
+▁flavors	-9887
+▁Moh	-9888
+▁guides	-9889
+upid	-9890
+▁suspend	-9891
+▁contents	-9892
+▁divided	-9893
+▁consulting	-9894
+▁	-9895
+e	-9896
+t	-9897
+a	-9898
+o	-9899
+i	-9900
+n	-9901
+s	-9902
+r	-9903
+h	-9904
+l	-9905
+d	-9906
+c	-9907
+u	-9908
+m	-9909
+p	-9910
+g	-9911
+f	-9912
+y	-9913
+w	-9914
+b	-9915
+.	-9916
+v	-9917
+,	-9918
+k	-9919
+T	-9920
+S	-9921
+I	-9922
+A	-9923
+-	-9924
+C	-9925
+0	-9926
+M	-9927
+1	-9928
+P	-9929
+x	-9930
+B	-9931
+2	-9932
+W	-9933
+D	-9934
+R	-9935
+E	-9936
+H	-9937
+F	-9938
+’	-9939
+L	-9940
+N	-9941
+O	-9942
+:	-9943
+'	-9944
+G	-9945
+j	-9946
+)	-9947
+(	-9948
+z	-9949
+3	-9950
+5	-9951
+q	-9952
+4	-9953
+U	-9954
+"	-9955
+9	-9956
+J	-9957
+8	-9958
+6	-9959
+V	-9960
+Y	-9961
+K	-9962
+|	-9963
+7	-9964
+!	-9965
+/	-9966
+“	-9967
+”	-9968
+?	-9969
+–	-9970
+;	-9971
+&	-9972
+$	-9973
+Q	-9974
+%	-9975
+—	-9976
+X	-9977
+Z	-9978
+*	-9979
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model
new file mode 100644
index 0000000000..4ca84a37bc
Binary files /dev/null and b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model differ
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.vocab b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.vocab
new file mode 100644
index 0000000000..8981f56d2c
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.vocab
@@ -0,0 +1,10240 @@
+<pad>	0
+<s>	0
+</s>	0
+<unk>	0
+	0
+	0
+	0
+	0
+<0x00>	0
+<0x01>	0
+<0x02>	0
+<0x03>	0
+<0x04>	0
+<0x05>	0
+<0x06>	0
+<0x07>	0
+<0x08>	0
+<0x09>	0
+<0x0A>	0
+<0x0B>	0
+<0x0C>	0
+<0x0D>	0
+<0x0E>	0
+<0x0F>	0
+<0x10>	0
+<0x11>	0
+<0x12>	0
+<0x13>	0
+<0x14>	0
+<0x15>	0
+<0x16>	0
+<0x17>	0
+<0x18>	0
+<0x19>	0
+<0x1A>	0
+<0x1B>	0
+<0x1C>	0
+<0x1D>	0
+<0x1E>	0
+<0x1F>	0
+<0x20>	0
+<0x21>	0
+<0x22>	0
+<0x23>	0
+<0x24>	0
+<0x25>	0
+<0x26>	0
+<0x27>	0
+<0x28>	0
+<0x29>	0
+<0x2A>	0
+<0x2B>	0
+<0x2C>	0
+<0x2D>	0
+<0x2E>	0
+<0x2F>	0
+<0x30>	0
+<0x31>	0
+<0x32>	0
+<0x33>	0
+<0x34>	0
+<0x35>	0
+<0x36>	0
+<0x37>	0
+<0x38>	0
+<0x39>	0
+<0x3A>	0
+<0x3B>	0
+<0x3C>	0
+<0x3D>	0
+<0x3E>	0
+<0x3F>	0
+<0x40>	0
+<0x41>	0
+<0x42>	0
+<0x43>	0
+<0x44>	0
+<0x45>	0
+<0x46>	0
+<0x47>	0
+<0x48>	0
+<0x49>	0
+<0x4A>	0
+<0x4B>	0
+<0x4C>	0
+<0x4D>	0
+<0x4E>	0
+<0x4F>	0
+<0x50>	0
+<0x51>	0
+<0x52>	0
+<0x53>	0
+<0x54>	0
+<0x55>	0
+<0x56>	0
+<0x57>	0
+<0x58>	0
+<0x59>	0
+<0x5A>	0
+<0x5B>	0
+<0x5C>	0
+<0x5D>	0
+<0x5E>	0
+<0x5F>	0
+<0x60>	0
+<0x61>	0
+<0x62>	0
+<0x63>	0
+<0x64>	0
+<0x65>	0
+<0x66>	0
+<0x67>	0
+<0x68>	0
+<0x69>	0
+<0x6A>	0
+<0x6B>	0
+<0x6C>	0
+<0x6D>	0
+<0x6E>	0
+<0x6F>	0
+<0x70>	0
+<0x71>	0
+<0x72>	0
+<0x73>	0
+<0x74>	0
+<0x75>	0
+<0x76>	0
+<0x77>	0
+<0x78>	0
+<0x79>	0
+<0x7A>	0
+<0x7B>	0
+<0x7C>	0
+<0x7D>	0
+<0x7E>	0
+<0x7F>	0
+<0x80>	0
+<0x81>	0
+<0x82>	0
+<0x83>	0
+<0x84>	0
+<0x85>	0
+<0x86>	0
+<0x87>	0
+<0x88>	0
+<0x89>	0
+<0x8A>	0
+<0x8B>	0
+<0x8C>	0
+<0x8D>	0
+<0x8E>	0
+<0x8F>	0
+<0x90>	0
+<0x91>	0
+<0x92>	0
+<0x93>	0
+<0x94>	0
+<0x95>	0
+<0x96>	0
+<0x97>	0
+<0x98>	0
+<0x99>	0
+<0x9A>	0
+<0x9B>	0
+<0x9C>	0
+<0x9D>	0
+<0x9E>	0
+<0x9F>	0
+<0xA0>	0
+<0xA1>	0
+<0xA2>	0
+<0xA3>	0
+<0xA4>	0
+<0xA5>	0
+<0xA6>	0
+<0xA7>	0
+<0xA8>	0
+<0xA9>	0
+<0xAA>	0
+<0xAB>	0
+<0xAC>	0
+<0xAD>	0
+<0xAE>	0
+<0xAF>	0
+<0xB0>	0
+<0xB1>	0
+<0xB2>	0
+<0xB3>	0
+<0xB4>	0
+<0xB5>	0
+<0xB6>	0
+<0xB7>	0
+<0xB8>	0
+<0xB9>	0
+<0xBA>	0
+<0xBB>	0
+<0xBC>	0
+<0xBD>	0
+<0xBE>	0
+<0xBF>	0
+<0xC0>	0
+<0xC1>	0
+<0xC2>	0
+<0xC3>	0
+<0xC4>	0
+<0xC5>	0
+<0xC6>	0
+<0xC7>	0
+<0xC8>	0
+<0xC9>	0
+<0xCA>	0
+<0xCB>	0
+<0xCC>	0
+<0xCD>	0
+<0xCE>	0
+<0xCF>	0
+<0xD0>	0
+<0xD1>	0
+<0xD2>	0
+<0xD3>	0
+<0xD4>	0
+<0xD5>	0
+<0xD6>	0
+<0xD7>	0
+<0xD8>	0
+<0xD9>	0
+<0xDA>	0
+<0xDB>	0
+<0xDC>	0
+<0xDD>	0
+<0xDE>	0
+<0xDF>	0
+<0xE0>	0
+<0xE1>	0
+<0xE2>	0
+<0xE3>	0
+<0xE4>	0
+<0xE5>	0
+<0xE6>	0
+<0xE7>	0
+<0xE8>	0
+<0xE9>	0
+<0xEA>	0
+<0xEB>	0
+<0xEC>	0
+<0xED>	0
+<0xEE>	0
+<0xEF>	0
+<0xF0>	0
+<0xF1>	0
+<0xF2>	0
+<0xF3>	0
+<0xF4>	0
+<0xF5>	0
+<0xF6>	0
+<0xF7>	0
+<0xF8>	0
+<0xF9>	0
+<0xFA>	0
+<0xFB>	0
+<0xFC>	0
+<0xFD>	0
+<0xFE>	0
+<0xFF>	0
+▁t	-0
+▁a	-1
+in	-2
+he	-3
+re	-4
+on	-5
+er	-6
+▁the	-7
+▁s	-8
+▁w	-9
+or	-10
+at	-11
+nd	-12
+it	-13
+ou	-14
+▁c	-15
+is	-16
+es	-17
+en	-18
+▁f	-19
+▁b	-20
+ing	-21
+▁p	-22
+▁o	-23
+an	-24
+ed	-25
+al	-26
+ar	-27
+▁to	-28
+▁m	-29
+▁and	-30
+▁in	-31
+▁of	-32
+le	-33
+as	-34
+ic	-35
+▁d	-36
+om	-37
+▁h	-38
+ion	-39
+▁th	-40
+il	-41
+st	-42
+ent	-43
+▁l	-44
+ro	-45
+ve	-46
+▁re	-47
+▁g	-48
+▁e	-49
+▁n	-50
+▁y	-51
+et	-52
+ac	-53
+the	-54
+ay	-55
+▁on	-56
+id	-57
+am	-58
+el	-59
+▁for	-60
+▁is	-61
+▁you	-62
+ot	-63
+ol	-64
+ow	-65
+ad	-66
+▁be	-67
+ly	-68
+ig	-69
+us	-70
+im	-71
+ch	-72
+ct	-73
+ver	-74
+ut	-75
+ith	-76
+▁st	-77
+ation	-78
+th	-79
+un	-80
+▁that	-81
+ir	-82
+▁with	-83
+se	-84
+ur	-85
+▁it	-86
+ce	-87
+ill	-88
+ter	-89
+▁he	-90
+if	-91
+ul	-92
+▁al	-93
+ke	-94
+▁an	-95
+▁as	-96
+oo	-97
+▁wh	-98
+ag	-99
+ers	-100
+▁are	-101
+▁pro	-102
+our	-103
+▁at	-104
+ra	-105
+out	-106
+▁we	-107
+em	-108
+▁or	-109
+ore	-110
+ri	-111
+▁ha	-112
+op	-113
+est	-114
+▁con	-115
+ew	-116
+▁com	-117
+up	-118
+igh	-119
+ess	-120
+▁(	-121
+ate	-122
+▁se	-123
+ab	-124
+rom	-125
+ld	-126
+ts	-127
+▁-	-128
+pe	-129
+all	-130
+um	-131
+▁r	-132
+and	-133
+ure	-134
+▁this	-135
+▁was	-136
+qu	-137
+ist	-138
+▁ex	-139
+ity	-140
+ive	-141
+ment	-142
+oc	-143
+▁by	-144
+▁v	-145
+▁from	-146
+art	-147
+▁de	-148
+os	-149
+ust	-150
+▁sh	-151
+▁will	-152
+▁your	-153
+▁have	-154
+ies	-155
+ant	-156
+ort	-157
+ain	-158
+..	-159
+ud	-160
+▁ch	-161
+▁us	-162
+pp	-163
+▁not	-164
+res	-165
+iv	-166
+com	-167
+▁can	-168
+te	-169
+ard	-170
+▁le	-171
+ost	-172
+for	-173
+ell	-174
+ack	-175
+▁j	-176
+ther	-177
+ould	-178
+ight	-179
+ge	-180
+we	-181
+red	-182
+gh	-183
+ast	-184
+od	-185
+ine	-186
+ear	-187
+ide	-188
+▁wor	-189
+▁ab	-190
+rou	-191
+fe	-192
+▁pl	-193
+ap	-194
+ial	-195
+nt	-196
+ice	-197
+ook	-198
+▁has	-199
+ind	-200
+cl	-201
+pl	-202
+▁all	-203
+▁do	-204
+ub	-205
+ome	-206
+ak	-207
+so	-208
+▁me	-209
+▁"	-210
+one	-211
+ry	-212
+▁ne	-213
+ame	-214
+▁k	-215
+▁ad	-216
+▁up	-217
+▁sa	-218
+▁but	-219
+ber	-220
+og	-221
+ake	-222
+▁whe	-223
+are	-224
+▁more	-225
+ip	-226
+ite	-227
+to	-228
+act	-229
+▁out	-230
+ia	-231
+her	-232
+ime	-233
+ally	-234
+ions	-235
+per	-236
+age	-237
+ie	-238
+ated	-239
+▁comp	-240
+ail	-241
+▁one	-242
+ks	-243
+▁so	-244
+very	-245
+▁cl	-246
+able	-247
+▁en	-248
+▁their	-249
+iz	-250
+ong	-251
+▁li	-252
+▁te	-253
+ood	-254
+▁they	-255
+port	-256
+ich	-257
+ary	-258
+▁his	-259
+ase	-260
+ace	-261
+▁about	-262
+▁cont	-263
+und	-264
+ass	-265
+▁off	-266
+▁our	-267
+av	-268
+ick	-269
+con	-270
+uc	-271
+ree	-272
+▁who	-273
+▁ac	-274
+ans	-275
+day	-276
+▁my	-277
+ue	-278
+ire	-279
+ign	-280
+ph	-281
+pt	-282
+ous	-283
+erv	-284
+ib	-285
+▁year	-286
+ents	-287
+ff	-288
+▁new	-289
+any	-290
+▁un	-291
+de	-292
+▁go	-293
+▁im	-294
+ne	-295
+ance	-296
+▁“	-297
+▁res	-298
+ach	-299
+▁said	-300
+ction	-301
+vel	-302
+▁per	-303
+own	-304
+form	-305
+ark	-306
+▁time	-307
+▁any	-308
+sh	-309
+ile	-310
+▁ag	-311
+▁qu	-312
+reat	-313
+▁other	-314
+▁tr	-315
+this	-316
+ven	-317
+▁also	-318
+▁like	-319
+▁get	-320
+ays	-321
+ount	-322
+▁some	-323
+che	-324
+ings	-325
+▁dis	-326
+▁over	-327
+...	-328
+ob	-329
+ical	-330
+ations	-331
+ough	-332
+ple	-333
+ult	-334
+▁when	-335
+▁ar	-336
+ord	-337
+▁sp	-338
+pro	-339
+wh	-340
+ors	-341
+▁them	-342
+▁would	-343
+▁which	-344
+vers	-345
+▁just	-346
+gr	-347
+▁what	-348
+▁work	-349
+irst	-350
+▁pe	-351
+▁part	-352
+vi	-353
+ren	-354
+ence	-355
+ound	-356
+▁app	-357
+▁pre	-358
+ove	-359
+▁its	-360
+ru	-361
+du	-362
+ress	-363
+ang	-364
+▁if	-365
+▁fe	-366
+ition	-367
+▁than	-368
+now	-369
+cent	-370
+be	-371
+ose	-372
+▁been	-373
+cess	-374
+row	-375
+ting	-376
+▁were	-377
+▁spe	-378
+ll	-379
+ish	-380
+oin	-381
+▁tw	-382
+ild	-383
+old	-384
+▁sc	-385
+ens	-386
+▁had	-387
+uch	-388
+clud	-389
+fter	-390
+▁first	-391
+ink	-392
+ft	-393
+ory	-394
+▁ro	-395
+▁her	-396
+▁there	-397
+▁am	-398
+ople	-399
+▁into	-400
+but	-401
+▁pr	-402
+new	-403
+lic	-404
+ep	-405
+ov	-406
+▁need	-407
+use	-408
+oll	-409
+you	-410
+ex	-411
+▁how	-412
+▁no	-413
+iss	-414
+oy	-415
+ade	-416
+ning	-417
+mon	-418
+ool	-419
+ater	-420
+end	-421
+sc	-422
+ian	-423
+▁add	-424
+under	-425
+rough	-426
+▁only	-427
+▁des	-428
+▁look	-429
+▁people	-430
+itt	-431
+als	-432
+▁two	-433
+ely	-434
+ful	-435
+▁play	-436
+▁includ	-437
+ca	-438
+man	-439
+▁make	-440
+ons	-441
+arch	-442
+ates	-443
+lp	-444
+▁ph	-445
+▁bl	-446
+▁sup	-447
+▁see	-448
+hing	-449
+▁she	-450
+ities	-451
+br	-452
+▁$	-453
+▁back	-454
+▁prov	-455
+tern	-456
+ict	-457
+▁man	-458
+▁em	-459
+aw	-460
+▁|	-461
+anc	-462
+ug	-463
+sp	-464
+get	-465
+▁every	-466
+bl	-467
+▁after	-468
+▁may	-469
+▁fl	-470
+ices	-471
+round	-472
+▁comm	-473
+▁under	-474
+ck	-475
+▁help	-476
+je	-477
+co	-478
+enc	-479
+▁use	-480
+xt	-481
+cts	-482
+▁bet	-483
+▁col	-484
+▁stud	-485
+▁gu	-486
+▁serv	-487
+▁well	-488
+urn	-489
+wn	-490
+formation	-491
+ual	-492
+▁reg	-493
+rent	-494
+ife	-495
+fr	-496
+▁act	-497
+▁most	-498
+ock	-499
+ating	-500
+ty	-501
+ok	-502
+▁know	-503
+▁through	-504
+ments	-505
+ax	-506
+▁years	-507
+view	-508
+fore	-509
+▁sm	-510
+▁rec	-511
+oun	-512
+▁br	-513
+lect	-514
+ures	-515
+ily	-516
+▁bu	-517
+tr	-518
+les	-519
+▁co	-520
+iness	-521
+cial	-522
+hed	-523
+read	-524
+▁could	-525
+air	-526
+olog	-527
+).	-528
+ason	-529
+ix	-530
+▁don	-531
+▁these	-532
+ational	-533
+ics	-534
+▁want	-535
+yst	-536
+▁ass	-537
+▁inv	-538
+▁now	-539
+▁&	-540
+atch	-541
+comm	-542
+▁show	-543
+▁ev	-544
+ased	-545
+▁loc	-546
+fl	-547
+ise	-548
+▁down	-549
+ms	-550
+▁ke	-551
+line	-552
+▁wee	-553
+ork	-554
+▁even	-555
+▁information	-556
+▁–	-557
+amp	-558
+ets	-559
+ited	-560
+erson	-561
+velop	-562
+amer	-563
+ys	-564
+ton	-565
+ific	-566
+chool	-567
+usiness	-568
+whe	-569
+ible	-570
+▁pos	-571
+▁rem	-572
+count	-573
+pr	-574
+ath	-575
+▁here	-576
+der	-577
+▁fin	-578
+ident	-579
+▁game	-580
+ert	-581
+uring	-582
+▁cons	-583
+cre	-584
+other	-585
+▁many	-586
+ble	-587
+▁does	-588
+ps	-589
+ular	-590
+rain	-591
+▁dif	-592
+▁min	-593
+ied	-594
+gram	-595
+▁mem	-596
+▁did	-597
+▁good	-598
+▁very	-599
+its	-600
+▁great	-601
+▁should	-602
+oth	-603
+▁ent	-604
+ience	-605
+ystem	-606
+io	-607
+▁find	-608
+▁bel	-609
+rest	-610
+▁way	-611
+."	-612
+bs	-613
+elf	-614
+not	-615
+▁day	-616
+▁read	-617
+▁call	-618
+no	-619
+▁pub	-620
+▁where	-621
+med	-622
+aking	-623
+ange	-624
+cause	-625
+▁last	-626
+▁produ	-627
+▁high	-628
+),	-629
+way	-630
+omet	-631
+▁ind	-632
+▁again	-633
+let	-634
+land	-635
+▁car	-636
+▁him	-637
+▁rel	-638
+my	-639
+col	-640
+▁exper	-641
+fect	-642
+part	-643
+▁start	-644
+▁cr	-645
+▁inst	-646
+me	-647
+▁bo	-648
+alth	-649
+▁own	-650
+ility	-651
+ues	-652
+ious	-653
+▁made	-654
+▁take	-655
+ner	-656
+▁wr	-657
+▁mon	-658
+ave	-659
+eng	-660
+por	-661
+▁che	-662
+▁hand	-663
+||	-664
+min	-665
+▁long	-666
+▁then	-667
+▁med	-668
+▁before	-669
+oh	-670
+pect	-671
+pen	-672
+▁fam	-673
+ages	-674
+ward	-675
+▁each	-676
+▁business	-677
+▁book	-678
+ollow	-679
+▁think	-680
+ights	-681
+ah	-682
+ener	-683
+▁sub	-684
+riend	-685
+▁while	-686
+ron	-687
+ww	-688
+gan	-689
+ull	-690
+ince	-691
+ss	-692
+ters	-693
+▁team	-694
+ness	-695
+▁sur	-696
+ivers	-697
+▁char	-698
+how	-699
+▁gr	-700
+ient	-701
+▁pres	-702
+▁those	-703
+pos	-704
+feren	-705
+▁free	-706
+▁because	-707
+ower	-708
+▁design	-709
+▁set	-710
+▁win	-711
+▁develop	-712
+▁fun	-713
+▁sim	-714
+ative	-715
+▁program	-716
+▁ed	-717
+bers	-718
+▁mark	-719
+▁pol	-720
+▁still	-721
+▁end	-722
+▁mod	-723
+▁much	-724
+ier	-725
+they	-726
+ement	-727
+▁inte	-728
+▁world	-729
+cur	-730
+▁run	-731
+americ	-732
+ale	-733
+riv	-734
+ives	-735
+oint	-736
+▁person	-737
+▁provid	-738
+ank	-739
+ger	-740
+▁somet	-741
+▁system	-742
+ars	-743
+▁sk	-744
+▁life	-745
+ames	-746
+▁art	-747
+au	-748
+bo	-749
+els	-750
+▁av	-751
+cond	-752
+unity	-753
+▁being	-754
+▁i	-755
+fin	-756
+irect	-757
+mar	-758
+▁right	-759
+▁used	-760
+ash	-761
+▁vis	-762
+▁though	-763
+,"	-764
+▁dec	-765
+▁ser	-766
+▁home	-767
+chn	-768
+ond	-769
+▁found	-770
+arn	-771
+eting	-772
+ittle	-773
+app	-774
+alk	-775
+ouse	-776
+▁requ	-777
+▁really	-778
+▁sl	-779
+▁feat	-780
+ants	-781
+az	-782
+aut	-783
+▁ra	-784
+arm	-785
+▁lead	-786
+▁such	-787
+▁cre	-788
+▁ret	-789
+lease	-790
+come	-791
+ology	-792
+▁post	-793
+▁gl	-794
+▁best	-795
+ene	-796
+iel	-797
+roup	-798
+▁adv	-799
+ins	-800
+may	-801
+ject	-802
+▁got	-803
+▁three	-804
+she	-805
+ality	-806
+▁same	-807
+▁too	-808
+there	-809
+▁support	-810
+ann	-811
+uck	-812
+akes	-813
+▁sign	-814
+▁going	-815
+att	-816
+ize	-817
+▁top	-818
+▁month	-819
+▁around	-820
+▁num	-821
+▁love	-822
+ailable	-823
+▁att	-824
+▁little	-825
+.”	-826
+that	-827
+▁det	-828
+ather	-829
+▁open	-830
+int	-831
+▁follow	-832
+what	-833
+▁season	-834
+off	-835
+▁vi	-836
+ross	-837
+▁cur	-838
+▁eas	-839
+▁pur	-840
+ional	-841
+▁week	-842
+▁trans	-843
+oot	-844
+istr	-845
+iving	-846
+search	-847
+▁company	-848
+book	-849
+▁put	-850
+rand	-851
+reen	-852
+▁form	-853
+by	-854
+ours	-855
+do	-856
+▁—	-857
+car	-858
+ety	-859
+▁str	-860
+ital	-861
+▁phot	-862
+▁few	-863
+ember	-864
+▁rece	-865
+▁say	-866
+▁site	-867
+▁inter	-868
+of	-869
+ces	-870
+ems	-871
+ield	-872
+▁cap	-873
+press	-874
+ually	-875
+ferent	-876
+▁count	-877
+ference	-878
+ists	-879
+▁care	-880
+▁happ	-881
+ves	-882
+▁fil	-883
+▁prof	-884
+pm	-885
+ior	-886
+▁big	-887
+▁report	-888
+imes	-889
+▁using	-890
+▁second	-891
+▁during	-892
+▁school	-893
+ley	-894
+ody	-895
+ished	-896
+▁disc	-897
+▁friend	-898
+▁available	-899
+▁including	-900
+▁.	-901
+self	-902
+ving	-903
+ustom	-904
+▁next	-905
+▁rest	-906
+▁different	-907
+▁something	-908
+when	-909
+▁since	-910
+▁better	-911
+▁online	-912
+go	-913
+less	-914
+serv	-915
+with	-916
+work	-917
+▁resp	-918
+▁offer	-919
+vir	-920
+ways	-921
+▁'	-922
+gin	-923
+nor	-924
+ized	-925
+ween	-926
+▁let	-927
+▁public	-928
+ats	-929
+uss	-930
+▁tri	-931
+iting	-932
+thing	-933
+▁real	-934
+iversity	-935
+,”	-936
+cer	-937
+ever	-938
+pres	-939
+▁ext	-940
+▁lar	-941
+▁tra	-942
+▁won	-943
+▁head	-944
+aj	-945
+▁*	-946
+atur	-947
+raph	-948
+▁come	-949
+▁list	-950
+▁family	-951
+▁students	-952
+ten	-953
+cept	-954
+▁old	-955
+gl	-956
+▁keep	-957
+▁techn	-958
+oci	-959
+▁cle	-960
+▁ins	-961
+▁both	-962
+▁give	-963
+▁pass	-964
+▁build	-965
+▁today	-966
+here	-967
+most	-968
+oad	-969
+▁four	-970
+▁small	-971
+▁things	-972
+tt	-973
+▁cor	-974
+illion	-975
+▁between	-976
+▁ef	-977
+ines	-978
+ired	-979
+▁poss	-980
+▁partic	-981
+stand	-982
+▁interest	-983
+duc	-984
+more	-985
+ounc	-986
+▁state	-987
+ved	-988
+▁ap	-989
+ship	-990
+▁acc	-991
+▁sit	-992
+que	-993
+vent	-994
+▁allow	-995
+▁manag	-996
+▁quest	-997
+leg	-998
+outh	-999
+▁line	-1000
+▁review	-1001
+▁another	-1002
+ym	-1003
+▁--	-1004
+▁ann	-1005
+▁aut	-1006
+▁direct	-1007
+▁import	-1008
+▁perform	-1009
+aff	-1010
+son	-1011
+spe	-1012
+olut	-1013
+uman	-1014
+▁iss	-1015
+▁cent	-1016
+ball	-1017
+ideo	-1018
+▁might	-1019
+▁invest	-1020
+▁lot	-1021
+▁child	-1022
+▁record	-1023
+!!	-1024
+dr	-1025
+ata	-1026
+pre	-1027
+til	-1028
+ploy	-1029
+▁met	-1030
+▁opp	-1031
+▁sure	-1032
+ird	-1033
+▁pop	-1034
+▁cour	-1035
+▁special	-1036
+gu	-1037
+par	-1038
+▁del	-1039
+▁try	-1040
+▁access	-1041
+pub	-1042
+rap	-1043
+▁days	-1044
+▁grow	-1045
+ission	-1046
+▁contin	-1047
+ott	-1048
+ric	-1049
+▁ty	-1050
+back	-1051
+fort	-1052
+▁ask	-1053
+▁men	-1054
+▁full	-1055
+▁gener	-1056
+hes	-1057
+▁el	-1058
+state	-1059
+▁area	-1060
+▁main	-1061
+ertain	-1062
+oney	-1063
+ored	-1064
+▁cost	-1065
+▁news	-1066
+▁local	-1067
+▁organ	-1068
+▁custom	-1069
+fer	-1070
+uff	-1071
+ards	-1072
+ched	-1073
+tend	-1074
+▁...	-1075
+▁cut	-1076
+▁ear	-1077
+▁process	-1078
+▁experience	-1079
+tw	-1080
+alf	-1081
+cor	-1082
+ike	-1083
+▁job	-1084
+▁must	-1085
+▁test	-1086
+▁appro	-1087
+▁place	-1088
+ode	-1089
+ared	-1090
+comp	-1091
+iday	-1092
+▁health	-1093
+▁against	-1094
+▁members	-1095
+ipp	-1096
+led	-1097
+ruct	-1098
+▁hard	-1099
+ium	-1100
+▁always	-1101
+ids	-1102
+ample	-1103
+ision	-1104
+▁spec	-1105
+▁view	-1106
+▁stand	-1107
+university	-1108
+ware	-1109
+ift	-1110
+jan	-1111
+ney	-1112
+ator	-1113
+▁less	-1114
+▁proble	-1115
+▁provide	-1116
+ari	-1117
+ains	-1118
+ency	-1119
+ense	-1120
+itor	-1121
+play	-1122
+▁wom	-1123
+▁vers	-1124
+add	-1125
+ope	-1126
+face	-1127
+just	-1128
+lection	-1129
+sw	-1130
+ider	-1131
+ults	-1132
+year	-1133
+▁tax	-1134
+▁minut	-1135
+▁never	-1136
+▁proper	-1137
+gest	-1138
+ills	-1139
+▁bro	-1140
+▁int	-1141
+▁mov	-1142
+▁par	-1143
+aster	-1144
+ribut	-1145
+▁feel	-1146
+▁level	-1147
+▁order	-1148
+inal	-1149
+▁cou	-1150
+▁sal	-1151
+▁appe	-1152
+▁incre	-1153
+▁please	-1154
+▁services	-1155
+cri	-1156
+hip	-1157
+hor	-1158
+ised	-1159
+joy	-1160
+mark	-1161
+▁ide	-1162
+▁pay	-1163
+ically	-1164
+intern	-1165
+▁chang	-1166
+▁night	-1167
+▁number	-1168
+▁looking	-1169
+gy	-1170
+net	-1171
+ches	-1172
+▁word	-1173
+▁until	-1174
+inc	-1175
+reg	-1176
+time	-1177
+ature	-1178
+partment	-1179
+la	-1180
+mer	-1181
+ohn	-1182
+▁kn	-1183
+augh	-1184
+▁bre	-1185
+ately	-1186
+please	-1187
+▁activ	-1188
+▁>	-1189
+cal	-1190
+lin	-1191
+uture	-1192
+▁conf	-1193
+mc	-1194
+ral	-1195
+ste	-1196
+mber	-1197
+▁exc	-1198
+▁group	-1199
+▁recent	-1200
+▁percent	-1201
+▁service	-1202
+post	-1203
+▁dri	-1204
+▁contact	-1205
+sm	-1206
+ged	-1207
+opy	-1208
+ajor	-1209
+vern	-1210
+▁fac	-1211
+▁mus	-1212
+▁val	-1213
+▁plan	-1214
+itions	-1215
+▁enough	-1216
+sy	-1217
+can	-1218
+▁ob	-1219
+▁eng	-1220
+▁says	-1221
+ograph	-1222
+▁addition	-1223
+af	-1224
+ivid	-1225
+▁ago	-1226
+▁away	-1227
+▁stat	-1228
+school	-1229
+▁comple	-1230
+city	-1231
+cont	-1232
+bsite	-1233
+ption	-1234
+▁kind	-1235
+▁near	-1236
+▁rele	-1237
+▁turn	-1238
+ability	-1239
+▁products	-1240
+ee	-1241
+ask	-1242
+chr	-1243
+ocial	-1244
+▁food	-1245
+▁water	-1246
+▁current	-1247
+▁present	-1248
+▁without	-1249
+nes	-1250
+ered	-1251
+john	-1252
+lege	-1253
+▁expl	-1254
+▁below	-1255
+▁include	-1256
+ots	-1257
+date	-1258
+ners	-1259
+▁leg	-1260
+▁ter	-1261
+▁light	-1262
+▁visit	-1263
+▁making	-1264
+american	-1265
+jun	-1266
+rol	-1267
+ush	-1268
+stud	-1269
+veral	-1270
+▁class	-1271
+▁event	-1272
+▁train	-1273
+dis	-1274
+iam	-1275
+orn	-1276
+icle	-1277
+ieve	-1278
+ling	-1279
+ready	-1280
+▁able	-1281
+▁fund	-1282
+▁doesn	-1283
+▁opport	-1284
+▁working	-1285
+org	-1286
+pol	-1287
+▁ve	-1288
+iter	-1289
+ries	-1290
+▁page	-1291
+▁prom	-1292
+ik	-1293
+bru	-1294
+avor	-1295
+ured	-1296
+▁name	-1297
+ential	-1298
+aterial	-1299
+ina	-1300
+gether	-1301
+ording	-1302
+▁const	-1303
+▁needs	-1304
+▁intern	-1305
+▁important	-1306
+ng	-1307
+irl	-1308
+wor	-1309
+unch	-1310
+▁pat	-1311
+aring	-1312
+atter	-1313
+itive	-1314
+▁data	-1315
+▁didn	-1316
+▁dist	-1317
+▁money	-1318
+▁within	-1319
+▁together	-1320
+ny	-1321
+val	-1322
+▁hot	-1323
+▁games	-1324
+▁getting	-1325
+//	-1326
+ec	-1327
+▁sw	-1328
+▁came	-1329
+▁future	-1330
+▁minutes	-1331
+.,	-1332
+cle	-1333
+ote	-1334
+ttp	-1335
+▁ce	-1336
+ived	-1337
+unic	-1338
+▁mar	-1339
+ients	-1340
+ither	-1341
+▁left	-1342
+▁pack	-1343
+▁past	-1344
+▁community	-1345
+bar	-1346
+den	-1347
+uro	-1348
+▁fr	-1349
+load	-1350
+ister	-1351
+▁tell	-1352
+▁point	-1353
+://	-1354
+raw	-1355
+str	-1356
+high	-1357
+▁buy	-1358
+▁est	-1359
+illed	-1360
+▁check	-1361
+▁power	-1362
+▁prote	-1363
+mal	-1364
+ser	-1365
+ases	-1366
+ival	-1367
+news	-1368
+west	-1369
+▁beh	-1370
+▁equ	-1371
+oints	-1372
+▁thing	-1373
+▁several	-1374
+don	-1375
+har	-1376
+ison	-1377
+▁pie	-1378
+▁case	-1379
+▁five	-1380
+▁told	-1381
+▁applic	-1382
+▁features	-1383
+mit	-1384
+ply	-1385
+sup	-1386
+last	-1387
+lish	-1388
+pend	-1389
+▁exp	-1390
+▁yet	-1391
+▁fact	-1392
+▁half	-1393
+▁stre	-1394
+▁large	-1395
+▁major	-1396
+▁story	-1397
+▁following	-1398
+dav	-1399
+was	-1400
+will	-1401
+▁ref	-1402
+▁web	-1403
+ified	-1404
+these	-1405
+▁cred	-1406
+▁pain	-1407
+lished	-1408
+▁short	-1409
+▁video	-1410
+▁already	-1411
+aug	-1412
+des	-1413
+▁dr	-1414
+ides	-1415
+ocus	-1416
+york	-1417
+▁far	-1418
+▁law	-1419
+▁easy	-1420
+▁hist	-1421
+▁side	-1422
+▁beaut	-1423
+▁website	-1424
+bu	-1425
+sk	-1426
+bur	-1427
+mat	-1428
+▁six	-1429
+▁hold	-1430
+▁media	-1431
+ization	-1432
+▁expect	-1433
+▁material	-1434
+cour	-1435
+eral	-1436
+raft	-1437
+ising	-1438
+▁camp	-1439
+county	-1440
+▁enjoy	-1441
+▁works	-1442
+▁assist	-1443
+oss	-1444
+▁ca	-1445
+ster	-1446
+▁why	-1447
+▁bene	-1448
+▁ever	-1449
+▁orig	-1450
+▁pict	-1451
+▁sing	-1452
+▁soft	-1453
+ensive	-1454
+friday	-1455
+▁based	-1456
+▁creat	-1457
+▁impro	-1458
+▁having	-1459
+aps	-1460
+ify	-1461
+nov	-1462
+wed	-1463
+yle	-1464
+ails	-1465
+sept	-1466
+▁air	-1467
+ument	-1468
+▁oper	-1469
+▁trad	-1470
+▁doing	-1471
+▁learn	-1472
+▁makes	-1473
+met	-1474
+osp	-1475
+▁wa	-1476
+anks	-1477
+oose	-1478
+▁city	-1479
+▁educ	-1480
+united	-1481
+▁employ	-1482
+▁million	-1483
+▁project	-1484
+sa	-1485
+su	-1486
+ray	-1487
+aper	-1488
+ford	-1489
+free	-1490
+word	-1491
+▁low	-1492
+vious	-1493
+monday	-1494
+▁treat	-1495
+▁charac	-1496
+▁quality	-1497
+rem	-1498
+itut	-1499
+ones	-1500
+room	-1501
+after	-1502
+ering	-1503
+north	-1504
+▁offic	-1505
+▁return	-1506
+▁individ	-1507
+name	-1508
+ories	-1509
+right	-1510
+▁once	-1511
+public	-1512
+▁along	-1513
+▁invol	-1514
+▁least	-1515
+▁watch	-1516
+▁offers	-1517
+▁series	-1518
+▁understand	-1519
+icy	-1520
+ism	-1521
+char	-1522
+from	-1523
+your	-1524
+▁suc	-1525
+conom	-1526
+▁http	-1527
+▁live	-1528
+christ	-1529
+▁pract	-1530
+▁across	-1531
+▁indust	-1532
+▁research	-1533
+uk	-1534
+hen	-1535
+put	-1536
+▁di	-1537
+ably	-1538
+king	-1539
+oice	-1540
+▁mil	-1541
+▁sex	-1542
+▁mess	-1543
+▁conne	-1544
+▁house	-1545
+▁share	-1546
+▁staff	-1547
+▁profess	-1548
+▁la	-1549
+▁meet	-1550
+▁sent	-1551
+▁store	-1552
+▁study	-1553
+▁account	-1554
+▁personal	-1555
+▁possible	-1556
+bre	-1557
+side	-1558
+wood	-1559
+▁bit	-1560
+▁def	-1561
+ences	-1562
+▁tool	-1563
+ouncil	-1564
+▁elect	-1565
+▁items	-1566
+national	-1567
+ze	-1568
+ators	-1569
+viron	-1570
+▁mind	-1571
+▁mult	-1572
+▁priv	-1573
+▁email	-1574
+▁music	-1575
+vernment	-1576
+▁respons	-1577
+ju	-1578
+ams	-1579
+bro	-1580
+ery	-1581
+nam	-1582
+ique	-1583
+orts	-1584
+rown	-1585
+▁dro	-1586
+▁key	-1587
+▁benef	-1588
+▁office	-1589
+▁discuss	-1590
+▁perfect	-1591
+▁version	-1592
+ci	-1593
+fo	-1594
+pa	-1595
+ims	-1596
+www	-1597
+duct	-1598
+ring	-1599
+roll	-1600
+▁dem	-1601
+itted	-1602
+jects	-1603
+march	-1604
+▁clos	-1605
+▁fire	-1606
+▁star	-1607
+▁took	-1608
+▁means	-1609
+▁effect	-1610
+▁points	-1611
+▁meeting	-1612
+▁children	-1613
+bed	-1614
+imate	-1615
+satur	-1616
+▁fore	-1617
+center	-1618
+▁comes	-1619
+▁hours	-1620
+▁human	-1621
+▁building	-1622
+sel	-1623
+tty	-1624
+ump	-1625
+ung	-1626
+some	-1627
+well	-1628
+▁net	-1629
+▁tem	-1630
+ained	-1631
+house	-1632
+▁focus	-1633
+▁range	-1634
+ilities	-1635
+▁change	-1636
+saturday	-1637
+▁everything	-1638
+ma	-1639
+oct	-1640
+oor	-1641
+see	-1642
+top	-1643
+urs	-1644
+vis	-1645
+▁af	-1646
+euro	-1647
+head	-1648
+lebr	-1649
+park	-1650
+stre	-1651
+sund	-1652
+▁nor	-1653
+▁sol	-1654
+ource	-1655
+while	-1656
+▁ball	-1657
+▁fron	-1658
+bruary	-1659
+ention	-1660
+▁either	-1661
+▁market	-1662
+▁started	-1663
+ux	-1664
+ham	-1665
+ips	-1666
+fact	-1667
+▁hit	-1668
+fully	-1669
+▁bring	-1670
+▁color	-1671
+▁death	-1672
+▁women	-1673
+▁course	-1674
+▁others	-1675
+▁simple	-1676
+vironment	-1677
+cy	-1678
+▁es	-1679
+ites	-1680
+▁dev	-1681
+▁opt	-1682
+▁sug	-1683
+▁tot	-1684
+board	-1685
+every	-1686
+▁blog	-1687
+▁face	-1688
+▁room	-1689
+▁seen	-1690
+▁favor	-1691
+▁final	-1692
+▁front	-1693
+commend	-1694
+▁called	-1695
+february	-1696
+cr	-1697
+idd	-1698
+ilt	-1699
+uary	-1700
+▁lik	-1701
+▁mom	-1702
+ances	-1703
+▁deal	-1704
+▁pick	-1705
+▁shop	-1706
+▁click	-1707
+▁purch	-1708
+service	-1709
+unities	-1710
+▁comput	-1711
+▁member	-1712
+▁months	-1713
+▁social	-1714
+▁friends	-1715
+--	-1716
+▁:	-1717
+bow	-1718
+pet	-1719
+web	-1720
+▁ey	-1721
+aken	-1722
+ands	-1723
+▁bas	-1724
+▁tre	-1725
+ening	-1726
+inter	-1727
+place	-1728
+▁field	-1729
+▁ident	-1730
+▁quick	-1731
+▁young	-1732
+however	-1733
+september	-1734
+ctor	-1735
+▁fall	-1736
+▁hope	-1737
+▁stop	-1738
+office	-1739
+▁sound	-1740
+olution	-1741
+present	-1742
+▁rights	-1743
+▁country	-1744
+▁product	-1745
+▁development	-1746
+po	-1747
+▁/	-1748
+ane	-1749
+ibr	-1750
+home	-1751
+iven	-1752
+ming	-1753
+▁bar	-1754
+▁non	-1755
+itten	-1756
+▁vari	-1757
+▁areas	-1758
+▁results	-1759
+▁thought	-1760
+na	-1761
+cher	-1762
+▁sum	-1763
+▁cook	-1764
+▁text	-1765
+▁tour	-1766
+itical	-1767
+▁added	-1768
+▁later	-1769
+▁often	-1770
+▁become	-1771
+▁performance	-1772
+gn	-1773
+ced	-1774
+gle	-1775
+pri	-1776
+aces	-1777
+ilar	-1778
+▁ant	-1779
+▁box	-1780
+▁lim	-1781
+▁cond	-1782
+▁third	-1783
+tv	-1784
+aur	-1785
+bor	-1786
+down	-1787
+▁doc	-1788
+▁mot	-1789
+▁ste	-1790
+▁held	-1791
+sunday	-1792
+▁clean	-1793
+ulation	-1794
+▁pretty	-1795
+department	-1796
+▁particular	-1797
+ief	-1798
+ken	-1799
+mor	-1800
+ted	-1801
+gree	-1802
+prof	-1803
+▁pri	-1804
+▁vol	-1805
+david	-1806
+tract	-1807
+▁comb	-1808
+▁port	-1809
+▁talk	-1810
+▁cover	-1811
+▁heart	-1812
+▁times	-1813
+▁econom	-1814
+▁events	-1815
+▁comment	-1816
+▁control	-1817
+▁success	-1818
+▁provided	-1819
+▁z	-1820
+ama	-1821
+apr	-1822
+eth	-1823
+ges	-1824
+pat	-1825
+ury	-1826
+iful	-1827
+ried	-1828
+▁aud	-1829
+▁fig	-1830
+▁ris	-1831
+aging	-1832
+anies	-1833
+field	-1834
+group	-1835
+light	-1836
+ports	-1837
+▁body	-1838
+▁done	-1839
+▁link	-1840
+▁miss	-1841
+▁went	-1842
+europe	-1843
+ruction	-1844
+washing	-1845
+▁almost	-1846
+▁complet	-1847
+▁content	-1848
+▁industry	-1849
+.)	-1850
+iod	-1851
+rug	-1852
+▁oc	-1853
+over	-1854
+▁tick	-1855
+▁vict	-1856
+nesday	-1857
+▁begin	-1858
+college	-1859
+▁travel	-1860
+▁address	-1861
+▁install	-1862
+▁designed	-1863
+hot	-1864
+irc	-1865
+ler	-1866
+rel	-1867
+ocol	-1868
+▁hig	-1869
+▁lin	-1870
+▁anal	-1871
+cially	-1872
+ination	-1873
+▁compet	-1874
+▁engine	-1875
+▁certain	-1876
+▁problem	-1877
+▁request	-1878
+wednesday	-1879
+▁particip	-1880
+wr	-1881
+oph	-1882
+care	-1883
+▁ens	-1884
+▁hon	-1885
+▁log	-1886
+south	-1887
+▁else	-1888
+▁mean	-1889
+ources	-1890
+states	-1891
+▁black	-1892
+▁celebr	-1893
+▁played	-1894
+▁result	-1895
+▁complete	-1896
+▁consider	-1897
+▁specific	-1898
+▁questions	-1899
+▁recommend	-1900
+alt	-1901
+june	-1902
+▁bad	-1903
+▁mag	-1904
+▁tit	-1905
+aving	-1906
+obile	-1907
+▁kids	-1908
+▁town	-1909
+▁compl	-1910
+▁whole	-1911
+▁anyone	-1912
+facebook	-1913
+▁popular	-1914
+▁continue	-1915
+tod	-1916
+aust	-1917
+ding	-1918
+kend	-1919
+sing	-1920
+▁hun	-1921
+▁red	-1922
+duced	-1923
+▁cell	-1924
+▁stay	-1925
+selves	-1926
+▁chall	-1927
+▁close	-1928
+▁party	-1929
+january	-1930
+▁descri	-1931
+▁period	-1932
+▁search	-1933
+▁everyone	-1934
+▁reported	-1935
+acy	-1936
+his	-1937
+hol	-1938
+war	-1939
+also	-1940
+ffic	-1941
+like	-1942
+▁age	-1943
+click	-1944
+▁fast	-1945
+▁foot	-1946
+▁join	-1947
+▁size	-1948
+uation	-1949
+urther	-1950
+▁match	-1951
+▁model	-1952
+▁price	-1953
+▁credit	-1954
+▁training	-1955
+washington	-1956
+ii	-1957
+fil	-1958
+mic	-1959
+tra	-1960
+aker	-1961
+body	-1962
+road	-1963
+weet	-1964
+▁cla	-1965
+▁hom	-1966
+▁oil	-1967
+anced	-1968
+asing	-1969
+erved	-1970
+▁girl	-1971
+▁role	-1972
+ctions	-1973
+curity	-1974
+▁defin	-1975
+▁super	-1976
+▁taken	-1977
+▁users	-1978
+▁living	-1979
+▁strong	-1980
+▁according	-1981
+▁beautiful	-1982
+▁marketing	-1983
+▁environment	-1984
+ev	-1985
+mo	-1986
+call	-1987
+cast	-1988
+ened	-1989
+phot	-1990
+tues	-1991
+▁fit	-1992
+▁film	-1993
+▁prob	-1994
+friend	-1995
+▁shows	-1996
+ocolate	-1997
+tuesday	-1998
+pecially	-1999
+▁changes	-2000
+▁official	-2001
+▁original	-2002
+ek	-2003
+ky	-2004
+tc	-2005
+irt	-2006
+men	-2007
+....	-2008
+abor	-2009
+amed	-2010
+many	-2011
+ouch	-2012
+used	-2013
+▁fre	-2014
+▁imp	-2015
+▁ple	-2016
+where	-2017
+▁sell	-2018
+coming	-2019
+▁given	-2020
+▁known	-2021
+▁phone	-2022
+▁table	-2023
+▁white	-2024
+istrict	-2025
+▁behind	-2026
+▁chance	-2027
+▁single	-2028
+▁property	-2029
+▁provides	-2030
+▁question	-2031
+▁received	-2032
+▁recently	-2033
+▁character	-2034
+▁technology	-2035
+ef	-2036
+aby	-2037
+elt	-2038
+ateg	-2039
+ires	-2040
+itch	-2041
+▁cup	-2042
+▁pot	-2043
+atest	-2044
+mitted	-2045
+posted	-2046
+▁among	-2047
+▁brand	-2048
+▁contr	-2049
+▁early	-2050
+▁sales	-2051
+▁track	-2052
+▁issues	-2053
+▁police	-2054
+▁systems	-2055
+▁actually	-2056
+▁includes	-2057
+▁government	-2058
+iver	-2059
+▁gra	-2060
+▁mix	-2061
+▁typ	-2062
+ables	-2063
+white	-2064
+▁books	-2065
+▁press	-2066
+▁coming	-2067
+▁believe	-2068
+▁collect	-2069
+▁private	-2070
+▁whether	-2071
+big	-2072
+bum	-2073
+ern	-2074
+irm	-2075
+loc	-2076
+mag	-2077
+pin	-2078
+sub	-2079
+▁ep	-2080
+▁su	-2081
+bert	-2082
+find	-2083
+oved	-2084
+april	-2085
+phone	-2086
+▁diff	-2087
+▁inde	-2088
+▁soon	-2089
+▁attend	-2090
+▁couple	-2091
+▁policy	-2092
+bel	-2093
+key	-2094
+ogn	-2095
+ows	-2096
+san	-2097
+ians	-2098
+libr	-2099
+look	-2100
+vert	-2101
+▁dou	-2102
+▁mor	-2103
+arter	-2104
+▁send	-2105
+▁step	-2106
+▁type	-2107
+associ	-2108
+overed	-2109
+ovember	-2110
+▁entire	-2111
+▁photos	-2112
+▁however	-2113
+▁management	-2114
+umb	-2115
+brit	-2116
+grad	-2117
+ided	-2118
+obal	-2119
+phil	-2120
+scri	-2121
+icult	-2122
+trans	-2123
+▁gift	-2124
+▁late	-2125
+▁rate	-2126
+▁song	-2127
+ederal	-2128
+health	-2129
+ursday	-2130
+▁asked	-2131
+▁stock	-2132
+america	-2133
+november	-2134
+▁release	-2135
+▁running	-2136
+▁programs	-2137
+▁represent	-2138
+bt	-2139
+▁x	-2140
+ape	-2141
+god	-2142
+ole	-2143
+july	-2144
+leag	-2145
+tain	-2146
+black	-2147
+first	-2148
+green	-2149
+▁coll	-2150
+▁goal	-2151
+▁road	-2152
+▁seem	-2153
+▁famil	-2154
+▁guest	-2155
+▁space	-2156
+▁wanted	-2157
+thursday	-2158
+▁comfort	-2159
+▁example	-2160
+▁improve	-2161
+▁nothing	-2162
+avy	-2163
+hod	-2164
+win	-2165
+life	-2166
+mich	-2167
+ples	-2168
+pose	-2169
+soft	-2170
+then	-2171
+▁rad	-2172
+▁via	-2173
+▁answ	-2174
+▁cool	-2175
+direct	-2176
+▁paper	-2177
+▁photo	-2178
+▁total	-2179
+ministr	-2180
+▁author	-2181
+▁choose	-2182
+services	-2183
+▁network	-2184
+▁anything	-2185
+▁companies	-2186
+▁difficult	-2187
+▁especially	-2188
+mod	-2189
+tri	-2190
+chan	-2191
+resh	-2192
+ules	-2193
+▁bed	-2194
+▁pan	-2195
+astic	-2196
+gener	-2197
+istic	-2198
+▁anim	-2199
+▁feed	-2200
+▁walk	-2201
+idents	-2202
+▁board	-2203
+▁clear	-2204
+▁leave	-2205
+▁quite	-2206
+▁ready	-2207
+▁value	-2208
+▁trying	-2209
+opyright	-2210
+▁article	-2211
+▁further	-2212
+▁communic	-2213
+rs	-2214
+ees	-2215
+ont	-2216
+tre	-2217
+clus	-2218
+dece	-2219
+game	-2220
+ging	-2221
+izes	-2222
+miss	-2223
+urch	-2224
+▁war	-2225
+azing	-2226
+human	-2227
+itary	-2228
+▁draw	-2229
+▁meas	-2230
+▁nice	-2231
+▁race	-2232
+atural	-2233
+august	-2234
+idence	-2235
+▁award	-2236
+▁weeks	-2237
+▁action	-2238
+▁appear	-2239
+▁create	-2240
+▁taking	-2241
+december	-2242
+▁computer	-2243
+▁opportunity	-2244
+ba	-2245
+raz	-2246
+aced	-2247
+aily	-2248
+aren	-2249
+idge	-2250
+lake	-2251
+onse	-2252
+▁dig	-2253
+▁due	-2254
+▁etc	-2255
+▁sat	-2256
+▁saw	-2257
+▁www	-2258
+iddle	-2259
+itter	-2260
+ively	-2261
+oring	-2262
+▁save	-2263
+ground	-2264
+illing	-2265
+thanks	-2266
+▁happy	-2267
+▁looks	-2268
+▁pages	-2269
+estival	-2270
+example	-2271
+▁concer	-2272
+▁matter	-2273
+▁screen	-2274
+▁written	-2275
+▁previous	-2276
+▁relation	-2277
+▁released	-2278
+hib	-2279
+rew	-2280
+aged	-2281
+ancy	-2282
+ecut	-2283
+inst	-2284
+ivil	-2285
+ailed	-2286
+cover	-2287
+posed	-2288
+today	-2289
+▁file	-2290
+▁idea	-2291
+▁init	-2292
+▁move	-2293
+▁issue	-2294
+▁movie	-2295
+▁takes	-2296
+▁latest	-2297
+▁reason	-2298
+▁picture	-2299
+▁similar	-2300
+▁investig	-2301
+▁position	-2302
+▁probably	-2303
+▁problems	-2304
+▁yourself	-2305
+▁contribut	-2306
+▁customers	-2307
+▁additional	-2308
+gs	-2309
+acc	-2310
+ett	-2311
+cons	-2312
+medi	-2313
+▁ele	-2314
+oogle	-2315
+▁hear	-2316
+▁phys	-2317
+▁sold	-2318
+▁user	-2319
+ipping	-2320
+ounced	-2321
+though	-2322
+▁eight	-2323
+▁hands	-2324
+▁worth	-2325
+▁consum	-2326
+▁ground	-2327
+▁inside	-2328
+▁wonder	-2329
+▁history	-2330
+▁medical	-2331
+▁players	-2332
+▁regular	-2333
+▁someone	-2334
+▁student	-2335
+▁increase	-2336
+▁software	-2337
+▁professional	-2338
+hop	-2339
+mas	-2340
+oke	-2341
+best	-2342
+star	-2343
+▁ess	-2344
+▁isn	-2345
+▁saf	-2346
+check	-2347
+▁bott	-2348
+▁offe	-2349
+street	-2350
+▁break	-2351
+▁intro	-2352
+▁seems	-2353
+council	-2354
+develop	-2355
+ibility	-2356
+▁allows	-2357
+▁longer	-2358
+▁needed	-2359
+▁forward	-2360
+▁variety	-2361
+▁published	-2362
+ios	-2363
+sal	-2364
+sim	-2365
+urt	-2366
+wil	-2367
+east	-2368
+▁ful	-2369
+▁jud	-2370
+▁tal	-2371
+acher	-2372
+angel	-2373
+thern	-2374
+▁date	-2375
+▁ener	-2376
+▁mass	-2377
+▁sche	-2378
+elcome	-2379
+verage	-2380
+▁extra	-2381
+▁spend	-2382
+▁words	-2383
+william	-2384
+▁likely	-2385
+▁posted	-2386
+▁comments	-2387
+▁individual	-2388
+▁opportunities	-2389
+ida	-2390
+law	-2391
+two	-2392
+vil	-2393
+atic	-2394
+ises	-2395
+turn	-2396
+wind	-2397
+▁har	-2398
+▁lay	-2399
+▁vot	-2400
+child	-2401
+▁code	-2402
+▁lost	-2403
+▁park	-2404
+▁stri	-2405
+▁wind	-2406
+ancial	-2407
+google	-2408
+virgin	-2409
+▁fresh	-2410
+library	-2411
+▁amount	-2412
+▁simply	-2413
+▁clients	-2414
+▁details	-2415
+▁playing	-2416
+▁require	-2417
+president	-2418
+▁function	-2419
+▁signific	-2420
+▁currently	-2421
+▁providing	-2422
+▁themselves	-2423
+cas	-2424
+mus	-2425
+ued	-2426
+who	-2427
+aign	-2428
+ined	-2429
+pack	-2430
+▁bud	-2431
+▁dam	-2432
+▁hol	-2433
+about	-2434
+ached	-2435
+based	-2436
+gress	-2437
+ought	-2438
+world	-2439
+▁base	-2440
+ipment	-2441
+▁sites	-2442
+▁choice	-2443
+▁growth	-2444
+▁master	-2445
+▁toward	-2446
+▁leading	-2447
+▁involved	-2448
+▁activities	-2449
+▁interesting	-2450
+bb	-2451
+kn	-2452
+bal	-2453
+wes	-2454
+hall	-2455
+ober	-2456
+▁abs	-2457
+▁mat	-2458
+arget	-2459
+icles	-2460
+ounds	-2461
+ufact	-2462
+▁capt	-2463
+▁circ	-2464
+▁host	-2465
+▁exist	-2466
+▁prior	-2467
+▁style	-2468
+special	-2469
+▁cancer	-2470
+internet	-2471
+▁located	-2472
+▁protect	-2473
+▁receive	-2474
+▁related	-2475
+▁weekend	-2476
+▁projects	-2477
+▁chocolate	-2478
+▁education	-2479
+▁applications	-2480
+▁relationship	-2481
+fa	-2482
+lim	-2483
+mem	-2484
+▁ri	-2485
+angu	-2486
+good	-2487
+itte	-2488
+mary	-2489
+ping	-2490
+▁guy	-2491
+▁hop	-2492
+▁inc	-2493
+▁var	-2494
+color	-2495
+ellow	-2496
+ently	-2497
+india	-2498
+▁note	-2499
+online	-2500
+▁court	-2501
+▁cream	-2502
+▁dress	-2503
+▁lower	-2504
+istered	-2505
+mission	-2506
+october	-2507
+▁regard	-2508
+▁summer	-2509
+▁unique	-2510
+▁created	-2511
+▁general	-2512
+▁quickly	-2513
+▁reviews	-2514
+▁suggest	-2515
+▁director	-2516
+▁download	-2517
+▁effective	-2518
+py	-2519
+got	-2520
+ica	-2521
+ipe	-2522
+roy	-2523
+ume	-2524
+duce	-2525
+list	-2526
+pped	-2527
+stat	-2528
+tech	-2529
+ural	-2530
+▁fab	-2531
+▁hor	-2532
+▁lic	-2533
+▁sun	-2534
+james	-2535
+since	-2536
+super	-2537
+▁club	-2538
+▁heat	-2539
+▁mail	-2540
+▁risk	-2541
+▁wasn	-2542
+▁wide	-2543
+corpor	-2544
+design	-2545
+review	-2546
+▁deter	-2547
+▁parts	-2548
+itation	-2549
+itional	-2550
+▁accept	-2551
+▁energy	-2552
+▁method	-2553
+▁mobile	-2554
+▁recogn	-2555
+▁select	-2556
+▁skills	-2557
+business	-2558
+▁amazing	-2559
+▁prevent	-2560
+▁writing	-2561
+▁included	-2562
+▁remember	-2563
+▁security	-2564
+international	-2565
+bay	-2566
+ead	-2567
+erg	-2568
+ped	-2569
+ram	-2570
+unt	-2571
+urb	-2572
+rees	-2573
+reng	-2574
+rock	-2575
+istry	-2576
+▁card	-2577
+▁grad	-2578
+aining	-2579
+icated	-2580
+iences	-2581
+▁album	-2582
+▁coach	-2583
+▁shoot	-2584
+▁woman	-2585
+plement	-2586
+▁career	-2587
+▁feature	-2588
+ification	-2589
+▁military	-2590
+▁starting	-2591
+▁administr	-2592
+▁conference	-2593
+▁significant	-2594
+cs	-2595
+ha	-2596
+aid	-2597
+cut	-2598
+fam	-2599
+iple	-2600
+paul	-2601
+resp	-2602
+rian	-2603
+site	-2604
+vest	-2605
+▁ach	-2606
+▁dry	-2607
+aches	-2608
+class	-2609
+craft	-2610
+ength	-2611
+iding	-2612
+media	-2613
+ounts	-2614
+▁died	-2615
+▁exam	-2616
+▁infl	-2617
+▁skin	-2618
+▁true	-2619
+aurant	-2620
+rought	-2621
+▁apply	-2622
+▁camer	-2623
+▁grant	-2624
+▁lives	-2625
+▁piece	-2626
+▁serve	-2627
+▁touch	-2628
+▁impact	-2629
+virginia	-2630
+▁morning	-2631
+▁outside	-2632
+▁document	-2633
+▁favorite	-2634
+▁required	-2635
+▁announced	-2636
+▁collection	-2637
+hd	-2638
+mm	-2639
+bul	-2640
+cam	-2641
+gen	-2642
+iff	-2643
+sch	-2644
+sen	-2645
+set	-2646
+sur	-2647
+airs	-2648
+band	-2649
+bell	-2650
+hood	-2651
+rote	-2652
+▁wed	-2653
+arent	-2654
+calif	-2655
+ental	-2656
+hotel	-2657
+ville	-2658
+▁arch	-2659
+▁wait	-2660
+author	-2661
+ociety	-2662
+▁pregn	-2663
+▁integr	-2664
+▁natural	-2665
+▁quarter	-2666
+▁reading	-2667
+▁respect	-2668
+association	-2669
+▁throughout	-2670
+ago	-2671
+box	-2672
+iar	-2673
+lar	-2674
+lis	-2675
+sem	-2676
+uel	-2677
+vey	-2678
+yes	-2679
+▁er	-2680
+▁ir	-2681
+▁ut	-2682
+cret	-2683
+fire	-2684
+hold	-2685
+iant	-2686
+irth	-2687
+oper	-2688
+sign	-2689
+take	-2690
+▁bur	-2691
+▁sym	-2692
+azine	-2693
+great	-2694
+lands	-2695
+ledge	-2696
+roups	-2697
+▁dest	-2698
+▁gets	-2699
+▁prem	-2700
+▁above	-2701
+▁cause	-2702
+▁estab	-2703
+▁image	-2704
+▁speak	-2705
+▁stuff	-2706
+▁surpr	-2707
+▁sweet	-2708
+▁budget	-2709
+▁cannot	-2710
+▁common	-2711
+▁encour	-2712
+▁thanks	-2713
+▁connect	-2714
+▁partners	-2715
+▁positive	-2716
+▁equipment	-2717
+▁responsible	-2718
+tm	-2719
+gar	-2720
+pal	-2721
+rad	-2722
+why	-2723
+enge	-2724
+food	-2725
+full	-2726
+hill	-2727
+mart	-2728
+ocks	-2729
+uate	-2730
+▁ill	-2731
+annel	-2732
+attle	-2733
+force	-2734
+izing	-2735
+rated	-2736
+train	-2737
+using	-2738
+video	-2739
+▁dead	-2740
+▁fish	-2741
+▁loss	-2742
+▁prep	-2743
+▁rock	-2744
+ession	-2745
+▁began	-2746
+▁satis	-2747
+▁sense	-2748
+▁wrong	-2749
+▁wrote	-2750
+austral	-2751
+ospital	-2752
+▁annual	-2753
+▁ensure	-2754
+▁images	-2755
+▁mechan	-2756
+▁myself	-2757
+▁streng	-2758
+▁brother	-2759
+▁schools	-2760
+▁science	-2761
+copyright	-2762
+▁expected	-2763
+▁national	-2764
+▁standard	-2765
+▁distribut	-2766
+▁potential	-2767
+▁residents	-2768
+▁application	-2769
+hp	-2770
+iqu	-2771
+olf	-2772
+oma	-2773
+rec	-2774
+ror	-2775
+tom	-2776
+▁sn	-2777
+adem	-2778
+fair	-2779
+have	-2780
+omin	-2781
+once	-2782
+ties	-2783
+vant	-2784
+wise	-2785
+acing	-2786
+micha	-2787
+smith	-2788
+▁brow	-2789
+▁fans	-2790
+▁guys	-2791
+aching	-2792
+action	-2793
+iation	-2794
+police	-2795
+program	-2796
+▁easily	-2797
+▁figure	-2798
+▁higher	-2799
+▁moment	-2800
+▁turned	-2801
+▁instead	-2802
+▁conditions	-2803
+▁construction	-2804
+ns	-2805
+dor	-2806
+fun	-2807
+adel	-2808
+ruit	-2809
+ysis	-2810
+▁inj	-2811
+power	-2812
+water	-2813
+▁insp	-2814
+▁load	-2815
+▁self	-2816
+▁slow	-2817
+arning	-2818
+ercial	-2819
+rights	-2820
+ulture	-2821
+▁built	-2822
+▁enter	-2823
+▁groups	-2824
+▁itself	-2825
+▁produc	-2826
+▁temper	-2827
+▁worked	-2828
+european	-2829
+research	-2830
+▁conduct	-2831
+▁consult	-2832
+▁decided	-2833
+▁usually	-2834
+community	-2835
+▁decision	-2836
+▁purchase	-2837
+▁entertain	-2838
+▁treatment	-2839
+ai	-2840
+bit	-2841
+det	-2842
+ffe	-2843
+uge	-2844
+wal	-2845
+aste	-2846
+berg	-2847
+even	-2848
+haps	-2849
+iber	-2850
+lond	-2851
+long	-2852
+rich	-2853
+▁boy	-2854
+▁gen	-2855
+▁mid	-2856
+▁sou	-2857
+anger	-2858
+dated	-2859
+iment	-2860
+order	-2861
+orney	-2862
+▁deep	-2863
+▁earn	-2864
+▁epis	-2865
+▁felt	-2866
+▁goes	-2867
+system	-2868
+urance	-2869
+▁broad	-2870
+▁floor	-2871
+▁gives	-2872
+▁green	-2873
+▁guide	-2874
+▁ideas	-2875
+▁mouth	-2876
+▁neigh	-2877
+▁print	-2878
+▁wants	-2879
+ructure	-2880
+▁answer	-2881
+▁assess	-2882
+▁attack	-2883
+▁except	-2884
+▁execut	-2885
+▁rather	-2886
+district	-2887
+ological	-2888
+▁billion	-2889
+▁convers	-2890
+▁display	-2891
+▁excited	-2892
+▁finally	-2893
+▁numbers	-2894
+▁offered	-2895
+▁tickets	-2896
+▁location	-2897
+▁countries	-2898
+▁developed	-2899
+▁photograph	-2900
+ado	-2901
+div	-2902
+fre	-2903
+mot	-2904
+pay	-2905
+anch	-2906
+andy	-2907
+anta	-2908
+chic	-2909
+educ	-2910
+erve	-2911
+make	-2912
+pite	-2913
+show	-2914
+uals	-2915
+▁cit	-2916
+▁gun	-2917
+▁rev	-2918
+earth	-2919
+orial	-2920
+quest	-2921
+speed	-2922
+techn	-2923
+thank	-2924
+▁drug	-2925
+▁firm	-2926
+▁gold	-2927
+▁libr	-2928
+▁ones	-2929
+▁prin	-2930
+▁roll	-2931
+▁safe	-2932
+▁viol	-2933
+▁wish	-2934
+arents	-2935
+encies	-2936
+iction	-2937
+london	-2938
+season	-2939
+▁fully	-2940
+▁helps	-2941
+agement	-2942
+▁center	-2943
+▁extend	-2944
+▁average	-2945
+▁devices	-2946
+▁manager	-2947
+▁respond	-2948
+according	-2949
+▁essential	-2950
+▁materials	-2951
+▁sometimes	-2952
+▁interested	-2953
+ds	-2954
+erc	-2955
+ese	-2956
+jes	-2957
+ned	-2958
+oon	-2959
+pac	-2960
+ris	-2961
+sun	-2962
+wis	-2963
+ceed	-2964
+flor	-2965
+iles	-2966
+leep	-2967
+main	-2968
+note	-2969
+team	-2970
+uled	-2971
+▁din	-2972
+▁eye	-2973
+▁kit	-2974
+▁sil	-2975
+▁yes	-2976
+arant	-2977
+fects	-2978
+forts	-2979
+infin	-2980
+ittee	-2981
+point	-2982
+▁appl	-2983
+▁baby	-2984
+▁cars	-2985
+▁fail	-2986
+▁item	-2987
+▁tend	-2988
+▁ways	-2989
+ellent	-2990
+follow	-2991
+report	-2992
+riving	-2993
+uments	-2994
+▁autom	-2995
+▁claim	-2996
+▁drive	-2997
+▁files	-2998
+▁rules	-2999
+▁score	-3000
+▁separ	-3001
+▁title	-3002
+▁whose	-3003
+aughter	-3004
+cessary	-3005
+general	-3006
+▁expert	-3007
+▁giving	-3008
+▁happen	-3009
+▁nation	-3010
+▁region	-3011
+▁safety	-3012
+▁update	-3013
+californ	-3014
+children	-3015
+olutions	-3016
+▁biggest	-3017
+▁efforts	-3018
+▁reports	-3019
+americans	-3020
+▁campaign	-3021
+▁manufact	-3022
+▁financial	-3023
+▁statement	-3024
+▁organization	-3025
+dv	-3026
+gg	-3027
+mr	-3028
+sl	-3029
+▁,	-3030
+!!!	-3031
+ada	-3032
+dan	-3033
+jer	-3034
+mad	-3035
+oes	-3036
+omb	-3037
+sum	-3038
+tim	-3039
+▁lo	-3040
+camp	-3041
+clos	-3042
+lead	-3043
+test	-3044
+▁bag	-3045
+▁bus	-3046
+▁cat	-3047
+build	-3048
+found	-3049
+itely	-3050
+itute	-3051
+photo	-3052
+rying	-3053
+▁bill	-3054
+▁copy	-3055
+▁imag	-3056
+▁repl	-3057
+▁sens	-3058
+▁weap	-3059
+atives	-3060
+employ	-3061
+iology	-3062
+league	-3063
+▁langu	-3064
+▁seven	-3065
+▁models	-3066
+▁nearly	-3067
+▁limited	-3068
+▁records	-3069
+▁serious	-3070
+▁contains	-3071
+▁internet	-3072
+▁offering	-3073
+california	-3074
+▁submitted	-3075
+▁background	-3076
+▁difference	-3077
+ka	-3078
+ben	-3079
+ien	-3080
+log	-3081
+▁ge	-3082
+each	-3083
+gold	-3084
+mont	-3085
+real	-3086
+reci	-3087
+unte	-3088
+▁sty	-3089
+▁sus	-3090
+▁veh	-3091
+aming	-3092
+arden	-3093
+brown	-3094
+compl	-3095
+micro	-3096
+riage	-3097
+▁bank	-3098
+▁menu	-3099
+▁sale	-3100
+▁stra	-3101
+▁wild	-3102
+fortun	-3103
+people	-3104
+▁occur	-3105
+▁teams	-3106
+▁tried	-3107
+▁camera	-3108
+▁depend	-3109
+▁player	-3110
+▁sports	-3111
+▁winner	-3112
+▁winter	-3113
+▁contain	-3114
+▁federal	-3115
+▁teacher	-3116
+▁production	-3117
+aud	-3118
+bob	-3119
+enn	-3120
+fol	-3121
+hel	-3122
+isl	-3123
+ona	-3124
+rup	-3125
+ube	-3126
+uce	-3127
+wom	-3128
+zer	-3129
+ault	-3130
+empt	-3131
+iano	-3132
+icks	-3133
+lier	-3134
+▁cal	-3135
+▁fan	-3136
+▁jun	-3137
+▁sem	-3138
+▁son	-3139
+asket	-3140
+aught	-3141
+ipped	-3142
+quare	-3143
+▁blue	-3144
+▁cold	-3145
+▁dark	-3146
+▁jour	-3147
+▁pers	-3148
+▁plus	-3149
+▁surv	-3150
+market	-3151
+ounter	-3152
+▁daily	-3153
+▁heard	-3154
+▁owner	-3155
+▁round	-3156
+▁tools	-3157
+company	-3158
+duction	-3159
+michael	-3160
+▁charge	-3161
+▁compon	-3162
+▁fabric	-3163
+▁passed	-3164
+▁states	-3165
+▁target	-3166
+▁benefit	-3167
+▁brought	-3168
+▁various	-3169
+▁benefits	-3170
+▁challeng	-3171
+▁contract	-3172
+▁pictures	-3173
+▁reserved	-3174
+▁president	-3175
+information	-3176
+▁businesses	-3177
+".	-3178
+ius	-3179
+oop	-3180
+rob	-3181
+sex	-3182
+vin	-3183
+aint	-3184
+alex	-3185
+aven	-3186
+cert	-3187
+hist	-3188
+join	-3189
+oute	-3190
+page	-3191
+rect	-3192
+week	-3193
+▁bal	-3194
+▁eff	-3195
+▁led	-3196
+▁tur	-3197
+atory	-3198
+franc	-3199
+iring	-3200
+obama	-3201
+▁agre	-3202
+▁aren	-3203
+▁cash	-3204
+▁decl	-3205
+▁luck	-3206
+▁sort	-3207
+▁tele	-3208
+▁uses	-3209
+ership	-3210
+isions	-3211
+▁blood	-3212
+▁champ	-3213
+▁fight	-3214
+▁paint	-3215
+▁thous	-3216
+▁throw	-3217
+british	-3218
+network	-3219
+▁corpor	-3220
+▁double	-3221
+▁retail	-3222
+▁served	-3223
+infinity	-3224
+republic	-3225
+students	-3226
+▁college	-3227
+▁message	-3228
+▁mission	-3229
+▁opening	-3230
+▁options	-3231
+▁parents	-3232
+▁stories	-3233
+▁multiple	-3234
+▁practice	-3235
+▁attention	-3236
+▁knowledge	-3237
+▁selection	-3238
+▁registered	-3239
+''	-3240
+**	-3241
+aa	-3242
+ga	-3243
+pd	-3244
+cap	-3245
+ils	-3246
+max	-3247
+pad	-3248
+sam	-3249
+▁hy	-3250
+agon	-3251
+bill	-3252
+card	-3253
+dule	-3254
+fall	-3255
+hest	-3256
+jack	-3257
+need	-3258
+phys	-3259
+ston	-3260
+▁ded	-3261
+▁sen	-3262
+▁ult	-3263
+anned	-3264
+court	-3265
+enced	-3266
+▁pull	-3267
+▁runs	-3268
+▁sets	-3269
+columb	-3270
+elling	-3271
+scient	-3272
+▁agree	-3273
+▁costs	-3274
+▁lines	-3275
+▁named	-3276
+▁princ	-3277
+▁sched	-3278
+▁speed	-3279
+florida	-3280
+twitter	-3281
+▁affect	-3282
+▁associ	-3283
+▁client	-3284
+▁colors	-3285
+▁damage	-3286
+▁former	-3287
+▁immedi	-3288
+▁propos	-3289
+▁threat	-3290
+▁attract	-3291
+▁express	-3292
+▁guarant	-3293
+▁strength	-3294
+▁beginning	-3295
+▁implement	-3296
+▁political	-3297
+▁commercial	-3298
+▁completely	-3299
+):	-3300
+hy	-3301
+ana	-3302
+dec	-3303
+dri	-3304
+ino	-3305
+jay	-3306
+los	-3307
+mil	-3308
+pop	-3309
+pts	-3310
+rop	-3311
+ude	-3312
+▁bi	-3313
+▁cy	-3314
+blog	-3315
+bost	-3316
+cure	-3317
+foot	-3318
+ifts	-3319
+love	-3320
+▁ble	-3321
+▁enh	-3322
+▁fol	-3323
+▁ing	-3324
+▁max	-3325
+▁mer	-3326
+▁ord	-3327
+▁pen	-3328
+chair	-3329
+mount	-3330
+ocket	-3331
+ounce	-3332
+watch	-3333
+words	-3334
+▁chap	-3335
+▁crim	-3336
+▁farm	-3337
+▁gave	-3338
+▁snow	-3339
+▁upon	-3340
+▁warm	-3341
+boston	-3342
+change	-3343
+engine	-3344
+▁citiz	-3345
+▁exerc	-3346
+▁glass	-3347
+▁innov	-3348
+▁liter	-3349
+ergency	-3350
+▁advice	-3351
+▁effort	-3352
+▁global	-3353
+▁highly	-3354
+▁letter	-3355
+▁levels	-3356
+▁ability	-3357
+▁council	-3358
+▁growing	-3359
+▁package	-3360
+▁purpose	-3361
+▁analysis	-3362
+▁instruct	-3363
+▁excellent	-3364
+▁connection	-3365
+▁definitely	-3366
+▁investment	-3367
+▁understanding	-3368
+bi	-3369
+dy	-3370
+ko	-3371
+bon	-3372
+lou	-3373
+num	-3374
+opt	-3375
+sey	-3376
+ute	-3377
+ager	-3378
+chap	-3379
+imum	-3380
+isra	-3381
+mond	-3382
+open	-3383
+plan	-3384
+rapy	-3385
+russ	-3386
+ulty	-3387
+whit	-3388
+wide	-3389
+▁aim	-3390
+▁bac	-3391
+▁fix	-3392
+▁imm	-3393
+▁nar	-3394
+▁obs	-3395
+▁pup	-3396
+ellig	-3397
+medic	-3398
+space	-3399
+▁gone	-3400
+▁hour	-3401
+▁hous	-3402
+▁prim	-3403
+▁spot	-3404
+▁thin	-3405
+▁tree	-3406
+israel	-3407
+priate	-3408
+remely	-3409
+▁birth	-3410
+▁calls	-3411
+▁links	-3412
+▁spent	-3413
+chicago	-3414
+english	-3415
+francis	-3416
+science	-3417
+▁fourth	-3418
+▁guests	-3419
+director	-3420
+▁pattern	-3421
+▁subject	-3422
+▁subscri	-3423
+education	-3424
+▁families	-3425
+▁insurance	-3426
+▁performed	-3427
+▁particularly	-3428
+md	-3429
+pc	-3430
+ye	-3431
+zz	-3432
+die	-3433
+lyn	-3434
+oms	-3435
+pok	-3436
+pon	-3437
+rie	-3438
+une	-3439
+vol	-3440
+burn	-3441
+chen	-3442
+cing	-3443
+cook	-3444
+irit	-3445
+live	-3446
+olar	-3447
+ooth	-3448
+orge	-3449
+rant	-3450
+text	-3451
+umin	-3452
+walk	-3453
+▁ice	-3454
+▁jew	-3455
+▁tim	-3456
+aries	-3457
+inary	-3458
+onstr	-3459
+print	-3460
+ushed	-3461
+which	-3462
+▁band	-3463
+▁cand	-3464
+▁clin	-3465
+▁info	-3466
+▁land	-3467
+▁trip	-3468
+▁vote	-3469
+ension	-3470
+ifying	-3471
+social	-3472
+▁aware	-3473
+▁civil	-3474
+▁goals	-3475
+▁homes	-3476
+▁legal	-3477
+▁maybe	-3478
+▁reach	-3479
+▁smart	-3480
+profess	-3481
+▁adults	-3482
+▁altern	-3483
+▁expand	-3484
+▁histor	-3485
+▁killed	-3486
+▁launch	-3487
+▁modern	-3488
+▁prices	-3489
+▁remove	-3490
+▁source	-3491
+▁stores	-3492
+communic	-3493
+ographic	-3494
+▁achieve	-3495
+▁cooking	-3496
+▁station	-3497
+▁wedding	-3498
+▁workers	-3499
+christmas	-3500
+▁advanced	-3501
+▁demonstr	-3502
+▁marriage	-3503
+▁neighbor	-3504
+▁planning	-3505
+▁surround	-3506
+▁thinking	-3507
+▁visitors	-3508
+▁necessary	-3509
+▁developing	-3510
+▁individuals	-3511
+ni	-3512
+wy	-3513
+cos	-3514
+cup	-3515
+dou	-3516
+ita	-3517
+nal	-3518
+oid	-3519
+orm	-3520
+pic	-3521
+tex	-3522
+acts	-3523
+bass	-3524
+brad	-3525
+burg	-3526
+dale	-3527
+next	-3528
+omew	-3529
+prov	-3530
+shop	-3531
+thom	-3532
+town	-3533
+▁gar	-3534
+▁pow	-3535
+const	-3536
+front	-3537
+grand	-3538
+iance	-3539
+legal	-3540
+ooper	-3541
+works	-3542
+▁beat	-3543
+▁cert	-3544
+▁flow	-3545
+▁hook	-3546
+▁reve	-3547
+▁wall	-3548
+▁wear	-3549
+family	-3550
+istics	-3551
+robert	-3552
+series	-3553
+▁arriv	-3554
+▁cross	-3555
+▁effic	-3556
+▁forum	-3557
+▁heavy	-3558
+▁influ	-3559
+▁sugar	-3560
+▁terms	-3561
+▁types	-3562
+▁advant	-3563
+▁advert	-3564
+▁degree	-3565
+▁easier	-3566
+▁ingred	-3567
+▁saying	-3568
+▁weight	-3569
+lastname	-3570
+▁charged	-3571
+▁classic	-3572
+▁defense	-3573
+▁driving	-3574
+▁evening	-3575
+▁healthy	-3576
+▁initial	-3577
+▁football	-3578
+▁friendly	-3579
+▁relevant	-3580
+▁response	-3581
+▁challenge	-3582
+▁completed	-3583
+▁employees	-3584
+▁interview	-3585
+▁scheduled	-3586
+development	-3587
+▁department	-3588
+▁protection	-3589
+▁international	-3590
+oe	-3591
+asc	-3592
+del	-3593
+gas	-3594
+inn	-3595
+nic	-3596
+tor	-3597
+▁($	-3598
+apan	-3599
+ares	-3600
+club	-3601
+four	-3602
+gian	-3603
+ibly	-3604
+iced	-3605
+mass	-3606
+noon	-3607
+ohio	-3608
+pass	-3609
+vict	-3610
+▁rep	-3611
+ander	-3612
+grees	-3613
+▁dise	-3614
+▁doll	-3615
+▁feet	-3616
+▁fing	-3617
+▁huge	-3618
+▁prec	-3619
+▁pred	-3620
+▁qual	-3621
+▁shot	-3622
+▁trou	-3623
+indian	-3624
+ration	-3625
+▁ahead	-3626
+▁alleg	-3627
+▁alone	-3628
+▁audio	-3629
+▁block	-3630
+▁exact	-3631
+▁grand	-3632
+▁indic	-3633
+▁regul	-3634
+▁thank	-3635
+because	-3636
+contact	-3637
+imately	-3638
+whether	-3639
+▁arrest	-3640
+▁finish	-3641
+▁length	-3642
+▁middle	-3643
+▁mother	-3644
+▁option	-3645
+▁remind	-3646
+▁senior	-3647
+ferences	-3648
+▁earlier	-3649
+▁impress	-3650
+▁section	-3651
+australia	-3652
+▁customer	-3653
+▁discount	-3654
+▁economic	-3655
+▁finished	-3656
+▁progress	-3657
+▁resources	-3658
+wa	-3659
+cat	-3660
+has	-3661
+hun	-3662
+iet	-3663
+jul	-3664
+lee	-3665
+lex	-3666
+mac	-3667
+nds	-3668
+ony	-3669
+ule	-3670
+blue	-3671
+cong	-3672
+enty	-3673
+girl	-3674
+irts	-3675
+mail	-3676
+mine	-3677
+ownt	-3678
+urer	-3679
+▁beg	-3680
+▁eat	-3681
+▁gas	-3682
+▁ton	-3683
+antly	-3684
+appro	-3685
+arily	-3686
+gment	-3687
+ivity	-3688
+japan	-3689
+model	-3690
+never	-3691
+night	-3692
+osoph	-3693
+ously	-3694
+pects	-3695
+river	-3696
+stone	-3697
+union	-3698
+▁dire	-3699
+▁drop	-3700
+▁eyes	-3701
+▁fine	-3702
+▁jobs	-3703
+▁knew	-3704
+▁meat	-3705
+▁term	-3706
+▁wire	-3707
+atever	-3708
+comple	-3709
+during	-3710
+ements	-3711
+idered	-3712
+invest	-3713
+porary	-3714
+sports	-3715
+▁exhib	-3716
+▁funds	-3717
+▁haven	-3718
+▁plans	-3719
+▁shape	-3720
+▁shown	-3721
+▁stage	-3722
+▁write	-3723
+vention	-3724
+▁bottom	-3725
+▁bought	-3726
+▁editor	-3727
+▁looked	-3728
+▁prefer	-3729
+▁slight	-3730
+▁animals	-3731
+▁appreci	-3732
+▁charges	-3733
+▁complex	-3734
+▁mention	-3735
+▁perhaps	-3736
+▁profile	-3737
+▁volunte	-3738
+▁weapons	-3739
+microsoft	-3740
+▁activity	-3741
+▁collabor	-3742
+▁creating	-3743
+▁festival	-3744
+▁institut	-3745
+▁majority	-3746
+▁pressure	-3747
+▁resident	-3748
+▁schedule	-3749
+▁straight	-3750
+▁watching	-3751
+conference	-3752
+▁extremely	-3753
+▁situation	-3754
+▁increasing	-3755
+bc	-3756
+agn	-3757
+bag	-3758
+ded	-3759
+gra	-3760
+jud	-3761
+low	-3762
+pan	-3763
+run	-3764
+ula	-3765
+ups	-3766
+acks	-3767
+aled	-3768
+both	-3769
+pons	-3770
+ream	-3771
+roid	-3772
+step	-3773
+▁mis	-3774
+▁ten	-3775
+adium	-3776
+antic	-3777
+asons	-3778
+berry	-3779
+carol	-3780
+chers	-3781
+dates	-3782
+elect	-3783
+enses	-3784
+grade	-3785
+pping	-3786
+ronic	-3787
+story	-3788
+times	-3789
+▁acqu	-3790
+▁crow	-3791
+▁fant	-3792
+▁hair	-3793
+▁mach	-3794
+▁neck	-3795
+▁paid	-3796
+▁pair	-3797
+▁rare	-3798
+▁sust	-3799
+ashion	-3800
+▁ended	-3801
+▁knows	-3802
+▁loved	-3803
+▁tough	-3804
+reading	-3805
+related	-3806
+▁beauty	-3807
+▁button	-3808
+▁doctor	-3809
+▁driver	-3810
+▁honest	-3811
+▁normal	-3812
+▁pieces	-3813
+▁placed	-3814
+▁reduce	-3815
+▁transl	-3816
+although	-3817
+▁archite	-3818
+▁covered	-3819
+▁funding	-3820
+▁overall	-3821
+▁although	-3822
+▁articles	-3823
+▁daughter	-3824
+▁existing	-3825
+▁learning	-3826
+▁magazine	-3827
+▁shipping	-3828
+commission	-3829
+▁assistant	-3830
+▁comfortable	-3831
+▁traditional	-3832
+lo	-3833
+za	-3834
+bat	-3835
+cow	-3836
+dvd	-3837
+hon	-3838
+iny	-3839
+ira	-3840
+ros	-3841
+sar	-3842
+sou	-3843
+tot	-3844
+van	-3845
+▁fa	-3846
+▁po	-3847
+▁sy	-3848
+bear	-3849
+fast	-3850
+gers	-3851
+hens	-3852
+ican	-3853
+iler	-3854
+mike	-3855
+nder	-3856
+rick	-3857
+rome	-3858
+▁aff	-3859
+▁dog	-3860
+▁fat	-3861
+▁lov	-3862
+award	-3863
+comes	-3864
+dered	-3865
+organ	-3866
+visit	-3867
+women	-3868
+▁born	-3869
+▁door	-3870
+▁laun	-3871
+▁phil	-3872
+▁rich	-3873
+▁sett	-3874
+▁tips	-3875
+▁wife	-3876
+church	-3877
+degree	-3878
+includ	-3879
+island	-3880
+nament	-3881
+ourses	-3882
+undred	-3883
+▁cells	-3884
+▁guess	-3885
+▁older	-3886
+▁tests	-3887
+▁topic	-3888
+▁voice	-3889
+clusive	-3890
+mentary	-3891
+through	-3892
+▁commit	-3893
+▁income	-3894
+▁opened	-3895
+▁promot	-3896
+▁scient	-3897
+▁scored	-3898
+▁street	-3899
+▁submit	-3900
+ervation	-3901
+▁changed	-3902
+▁channel	-3903
+▁degrees	-3904
+▁deliver	-3905
+▁highest	-3906
+▁holiday	-3907
+▁payment	-3908
+▁putting	-3909
+▁smaller	-3910
+▁surface	-3911
+▁welcome	-3912
+▁winning	-3913
+▁agencies	-3914
+▁critical	-3915
+▁messages	-3916
+▁presence	-3917
+▁locations	-3918
+▁characters	-3919
+▁successful	-3920
+▁competition	-3921
+??	-3922
+cd	-3923
+fc	-3924
+hi	-3925
+nh	-3926
+dig	-3927
+dom	-3928
+equ	-3929
+gal	-3930
+ixt	-3931
+jet	-3932
+job	-3933
+kes	-3934
+ocr	-3935
+ota	-3936
+sol	-3937
+amin	-3938
+army	-3939
+asia	-3940
+aves	-3941
+bing	-3942
+gery	-3943
+help	-3944
+inte	-3945
+lind	-3946
+olym	-3947
+▁hum	-3948
+▁lab	-3949
+▁lap	-3950
+▁thr	-3951
+▁vac	-3952
+▁ver	-3953
+anted	-3954
+brook	-3955
+guide	-3956
+iques	-3957
+louis	-3958
+prote	-3959
+rench	-3960
+scott	-3961
+those	-3962
+▁bird	-3963
+▁gain	-3964
+▁opin	-3965
+▁poll	-3966
+▁ring	-3967
+▁spok	-3968
+▁suit	-3969
+▁tiss	-3970
+▁wood	-3971
+centre	-3972
+covery	-3973
+custom	-3974
+person	-3975
+spring	-3976
+travel	-3977
+▁accur	-3978
+▁capac	-3979
+▁frequ	-3980
+▁hotel	-3981
+▁piano	-3982
+▁refer	-3983
+▁sleep	-3984
+▁spect	-3985
+▁units	-3986
+central	-3987
+pacific	-3988
+▁adjust	-3989
+▁brings	-3990
+▁candid	-3991
+▁claims	-3992
+▁closed	-3993
+▁forget	-3994
+▁medium	-3995
+▁places	-3996
+▁seeing	-3997
+▁sounds	-3998
+▁survey	-3999
+colorado	-4000
+division	-4001
+▁applied	-4002
+▁contest	-4003
+▁courses	-4004
+▁effects	-4005
+▁meaning	-4006
+▁species	-4007
+▁strateg	-4008
+▁talking	-4009
+▁traffic	-4010
+▁weather	-4011
+▁compared	-4012
+▁directly	-4013
+▁hospital	-4014
+▁independ	-4015
+▁advantage	-4016
+▁establish	-4017
+▁firstname	-4018
+▁operating	-4019
+▁pregnancy	-4020
+▁considered	-4021
+li	-4022
+rd	-4023
+uv	-4024
+apt	-4025
+ara	-4026
+asy	-4027
+cla	-4028
+dem	-4029
+edy	-4030
+han	-4031
+lor	-4032
+nik	-4033
+oud	-4034
+rim	-4035
+sky	-4036
+▁il	-4037
+aked	-4038
+bank	-4039
+bour	-4040
+buff	-4041
+came	-4042
+case	-4043
+cell	-4044
+core	-4045
+does	-4046
+edom	-4047
+ffee	-4048
+illa	-4049
+jeff	-4050
+keep	-4051
+nell	-4052
+nick	-4053
+odes	-4054
+oore	-4055
+orry	-4056
+razy	-4057
+rupt	-4058
+sear	-4059
+ucle	-4060
+yond	-4061
+▁arm	-4062
+▁div	-4063
+▁tow	-4064
+▁van	-4065
+asure	-4066
+brand	-4067
+chief	-4068
+forum	-4069
+icing	-4070
+ivery	-4071
+linux	-4072
+maybe	-4073
+music	-4074
+nding	-4075
+otton	-4076
+their	-4077
+▁defe	-4078
+▁golf	-4079
+▁kick	-4080
+▁lock	-4081
+▁logo	-4082
+▁shut	-4083
+▁tack	-4084
+▁teen	-4085
+encing	-4086
+global	-4087
+second	-4088
+vision	-4089
+▁basic	-4090
+▁basis	-4091
+▁belie	-4092
+▁breat	-4093
+▁catch	-4094
+▁categ	-4095
+▁entry	-4096
+▁girls	-4097
+▁rates	-4098
+▁shall	-4099
+▁stars	-4100
+▁taste	-4101
+charles	-4102
+johnson	-4103
+pective	-4104
+▁active	-4105
+▁agency	-4106
+▁campus	-4107
+▁confir	-4108
+▁device	-4109
+▁inches	-4110
+▁leader	-4111
+▁proced	-4112
+▁smooth	-4113
+▁spread	-4114
+▁switch	-4115
+▁visual	-4116
+festival	-4117
+▁advance	-4118
+▁alleged	-4119
+▁attempt	-4120
+▁central	-4121
+▁correct	-4122
+▁designs	-4123
+▁economy	-4124
+▁episode	-4125
+▁foreign	-4126
+▁helpful	-4127
+▁himself	-4128
+▁hundred	-4129
+▁keeping	-4130
+▁library	-4131
+▁partner	-4132
+▁produce	-4133
+▁seconds	-4134
+▁victims	-4135
+christian	-4136
+committee	-4137
+▁approach	-4138
+▁changing	-4139
+▁controll	-4140
+▁determin	-4141
+▁exercise	-4142
+▁familiar	-4143
+▁physical	-4144
+▁workshop	-4145
+▁certainly	-4146
+▁consumers	-4147
+▁direction	-4148
+▁increased	-4149
+▁presented	-4150
+fortunately	-4151
+▁appropriate	-4152
+▁educational	-4153
+",	-4154
+di	-4155
+lu	-4156
+ws	-4157
+:||	-4158
+afr	-4159
+anz	-4160
+bet	-4161
+cou	-4162
+did	-4163
+doc	-4164
+eag	-4165
+fax	-4166
+hom	-4167
+jew	-4168
+mes	-4169
+pdf	-4170
+pir	-4171
+sat	-4172
+uit	-4173
+▁pm	-4174
+ades	-4175
+hard	-4176
+host	-4177
+iers	-4178
+iger	-4179
+inch	-4180
+ipal	-4181
+know	-4182
+lets	-4183
+phia	-4184
+rang	-4185
+stan	-4186
+tool	-4187
+▁arg	-4188
+▁bra	-4189
+▁map	-4190
+▁und	-4191
+ansas	-4192
+enger	-4193
+enter	-4194
+games	-4195
+ining	-4196
+lewis	-4197
+lines	-4198
+peter	-4199
+style	-4200
+uters	-4201
+▁cast	-4202
+▁dish	-4203
+▁emot	-4204
+▁lots	-4205
+▁virt	-4206
+ademic	-4207
+ancing	-4208
+assist	-4209
+atform	-4210
+disney	-4211
+giving	-4212
+itness	-4213
+uclear	-4214
+valley	-4215
+▁debut	-4216
+▁diagn	-4217
+▁medic	-4218
+▁moved	-4219
+▁novel	-4220
+▁optim	-4221
+▁relig	-4222
+▁solid	-4223
+▁spons	-4224
+▁trust	-4225
+angeles	-4226
+chester	-4227
+product	-4228
+tenance	-4229
+▁actual	-4230
+▁became	-4231
+▁demand	-4232
+▁dinner	-4233
+▁experi	-4234
+▁junior	-4235
+▁prison	-4236
+▁satisf	-4237
+▁wouldn	-4238
+adelphia	-4239
+columbia	-4240
+cription	-4241
+remember	-4242
+▁allowed	-4243
+▁appoint	-4244
+▁century	-4245
+▁consist	-4246
+▁largest	-4247
+▁leaving	-4248
+▁selling	-4249
+▁sources	-4250
+▁wearing	-4251
+published	-4252
+▁announce	-4253
+▁creative	-4254
+▁criminal	-4255
+▁elements	-4256
+▁meetings	-4257
+▁printing	-4258
+▁produced	-4259
+▁thoughts	-4260
+▁upcoming	-4261
+▁decisions	-4262
+▁extensive	-4263
+▁functions	-4264
+▁regarding	-4265
+▁solutions	-4266
+▁technical	-4267
+▁wonderful	-4268
+▁assistance	-4269
+philadelphia	-4270
+▁ingredients	-4271
+!)	-4272
+da	-4273
+dd	-4274
+def	-4275
+ele	-4276
+ena	-4277
+haw	-4278
+ico	-4279
+irs	-4280
+nab	-4281
+nav	-4282
+obe	-4283
+pot	-4284
+ria	-4285
+arab	-4286
+coll	-4287
+dora	-4288
+hand	-4289
+iami	-4290
+inct	-4291
+itar	-4292
+made	-4293
+rate	-4294
+wars	-4295
+▁fav	-4296
+▁mic	-4297
+▁pig	-4298
+▁rap	-4299
+▁rat	-4300
+▁twe	-4301
+akers	-4302
+apple	-4303
+charl	-4304
+eline	-4305
+etime	-4306
+iling	-4307
+jesus	-4308
+miami	-4309
+odies	-4310
+offic	-4311
+party	-4312
+price	-4313
+share	-4314
+table	-4315
+▁cake	-4316
+▁debt	-4317
+▁jack	-4318
+▁mini	-4319
+▁path	-4320
+▁pros	-4321
+▁rout	-4322
+▁shel	-4323
+▁teas	-4324
+active	-4325
+arters	-4326
+before	-4327
+common	-4328
+george	-4329
+giants	-4330
+ighter	-4331
+inking	-4332
+osophy	-4333
+usband	-4334
+weight	-4335
+▁estim	-4336
+▁gifts	-4337
+▁meant	-4338
+▁miles	-4339
+▁mount	-4340
+▁peace	-4341
+▁plays	-4342
+▁scale	-4343
+▁somew	-4344
+▁strug	-4345
+▁views	-4346
+▁voted	-4347
+another	-4348
+country	-4349
+express	-4350
+germany	-4351
+igation	-4352
+medical	-4353
+welcome	-4354
+▁animal	-4355
+▁breast	-4356
+▁coffee	-4357
+▁danger	-4358
+▁filled	-4359
+▁owners	-4360
+▁plenty	-4361
+▁remain	-4362
+▁secret	-4363
+▁talent	-4364
+▁appears	-4365
+▁capital	-4366
+▁concent	-4367
+▁dollars	-4368
+▁happens	-4369
+▁measure	-4370
+▁organiz	-4371
+▁parties	-4372
+▁reached	-4373
+▁reality	-4374
+▁remains	-4375
+▁serving	-4376
+▁vehicle	-4377
+▁waiting	-4378
+asketball	-4379
+institute	-4380
+▁feedback	-4381
+▁language	-4382
+▁powerful	-4383
+▁requires	-4384
+▁solution	-4385
+employment	-4386
+management	-4387
+▁afternoon	-4388
+▁corporate	-4389
+▁documents	-4390
+▁structure	-4391
+▁discussion	-4392
+▁facilities	-4393
+▁profession	-4394
+▁temperature	-4395
+▁neighborhood	-4396
+▁administration	-4397
+ox	-4398
+rh	-4399
+ta	-4400
+td	-4401
+va	-4402
+adv	-4403
+bol	-4404
+boy	-4405
+buy	-4406
+exp	-4407
+mel	-4408
+oom	-4409
+phe	-4410
+rac	-4411
+ret	-4412
+rey	-4413
+sil	-4414
+tch	-4415
+▁dy	-4416
+appe	-4417
+expl	-4418
+gift	-4419
+igan	-4420
+igue	-4421
+iser	-4422
+nded	-4423
+only	-4424
+plus	-4425
+raid	-4426
+ryan	-4427
+sent	-4428
+suit	-4429
+teen	-4430
+▁ban	-4431
+▁bow	-4432
+▁dur	-4433
+▁joy	-4434
+▁lif	-4435
+▁neg	-4436
+block	-4437
+china	-4438
+email	-4439
+erous	-4440
+icken	-4441
+inese	-4442
+inson	-4443
+month	-4444
+moore	-4445
+ordin	-4446
+smart	-4447
+stock	-4448
+urban	-4449
+▁emer	-4450
+▁fair	-4451
+▁fees	-4452
+▁fell	-4453
+▁lose	-4454
+▁lung	-4455
+▁maps	-4456
+▁rain	-4457
+▁ship	-4458
+▁situ	-4459
+▁unit	-4460
+▁util	-4461
+commod	-4462
+eagles	-4463
+record	-4464
+terday	-4465
+▁amend	-4466
+▁brush	-4467
+▁carry	-4468
+▁cases	-4469
+▁graph	-4470
+▁index	-4471
+▁notes	-4472
+▁signs	-4473
+▁teeth	-4474
+▁twice	-4475
+android	-4476
+hensive	-4477
+ography	-4478
+owntown	-4479
+perform	-4480
+▁county	-4481
+▁helped	-4482
+▁larger	-4483
+▁origin	-4484
+▁starts	-4485
+▁thread	-4486
+congress	-4487
+▁artists	-4488
+▁calling	-4489
+▁concept	-4490
+▁edition	-4491
+▁exactly	-4492
+▁explore	-4493
+▁faculty	-4494
+▁feeling	-4495
+▁holding	-4496
+▁passion	-4497
+▁planned	-4498
+▁promote	-4499
+▁readers	-4500
+▁removed	-4501
+▁sharing	-4502
+▁society	-4503
+▁victory	-4504
+istration	-4505
+▁combined	-4506
+▁delivery	-4507
+▁district	-4508
+▁followed	-4509
+▁happened	-4510
+▁interior	-4511
+▁selected	-4512
+▁installed	-4513
+▁reference	-4514
+▁restaurant	-4515
+▁participate	-4516
+▁requirements	-4517
+▁communication	-4518
+dj	-4519
+ya	-4520
+dog	-4521
+fox	-4522
+gon	-4523
+hab	-4524
+htm	-4525
+nis	-4526
+rit	-4527
+syl	-4528
+tur	-4529
+uly	-4530
+usa	-4531
+via	-4532
+▁mo	-4533
+▁vs	-4534
+anal	-4535
+anke	-4536
+ariz	-4537
+away	-4538
+bowl	-4539
+brow	-4540
+cost	-4541
+door	-4542
+fund	-4543
+iate	-4544
+mill	-4545
+orig	-4546
+reme	-4547
+roud	-4548
+shel	-4549
+stop	-4550
+thes	-4551
+tour	-4552
+umps	-4553
+wall	-4554
+zone	-4555
+▁der	-4556
+▁enc	-4557
+▁gri	-4558
+▁los	-4559
+▁pit	-4560
+▁reb	-4561
+▁wat	-4562
+bling	-4563
+chris	-4564
+civil	-4565
+conne	-4566
+ctors	-4567
+frank	-4568
+georg	-4569
+holly	-4570
+itics	-4571
+orage	-4572
+ptoms	-4573
+ships	-4574
+start	-4575
+steve	-4576
+study	-4577
+verse	-4578
+▁eval	-4579
+▁flat	-4580
+▁jump	-4581
+▁lack	-4582
+▁laws	-4583
+▁nine	-4584
+▁pray	-4585
+▁ride	-4586
+morrow	-4587
+source	-4588
+uminum	-4589
+valent	-4590
+▁banks	-4591
+▁behav	-4592
+▁cards	-4593
+▁delic	-4594
+▁force	-4595
+▁forms	-4596
+▁funny	-4597
+▁joint	-4598
+▁lunch	-4599
+▁motor	-4600
+▁multi	-4601
+▁sauce	-4602
+▁scene	-4603
+▁stick	-4604
+▁tells	-4605
+▁trees	-4606
+federal	-4607
+ionship	-4608
+western	-4609
+▁beyond	-4610
+▁boards	-4611
+▁bottle	-4612
+▁bright	-4613
+▁father	-4614
+▁gather	-4615
+▁handle	-4616
+▁hotels	-4617
+▁mostly	-4618
+▁moving	-4619
+▁resear	-4620
+▁stress	-4621
+▁writer	-4622
+attorney	-4623
+building	-4624
+standing	-4625
+training	-4626
+williams	-4627
+▁despite	-4628
+▁digital	-4629
+▁experts	-4630
+▁fashion	-4631
+▁finding	-4632
+▁greater	-4633
+▁helping	-4634
+▁kitchen	-4635
+▁managed	-4636
+▁monitor	-4637
+▁parking	-4638
+▁studies	-4639
+▁surgery	-4640
+▁theater	-4641
+available	-4642
+▁arrested	-4643
+▁exciting	-4644
+▁prepared	-4645
+▁slightly	-4646
+government	-4647
+restaurant	-4648
+▁condition	-4649
+▁executive	-4650
+▁officials	-4651
+▁remaining	-4652
+▁thousands	-4653
+▁transport	-4654
+scientology	-4655
+▁population	-4656
+▁properties	-4657
+mi	-4658
+”.	-4659
+ags	-4660
+ani	-4661
+bus	-4662
+cks	-4663
+cul	-4664
+kay	-4665
+mex	-4666
+nas	-4667
+rid	-4668
+try	-4669
+▁>>	-4670
+eric	-4671
+greg	-4672
+happ	-4673
+hers	-4674
+jour	-4675
+kind	-4676
+oner	-4677
+raud	-4678
+rett	-4679
+roke	-4680
+true	-4681
+type	-4682
+umes	-4683
+uses	-4684
+▁boo	-4685
+▁bul	-4686
+▁cop	-4687
+▁pun	-4688
+▁sav	-4689
+▁sky	-4690
+aling	-4691
+beach	-4692
+decor	-4693
+ideos	-4694
+ilton	-4695
+irish	-4696
+laure	-4697
+occer	-4698
+small	-4699
+tered	-4700
+texas	-4701
+three	-4702
+▁anti	-4703
+▁clim	-4704
+▁core	-4705
+▁dete	-4706
+▁onto	-4707
+▁pool	-4708
+▁poor	-4709
+▁salt	-4710
+▁suff	-4711
+▁west	-4712
+▁wine	-4713
+access	-4714
+alling	-4715
+french	-4716
+lights	-4717
+martin	-4718
+▁adult	-4719
+▁chest	-4720
+▁clubs	-4721
+▁dates	-4722
+▁distr	-4723
+▁folks	-4724
+▁fruit	-4725
+▁ideal	-4726
+▁metal	-4727
+▁oppos	-4728
+▁seats	-4729
+▁sizes	-4730
+▁spots	-4731
+▁suppl	-4732
+▁truly	-4733
+chinese	-4734
+current	-4735
+georgia	-4736
+history	-4737
+limited	-4738
+reuters	-4739
+russell	-4740
+society	-4741
+viously	-4742
+▁artist	-4743
+▁batter	-4744
+▁church	-4745
+▁compre	-4746
+▁decide	-4747
+▁detail	-4748
+▁estate	-4749
+▁issued	-4750
+▁legisl	-4751
+▁listed	-4752
+▁minute	-4753
+▁nature	-4754
+▁object	-4755
+▁spring	-4756
+▁tables	-4757
+anderson	-4758
+atically	-4759
+▁continu	-4760
+▁destroy	-4761
+▁disease	-4762
+▁failure	-4763
+▁figures	-4764
+▁husband	-4765
+▁plastic	-4766
+▁reflect	-4767
+▁showing	-4768
+▁towards	-4769
+abilities	-4770
+forcement	-4771
+ibilities	-4772
+marketing	-4773
+▁citizens	-4774
+▁launched	-4775
+▁maintain	-4776
+▁platform	-4777
+▁spending	-4778
+▁symptoms	-4779
+▁tomorrow	-4780
+▁whatever	-4781
+▁continues	-4782
+▁dedicated	-4783
+▁depending	-4784
+▁described	-4785
+▁practices	-4786
+▁reporting	-4787
+▁assessment	-4788
+▁maintenance	-4789
+▁recommended	-4790
+▁entertainment	-4791
+▁environmental	-4792
+ij	-4793
+ls	-4794
+sn	-4795
+vy	-4796
+ady	-4797
+alb	-4798
+arp	-4799
+bra	-4800
+fed	-4801
+fig	-4802
+ker	-4803
+kit	-4804
+lah	-4805
+las	-4806
+luc	-4807
+mur	-4808
+nth	-4809
+ola	-4810
+pur	-4811
+sit	-4812
+uth	-4813
+▁ju	-4814
+anda	-4815
+area	-4816
+arts	-4817
+braz	-4818
+bury	-4819
+eman	-4820
+hart	-4821
+hire	-4822
+icon	-4823
+iled	-4824
+kiss	-4825
+lock	-4826
+mean	-4827
+olic	-4828
+oren	-4829
+outs	-4830
+pled	-4831
+sequ	-4832
+spec	-4833
+trad	-4834
+vens	-4835
+wing	-4836
+▁bon	-4837
+▁fee	-4838
+▁fem	-4839
+▁sad	-4840
+again	-4841
+brian	-4842
+champ	-4843
+drive	-4844
+fried	-4845
+harry	-4846
+olymp	-4847
+queen	-4848
+terms	-4849
+track	-4850
+▁bond	-4851
+▁chat	-4852
+▁cris	-4853
+▁cute	-4854
+▁gall	-4855
+▁grab	-4856
+▁harm	-4857
+▁infr	-4858
+▁mist	-4859
+▁recy	-4860
+▁sole	-4861
+atures	-4862
+brazil	-4863
+master	-4864
+thomas	-4865
+▁chair	-4866
+▁deals	-4867
+▁decor	-4868
+▁earth	-4869
+▁expos	-4870
+▁initi	-4871
+▁laugh	-4872
+▁limit	-4873
+▁occas	-4874
+▁subst	-4875
+▁trade	-4876
+▁youth	-4877
+arently	-4878
+arizona	-4879
+england	-4880
+members	-4881
+▁advent	-4882
+▁afford	-4883
+▁bodies	-4884
+▁caught	-4885
+▁cheese	-4886
+▁format	-4887
+▁lights	-4888
+▁prepar	-4889
+▁secure	-4890
+▁sexual	-4891
+▁shares	-4892
+▁stored	-4893
+▁supply	-4894
+▁tablet	-4895
+graduate	-4896
+▁absolut	-4897
+▁accompl	-4898
+▁browser	-4899
+▁climate	-4900
+▁justice	-4901
+▁methods	-4902
+▁realize	-4903
+▁sustain	-4904
+francisco	-4905
+▁academic	-4906
+▁accommod	-4907
+▁allowing	-4908
+▁approved	-4909
+▁baseball	-4910
+▁birthday	-4911
+▁detailed	-4912
+▁identify	-4913
+▁returned	-4914
+▁teaching	-4915
+technology	-4916
+▁construct	-4917
+▁highlight	-4918
+▁standards	-4919
+▁challenges	-4920
+▁commission	-4921
+▁confidence	-4922
+▁identified	-4923
+▁importance	-4924
+▁competitive	-4925
+▁distribution	-4926
+!"	-4927
+xx	-4928
+zy	-4929
+▁)	-4930
+ads	-4931
+alo	-4932
+boo	-4933
+csi	-4934
+fal	-4935
+ias	-4936
+iva	-4937
+lay	-4938
+neo	-4939
+non	-4940
+rev	-4941
+ales	-4942
+anne	-4943
+athe	-4944
+bled	-4945
+cean	-4946
+film	-4947
+hors	-4948
+ifth	-4949
+otic	-4950
+pers	-4951
+sean	-4952
+send	-4953
+tick	-4954
+▁ang	-4955
+▁cro	-4956
+▁deb	-4957
+▁fly	-4958
+▁gay	-4959
+▁ign	-4960
+▁inf	-4961
+▁kid	-4962
+▁las	-4963
+▁que	-4964
+books	-4965
+going	-4966
+igned	-4967
+ingly	-4968
+inity	-4969
+learn	-4970
+money	-4971
+oklah	-4972
+rance	-4973
+rooms	-4974
+santa	-4975
+ulate	-4976
+useum	-4977
+▁awes	-4978
+▁beef	-4979
+▁belt	-4980
+▁cart	-4981
+▁conv	-4982
+▁cult	-4983
+▁fill	-4984
+▁kill	-4985
+▁plug	-4986
+▁vary	-4987
+andrew	-4988
+applic	-4989
+avenue	-4990
+jersey	-4991
+member	-4992
+middle	-4993
+porter	-4994
+select	-4995
+tained	-4996
+▁admit	-4997
+▁brain	-4998
+▁brown	-4999
+▁craft	-5000
+▁crowd	-5001
+▁domin	-5002
+▁dynam	-5003
+▁faith	-5004
+▁frame	-5005
+▁instr	-5006
+▁moist	-5007
+▁quiet	-5008
+▁relax	-5009
+▁smile	-5010
+▁south	-5011
+▁talks	-5012
+acement	-5013
+actions	-5014
+explore	-5015
+friends	-5016
+ishment	-5017
+support	-5018
+▁begins	-5019
+▁cities	-5020
+▁couldn	-5021
+▁incred	-5022
+▁listen	-5023
+▁missed	-5024
+▁recipe	-5025
+▁signed	-5026
+▁status	-5027
+▁ticket	-5028
+▁useful	-5029
+oklahoma	-5030
+sciences	-5031
+▁assault	-5032
+▁awesome	-5033
+▁counter	-5034
+▁enhance	-5035
+▁hunting	-5036
+▁machine	-5037
+▁nuclear	-5038
+▁restaur	-5039
+▁setting	-5040
+▁streets	-5041
+reference	-5042
+▁accident	-5043
+▁employee	-5044
+▁evidence	-5045
+▁featured	-5046
+▁inspired	-5047
+▁judgment	-5048
+▁measures	-5049
+▁resource	-5050
+▁versions	-5051
+foundation	-5052
+republican	-5053
+▁connected	-5054
+▁efficient	-5055
+▁expensive	-5056
+▁fantastic	-5057
+▁generally	-5058
+▁positions	-5059
+▁processes	-5060
+▁receiving	-5061
+▁therefore	-5062
+▁yesterday	-5063
+▁tournament	-5064
+▁combination	-5065
+▁comprehensive	-5066
+?”	-5067
+mb	-5068
+bad	-5069
+bas	-5070
+feb	-5071
+fly	-5072
+ibn	-5073
+imp	-5074
+lif	-5075
+pow	-5076
+rat	-5077
+rub	-5078
+rum	-5079
+saf	-5080
+zen	-5081
+▁au	-5082
+▁ur	-5083
+ague	-5084
+bian	-5085
+cred	-5086
+ibit	-5087
+jose	-5088
+link	-5089
+marc	-5090
+osis	-5091
+penn	-5092
+poon	-5093
+rief	-5094
+trib	-5095
+wich	-5096
+wild	-5097
+▁cir	-5098
+▁egg	-5099
+▁fro	-5100
+▁mel	-5101
+▁tum	-5102
+▁ven	-5103
+▁wid	-5104
+anzan	-5105
+apore	-5106
+canon	-5107
+cling	-5108
+daily	-5109
+davis	-5110
+exper	-5111
+imens	-5112
+inste	-5113
+istan	-5114
+labor	-5115
+local	-5116
+mates	-5117
+mical	-5118
+secut	-5119
+store	-5120
+uable	-5121
+▁apps	-5122
+▁bath	-5123
+▁fuel	-5124
+▁pure	-5125
+agency	-5126
+bridge	-5127
+bsites	-5128
+canada	-5129
+elines	-5130
+icians	-5131
+making	-5132
+ptions	-5133
+safety	-5134
+screen	-5135
+should	-5136
+▁apart	-5137
+▁cheap	-5138
+▁crust	-5139
+▁flour	-5140
+▁mixed	-5141
+▁proud	-5142
+▁shelf	-5143
+▁thick	-5144
+▁tomat	-5145
+▁usual	-5146
+▁veter	-5147
+▁waste	-5148
+article	-5149
+finally	-5150
+monitor	-5151
+▁adding	-5152
+▁approx	-5153
+▁asking	-5154
+▁centre	-5155
+▁corner	-5156
+▁garden	-5157
+▁memory	-5158
+▁nearby	-5159
+▁obtain	-5160
+▁raised	-5161
+▁shared	-5162
+▁sister	-5163
+▁styles	-5164
+▁theory	-5165
+▁tradem	-5166
+everyone	-5167
+hamilton	-5168
+memorial	-5169
+partners	-5170
+▁chapter	-5171
+▁compare	-5172
+▁culture	-5173
+▁cutting	-5174
+▁explain	-5175
+▁imagine	-5176
+▁involve	-5177
+▁learned	-5178
+▁leather	-5179
+▁markets	-5180
+▁objects	-5181
+▁offices	-5182
+▁ordered	-5183
+▁publish	-5184
+▁seating	-5185
+singapore	-5186
+▁behavior	-5187
+▁numerous	-5188
+▁patients	-5189
+▁regional	-5190
+▁register	-5191
+▁shooting	-5192
+▁celebrate	-5193
+▁concerned	-5194
+▁emergency	-5195
+▁featuring	-5196
+▁guarantee	-5197
+▁perfectly	-5198
+▁purchased	-5199
+▁accessible	-5200
+▁appearance	-5201
+▁commitment	-5202
+▁components	-5203
+▁discovered	-5204
+▁healthcare	-5205
+▁techniques	-5206
+▁immediately	-5207
+▁conversation	-5208
+?"	-5209
+ez	-5210
+fu	-5211
+——	-5212
+abc	-5213
+abs	-5214
+aks	-5215
+atl	-5216
+fac	-5217
+hal	-5218
+hus	-5219
+kim	-5220
+map	-5221
+mid	-5222
+mos	-5223
+oga	-5224
+raf	-5225
+tro	-5226
+▁aw	-5227
+▁du	-5228
+acle	-5229
+bles	-5230
+cape	-5231
+dist	-5232
+gent	-5233
+hope	-5234
+inem	-5235
+ndom	-5236
+path	-5237
+rodr	-5238
+rush	-5239
+ryst	-5240
+sher	-5241
+talk	-5242
+tell	-5243
+unge	-5244
+wipe	-5245
+▁bub	-5246
+▁pet	-5247
+▁swe	-5248
+ategy	-5249
+cross	-5250
+enjoy	-5251
+iguez	-5252
+innov	-5253
+iters	-5254
+orrow	-5255
+poses	-5256
+rings	-5257
+sarah	-5258
+trust	-5259
+ushes	-5260
+vegas	-5261
+would	-5262
+▁ages	-5263
+▁arms	-5264
+▁bear	-5265
+▁frag	-5266
+▁ment	-5267
+▁mode	-5268
+▁narr	-5269
+▁push	-5270
+▁rent	-5271
+▁repe	-5272
+▁sand	-5273
+▁seed	-5274
+▁shap	-5275
+▁tech	-5276
+▁toys	-5277
+▁wash	-5278
+choose	-5279
+compet	-5280
+crease	-5281
+iamond	-5282
+kansas	-5283
+lauren	-5284
+letter	-5285
+minute	-5286
+mobile	-5287
+orough	-5288
+return	-5289
+single	-5290
+ushing	-5291
+▁draft	-5292
+▁feels	-5293
+▁holds	-5294
+▁label	-5295
+▁motiv	-5296
+▁plate	-5297
+▁raise	-5298
+▁rapid	-5299
+▁reven	-5300
+▁shots	-5301
+▁songs	-5302
+▁steps	-5303
+▁taxes	-5304
+▁tight	-5305
+▁worry	-5306
+climate	-5307
+collect	-5308
+comfort	-5309
+faction	-5310
+ologies	-5311
+station	-5312
+studies	-5313
+subscri	-5314
+updated	-5315
+windows	-5316
+working	-5317
+written	-5318
+▁anyway	-5319
+▁arrang	-5320
+▁breath	-5321
+▁caused	-5322
+▁disapp	-5323
+▁exceed	-5324
+▁hoping	-5325
+▁houses	-5326
+▁jacket	-5327
+▁leaves	-5328
+▁notice	-5329
+▁random	-5330
+▁sector	-5331
+▁spirit	-5332
+▁square	-5333
+▁stream	-5334
+▁topics	-5335
+▁tweets	-5336
+▁victim	-5337
+▁volume	-5338
+▁weekly	-5339
+carolina	-5340
+ideswipe	-5341
+iversary	-5342
+minister	-5343
+quarters	-5344
+security	-5345
+▁dropped	-5346
+▁extract	-5347
+▁herself	-5348
+▁illegal	-5349
+▁journey	-5350
+▁leaders	-5351
+▁prepare	-5352
+▁primary	-5353
+▁replace	-5354
+▁sitting	-5355
+▁storage	-5356
+▁trailer	-5357
+▁updated	-5358
+▁updates	-5359
+ferencing	-5360
+hollywood	-5361
+rodriguez	-5362
+▁anywhere	-5363
+▁attorney	-5364
+▁distinct	-5365
+▁facility	-5366
+▁proposed	-5367
+▁purposes	-5368
+▁teachers	-5369
+▁transfer	-5370
+▁vacation	-5371
+▁broadcast	-5372
+▁confirmed	-5373
+▁determine	-5374
+▁encourage	-5375
+▁exclusive	-5376
+▁expertise	-5377
+▁procedure	-5378
+▁radiation	-5379
+corporation	-5380
+▁associated	-5381
+▁enforcement	-5382
+▁investigation	-5383
+.|	-5384
+uz	-5385
+ano	-5386
+asp	-5387
+ban	-5388
+bug	-5389
+far	-5390
+gie	-5391
+hoo	-5392
+nfl	-5393
+oil	-5394
+omy	-5395
+tay	-5396
+tit	-5397
+asks	-5398
+beck	-5399
+cath	-5400
+ctic	-5401
+edia	-5402
+ette	-5403
+flat	-5404
+hatt	-5405
+ifan	-5406
+lace	-5407
+leon	-5408
+lord	-5409
+nels	-5410
+oles	-5411
+onym	-5412
+osen	-5413
+pens	-5414
+pier	-5415
+plit	-5416
+prom	-5417
+rink	-5418
+sony	-5419
+stay	-5420
+trom	-5421
+yard	-5422
+▁bat	-5423
+▁ben	-5424
+▁phr	-5425
+▁pil	-5426
+▁rom	-5427
+▁squ	-5428
+▁tap	-5429
+ashed	-5430
+being	-5431
+clean	-5432
+could	-5433
+crete	-5434
+cynth	-5435
+diego	-5436
+eless	-5437
+estin	-5438
+happy	-5439
+hemat	-5440
+italy	-5441
+known	-5442
+leges	-5443
+level	-5444
+major	-5445
+mayor	-5446
+ograp	-5447
+oices	-5448
+opera	-5449
+prise	-5450
+radio	-5451
+ridge	-5452
+route	-5453
+still	-5454
+total	-5455
+udden	-5456
+upper	-5457
+uries	-5458
+yanke	-5459
+▁boys	-5460
+▁cups	-5461
+▁desk	-5462
+▁glow	-5463
+▁hang	-5464
+▁inch	-5465
+▁none	-5466
+▁oven	-5467
+▁rose	-5468
+▁scen	-5469
+▁scre	-5470
+▁soph	-5471
+▁spin	-5472
+▁tail	-5473
+▁tast	-5474
+▁ware	-5475
+advent	-5476
+democr	-5477
+energy	-5478
+hattan	-5479
+having	-5480
+normal	-5481
+taylor	-5482
+update	-5483
+wilson	-5484
+▁analy	-5485
+▁avail	-5486
+▁blend	-5487
+▁crazy	-5488
+▁grown	-5489
+▁honor	-5490
+▁keeps	-5491
+▁minor	-5492
+▁names	-5493
+▁panel	-5494
+▁proof	-5495
+▁radio	-5496
+▁rings	-5497
+▁swing	-5498
+▁theme	-5499
+▁trial	-5500
+▁valid	-5501
+▁wheel	-5502
+▁worst	-5503
+▁yards	-5504
+athered	-5505
+closure	-5506
+cynthia	-5507
+despite	-5508
+fection	-5509
+hanifan	-5510
+isation	-5511
+looking	-5512
+project	-5513
+quality	-5514
+respond	-5515
+▁agreed	-5516
+▁blocks	-5517
+▁cooper	-5518
+▁driven	-5519
+▁favour	-5520
+▁fellow	-5521
+▁govern	-5522
+▁inform	-5523
+▁injury	-5524
+▁losing	-5525
+▁mental	-5526
+▁motion	-5527
+▁pocket	-5528
+▁seemed	-5529
+▁spokes	-5530
+▁statue	-5531
+▁strike	-5532
+▁tissue	-5533
+▁upgrad	-5534
+▁values	-5535
+complete	-5536
+football	-5537
+intellig	-5538
+position	-5539
+standard	-5540
+▁becomes	-5541
+▁capture	-5542
+▁clearly	-5543
+▁confirm	-5544
+▁flights	-5545
+▁massage	-5546
+▁maximum	-5547
+▁playoff	-5548
+▁premium	-5549
+▁session	-5550
+▁totally	-5551
+▁turning	-5552
+▁accounts	-5553
+▁becoming	-5554
+▁bringing	-5555
+▁capacity	-5556
+▁coverage	-5557
+▁greatest	-5558
+▁negative	-5559
+▁recorded	-5560
+▁sentence	-5561
+▁separate	-5562
+▁wildlife	-5563
+elementary	-5564
+physiology	-5565
+▁collected	-5566
+▁component	-5567
+▁defensive	-5568
+▁delicious	-5569
+▁recognize	-5570
+▁strategic	-5571
+▁introduced	-5572
+▁leadership	-5573
+▁networking	-5574
+▁previously	-5575
+▁alternative	-5576
+▁description	-5577
+▁instructions	-5578
+▁organizations	-5579
+▁professionals	-5580
+▁responsibility	-5581
+cm	-5582
+dc	-5583
+ki	-5584
+kr	-5585
+azz	-5586
+cop	-5587
+eed	-5588
+fay	-5589
+jac	-5590
+kat	-5591
+pha	-5592
+sab	-5593
+sep	-5594
+tes	-5595
+uts	-5596
+▁ly	-5597
+..."	-5598
+ampa	-5599
+atab	-5600
+balt	-5601
+bean	-5602
+beat	-5603
+bird	-5604
+chat	-5605
+clin	-5606
+data	-5607
+heat	-5608
+html	-5609
+icit	-5610
+inks	-5611
+kent	-5612
+mult	-5613
+odge	-5614
+oted	-5615
+otes	-5616
+pict	-5617
+pper	-5618
+rial	-5619
+rics	-5620
+song	-5621
+tien	-5622
+tube	-5623
+utch	-5624
+▁den	-5625
+▁fis	-5626
+▁flu	-5627
+▁hip	-5628
+▁kne	-5629
+▁mur	-5630
+▁odd	-5631
+▁pal	-5632
+▁pin	-5633
+▁wra	-5634
+adult	-5635
+apers	-5636
+cohol	-5637
+eries	-5638
+estic	-5639
+ether	-5640
+files	-5641
+glass	-5642
+iable	-5643
+iders	-5644
+imore	-5645
+iples	-5646
+marsh	-5647
+oking	-5648
+ospit	-5649
+patch	-5650
+rants	-5651
+staff	-5652
+sters	-5653
+tampa	-5654
+ulous	-5655
+wards	-5656
+worth	-5657
+young	-5658
+▁arts	-5659
+▁buck	-5660
+▁deer	-5661
+▁dogs	-5662
+▁edge	-5663
+▁eleg	-5664
+▁fear	-5665
+▁hero	-5666
+▁iron	-5667
+▁kept	-5668
+▁rank	-5669
+▁rise	-5670
+▁task	-5671
+▁youn	-5672
+ducing	-5673
+econom	-5674
+events	-5675
+integr	-5676
+ivered	-5677
+iverse	-5678
+ixture	-5679
+mexico	-5680
+orient	-5681
+pected	-5682
+profit	-5683
+spaper	-5684
+square	-5685
+sylvan	-5686
+wanted	-5687
+▁agent	-5688
+▁boxes	-5689
+▁dream	-5690
+▁drink	-5691
+▁fraud	-5692
+▁pitch	-5693
+▁plant	-5694
+▁recip	-5695
+▁skill	-5696
+▁teach	-5697
+▁truck	-5698
+account	-5699
+african	-5700
+britain	-5701
+fairfax	-5702
+italian	-5703
+therapy	-5704
+tribune	-5705
+website	-5706
+▁afraid	-5707
+▁awards	-5708
+▁battle	-5709
+▁circum	-5710
+▁conven	-5711
+▁cotton	-5712
+▁dating	-5713
+▁earned	-5714
+▁excess	-5715
+▁forgot	-5716
+▁freeze	-5717
+▁illust	-5718
+▁joined	-5719
+▁league	-5720
+▁licens	-5721
+▁manage	-5722
+▁perman	-5723
+▁powder	-5724
+▁racing	-5725
+anzanite	-5726
+download	-5727
+hospital	-5728
+medicine	-5729
+response	-5730
+▁arrived	-5731
+▁classes	-5732
+▁enjoyed	-5733
+▁factors	-5734
+▁fingers	-5735
+▁focused	-5736
+▁journal	-5737
+▁license	-5738
+▁offense	-5739
+▁opinion	-5740
+▁pleased	-5741
+▁possess	-5742
+▁seeking	-5743
+▁therapy	-5744
+▁treated	-5745
+▁upgrade	-5746
+▁virtual	-5747
+▁willing	-5748
+baltimore	-5749
+manhattan	-5750
+valentine	-5751
+▁accepted	-5752
+▁aircraft	-5753
+▁aluminum	-5754
+▁appeared	-5755
+▁incident	-5756
+▁managing	-5757
+▁password	-5758
+▁possibly	-5759
+▁pregnant	-5760
+▁properly	-5761
+▁retailers	-5762
+▁seriously	-5763
+▁surprised	-5764
+▁basketball	-5765
+▁innovation	-5766
+▁permission	-5767
+▁philosophy	-5768
+▁processing	-5769
+▁television	-5770
+▁university	-5771
+professional	-5772
+thanksgiving	-5773
+▁advertising	-5774
+▁celebration	-5775
+▁engineering	-5776
+▁established	-5777
+▁experienced	-5778
+▁independent	-5779
+▁integration	-5780
+▁photographs	-5781
+▁publication	-5782
+▁researchers	-5783
+▁unemployment	-5784
+communications	-5785
+▁approximately	-5786
+▁automatically	-5787
+gb	-5788
+gi	-5789
+ja	-5790
+sv	-5791
+vp	-5792
+wi	-5793
+amm	-5794
+awn	-5795
+bes	-5796
+cod	-5797
+dam	-5798
+ero	-5799
+fla	-5800
+gov	-5801
+ini	-5802
+jim	-5803
+kar	-5804
+llc	-5805
+mov	-5806
+pth	-5807
+sym	-5808
+tax	-5809
+tic	-5810
+ura	-5811
+abet	-5812
+aide	-5813
+alty	-5814
+aron	-5815
+circ	-5816
+disc	-5817
+draw	-5818
+ella	-5819
+erce	-5820
+erts	-5821
+flet	-5822
+fold	-5823
+gall	-5824
+hour	-5825
+hous	-5826
+ilty	-5827
+imin	-5828
+jenn	-5829
+jess	-5830
+mate	-5831
+mith	-5832
+ntic	-5833
+obby	-5834
+pool	-5835
+prem	-5836
+rict	-5837
+rust	-5838
+sche	-5839
+such	-5840
+term	-5841
+tony	-5842
+wrap	-5843
+▁ast	-5844
+▁ath	-5845
+▁avi	-5846
+▁bug	-5847
+▁dom	-5848
+▁dut	-5849
+▁eth	-5850
+▁ink	-5851
+▁les	-5852
+▁lit	-5853
+▁nur	-5854
+▁ske	-5855
+▁tab	-5856
+▁tut	-5857
+▁wel	-5858
+aints	-5859
+ament	-5860
+athon	-5861
+coast	-5862
+gling	-5863
+heart	-5864
+icher	-5865
+image	-5866
+ishes	-5867
+kevin	-5868
+liter	-5869
+olved	-5870
+oming	-5871
+poker	-5872
+quick	-5873
+sands	-5874
+sandy	-5875
+scale	-5876
+sorry	-5877
+troms	-5878
+▁bare	-5879
+▁bowl	-5880
+▁burn	-5881
+▁ends	-5882
+▁fest	-5883
+▁guid	-5884
+▁keys	-5885
+▁legs	-5886
+▁mine	-5887
+▁rule	-5888
+▁scri	-5889
+compan	-5890
+father	-5891
+igible	-5892
+indust	-5893
+intage	-5894
+little	-5895
+oregon	-5896
+phones	-5897
+stract	-5898
+thread	-5899
+▁brief	-5900
+▁chief	-5901
+▁conce	-5902
+▁crack	-5903
+▁fifth	-5904
+▁filed	-5905
+▁gradu	-5906
+▁grass	-5907
+▁hopes	-5908
+▁judge	-5909
+▁knock	-5910
+▁locks	-5911
+▁negot	-5912
+▁puppy	-5913
+▁repro	-5914
+▁simpl	-5915
+▁spark	-5916
+▁trick	-5917
+buffalo	-5918
+control	-5919
+edition	-5920
+funeral	-5921
+illiant	-5922
+jackson	-5923
+madison	-5924
+oration	-5925
+section	-5926
+thentic	-5927
+without	-5928
+▁bacter	-5929
+▁baking	-5930
+▁bigger	-5931
+▁chosen	-5932
+▁closer	-5933
+▁compar	-5934
+▁repair	-5935
+▁scheme	-5936
+▁shirts	-5937
+▁showed	-5938
+▁struck	-5939
+▁sudden	-5940
+▁taxpay	-5941
+▁unless	-5942
+adelaide	-5943
+explorer	-5944
+location	-5945
+magazine	-5946
+medicare	-5947
+register	-5948
+robinson	-5949
+▁actions	-5950
+▁answers	-5951
+▁command	-5952
+▁coordin	-5953
+▁couples	-5954
+▁discrim	-5955
+▁diverse	-5956
+▁examine	-5957
+▁falling	-5958
+▁illness	-5959
+▁reduced	-5960
+▁testing	-5961
+▁walking	-5962
+alexander	-5963
+chocolate	-5964
+sideswipe	-5965
+▁affected	-5966
+▁civilian	-5967
+▁designer	-5968
+▁downtown	-5969
+▁electric	-5970
+▁intellig	-5971
+▁memories	-5972
+▁revealed	-5973
+▁shopping	-5974
+▁thorough	-5975
+▁tomatoes	-5976
+▁websites	-5977
+▁apartment	-5978
+▁architect	-5979
+▁buildings	-5980
+▁carefully	-5981
+▁computers	-5982
+▁immediate	-5983
+▁mentioned	-5984
+▁organized	-5985
+▁tradition	-5986
+▁absolutely	-5987
+▁correspond	-5988
+▁eventually	-5989
+▁generation	-5990
+▁operations	-5991
+▁situations	-5992
+▁communities	-5993
+▁partnership	-5994
+▁restaurants	-5995
+▁participants	-5996
+▁satisfaction	-5997
+▁contributions	-5998
+'.	-5999
+);	-6000
+mp	-6001
+nc	-6002
+nr	-6003
+xz	-6004
+arc	-6005
+aro	-6006
+ato	-6007
+axy	-6008
+bly	-6009
+cel	-6010
+coy	-6011
+cue	-6012
+dar	-6013
+dur	-6014
+hay	-6015
+iat	-6016
+ili	-6017
+izz	-6018
+jon	-6019
+kal	-6020
+lan	-6021
+lat	-6022
+ltd	-6023
+mah	-6024
+nda	-6025
+pes	-6026
+phi	-6027
+pit	-6028
+rab	-6029
+ref	-6030
+rot	-6031
+sea	-6032
+sue	-6033
+swe	-6034
+tem	-6035
+url	-6036
+wer	-6037
+▁id	-6038
+adhd	-6039
+anim	-6040
+bali	-6041
+base	-6042
+cain	-6043
+chef	-6044
+cles	-6045
+code	-6046
+conf	-6047
+dark	-6048
+farm	-6049
+feed	-6050
+gend	-6051
+half	-6052
+hell	-6053
+icip	-6054
+illy	-6055
+ilst	-6056
+isms	-6057
+mers	-6058
+navy	-6059
+olec	-6060
+owed	-6061
+pret	-6062
+priv	-6063
+rena	-6064
+sand	-6065
+▁bid	-6066
+▁cos	-6067
+▁die	-6068
+▁god	-6069
+▁ran	-6070
+▁raw	-6071
+▁rob	-6072
+▁sch	-6073
+▁tie	-6074
+▁zip	-6075
+annie	-6076
+break	-6077
+camer	-6078
+endar	-6079
+focus	-6080
+ideas	-6081
+itude	-6082
+jones	-6083
+lying	-6084
+shoot	-6085
+sized	-6086
+tours	-6087
+ucial	-6088
+until	-6089
+usion	-6090
+uster	-6091
+write	-6092
+yours	-6093
+▁cere	-6094
+▁emph	-6095
+▁evac	-6096
+▁flag	-6097
+▁flex	-6098
+▁glad	-6099
+▁grew	-6100
+▁hasn	-6101
+▁hunt	-6102
+▁lift	-6103
+▁meal	-6104
+▁pist	-6105
+▁plot	-6106
+▁rang	-6107
+▁rear	-6108
+▁scan	-6109
+▁solo	-6110
+▁soul	-6111
+▁vent	-6112
+▁vill	-6113
+academ	-6114
+around	-6115
+astern	-6116
+austin	-6117
+finder	-6118
+future	-6119
+iously	-6120
+living	-6121
+mother	-6122
+museum	-6123
+nelson	-6124
+othing	-6125
+photos	-6126
+romney	-6127
+ronics	-6128
+winter	-6129
+xzibit	-6130
+▁abuse	-6131
+▁assem	-6132
+▁avoid	-6133
+▁balls	-6134
+▁bunch	-6135
+▁doubt	-6136
+▁films	-6137
+▁jewel	-6138
+▁liber	-6139
+▁liked	-6140
+▁loves	-6141
+▁meals	-6142
+▁merch	-6143
+▁molec	-6144
+▁noted	-6145
+▁risks	-6146
+▁river	-6147
+▁sides	-6148
+▁spoke	-6149
+▁tasks	-6150
+affairs	-6151
+atively	-6152
+avenger	-6153
+charlie	-6154
+clusion	-6155
+content	-6156
+details	-6157
+enities	-6158
+itative	-6159
+justice	-6160
+ointing	-6161
+perfect	-6162
+ulating	-6163
+winning	-6164
+▁buying	-6165
+▁contem	-6166
+▁fields	-6167
+▁forces	-6168
+▁guilty	-6169
+▁hospit	-6170
+▁laptop	-6171
+▁legend	-6172
+▁linked	-6173
+▁marked	-6174
+▁output	-6175
+▁paying	-6176
+▁plugin	-6177
+▁rating	-6178
+▁riding	-6179
+▁thinks	-6180
+▁titles	-6181
+▁united	-6182
+▁videos	-6183
+▁window	-6184
+advanced	-6185
+catholic	-6186
+michigan	-6187
+reserved	-6188
+software	-6189
+sylvania	-6190
+▁alcohol	-6191
+▁balance	-6192
+▁brushes	-6193
+▁capable	-6194
+▁coaches	-6195
+▁conclud	-6196
+▁engaged	-6197
+▁flowers	-6198
+▁founder	-6199
+▁hearing	-6200
+▁lessons	-6201
+▁matters	-6202
+▁missing	-6203
+▁musical	-6204
+▁noticed	-6205
+▁ongoing	-6206
+▁patient	-6207
+▁privacy	-6208
+▁stadium	-6209
+▁trouble	-6210
+▁younger	-6211
+assistant	-6212
+corporate	-6213
+emergency	-6214
+professor	-6215
+riculture	-6216
+▁admitted	-6217
+▁audience	-6218
+▁believes	-6219
+▁carrying	-6220
+▁directed	-6221
+▁explains	-6222
+▁improved	-6223
+▁matching	-6224
+▁moisture	-6225
+▁occasion	-6226
+▁releases	-6227
+▁requests	-6228
+▁speakers	-6229
+▁strategy	-6230
+▁supports	-6231
+▁valuable	-6232
+astructure	-6233
+collection	-6234
+innovation	-6235
+▁committed	-6236
+▁continued	-6237
+▁delivered	-6238
+▁developer	-6239
+▁displayed	-6240
+▁extension	-6241
+▁happening	-6242
+▁otherwise	-6243
+▁supported	-6244
+▁technique	-6245
+▁temporary	-6246
+▁volunteer	-6247
+▁workshops	-6248
+environment	-6249
+▁electronic	-6250
+▁employment	-6251
+▁resolution	-6252
+▁scientific	-6253
+▁appointment	-6254
+▁experiences	-6255
+▁perspective	-6256
+▁photography	-6257
+▁regulations	-6258
+▁capabilities	-6259
+▁improvements	-6260
+▁installation	-6261
+▁institutions	-6262
+▁manufacturer	-6263
+▁presentation	-6264
+▁registration	-6265
+▁communications	-6266
+bn	-6267
+cc	-6268
+fs	-6269
+jr	-6270
+xy	-6271
+yl	-6272
+,''	-6273
+apj	-6274
+dge	-6275
+dll	-6276
+eve	-6277
+fan	-6278
+fat	-6279
+gun	-6280
+guy	-6281
+had	-6282
+hou	-6283
+ila	-6284
+iot	-6285
+iti	-6286
+jav	-6287
+lib	-6288
+pas	-6289
+phy	-6290
+pie	-6291
+tel	-6292
+yer	-6293
+▁:)	-6294
+alog	-6295
+anna	-6296
+anya	-6297
+aret	-6298
+arms	-6299
+avia	-6300
+bull	-6301
+flow	-6302
+fred	-6303
+iere	-6304
+iest	-6305
+ifer	-6306
+info	-6307
+ippi	-6308
+ivan	-6309
+izon	-6310
+mand	-6311
+matt	-6312
+mony	-6313
+okay	-6314
+poss	-6315
+rait	-6316
+roit	-6317
+rosc	-6318
+shot	-6319
+shut	-6320
+size	-6321
+sold	-6322
+spot	-6323
+ucci	-6324
+user	-6325
+utah	-6326
+vere	-6327
+wash	-6328
+wear	-6329
+▁hat	-6330
+▁nav	-6331
+▁ped	-6332
+▁sam	-6333
+▁sea	-6334
+▁tou	-6335
+▁tun	-6336
+▁voc	-6337
+allen	-6338
+among	-6339
+ateur	-6340
+bound	-6341
+bruce	-6342
+chall	-6343
+folio	-6344
+grant	-6345
+guard	-6346
+ician	-6347
+idays	-6348
+inals	-6349
+index	-6350
+jason	-6351
+lando	-6352
+lects	-6353
+locks	-6354
+motor	-6355
+myers	-6356
+somet	-6357
+sport	-6358
+stars	-6359
+▁arth	-6360
+▁bars	-6361
+▁busy	-6362
+▁coun	-6363
+▁cuts	-6364
+▁ease	-6365
+▁hits	-6366
+▁mill	-6367
+▁nerv	-6368
+▁pump	-6369
+▁till	-6370
+▁whom	-6371
+arabic	-6372
+charge	-6373
+comput	-6374
+contin	-6375
+course	-6376
+credit	-6377
+eicher	-6378
+execut	-6379
+garden	-6380
+hoover	-6381
+ilting	-6382
+intend	-6383
+motion	-6384
+number	-6385
+pierre	-6386
+pocket	-6387
+proper	-6388
+ternal	-6389
+▁achie	-6390
+▁beach	-6391
+▁bills	-6392
+▁coast	-6393
+▁crash	-6394
+▁deput	-6395
+▁elbow	-6396
+▁finds	-6397
+▁flood	-6398
+▁hedge	-6399
+▁kills	-6400
+▁lungs	-6401
+▁owned	-6402
+▁param	-6403
+▁parks	-6404
+▁posts	-6405
+▁quilt	-6406
+▁rated	-6407
+▁smoke	-6408
+▁sport	-6409
+▁stuck	-6410
+▁theft	-6411
+▁ultra	-6412
+▁upper	-6413
+atabase	-6414
+ception	-6415
+certain	-6416
+contest	-6417
+highway	-6418
+instead	-6419
+iration	-6420
+issippi	-6421
+mediate	-6422
+orlando	-6423
+process	-6424
+stances	-6425
+weather	-6426
+▁appeal	-6427
+▁brands	-6428
+▁bridge	-6429
+▁butter	-6430
+▁comedy	-6431
+▁compos	-6432
+▁covers	-6433
+▁deaths	-6434
+▁decade	-6435
+▁emphas	-6436
+▁enthus	-6437
+▁faster	-6438
+▁flavor	-6439
+▁formed	-6440
+▁height	-6441
+▁lovely	-6442
+▁murder	-6443
+▁orders	-6444
+▁rising	-6445
+▁sevent	-6446
+▁tested	-6447
+mountain	-6448
+official	-6449
+presents	-6450
+specific	-6451
+▁accused	-6452
+▁analyst	-6453
+▁awarded	-6454
+▁bedroom	-6455
+▁chicken	-6456
+▁curious	-6457
+▁drawing	-6458
+▁dynamic	-6459
+▁follows	-6460
+▁forever	-6461
+▁formula	-6462
+▁instant	-6463
+▁joining	-6464
+▁killing	-6465
+▁mailing	-6466
+▁mixture	-6467
+▁organic	-6468
+▁outdoor	-6469
+▁posting	-6470
+▁profits	-6471
+▁returns	-6472
+▁skilled	-6473
+▁summary	-6474
+▁tonight	-6475
+▁trained	-6476
+▁typical	-6477
+currently	-6478
+executive	-6479
+including	-6480
+questions	-6481
+subscribe	-6482
+▁accompan	-6483
+▁champion	-6484
+▁cleaning	-6485
+▁concerns	-6486
+▁eligible	-6487
+▁entering	-6488
+▁exposure	-6489
+▁extended	-6490
+▁fighting	-6491
+▁identity	-6492
+▁lifetime	-6493
+▁narrator	-6494
+▁painting	-6495
+▁proposal	-6496
+▁standing	-6497
+▁teaspoon	-6498
+protection	-6499
+▁adventure	-6500
+▁agreement	-6501
+▁conducted	-6502
+▁confident	-6503
+▁copyright	-6504
+▁expressed	-6505
+▁favourite	-6506
+▁hopefully	-6507
+▁improving	-6508
+▁influence	-6509
+▁interests	-6510
+▁newspaper	-6511
+▁operation	-6512
+▁producing	-6513
+▁regularly	-6514
+▁religious	-6515
+▁somewhere	-6516
+▁spokesman	-6517
+▁touchdown	-6518
+▁transform	-6519
+▁worldwide	-6520
+engineering	-6521
+mississippi	-6522
+▁apparently	-6523
+▁candidates	-6524
+▁constantly	-6525
+▁mechanical	-6526
+▁reportedly	-6527
+▁possibility	-6528
+▁surrounding	-6529
+unfortunately	-6530
+▁headquarters	-6531
+▁circumstances	-6532
+aza	-6533
+bah	-6534
+ceo	-6535
+cru	-6536
+eat	-6537
+eds	-6538
+esh	-6539
+esy	-6540
+hat	-6541
+hiv	-6542
+ipl	-6543
+ktm	-6544
+kyo	-6545
+lik	-6546
+nel	-6547
+nob	-6548
+ops	-6549
+pci	-6550
+rix	-6551
+tag	-6552
+tle	-6553
+tog	-6554
+uzz	-6555
+won	-6556
+yah	-6557
+zip	-6558
+abin	-6559
+amic	-6560
+anes	-6561
+ardi	-6562
+arth	-6563
+atre	-6564
+baby	-6565
+bene	-6566
+bern	-6567
+bits	-6568
+cash	-6569
+cass	-6570
+dall	-6571
+days	-6572
+easy	-6573
+eliz	-6574
+ells	-6575
+elly	-6576
+eter	-6577
+etry	-6578
+five	-6579
+gard	-6580
+gord	-6581
+http	-6582
+irty	-6583
+jake	-6584
+kids	-6585
+kins	-6586
+lane	-6587
+mini	-6588
+must	-6589
+oria	-6590
+prep	-6591
+race	-6592
+rams	-6593
+rell	-6594
+shaw	-6595
+todd	-6596
+tree	-6597
+wine	-6598
+▁cum	-6599
+▁flo	-6600
+▁gro	-6601
+▁lie	-6602
+▁lux	-6603
+▁msa	-6604
+▁pic	-6605
+▁rac	-6606
+▁rot	-6607
+▁row	-6608
+▁rub	-6609
+▁sed	-6610
+▁sew	-6611
+▁tag	-6612
+▁uns	-6613
+asian	-6614
+below	-6615
+bible	-6616
+chang	-6617
+cycle	-6618
+gross	-6619
+guest	-6620
+icate	-6621
+idden	-6622
+intel	-6623
+intro	-6624
+larry	-6625
+magic	-6626
+mally	-6627
+mouth	-6628
+owned	-6629
+pages	-6630
+poons	-6631
+rible	-6632
+rical	-6633
+rison	-6634
+roots	-6635
+shore	-6636
+short	-6637
+steel	-6638
+touch	-6639
+▁adds	-6640
+▁batt	-6641
+▁blow	-6642
+▁boss	-6643
+▁bust	-6644
+▁butt	-6645
+▁dont	-6646
+▁dram	-6647
+▁dual	-6648
+▁fits	-6649
+▁furn	-6650
+▁hall	-6651
+▁hurt	-6652
+▁loan	-6653
+▁pigs	-6654
+▁poly	-6655
+▁pork	-6656
+▁rend	-6657
+▁seat	-6658
+▁sequ	-6659
+▁sexy	-6660
+▁soil	-6661
+▁spee	-6662
+▁stim	-6663
+▁surf	-6664
+▁thus	-6665
+advert	-6666
+alanya	-6667
+anyone	-6668
+bedded	-6669
+behold	-6670
+browse	-6671
+chrome	-6672
+create	-6673
+dallas	-6674
+doctor	-6675
+editor	-6676
+edward	-6677
+ellant	-6678
+golden	-6679
+imming	-6680
+itable	-6681
+junior	-6682
+ockets	-6683
+player	-6684
+rating	-6685
+secret	-6686
+senior	-6687
+spirit	-6688
+stroke	-6689
+studio	-6690
+target	-6691
+ulated	-6692
+▁aside	-6693
+▁bread	-6694
+▁cable	-6695
+▁chain	-6696
+▁crime	-6697
+▁humor	-6698
+▁input	-6699
+▁lived	-6700
+▁lucky	-6701
+▁north	-6702
+▁prove	-6703
+▁remov	-6704
+▁stamp	-6705
+▁veget	-6706
+▁venue	-6707
+alities	-6708
+atlanta	-6709
+batavia	-6710
+coupons	-6711
+detroit	-6712
+diamond	-6713
+emption	-6714
+freedom	-6715
+houston	-6716
+iversal	-6717
+journal	-6718
+meeting	-6719
+missour	-6720
+patrick	-6721
+profile	-6722
+stadium	-6723
+verizon	-6724
+wyoming	-6725
+yankees	-6726
+▁aspect	-6727
+▁assets	-6728
+▁crisis	-6729
+▁depart	-6730
+▁drinks	-6731
+▁eating	-6732
+▁exclud	-6733
+▁golden	-6734
+▁guitar	-6735
+▁latter	-6736
+▁lesson	-6737
+▁manner	-6738
+▁msaexp	-6739
+▁orient	-6740
+▁picked	-6741
+▁plants	-6742
+▁purple	-6743
+▁serves	-6744
+▁shower	-6745
+▁signal	-6746
+▁struct	-6747
+▁superv	-6748
+▁taught	-6749
+▁volunt	-6750
+▁yellow	-6751
+archives	-6752
+comments	-6753
+commerce	-6754
+fletcher	-6755
+learning	-6756
+northern	-6757
+olympics	-6758
+southern	-6759
+▁amounts	-6760
+▁authors	-6761
+▁battery	-6762
+▁desired	-6763
+▁doctors	-6764
+▁elected	-6765
+▁elegant	-6766
+▁enables	-6767
+▁filling	-6768
+▁freedom	-6769
+▁hitting	-6770
+▁housing	-6771
+▁invited	-6772
+▁minimum	-6773
+▁moments	-6774
+▁printed	-6775
+▁proceed	-6776
+▁reasons	-6777
+▁recruit	-6778
+▁texture	-6779
+▁tourism	-6780
+▁viewing	-6781
+▁watched	-6782
+▁writers	-6783
+companies	-6784
+▁argument	-6785
+▁attached	-6786
+▁aviation	-6787
+▁bacteria	-6788
+▁category	-6789
+▁certific	-6790
+▁clinical	-6791
+▁consumer	-6792
+▁election	-6793
+▁episodes	-6794
+▁generous	-6795
+▁graduate	-6796
+▁historic	-6797
+▁hundreds	-6798
+▁intended	-6799
+▁ministry	-6800
+▁necklace	-6801
+▁northern	-6802
+▁officers	-6803
+▁packages	-6804
+▁patterns	-6805
+▁presents	-6806
+▁profiles	-6807
+▁recovery	-6808
+▁restrict	-6809
+▁reviewed	-6810
+▁southern	-6811
+▁supposed	-6812
+▁surprise	-6813
+▁survival	-6814
+▁thousand	-6815
+▁tutorial	-6816
+australian	-6817
+unintellig	-6818
+▁allegedly	-6819
+▁authentic	-6820
+▁brilliant	-6821
+▁committee	-6822
+▁directory	-6823
+▁everybody	-6824
+▁increases	-6825
+▁investors	-6826
+▁ourselves	-6827
+▁practical	-6828
+▁proposals	-6829
+▁architects	-6830
+▁frequently	-6831
+▁impressive	-6832
+▁incredible	-6833
+▁membership	-6834
+▁represents	-6835
+championship	-6836
+construction	-6837
+pennsylvania	-6838
+▁discussions	-6839
+▁interactive	-6840
+▁entertaining	-6841
+▁expectations	-6842
+▁specifically	-6843
+▁technologies	-6844
+▁temperatures	-6845
+unintelligible	-6846
+▁relationships	-6847
+hh	-6848
+hz	-6849
+mt	-6850
+oa	-6851
+pi	-6852
+ami	-6853
+avi	-6854
+blu	-6855
+doo	-6856
+dry	-6857
+era	-6858
+few	-6859
+fis	-6860
+fit	-6861
+hey	-6862
+hig	-6863
+hub	-6864
+loo	-6865
+mix	-6866
+mun	-6867
+oft	-6868
+oks	-6869
+spa	-6870
+tha	-6871
+tol	-6872
+uis	-6873
+umn	-6874
+una	-6875
+unk	-6876
+utr	-6877
+wee	-6878
+▁na	-6879
+▁||	-6880
+amaz	-6881
+aska	-6882
+asts	-6883
+aved	-6884
+avil	-6885
+bott	-6886
+bron	-6887
+bush	-6888
+carm	-6889
+dave	-6890
+dead	-6891
+emet	-6892
+eper	-6893
+gate	-6894
+golf	-6895
+iran	-6896
+izer	-6897
+jere	-6898
+kong	-6899
+lete	-6900
+lomb	-6901
+lons	-6902
+meet	-6903
+mess	-6904
+oked	-6905
+onna	-6906
+ores	-6907
+orth	-6908
+rawn	-6909
+sams	-6910
+save	-6911
+ucks	-6912
+viet	-6913
+▁anx	-6914
+▁cas	-6915
+▁gal	-6916
+▁hal	-6917
+▁hur	-6918
+▁obl	-6919
+▁pix	-6920
+▁pod	-6921
+▁psy	-6922
+▁riv	-6923
+▁sac	-6924
+▁tea	-6925
+▁tex	-6926
+▁tip	-6927
+▁vag	-6928
+▁vir	-6929
+▁wal	-6930
+abeth	-6931
+alpha	-6932
+anton	-6933
+arian	-6934
+attoo	-6935
+bassy	-6936
+brace	-6937
+cemet	-6938
+claim	-6939
+creat	-6940
+duces	-6941
+ester	-6942
+favor	-6943
+girls	-6944
+grass	-6945
+gucci	-6946
+iency	-6947
+itage	-6948
+judge	-6949
+koner	-6950
+krist	-6951
+lined	-6952
+maine	-6953
+notes	-6954
+ongue	-6955
+plant	-6956
+plays	-6957
+posit	-6958
+posts	-6959
+prene	-6960
+ution	-6961
+▁argu	-6962
+▁bail	-6963
+▁bike	-6964
+▁boat	-6965
+▁cats	-6966
+▁clip	-6967
+▁dent	-6968
+▁dess	-6969
+▁heav	-6970
+▁peak	-6971
+▁tall	-6972
+▁tear	-6973
+▁tied	-6974
+▁tons	-6975
+▁whit	-6976
+albion	-6977
+angers	-6978
+asting	-6979
+bourne	-6980
+celebr	-6981
+clevel	-6982
+cooper	-6983
+denver	-6984
+double	-6985
+former	-6986
+friger	-6987
+histor	-6988
+icking	-6989
+inance	-6990
+itched	-6991
+kshire	-6992
+lisher	-6993
+marine	-6994
+minnes	-6995
+mistry	-6996
+orders	-6997
+points	-6998
+policy	-6999
+rovers	-7000
+strong	-7001
+tennes	-7002
+terfly	-7003
+thetic	-7004
+ucking	-7005
+walker	-7006
+▁adapt	-7007
+▁advoc	-7008
+▁alive	-7009
+▁appet	-7010
+▁bench	-7011
+▁blogs	-7012
+▁cemet	-7013
+▁cloud	-7014
+▁cryst	-7015
+▁drops	-7016
+▁faces	-7017
+▁giant	-7018
+▁guard	-7019
+▁hyper	-7020
+▁iscsi	-7021
+▁knife	-7022
+▁leads	-7023
+▁lists	-7024
+▁minim	-7025
+▁patch	-7026
+▁plain	-7027
+▁pleas	-7028
+▁quant	-7029
+▁reput	-7030
+▁rough	-7031
+▁shock	-7032
+▁stone	-7033
+▁turns	-7034
+▁wings	-7035
+academy	-7036
+brandon	-7037
+channel	-7038
+collins	-7039
+comment	-7040
+discuss	-7041
+donnell	-7042
+ighters	-7043
+kingdom	-7044
+konerko	-7045
+leagues	-7046
+powered	-7047
+rations	-7048
+records	-7049
+reprene	-7050
+samsung	-7051
+scholar	-7052
+schools	-7053
+seattle	-7054
+stephen	-7055
+▁astron	-7056
+▁athlet	-7057
+▁belong	-7058
+▁borrow	-7059
+▁broken	-7060
+▁browse	-7061
+▁caring	-7062
+▁checks	-7063
+▁config	-7064
+▁debate	-7065
+▁decent	-7066
+▁desire	-7067
+▁detect	-7068
+▁enable	-7069
+▁extent	-7070
+▁factor	-7071
+▁famous	-7072
+▁fiscal	-7073
+▁geomet	-7074
+▁gotten	-7075
+▁narrow	-7076
+▁native	-7077
+▁packed	-7078
+▁panels	-7079
+▁phrase	-7080
+▁planet	-7081
+▁refund	-7082
+▁script	-7083
+▁soccer	-7084
+▁spaces	-7085
+▁speech	-7086
+▁tattoo	-7087
+▁trends	-7088
+▁upload	-7089
+▁wealth	-7090
+▁writes	-7091
+brooklyn	-7092
+cemetery	-7093
+lonsdale	-7094
+ministry	-7095
+missouri	-7096
+ologists	-7097
+shipping	-7098
+whitaker	-7099
+▁anybody	-7100
+▁aspects	-7101
+▁citizen	-7102
+▁convent	-7103
+▁coupled	-7104
+▁entered	-7105
+▁fitness	-7106
+▁injured	-7107
+▁islands	-7108
+▁picking	-7109
+▁raising	-7110
+▁recover	-7111
+▁retired	-7112
+▁scholar	-7113
+▁seasons	-7114
+▁somehow	-7115
+▁strange	-7116
+▁stretch	-7117
+▁vintage	-7118
+▁warrant	-7119
+▁washing	-7120
+▁winners	-7121
+▁witness	-7122
+▁worried	-7123
+adventure	-7124
+cleveland	-7125
+elizabeth	-7126
+financial	-7127
+following	-7128
+friedberg	-7129
+meanwhile	-7130
+minnesota	-7131
+resources	-7132
+transport	-7133
+▁attended	-7134
+▁chapters	-7135
+▁chemical	-7136
+▁colleges	-7137
+▁creation	-7138
+▁disaster	-7139
+▁discover	-7140
+▁distance	-7141
+▁ensuring	-7142
+▁expenses	-7143
+▁hardware	-7144
+▁imported	-7145
+▁medicine	-7146
+▁occurred	-7147
+▁pleasure	-7148
+▁vehicles	-7149
+▁visiting	-7150
+▁wherever	-7151
+convention	-7152
+everything	-7153
+▁addresses	-7154
+▁admission	-7155
+▁attending	-7156
+▁breakfast	-7157
+▁discovery	-7158
+▁estimated	-7159
+▁interface	-7160
+▁languages	-7161
+▁messaging	-7162
+▁obviously	-7163
+▁ordinance	-7164
+▁photograp	-7165
+▁providers	-7166
+▁satisfied	-7167
+▁searching	-7168
+▁trademark	-7169
+description	-7170
+▁colleagues	-7171
+▁contribute	-7172
+▁determined	-7173
+▁disappoint	-7174
+▁guaranteed	-7175
+▁historical	-7176
+▁instrument	-7177
+▁originally	-7178
+▁respective	-7179
+▁smartphone	-7180
+▁supporting	-7181
+▁surprising	-7182
+▁thoroughly	-7183
+▁volunteers	-7184
+▁appreciated	-7185
+▁considering	-7186
+▁distributed	-7187
+▁outstanding	-7188
+▁availability	-7189
+▁contemporary	-7190
+▁intelligence	-7191
+▁collaboration	-7192
+▁significantly	-7193
+▁infrastructure	-7194
+▁transportation	-7195
+lb	-7196
+mg	-7197
+pn	-7198
+sg	-7199
+vo	-7200
+vs	-7201
+▁?	-7202
+alm	-7203
+amb	-7204
+aub	-7205
+beh	-7206
+blr	-7207
+bud	-7208
+cho	-7209
+dol	-7210
+dre	-7211
+fri	-7212
+gre	-7213
+gsa	-7214
+jam	-7215
+lab	-7216
+mir	-7217
+mom	-7218
+nap	-7219
+nba	-7220
+nbc	-7221
+ora	-7222
+owe	-7223
+pak	-7224
+pam	-7225
+pul	-7226
+shi	-7227
+six	-7228
+ski	-7229
+uct	-7230
+usb	-7231
+wid	-7232
+wow	-7233
+zes	-7234
+▁(“	-7235
+aded	-7236
+anor	-7237
+bang	-7238
+bart	-7239
+beth	-7240
+born	-7241
+brew	-7242
+ciss	-7243
+drop	-7244
+enth	-7245
+fish	-7246
+gust	-7247
+hong	-7248
+icts	-7249
+imag	-7250
+java	-7251
+jord	-7252
+kath	-7253
+lini	-7254
+mann	-7255
+mort	-7256
+much	-7257
+nabb	-7258
+nice	-7259
+onda	-7260
+oral	-7261
+osph	-7262
+prin	-7263
+rors	-7264
+safe	-7265
+sole	-7266
+tang	-7267
+tery	-7268
+tter	-7269
+uous	-7270
+walt	-7271
+whel	-7272
+wire	-7273
+yeah	-7274
+▁gem	-7275
+▁gre	-7276
+▁hub	-7277
+▁jer	-7278
+▁lip	-7279
+▁mal	-7280
+▁mig	-7281
+▁oxy	-7282
+▁syn	-7283
+aaron	-7284
+acter	-7285
+added	-7286
+advis	-7287
+agers	-7288
+arena	-7289
+astro	-7290
+athan	-7291
+baker	-7292
+barry	-7293
+billy	-7294
+blood	-7295
+defin	-7296
+early	-7297
+ender	-7298
+fined	-7299
+hills	-7300
+icals	-7301
+iesel	-7302
+iture	-7303
+large	-7304
+lisle	-7305
+maria	-7306
+mouse	-7307
+ortex	-7308
+otted	-7309
+paper	-7310
+prior	-7311
+prize	-7312
+rates	-7313
+rence	-7314
+rolls	-7315
+saint	-7316
+scape	-7317
+stern	-7318
+terry	-7319
+think	-7320
+tokyo	-7321
+treat	-7322
+value	-7323
+▁auto	-7324
+▁bran	-7325
+▁caut	-7326
+▁coat	-7327
+▁edit	-7328
+▁fort	-7329
+▁gang	-7330
+▁glut	-7331
+▁hack	-7332
+▁hate	-7333
+▁impl	-7334
+▁milk	-7335
+▁outs	-7336
+▁peer	-7337
+▁tiny	-7338
+amazon	-7339
+anyway	-7340
+arctic	-7341
+awards	-7342
+cience	-7343
+consin	-7344
+easter	-7345
+estate	-7346
+galaxy	-7347
+gments	-7348
+gordon	-7349
+images	-7350
+jeremy	-7351
+nascar	-7352
+native	-7353
+oosing	-7354
+orable	-7355
+orange	-7356
+stream	-7357
+summer	-7358
+swered	-7359
+unning	-7360
+within	-7361
+▁absor	-7362
+▁advis	-7363
+▁assum	-7364
+▁bonus	-7365
+▁boots	-7366
+▁bound	-7367
+▁broke	-7368
+▁chart	-7369
+▁crown	-7370
+▁dance	-7371
+▁divid	-7372
+▁error	-7373
+▁falls	-7374
+▁fewer	-7375
+▁firef	-7376
+▁forth	-7377
+▁globe	-7378
+▁grade	-7379
+▁horse	-7380
+▁layer	-7381
+▁mouse	-7382
+▁nurse	-7383
+▁ocean	-7384
+▁opens	-7385
+▁ratio	-7386
+▁renew	-7387
+▁retur	-7388
+▁rider	-7389
+▁rooms	-7390
+▁ships	-7391
+▁sight	-7392
+▁sixth	-7393
+▁split	-7394
+▁tired	-7395
+▁tours	-7396
+▁trend	-7397
+▁truth	-7398
+▁votes	-7399
+bradley	-7400
+cooking	-7401
+eichert	-7402
+essions	-7403
+itution	-7404
+jection	-7405
+leading	-7406
+lington	-7407
+natural	-7408
+ocation	-7409
+porters	-7410
+supreme	-7411
+version	-7412
+▁acting	-7413
+▁belief	-7414
+▁circle	-7415
+▁colour	-7416
+▁dining	-7417
+▁dishes	-7418
+▁festiv	-7419
+▁filter	-7420
+▁finest	-7421
+▁gained	-7422
+▁garlic	-7423
+▁grants	-7424
+▁hearts	-7425
+▁infect	-7426
+▁loaded	-7427
+▁mainly	-7428
+▁margin	-7429
+▁oxygen	-7430
+▁pepper	-7431
+▁phones	-7432
+▁pounds	-7433
+▁prayer	-7434
+▁reform	-7435
+▁relief	-7436
+▁studio	-7437
+▁symbol	-7438
+▁tackle	-7439
+carlisle	-7440
+computer	-7441
+consider	-7442
+friendly	-7443
+harrison	-7444
+ications	-7445
+jennifer	-7446
+material	-7447
+unicipal	-7448
+waterloo	-7449
+▁anticip	-7450
+▁approve	-7451
+▁cancell	-7452
+▁circuit	-7453
+▁claimed	-7454
+▁consequ	-7455
+▁conserv	-7456
+▁context	-7457
+▁convinc	-7458
+▁creates	-7459
+▁crucial	-7460
+▁defined	-7461
+▁disappe	-7462
+▁discipl	-7463
+▁dresses	-7464
+▁drivers	-7465
+▁exposed	-7466
+▁fishing	-7467
+▁funeral	-7468
+▁gallery	-7469
+▁glasses	-7470
+▁greatly	-7471
+▁harness	-7472
+▁holders	-7473
+▁lesbian	-7474
+▁letters	-7475
+▁massive	-7476
+▁nervous	-7477
+▁passing	-7478
+▁penalty	-7479
+▁promise	-7480
+▁pushing	-7481
+▁ratings	-7482
+▁regions	-7483
+▁routine	-7484
+▁satisfy	-7485
+▁speaker	-7486
+▁stopped	-7487
+▁threats	-7488
+▁twitter	-7489
+▁visited	-7490
+▁warning	-7491
+▁wrapped	-7492
+administr	-7493
+authority	-7494
+essential	-7495
+highlands	-7496
+wisconsin	-7497
+▁accurate	-7498
+▁answered	-7499
+▁approval	-7500
+▁cemetery	-7501
+▁clicking	-7502
+▁climbing	-7503
+▁consists	-7504
+▁division	-7505
+▁examples	-7506
+▁internal	-7507
+▁keywords	-7508
+▁minister	-7509
+▁priority	-7510
+▁refriger	-7511
+▁remained	-7512
+▁stations	-7513
+▁suitable	-7514
+▁universe	-7515
+▁violence	-7516
+registered	-7517
+▁appellant	-7518
+▁awareness	-7519
+▁candidate	-7520
+▁container	-7521
+▁dangerous	-7522
+▁destroyed	-7523
+▁diagnosis	-7524
+▁encounter	-7525
+▁enjoyable	-7526
+▁favorites	-7527
+▁generated	-7528
+▁hospitals	-7529
+▁household	-7530
+▁introduce	-7531
+▁listening	-7532
+▁offensive	-7533
+▁permanent	-7534
+▁realistic	-7535
+▁recording	-7536
+▁recycling	-7537
+▁sensitive	-7538
+▁suggested	-7539
+▁tanzanite	-7540
+▁traveling	-7541
+▁whitening	-7542
+▁wondering	-7543
+partnership	-7544
+performance	-7545
+▁accomplish	-7546
+▁appreciate	-7547
+▁compliment	-7548
+▁consistent	-7549
+▁controlled	-7550
+▁controvers	-7551
+▁evacuation	-7552
+▁innovative	-7553
+▁possession	-7554
+▁prevention	-7555
+▁scientists	-7556
+organization	-7557
+▁authorities	-7558
+▁challenging	-7559
+▁cooperation	-7560
+▁investments	-7561
+▁preparation	-7562
+▁represented	-7563
+▁residential	-7564
+▁conferencing	-7565
+,'	-7566
+.’	-7567
+eo	-7568
+eu	-7569
+ho	-7570
+hr	-7571
+jo	-7572
+kl	-7573
+lg	-7574
+lr	-7575
+ua	-7576
+aer	-7577
+aft	-7578
+aka	-7579
+alg	-7580
+ati	-7581
+bac	-7582
+bea	-7583
+bio	-7584
+cro	-7585
+dak	-7586
+dev	-7587
+due	-7588
+gem	-7589
+gib	-7590
+het	-7591
+hil	-7592
+iii	-7593
+ima	-7594
+iow	-7595
+itz	-7596
+joe	-7597
+kid	-7598
+lol	-7599
+lus	-7600
+moh	-7601
+mrs	-7602
+mvp	-7603
+oul	-7604
+php	-7605
+pil	-7606
+psy	-7607
+rys	-7608
+saw	-7609
+say	-7610
+ses	-7611
+som	-7612
+sus	-7613
+ugg	-7614
+uma	-7615
+vor	-7616
+▁eg	-7617
+▁fo	-7618
+▁tu	-7619
+▁ye	-7620
+adam	-7621
+aria	-7622
+arks	-7623
+bach	-7624
+boys	-7625
+coln	-7626
+crow	-7627
+diff	-7628
+etic	-7629
+glas	-7630
+gled	-7631
+hawa	-7632
+ilit	-7633
+inar	-7634
+iors	-7635
+iowa	-7636
+iraq	-7637
+iron	-7638
+izza	-7639
+juan	-7640
+kore	-7641
+left	-7642
+mant	-7643
+mind	-7644
+mitt	-7645
+nhof	-7646
+oche	-7647
+ogen	-7648
+onts	-7649
+orle	-7650
+orus	-7651
+pson	-7652
+qual	-7653
+quer	-7654
+requ	-7655
+rots	-7656
+salt	-7657
+same	-7658
+scot	-7659
+shad	-7660
+stra	-7661
+swed	-7662
+tele	-7663
+ucky	-7664
+udes	-7665
+vard	-7666
+vity	-7667
+zeal	-7668
+▁ads	-7669
+▁aid	-7670
+▁ber	-7671
+▁clo	-7672
+▁emb	-7673
+▁fal	-7674
+▁fir	-7675
+▁inn	-7676
+▁lad	-7677
+▁lod	-7678
+▁maj	-7679
+▁mut	-7680
+▁nic	-7681
+▁nin	-7682
+▁rhy	-7683
+▁rig	-7684
+▁rum	-7685
+▁scr	-7686
+▁tro	-7687
+aceut	-7688
+afood	-7689
+alarm	-7690
+amber	-7691
+angle	-7692
+audio	-7693
+bears	-7694
+birth	-7695
+brugg	-7696
+ducts	-7697
+eland	-7698
+eller	-7699
+eness	-7700
+essed	-7701
+exhib	-7702
+falls	-7703
+gator	-7704
+glatt	-7705
+helor	-7706
+henry	-7707
+iform	-7708
+inema	-7709
+inger	-7710
+ipals	-7711
+jerry	-7712
+links	-7713
+lower	-7714
+lusso	-7715
+maker	-7716
+metal	-7717
+miths	-7718
+movie	-7719
+nific	-7720
+novel	-7721
+ockey	-7722
+ounge	-7723
+oured	-7724
+paint	-7725
+rafts	-7726
+retta	-7727
+rhode	-7728
+riley	-7729
+salty	-7730
+stage	-7731
+stick	-7732
+storm	-7733
+third	-7734
+title	-7735
+trade	-7736
+▁acts	-7737
+▁army	-7738
+▁bags	-7739
+▁beer	-7740
+▁crit	-7741
+▁dial	-7742
+▁diet	-7743
+▁digs	-7744
+▁duty	-7745
+▁gear	-7746
+▁gene	-7747
+▁lets	-7748
+▁liqu	-7749
+▁medi	-7750
+▁mile	-7751
+▁oils	-7752
+▁pics	-7753
+▁puts	-7754
+▁ribb	-7755
+▁seek	-7756
+▁tops	-7757
+▁trim	-7758
+▁vend	-7759
+▁zero	-7760
+amanda	-7761
+annual	-7762
+attend	-7763
+battle	-7764
+bullet	-7765
+consum	-7766
+dakota	-7767
+daniel	-7768
+dragon	-7769
+engers	-7770
+france	-7771
+german	-7772
+honors	-7773
+houses	-7774
+hunter	-7775
+icient	-7776
+isters	-7777
+justin	-7778
+lander	-7779
+morgan	-7780
+morris	-7781
+murray	-7782
+nation	-7783
+origin	-7784
+parent	-7785
+proble	-7786
+prodig	-7787
+ravens	-7788
+recent	-7789
+sabres	-7790
+scious	-7791
+sheila	-7792
+silver	-7793
+simply	-7794
+theast	-7795
+urches	-7796
+wiseco	-7797
+writer	-7798
+▁actor	-7799
+▁bacon	-7800
+▁carri	-7801
+▁drawn	-7802
+▁drugs	-7803
+▁edges	-7804
+▁embar	-7805
+▁equal	-7806
+▁genre	-7807
+▁gonna	-7808
+▁handy	-7809
+▁heads	-7810
+▁holes	-7811
+▁login	-7812
+▁micro	-7813
+▁moves	-7814
+▁neuro	-7815
+▁olive	-7816
+▁pilot	-7817
+▁plane	-7818
+▁quote	-7819
+▁roads	-7820
+▁route	-7821
+▁sheet	-7822
+▁shift	-7823
+▁solar	-7824
+▁sorry	-7825
+▁suffe	-7826
+▁traff	-7827
+avilion	-7828
+bahnhof	-7829
+complex	-7830
+digital	-7831
+groupon	-7832
+ivalent	-7833
+lerless	-7834
+lincoln	-7835
+manager	-7836
+onymous	-7837
+options	-7838
+orleans	-7839
+osphere	-7840
+pending	-7841
+perhaps	-7842
+rapbook	-7843
+roscopy	-7844
+systems	-7845
+▁booked	-7846
+▁bubble	-7847
+▁builds	-7848
+▁failed	-7849
+▁fairly	-7850
+▁flight	-7851
+▁gaming	-7852
+▁gender	-7853
+▁graves	-7854
+▁negoti	-7855
+▁orange	-7856
+▁patent	-7857
+▁pushed	-7858
+▁rarely	-7859
+▁reader	-7860
+▁recall	-7861
+▁sample	-7862
+▁sooner	-7863
+▁tracks	-7864
+▁unable	-7865
+armaceut	-7866
+campaign	-7867
+chairman	-7868
+interest	-7869
+lombardi	-7870
+manufact	-7871
+marshall	-7872
+nificent	-7873
+perretta	-7874
+richmond	-7875
+together	-7876
+▁absence	-7877
+▁amended	-7878
+▁beating	-7879
+▁causing	-7880
+▁choices	-7881
+▁convict	-7882
+▁delight	-7883
+▁durable	-7884
+▁editing	-7885
+▁element	-7886
+▁extreme	-7887
+▁granted	-7888
+▁heading	-7889
+▁jewelry	-7890
+▁lasting	-7891
+▁reveals	-7892
+▁shouldn	-7893
+▁signing	-7894
+▁statist	-7895
+▁trading	-7896
+▁veteran	-7897
+allingham	-7898
+directory	-7899
+entertain	-7900
+itionally	-7901
+recommend	-7902
+rochester	-7903
+▁branches	-7904
+▁commonly	-7905
+▁concepts	-7906
+▁concrete	-7907
+▁cultural	-7908
+▁declined	-7909
+▁displays	-7910
+▁equipped	-7911
+▁everyday	-7912
+▁exchange	-7913
+▁expanded	-7914
+▁findings	-7915
+▁forecast	-7916
+▁involves	-7917
+▁licensed	-7918
+▁mistakes	-7919
+▁modeling	-7920
+▁normally	-7921
+▁operated	-7922
+▁opinions	-7923
+▁ordering	-7924
+▁overwhel	-7925
+▁premiere	-7926
+▁provider	-7927
+▁realized	-7928
+▁romantic	-7929
+▁soldiers	-7930
+▁somewhat	-7931
+▁superior	-7932
+▁supplies	-7933
+▁ultimate	-7934
+▁wireless	-7935
+ationalist	-7936
+experience	-7937
+glattbrugg	-7938
+humanities	-7939
+ifications	-7940
+internship	-7941
+▁amenities	-7942
+▁complaint	-7943
+▁contained	-7944
+▁donations	-7945
+▁interpret	-7946
+▁investing	-7947
+▁involving	-7948
+▁literally	-7949
+▁packaging	-7950
+▁primarily	-7951
+▁principal	-7952
+▁protected	-7953
+▁responded	-7954
+▁returning	-7955
+▁antlerless	-7956
+▁approaches	-7957
+▁categories	-7958
+▁celebrated	-7959
+▁collecting	-7960
+▁commentary	-7961
+▁encouraged	-7962
+▁enterprise	-7963
+▁exhibition	-7964
+▁literature	-7965
+▁maintained	-7966
+▁monitoring	-7967
+▁newsletter	-7968
+▁publishing	-7969
+▁purchasing	-7970
+▁specialist	-7971
+▁traditions	-7972
+▁transition	-7973
+▁accompanied	-7974
+▁anniversary	-7975
+▁concentrate	-7976
+▁involvement	-7977
+▁potentially	-7978
+▁programming	-7979
+▁quarterback	-7980
+▁replacement	-7981
+communication	-7982
+environmental	-7983
+▁announcement	-7984
+▁developments	-7985
+▁subscription	-7986
+▁participating	-7987
+▁administrative	-7988
+▁recommendations	-7989
+!”	-7990
+',	-7991
+aq	-7992
+gp	-7993
+sr	-7994
+”)	-7995
+▁u	-7996
+)||	-7997
+---	-7998
+ald	-7999
+awa	-8000
+bbc	-8001
+bib	-8002
+boe	-8003
+cbs	-8004
+chi	-8005
+coc	-8006
+cst	-8007
+cub	-8008
+dad	-8009
+dna	-8010
+enh	-8011
+gro	-8012
+hug	-8013
+ido	-8014
+iki	-8015
+jen	-8016
+kle	-8017
+lot	-8018
+lov	-8019
+nie	-8020
+oen	-8021
+ran	-8022
+rep	-8023
+sec	-8024
+sle	-8025
+tar	-8026
+tip	-8027
+vag	-8028
+var	-8029
+wan	-8030
+wat	-8031
+yet	-8032
+yog	-8033
+zee	-8034
+▁ru	-8035
+****	-8036
+adia	-8037
+anic	-8038
+ario	-8039
+ashe	-8040
+barn	-8041
+buck	-8042
+canc	-8043
+cave	-8044
+chev	-8045
+clam	-8046
+clip	-8047
+cole	-8048
+cool	-8049
+deck	-8050
+delt	-8051
+docs	-8052
+eled	-8053
+eler	-8054
+epis	-8055
+feel	-8056
+give	-8057
+glen	-8058
+grow	-8059
+hole	-8060
+holl	-8061
+inel	-8062
+iner	-8063
+inge	-8064
+itan	-8065
+item	-8066
+itis	-8067
+kick	-8068
+lang	-8069
+lers	-8070
+luke	-8071
+maps	-8072
+math	-8073
+mons	-8074
+odor	-8075
+past	-8076
+pick	-8077
+poly	-8078
+puff	-8079
+pure	-8080
+rank	-8081
+rise	-8082
+shan	-8083
+skin	-8084
+soul	-8085
+spir	-8086
+zing	-8087
+▁>>>	-8088
+▁awk	-8089
+▁caf	-8090
+▁cow	-8091
+▁gam	-8092
+▁lun	-8093
+▁mir	-8094
+▁nut	-8095
+▁por	-8096
+▁rid	-8097
+▁rul	-8098
+▁sec	-8099
+▁sod	-8100
+▁tan	-8101
+.....	-8102
+accur	-8103
+activ	-8104
+adobe	-8105
+andal	-8106
+andre	-8107
+arbit	-8108
+arest	-8109
+arist	-8110
+ashes	-8111
+asion	-8112
+asted	-8113
+avier	-8114
+berto	-8115
+clear	-8116
+delta	-8117
+deput	-8118
+dried	-8119
+empts	-8120
+estab	-8121
+eties	-8122
+event	-8123
+hours	-8124
+icide	-8125
+icies	-8126
+inent	-8127
+items	-8128
+juana	-8129
+kenpo	-8130
+leave	-8131
+oenix	-8132
+otive	-8133
+phill	-8134
+piece	-8135
+plans	-8136
+pract	-8137
+range	-8138
+redsk	-8139
+rogen	-8140
+roger	-8141
+roman	-8142
+royal	-8143
+saver	-8144
+thers	-8145
+ugene	-8146
+ulner	-8147
+urers	-8148
+users	-8149
+woods	-8150
+▁aims	-8151
+▁boot	-8152
+▁coin	-8153
+▁cozy	-8154
+▁crew	-8155
+▁disk	-8156
+▁eggs	-8157
+▁enem	-8158
+▁flav	-8159
+▁fulf	-8160
+▁genu	-8161
+▁hire	-8162
+▁hung	-8163
+▁intr	-8164
+▁jail	-8165
+▁knee	-8166
+▁mere	-8167
+▁okay	-8168
+▁orth	-8169
+▁pill	-8170
+▁pink	-8171
+▁pled	-8172
+▁sacr	-8173
+▁soci	-8174
+▁stir	-8175
+▁stom	-8176
+▁tent	-8177
+▁tofu	-8178
+▁tray	-8179
+▁tube	-8180
+▁unus	-8181
+▁wave	-8182
+▁wins	-8183
+▁worn	-8184
+▁wrap	-8185
+▁zone	-8186
+argent	-8187
+asking	-8188
+called	-8189
+carter	-8190
+client	-8191
+estyle	-8192
+gedcom	-8193
+grants	-8194
+harris	-8195
+hasbro	-8196
+hawaii	-8197
+howard	-8198
+import	-8199
+inding	-8200
+inteln	-8201
+itting	-8202
+jordan	-8203
+miller	-8204
+modern	-8205
+nature	-8206
+promot	-8207
+rinkle	-8208
+senate	-8209
+simple	-8210
+soccer	-8211
+summit	-8212
+terior	-8213
+things	-8214
+turner	-8215
+yellow	-8216
+▁asset	-8217
+▁begun	-8218
+▁boats	-8219
+▁boost	-8220
+▁comic	-8221
+▁cycle	-8222
+▁depth	-8223
+▁dried	-8224
+▁false	-8225
+▁fiber	-8226
+▁foods	-8227
+▁habit	-8228
+▁hilar	-8229
+▁homem	-8230
+▁immig	-8231
+▁immun	-8232
+▁likes	-8233
+▁lined	-8234
+▁loads	-8235
+▁maxim	-8236
+▁meets	-8237
+▁miner	-8238
+▁polit	-8239
+▁prize	-8240
+▁rally	-8241
+▁renov	-8242
+▁saves	-8243
+▁shell	-8244
+▁skept	-8245
+▁skirt	-8246
+▁slide	-8247
+▁solve	-8248
+▁steel	-8249
+▁tries	-8250
+▁urban	-8251
+▁worse	-8252
+▁zones	-8253
+allison	-8254
+anormal	-8255
+chapter	-8256
+douglas	-8257
+eastern	-8258
+evolved	-8259
+flights	-8260
+further	-8261
+gallery	-8262
+herlini	-8263
+leonard	-8264
+located	-8265
+ologist	-8266
+opening	-8267
+outique	-8268
+pection	-8269
+phoenix	-8270
+playing	-8271
+popular	-8272
+portion	-8273
+provide	-8274
+release	-8275
+reviews	-8276
+ributed	-8277
+rinteln	-8278
+session	-8279
+several	-8280
+springs	-8281
+stevens	-8282
+tickets	-8283
+tretter	-8284
+vermont	-8285
+zealand	-8286
+▁acknow	-8287
+▁agents	-8288
+▁backed	-8289
+▁branch	-8290
+▁bucket	-8291
+▁collar	-8292
+▁coupon	-8293
+▁curric	-8294
+▁deadly	-8295
+▁dispos	-8296
+▁engage	-8297
+▁exclus	-8298
+▁export	-8299
+▁female	-8300
+▁fundra	-8301
+▁harder	-8302
+▁hosted	-8303
+▁hunter	-8304
+▁imposs	-8305
+▁intent	-8306
+▁knives	-8307
+▁locked	-8308
+▁loving	-8309
+▁manual	-8310
+▁mixing	-8311
+▁movies	-8312
+▁myster	-8313
+▁overse	-8314
+▁poster	-8315
+▁prompt	-8316
+▁pursue	-8317
+▁remote	-8318
+▁resist	-8319
+▁retire	-8320
+▁roster	-8321
+▁routes	-8322
+▁scores	-8323
+▁settle	-8324
+▁shaped	-8325
+▁solely	-8326
+▁spoken	-8327
+▁steaks	-8328
+▁streak	-8329
+▁surely	-8330
+▁tempor	-8331
+▁terror	-8332
+▁tongue	-8333
+▁whilst	-8334
+aturally	-8335
+columbus	-8336
+curities	-8337
+exchange	-8338
+favorite	-8339
+hartford	-8340
+increase	-8341
+junction	-8342
+mathemat	-8343
+military	-8344
+personal	-8345
+prodigie	-8346
+products	-8347
+redskins	-8348
+regional	-8349
+thompson	-8350
+ulations	-8351
+▁aggress	-8352
+▁amateur	-8353
+▁awkward	-8354
+▁carried	-8355
+▁carrots	-8356
+▁chances	-8357
+▁clothes	-8358
+▁concert	-8359
+▁counsel	-8360
+▁emerged	-8361
+▁ensures	-8362
+▁entries	-8363
+▁expects	-8364
+▁fastest	-8365
+▁harmful	-8366
+▁hosting	-8367
+▁hunters	-8368
+▁lawsuit	-8369
+▁listing	-8370
+▁portray	-8371
+▁premier	-8372
+▁recipes	-8373
+▁repairs	-8374
+▁runners	-8375
+▁sending	-8376
+▁seventh	-8377
+▁signals	-8378
+▁suppose	-8379
+▁telling	-8380
+▁tissues	-8381
+▁univers	-8382
+▁varying	-8383
+▁visible	-8384
+cambridge	-8385
+challenge	-8386
+developed	-8387
+imensions	-8388
+jefferson	-8389
+represent	-8390
+secretary	-8391
+universal	-8392
+▁advances	-8393
+▁attempts	-8394
+▁breaking	-8395
+▁brothers	-8396
+▁controls	-8397
+▁database	-8398
+▁delivers	-8399
+▁deputies	-8400
+▁describe	-8401
+▁domestic	-8402
+▁donation	-8403
+▁dropping	-8404
+▁external	-8405
+▁facebook	-8406
+▁freshman	-8407
+▁giveaway	-8408
+▁injuries	-8409
+▁instance	-8410
+▁observed	-8411
+▁organize	-8412
+▁portable	-8413
+▁producer	-8414
+▁produces	-8415
+▁rebounds	-8416
+▁referred	-8417
+▁reliable	-8418
+▁religion	-8419
+▁revenues	-8420
+▁sandwich	-8421
+▁sections	-8422
+▁sessions	-8423
+▁settings	-8424
+▁sponsors	-8425
+▁stronger	-8426
+▁strongly	-8427
+▁suddenly	-8428
+▁suffered	-8429
+▁suggests	-8430
+▁talented	-8431
+▁taxpayer	-8432
+additional	-8433
+associated	-8434
+corporated	-8435
+discussion	-8436
+healthcare	-8437
+shutterfly	-8438
+▁activists	-8439
+▁attempted	-8440
+▁authority	-8441
+▁basically	-8442
+▁creatures	-8443
+▁elsewhere	-8444
+▁emotional	-8445
+▁estimates	-8446
+▁formation	-8447
+▁indicated	-8448
+▁pollution	-8449
+▁portfolio	-8450
+▁preferred	-8451
+▁resulting	-8452
+▁specialty	-8453
+▁testified	-8454
+▁uncertain	-8455
+application	-8456
+tangherlini	-8457
+▁affordable	-8458
+▁consultant	-8459
+▁continuing	-8460
+▁convenient	-8461
+▁designated	-8462
+▁difficulty	-8463
+▁disability	-8464
+▁disclosure	-8465
+▁engagement	-8466
+▁fatalities	-8467
+▁highlights	-8468
+▁industries	-8469
+▁integrated	-8470
+▁officially	-8471
+▁principals	-8472
+▁principles	-8473
+▁recognized	-8474
+▁regardless	-8475
+▁regulation	-8476
+▁revolution	-8477
+▁statistics	-8478
+additionally	-8479
+applications	-8480
+commissioner	-8481
+introduction	-8482
+▁achievement	-8483
+▁agriculture	-8484
+▁appearances	-8485
+▁communicate	-8486
+▁conferences	-8487
+▁destination	-8488
+▁electronics	-8489
+▁exceptional	-8490
+▁implemented	-8491
+▁improvement	-8492
+▁inspiration	-8493
+entertainment	-8494
+▁championship	-8495
+▁consequences	-8496
+▁conservative	-8497
+▁introduction	-8498
+▁organisation	-8499
+administration	-8500
+▁administrator	-8501
+▁commissioners	-8502
+▁establishment	-8503
+▁participation	-8504
+▁presentations	-8505
+▁undergraduate	-8506
+▁characteristics	-8507
+ao	-8508
+cn	-8509
+ei	-8510
+ey	-8511
+ji	-8512
+”,	-8513
+abt	-8514
+adi	-8515
+aga	-8516
+aha	-8517
+amy	-8518
+bri	-8519
+bsp	-8520
+cil	-8521
+cit	-8522
+dub	-8523
+edu	-8524
+fab	-8525
+fel	-8526
+fen	-8527
+gig	-8528
+gnc	-8529
+ito	-8530
+mis	-8531
+mla	-8532
+mls	-8533
+nfc	-8534
+nin	-8535
+nrl	-8536
+nux	-8537
+ori	-8538
+pra	-8539
+qua	-8540
+sav	-8541
+sed	-8542
+sug	-8543
+tab	-8544
+tal	-8545
+tea	-8546
+too	-8547
+tow	-8548
+tuc	-8549
+tum	-8550
+uer	-8551
+uls	-8552
+wet	-8553
+yak	-8554
+▁oh	-8555
+▁ou	-8556
+addy	-8557
+aids	-8558
+anth	-8559
+atom	-8560
+benj	-8561
+brid	-8562
+carl	-8563
+cats	-8564
+corp	-8565
+dear	-8566
+dick	-8567
+dney	-8568
+elle	-8569
+ello	-8570
+epic	-8571
+etch	-8572
+fame	-8573
+faye	-8574
+fits	-8575
+gear	-8576
+gely	-8577
+gets	-8578
+gran	-8579
+gray	-8580
+hair	-8581
+heim	-8582
+holy	-8583
+iety	-8584
+inas	-8585
+ingu	-8586
+kaya	-8587
+lady	-8588
+lgbt	-8589
+lins	-8590
+mach	-8591
+mode	-8592
+moon	-8593
+nuxe	-8594
+olly	-8595
+onde	-8596
+orse	-8597
+osed	-8598
+poll	-8599
+pred	-8600
+pton	-8601
+pull	-8602
+quir	-8603
+rani	-8604
+rape	-8605
+rons	-8606
+ruiz	-8607
+sale	-8608
+shap	-8609
+sier	-8610
+stri	-8611
+sure	-8612
+sust	-8613
+tags	-8614
+tail	-8615
+tall	-8616
+trip	-8617
+ulum	-8618
+utes	-8619
+vine	-8620
+virt	-8621
+want	-8622
+wife	-8623
+zech	-8624
+zens	-8625
+————	-8626
+▁aer	-8627
+▁arc	-8628
+▁atm	-8629
+▁bio	-8630
+▁loo	-8631
+▁sms	-8632
+▁tub	-8633
+▁ump	-8634
+album	-8635
+allow	-8636
+along	-8637
+ammar	-8638
+andem	-8639
+apart	-8640
+apped	-8641
+ardon	-8642
+arity	-8643
+aspen	-8644
+avios	-8645
+aware	-8646
+beaut	-8647
+bered	-8648
+canad	-8649
+chant	-8650
+clone	-8651
+corps	-8652
+crazy	-8653
+dwell	-8654
+eding	-8655
+eters	-8656
+extra	-8657
+final	-8658
+flore	-8659
+gence	-8660
+graph	-8661
+habil	-8662
+icial	-8663
+ifies	-8664
+impro	-8665
+issue	-8666
+itzer	-8667
+ivest	-8668
+julie	-8669
+kings	-8670
+langu	-8671
+login	-8672
+looks	-8673
+lucas	-8674
+mario	-8675
+marks	-8676
+navin	-8677
+ntine	-8678
+ocean	-8679
+opher	-8680
+opped	-8681
+oween	-8682
+pense	-8683
+pride	-8684
+rapes	-8685
+rared	-8686
+reach	-8687
+rives	-8688
+roush	-8689
+score	-8690
+semin	-8691
+sharp	-8692
+shire	-8693
+sound	-8694
+stark	-8695
+stein	-8696
+suing	-8697
+therm	-8698
+tiger	-8699
+treas	-8700
+uated	-8701
+ultra	-8702
+ureau	-8703
+woman	-8704
+yahoo	-8705
+▁adop	-8706
+▁aged	-8707
+▁alum	-8708
+▁atom	-8709
+▁bean	-8710
+▁canv	-8711
+▁cher	-8712
+▁cock	-8713
+▁corn	-8714
+▁dump	-8715
+▁ecos	-8716
+▁evil	-8717
+▁fasc	-8718
+▁fold	-8719
+▁gard	-8720
+▁goat	-8721
+▁grip	-8722
+▁hill	-8723
+▁hoop	-8724
+▁lane	-8725
+▁loop	-8726
+▁mall	-8727
+▁mart	-8728
+▁noon	-8729
+▁nurs	-8730
+▁plum	-8731
+▁rats	-8732
+▁rein	-8733
+▁rice	-8734
+▁snap	-8735
+▁sour	-8736
+▁stol	-8737
+▁tale	-8738
+▁tape	-8739
+▁twin	-8740
+▁vast	-8741
+abetes	-8742
+africa	-8743
+alaska	-8744
+always	-8745
+arians	-8746
+backer	-8747
+bigley	-8748
+carmel	-8749
+closed	-8750
+commut	-8751
+cowley	-8752
+either	-8753
+ellect	-8754
+eugene	-8755
+gerber	-8756
+ifting	-8757
+ington	-8758
+itches	-8759
+iteria	-8760
+joseph	-8761
+latest	-8762
+listed	-8763
+matthe	-8764
+michel	-8765
+orious	-8766
+ounces	-8767
+planet	-8768
+ponder	-8769
+prises	-8770
+regard	-8771
+retail	-8772
+ritory	-8773
+serena	-8774
+shangh	-8775
+survey	-8776
+trevor	-8777
+turkey	-8778
+uccess	-8779
+unless	-8780
+vation	-8781
+walter	-8782
+wealth	-8783
+wonder	-8784
+▁alarm	-8785
+▁belly	-8786
+▁blank	-8787
+▁blind	-8788
+▁codes	-8789
+▁delay	-8790
+▁dough	-8791
+▁drain	-8792
+▁drama	-8793
+▁excit	-8794
+▁faced	-8795
+▁firms	-8796
+▁flash	-8797
+▁fried	-8798
+▁grain	-8799
+▁juice	-8800
+▁labor	-8801
+▁loose	-8802
+▁pizza	-8803
+▁ports	-8804
+▁prime	-8805
+▁racks	-8806
+▁recre	-8807
+▁rhyth	-8808
+▁sharp	-8809
+▁sheep	-8810
+▁shirt	-8811
+▁strip	-8812
+▁twist	-8813
+▁vital	-8814
+▁wagon	-8815
+▁walls	-8816
+▁weigh	-8817
+▁weird	-8818
+▁wrist	-8819
+▁yield	-8820
+circuit	-8821
+clinton	-8822
+episode	-8823
+eration	-8824
+estrian	-8825
+faculty	-8826
+holiday	-8827
+holland	-8828
+ignment	-8829
+indiana	-8830
+indones	-8831
+itarian	-8832
+machine	-8833
+million	-8834
+nothing	-8835
+players	-8836
+private	-8837
+raymond	-8838
+selling	-8839
+stories	-8840
+student	-8841
+studios	-8842
+subject	-8843
+teacher	-8844
+thunder	-8845
+trading	-8846
+ultural	-8847
+▁abroad	-8848
+▁agenda	-8849
+▁argued	-8850
+▁banner	-8851
+▁barrel	-8852
+▁behalf	-8853
+▁calcul	-8854
+▁causes	-8855
+▁compan	-8856
+▁deodor	-8857
+▁elimin	-8858
+▁ending	-8859
+▁errors	-8860
+▁escape	-8861
+▁expans	-8862
+▁explos	-8863
+▁facing	-8864
+▁finger	-8865
+▁franch	-8866
+▁garage	-8867
+▁habits	-8868
+▁hidden	-8869
+▁horror	-8870
+▁insert	-8871
+▁jumped	-8872
+▁logged	-8873
+▁lowest	-8874
+▁newest	-8875
+▁papers	-8876
+▁pardon	-8877
+▁parent	-8878
+▁prohib	-8879
+▁proven	-8880
+▁pulled	-8881
+▁retrie	-8882
+▁saving	-8883
+▁server	-8884
+▁silver	-8885
+▁singer	-8886
+▁sought	-8887
+▁stable	-8888
+▁stands	-8889
+▁stated	-8890
+▁sticks	-8891
+▁stroke	-8892
+▁titled	-8893
+▁versat	-8894
+▁vision	-8895
+▁weapon	-8896
+▁wheels	-8897
+ammation	-8898
+aviation	-8899
+bulletin	-8900
+daughter	-8901
+document	-8902
+franklin	-8903
+inations	-8904
+ivestock	-8905
+original	-8906
+pavilion	-8907
+portland	-8908
+quilting	-8909
+sentinel	-8910
+thinking	-8911
+timeline	-8912
+torrance	-8913
+velength	-8914
+workshop	-8915
+▁actress	-8916
+▁airport	-8917
+▁anxiety	-8918
+▁anymore	-8919
+▁assists	-8920
+▁assured	-8921
+▁brewing	-8922
+▁cameras	-8923
+▁careful	-8924
+▁centers	-8925
+▁charity	-8926
+▁closely	-8927
+▁closest	-8928
+▁closing	-8929
+▁combine	-8930
+▁concern	-8931
+▁conflic	-8932
+▁consent	-8933
+▁cookies	-8934
+▁corrupt	-8935
+▁coupons	-8936
+▁dangers	-8937
+▁dealing	-8938
+▁decades	-8939
+▁deficit	-8940
+▁desktop	-8941
+▁dessert	-8942
+▁dismiss	-8943
+▁divided	-8944
+▁electro	-8945
+▁emotion	-8946
+▁endorse	-8947
+▁exhibit	-8948
+▁harvest	-8949
+▁inaccur	-8950
+▁locally	-8951
+▁lodging	-8952
+▁married	-8953
+▁mistake	-8954
+▁nations	-8955
+▁operate	-8956
+▁opposed	-8957
+▁outlets	-8958
+▁periods	-8959
+▁pockets	-8960
+▁protein	-8961
+▁protest	-8962
+▁qualify	-8963
+▁reserve	-8964
+▁revenue	-8965
+▁savings	-8966
+▁scoring	-8967
+▁screens	-8968
+▁strings	-8969
+▁tablets	-8970
+▁theatre	-8971
+▁underst	-8972
+▁unusual	-8973
+▁violent	-8974
+▁windows	-8975
+▁workout	-8976
+engineers	-8977
+exclusive	-8978
+hopefully	-8979
+melbourne	-8980
+northwest	-8981
+repreneur	-8982
+scavenger	-8983
+solutions	-8984
+technical	-8985
+tennessee	-8986
+truepress	-8987
+westbrook	-8988
+▁abstract	-8989
+▁clothing	-8990
+▁contrast	-8991
+▁creditor	-8992
+▁deadline	-8993
+▁dramatic	-8994
+▁embedded	-8995
+▁employer	-8996
+▁entirely	-8997
+▁focusing	-8998
+▁founding	-8999
+▁generate	-9000
+▁geometry	-9001
+▁handling	-9002
+▁honestly	-9003
+▁keyboard	-9004
+▁obtained	-9005
+▁policies	-9006
+▁politics	-9007
+▁promised	-9008
+▁promises	-9009
+▁reaching	-9010
+▁reaction	-9011
+▁receiver	-9012
+▁reflects	-9013
+▁relative	-9014
+▁reminded	-9015
+▁shoulder	-9016
+▁spectrum	-9017
+▁stunning	-9018
+▁survived	-9019
+▁teenager	-9020
+▁throwing	-9021
+▁upgraded	-9022
+collective	-9023
+disneyland	-9024
+florentine	-9025
+integrated	-9026
+prodigieux	-9027
+▁activated	-9028
+▁campaigns	-9029
+▁contacted	-9030
+▁convinced	-9031
+▁discussed	-9032
+▁diversity	-9033
+▁ecosystem	-9034
+▁engineers	-9035
+▁evolution	-9036
+▁expanding	-9037
+▁explained	-9038
+▁frequency	-9039
+▁gathering	-9040
+▁indicates	-9041
+▁initially	-9042
+▁intellect	-9043
+▁libraries	-9044
+▁mechanics	-9045
+▁offerings	-9046
+▁ownership	-9047
+▁paintings	-9048
+▁producers	-9049
+▁professor	-9050
+▁promotion	-9051
+▁qualified	-9052
+▁relations	-9053
+▁releasing	-9054
+▁requested	-9055
+▁sentenced	-9056
+▁sophistic	-9057
+▁specified	-9058
+▁subscribe	-9059
+▁typically	-9060
+▁voluntary	-9061
+commutative	-9062
+schoolcraft	-9063
+springfield	-9064
+▁attendance	-9065
+▁complaints	-9066
+▁compliance	-9067
+▁conducting	-9068
+▁containing	-9069
+▁customized	-9070
+▁durability	-9071
+▁efficiency	-9072
+▁equivalent	-9073
+▁essentials	-9074
+▁evaluation	-9075
+▁excitement	-9076
+▁internship	-9077
+▁nationally	-9078
+▁percentage	-9079
+▁personally	-9080
+▁popularity	-9081
+▁threatened	-9082
+▁treatments	-9083
+commonwealth	-9084
+habilitation	-9085
+registration	-9086
+▁conjunction	-9087
+▁constructed	-9088
+▁coordinator	-9089
+▁demonstrate	-9090
+▁documentary	-9091
+▁investigate	-9092
+▁personality	-9093
+▁threatening	-9094
+▁architecture	-9095
+▁difficulties	-9096
+▁experiencing	-9097
+▁inflammation	-9098
+▁certification	-9099
+▁complimentary	-9100
+▁concentration	-9101
+▁consideration	-9102
+▁illustrations	-9103
+▁investigators	-9104
+▁manufacturing	-9105
+▁transcription	-9106
+▁unfortunately	-9107
+.-	-9108
+?)	-9109
+fy	-9110
+pv	-9111
+ql	-9112
+rc	-9113
+tz	-9114
+vd	-9115
+▁’	-9116
+.''	-9117
+aco	-9118
+bir	-9119
+bnz	-9120
+cps	-9121
+dra	-9122
+eye	-9123
+fro	-9124
+gur	-9125
+hex	-9126
+ibe	-9127
+iop	-9128
+itu	-9129
+jar	-9130
+lec	-9131
+lie	-9132
+lux	-9133
+mox	-9134
+nhl	-9135
+npr	-9136
+nyc	-9137
+nys	-9138
+odi	-9139
+ogy	-9140
+oir	-9141
+osh	-9142
+pap	-9143
+pun	-9144
+rah	-9145
+rig	-9146
+rus	-9147
+sid	-9148
+sms	-9149
+thu	-9150
+vet	-9151
+vim	-9152
+vit	-9153
+yme	-9154
+▁mg	-9155
+▁oz	-9156
+!!!!	-9157
+ache	-9158
+acio	-9159
+acre	-9160
+agne	-9161
+ante	-9162
+apps	-9163
+auge	-9164
+bath	-9165
+baus	-9166
+caff	-9167
+cake	-9168
+cars	-9169
+chel	-9170
+cort	-9171
+didd	-9172
+dise	-9173
+doug	-9174
+earl	-9175
+edit	-9176
+eeee	-9177
+ente	-9178
+erie	-9179
+esis	-9180
+ests	-9181
+fest	-9182
+fett	-9183
+file	-9184
+flag	-9185
+gans	-9186
+gene	-9187
+hend	-9188
+hunt	-9189
+ials	-9190
+iana	-9191
+icbc	-9192
+ifty	-9193
+ikes	-9194
+inos	-9195
+ione	-9196
+issy	-9197
+jean	-9198
+jews	-9199
+jobs	-9200
+lara	-9201
+lynd	-9202
+nett	-9203
+none	-9204
+omed	-9205
+oors	-9206
+oras	-9207
+poin	-9208
+pope	-9209
+rele	-9210
+rene	-9211
+reve	-9212
+riad	-9213
+rive	-9214
+saud	-9215
+seau	-9216
+snow	-9217
+thor	-9218
+thus	-9219
+tory	-9220
+umph	-9221
+vanc	-9222
+vill	-9223
+viol	-9224
+xbox	-9225
+yogi	-9226
+▁aes	-9227
+▁cig	-9228
+▁cin	-9229
+▁cra	-9230
+▁dad	-9231
+▁dot	-9232
+▁dre	-9233
+▁fax	-9234
+▁fru	-9235
+▁fur	-9236
+▁ham	-9237
+▁hoo	-9238
+▁jug	-9239
+▁mph	-9240
+▁nob	-9241
+▁nom	-9242
+▁ren	-9243
+▁shy	-9244
+▁sin	-9245
+▁ski	-9246
+▁sne	-9247
+▁vit	-9248
+▁vou	-9249
+▁yog	-9250
+abyte	-9251
+adams	-9252
+ading	-9253
+adise	-9254
+alled	-9255
+assem	-9256
+atoes	-9257
+autom	-9258
+begin	-9259
+bloom	-9260
+bring	-9261
+cases	-9262
+chave	-9263
+clark	-9264
+coach	-9265
+cript	-9266
+dante	-9267
+death	-9268
+diddy	-9269
+domin	-9270
+dynam	-9271
+elite	-9272
+erves	-9273
+flash	-9274
+glenn	-9275
+hence	-9276
+homes	-9277
+ideal	-9278
+ighth	-9279
+ilies	-9280
+inois	-9281
+istol	-9282
+izard	-9283
+karen	-9284
+kelly	-9285
+koren	-9286
+laura	-9287
+liber	-9288
+lunch	-9289
+marin	-9290
+medix	-9291
+monte	-9292
+neath	-9293
+nered	-9294
+netic	-9295
+ocent	-9296
+opard	-9297
+ousin	-9298
+ouver	-9299
+parts	-9300
+peace	-9301
+phony	-9302
+piran	-9303
+proof	-9304
+pture	-9305
+quant	-9306
+royce	-9307
+sales	-9308
+salon	-9309
+serie	-9310
+sites	-9311
+spect	-9312
+spons	-9313
+susan	-9314
+sweet	-9315
+syrac	-9316
+syria	-9317
+tains	-9318
+tense	-9319
+uates	-9320
+ucker	-9321
+undry	-9322
+urtle	-9323
+usaid	-9324
+vated	-9325
+views	-9326
+▁asks	-9327
+▁dose	-9328
+▁drag	-9329
+▁east	-9330
+▁fert	-9331
+▁flip	-9332
+▁folk	-9333
+▁fool	-9334
+▁foul	-9335
+▁fuck	-9336
+▁glue	-9337
+▁grat	-9338
+▁grid	-9339
+▁hint	-9340
+▁hole	-9341
+▁hood	-9342
+▁icon	-9343
+▁king	-9344
+▁laid	-9345
+▁lect	-9346
+▁lies	-9347
+▁male	-9348
+▁masc	-9349
+▁moon	-9350
+▁mort	-9351
+▁neat	-9352
+▁pets	-9353
+▁pose	-9354
+▁prop	-9355
+▁rail	-9356
+▁rely	-9357
+▁scar	-9358
+▁sink	-9359
+▁tags	-9360
+▁tank	-9361
+▁toll	-9362
+▁toss	-9363
+▁trem	-9364
+▁trop	-9365
+▁weak	-9366
+▁wick	-9367
+▁yeah	-9368
+▁yoga	-9369
+abella	-9370
+acular	-9371
+afiber	-9372
+alties	-9373
+amedei	-9374
+apping	-9375
+ashley	-9376
+atured	-9377
+browns	-9378
+campus	-9379
+carrie	-9380
+castle	-9381
+cotton	-9382
+deputy	-9383
+destin	-9384
+erving	-9385
+fellow	-9386
+floren	-9387
+forest	-9388
+fourth	-9389
+hollow	-9390
+iments	-9391
+irvine	-9392
+ittees	-9393
+leager	-9394
+likely	-9395
+lounge	-9396
+matrix	-9397
+method	-9398
+muslim	-9399
+opport	-9400
+ortion	-9401
+ospace	-9402
+popeye	-9403
+prices	-9404
+reader	-9405
+reduce	-9406
+resort	-9407
+rotary	-9408
+rubber	-9409
+seeing	-9410
+shelia	-9411
+shirts	-9412
+sounds	-9413
+spects	-9414
+stevia	-9415
+superv	-9416
+theory	-9417
+ticket	-9418
+velope	-9419
+waring	-9420
+window	-9421
+wright	-9422
+xavier	-9423
+▁alert	-9424
+▁alter	-9425
+▁angry	-9426
+▁arbit	-9427
+▁array	-9428
+▁birds	-9429
+▁brave	-9430
+▁burst	-9431
+▁cargo	-9432
+▁cents	-9433
+▁compr	-9434
+▁decre	-9435
+▁disag	-9436
+▁drove	-9437
+▁farms	-9438
+▁femin	-9439
+▁gains	-9440
+▁gauge	-9441
+▁ghost	-9442
+▁grave	-9443
+▁grill	-9444
+▁hesit	-9445
+▁irons	-9446
+▁kinda	-9447
+▁loans	-9448
+▁magic	-9449
+▁newly	-9450
+▁noise	-9451
+▁oblig	-9452
+▁picks	-9453
+▁queue	-9454
+▁races	-9455
+▁reply	-9456
+▁revel	-9457
+▁rival	-9458
+▁roles	-9459
+▁rural	-9460
+▁setup	-9461
+▁spray	-9462
+▁storm	-9463
+▁towns	-9464
+▁union	-9465
+▁walks	-9466
+against	-9467
+alition	-9468
+assment	-9469
+between	-9470
+connect	-9471
+connell	-9472
+conserv	-9473
+consult	-9474
+coyotes	-9475
+defense	-9476
+embassy	-9477
+getting	-9478
+gustave	-9479
+ication	-9480
+ilation	-9481
+issance	-9482
+iveness	-9483
+lessons	-9484
+markets	-9485
+marlins	-9486
+mington	-9487
+ographs	-9488
+ometown	-9489
+outside	-9490
+publish	-9491
+regular	-9492
+reports	-9493
+respons	-9494
+results	-9495
+richard	-9496
+swedish	-9497
+uration	-9498
+volunte	-9499
+▁acquis	-9500
+▁almond	-9501
+▁beaten	-9502
+▁cancel	-9503
+▁casual	-9504
+▁clever	-9505
+▁compat	-9506
+▁conver	-9507
+▁copies	-9508
+▁defend	-9509
+▁define	-9510
+▁deploy	-9511
+▁divide	-9512
+▁dozens	-9513
+▁evenly	-9514
+▁feeder	-9515
+▁filing	-9516
+▁forced	-9517
+▁genius	-9518
+▁hockey	-9519
+▁honors	-9520
+▁hunger	-9521
+▁indeed	-9522
+▁invite	-9523
+▁losses	-9524
+▁lovers	-9525
+▁luxury	-9526
+▁majors	-9527
+▁marine	-9528
+▁occurs	-9529
+▁organs	-9530
+▁permit	-9531
+▁prophe	-9532
+▁proved	-9533
+▁quoted	-9534
+▁raises	-9535
+▁reject	-9536
+▁resort	-9537
+▁rolled	-9538
+▁senses	-9539
+▁severe	-9540
+▁slowly	-9541
+▁stolen	-9542
+▁suspic	-9543
+▁testim	-9544
+▁treats	-9545
+▁triple	-9546
+▁viewed	-9547
+▁vulner	-9548
+▁warmth	-9549
+▁worker	-9550
+▁yogurt	-9551
+aminated	-9552
+analysis	-9553
+belladia	-9554
+benjamin	-9555
+breaking	-9556
+caldwell	-9557
+calendar	-9558
+contract	-9559
+designed	-9560
+downtown	-9561
+ensation	-9562
+expected	-9563
+florence	-9564
+illinois	-9565
+includes	-9566
+japanese	-9567
+marathon	-9568
+meleager	-9569
+pakistan	-9570
+physical	-9571
+planning	-9572
+pression	-9573
+programs	-9574
+resident	-9575
+ribution	-9576
+shanghai	-9577
+speaking	-9578
+syracuse	-9579
+transfer	-9580
+ultimate	-9581
+visitors	-9582
+whatever	-9583
+▁anytime	-9584
+▁applies	-9585
+▁archive	-9586
+▁arrives	-9587
+▁beneath	-9588
+▁carries	-9589
+▁caution	-9590
+▁checked	-9591
+▁chopped	-9592
+▁compact	-9593
+▁crystal	-9594
+▁facilit	-9595
+▁founded	-9596
+▁gravity	-9597
+▁grounds	-9598
+▁handful	-9599
+▁heating	-9600
+▁heavily	-9601
+▁honored	-9602
+▁intense	-9603
+▁interns	-9604
+▁keyword	-9605
+▁lacking	-9606
+▁largely	-9607
+▁letting	-9608
+▁liberal	-9609
+▁lightly	-9610
+▁magical	-9611
+▁matched	-9612
+▁monthly	-9613
+▁neither	-9614
+▁observe	-9615
+▁partial	-9616
+▁portion	-9617
+▁predict	-9618
+▁pursuit	-9619
+▁queries	-9620
+▁ranging	-9621
+▁rapidly	-9622
+▁reduces	-9623
+▁scratch	-9624
+▁servers	-9625
+▁shortly	-9626
+▁utility	-9627
+▁vanilla	-9628
+clamation	-9629
+halloween	-9630
+indonesia	-9631
+novelties	-9632
+powerpuff	-9633
+themarket	-9634
+vancouver	-9635
+▁accuracy	-9636
+▁achieved	-9637
+▁analysts	-9638
+▁applying	-9639
+▁assigned	-9640
+▁bankrupt	-9641
+▁bathroom	-9642
+▁bearings	-9643
+▁captured	-9644
+▁ceremony	-9645
+▁chairman	-9646
+▁choosing	-9647
+▁constant	-9648
+▁contents	-9649
+▁costumes	-9650
+▁criteria	-9651
+▁declared	-9652
+▁decrease	-9653
+▁dressing	-9654
+▁drinking	-9655
+▁engineer	-9656
+▁enjoying	-9657
+▁estimate	-9658
+▁firewall	-9659
+▁gathered	-9660
+▁graphics	-9661
+▁holidays	-9662
+▁integral	-9663
+▁laughing	-9664
+▁mathemat	-9665
+▁mountain	-9666
+▁portrait	-9667
+▁precious	-9668
+▁prospect	-9669
+▁repeated	-9670
+▁reporter	-9671
+▁speaking	-9672
+▁troubled	-9673
+▁warranty	-9674
+▁whenever	-9675
+▁withdraw	-9676
+basketball	-9677
+democratic	-9678
+exhibition	-9679
+locksmiths	-9680
+operations	-9681
+philosophy	-9682
+production	-9683
+statistics	-9684
+tournament	-9685
+wilmington	-9686
+▁acknowled	-9687
+▁audiences	-9688
+▁celebrity	-9689
+▁classroom	-9690
+▁companion	-9691
+▁contracts	-9692
+▁convicted	-9693
+▁decorated	-9694
+▁decreased	-9695
+▁discrimin	-9696
+▁districts	-9697
+▁exploring	-9698
+▁festivals	-9699
+▁furniture	-9700
+▁hilarious	-9701
+▁instantly	-9702
+▁intention	-9703
+▁mechanism	-9704
+▁motivated	-9705
+▁municipal	-9706
+▁prospects	-9707
+▁screening	-9708
+▁sentiment	-9709
+▁shoulders	-9710
+▁signature	-9711
+▁sponsored	-9712
+▁teammates	-9713
+▁telephone	-9714
+▁territory	-9715
+▁thrilling	-9716
+▁uncomfort	-9717
+▁warehouse	-9718
+anniversary	-9719
+thermafiber	-9720
+▁accounting	-9721
+▁accurately	-9722
+▁attractive	-9723
+▁capability	-9724
+▁collective	-9725
+▁containers	-9726
+▁girlfriend	-9727
+▁impossible	-9728
+▁industrial	-9729
+▁motorcycle	-9730
+▁performing	-9731
+▁physiology	-9732
+▁procedures	-9733
+▁satisfying	-9734
+▁sentencing	-9735
+▁statements	-9736
+▁strategies	-9737
+▁strengthen	-9738
+▁struggling	-9739
+▁supplement	-9740
+▁surrounded	-9741
+▁translated	-9742
+▁underneath	-9743
+▁warehouses	-9744
+▁wavelength	-9745
+intermediate	-9746
+▁accessories	-9747
+▁departments	-9748
+▁encouraging	-9749
+▁essentially	-9750
+▁exclusively	-9751
+▁governments	-9752
+▁institution	-9753
+▁instruction	-9754
+▁introducing	-9755
+▁legislation	-9756
+▁legislators	-9757
+▁positioning	-9758
+▁reservation	-9759
+▁specializes	-9760
+▁supermarket	-9761
+▁tablespoons	-9762
+▁trafficking	-9763
+▁alternatives	-9764
+▁arrangements	-9765
+▁contribution	-9766
+▁increasingly	-9767
+▁negotiations	-9768
+▁overwhelming	-9769
+▁presidential	-9770
+transportation	-9771
+▁uncomfortable	-9772
+▁accountability	-9773
+▁discrimination	-9774
+▁interpretation	-9775
+▁noncommutative	-9776
+▁representation	-9777
+,’	-9778
+.:	-9779
+/.	-9780
+:)	-9781
+?!	-9782
+bm	-9783
+cu	-9784
+dl	-9785
+hu	-9786
+oz	-9787
+qb	-9788
+tu	-9789
+vu	-9790
+wl	-9791
+xl	-9792
+yn	-9793
+yo	-9794
+zo	-9795
+▁!	-9796
+▁;	-9797
+abd	-9798
+aed	-9799
+arv	-9800
+bek	-9801
+bos	-9802
+buz	-9803
+dif	-9804
+dit	-9805
+dod	-9806
+dot	-9807
+dun	-9808
+egy	-9809
+eta	-9810
+exc	-9811
+gil	-9812
+gor	-9813
+hum	-9814
+hur	-9815
+iev	-9816
+ika	-9817
+iov	-9818
+isk	-9819
+kin	-9820
+mam	-9821
+nex	-9822
+nug	-9823
+oak	-9824
+oli	-9825
+oyd	-9826
+pbs	-9827
+pey	-9828
+pmt	-9829
+pom	-9830
+pty	-9831
+rek	-9832
+rip	-9833
+rut	-9834
+seo	-9835
+sir	-9836
+sox	-9837
+tek	-9838
+ulf	-9839
+usc	-9840
+usd	-9841
+wra	-9842
+yor	-9843
+▁''	-9844
+▁**	-9845
+▁ax	-9846
+▁da	-9847
+▁mp	-9848
+▁ok	-9849
+aber	-9850
+abig	-9851
+achy	-9852
+afgh	-9853
+aine	-9854
+aire	-9855
+airy	-9856
+amew	-9857
+argo	-9858
+auer	-9859
+bage	-9860
+bapt	-9861
+blvd	-9862
+boba	-9863
+bors	-9864
+bted	-9865
+cafe	-9866
+camt	-9867
+cand	-9868
+carr	-9869
+cart	-9870
+cend	-9871
+chip	-9872
+cled	-9873
+coat	-9874
+comb	-9875
+copy	-9876
+eder	-9877
+elij	-9878
+elli	-9879
+emen	-9880
+esto	-9881
+euch	-9882
+exam	-9883
+fear	-9884
+fers	-9885
+fine	-9886
+frag	-9887
+gary	-9888
+gill	-9889
+gins	-9890
+goal	-9891
+grab	-9892
+guam	-9893
+held	-9894
+hide	-9895
+hook	-9896
+hort	-9897
+iley	-9898
+inis	-9899
+ithm	-9900
+kers	-9901
+kiva	-9902
+kiwi	-9903
+laun	-9904
+loot	-9905
+lynn	-9906
+maid	-9907
+masi	-9908
+menu	-9909
+mott	-9910
+move	-9911
+nand	-9912
+nasa	-9913
+near	-9914
+nine	-9915
+ogan	-9916
+▁	-9917
+e	-9918
+t	-9919
+a	-9920
+o	-9921
+i	-9922
+n	-9923
+s	-9924
+r	-9925
+h	-9926
+l	-9927
+d	-9928
+c	-9929
+u	-9930
+m	-9931
+p	-9932
+f	-9933
+g	-9934
+y	-9935
+w	-9936
+b	-9937
+.	-9938
+v	-9939
+,	-9940
+k	-9941
+-	-9942
+0	-9943
+1	-9944
+2	-9945
+x	-9946
+'	-9947
+j	-9948
+:	-9949
+3	-9950
+’	-9951
+)	-9952
+"	-9953
+(	-9954
+z	-9955
+5	-9956
+q	-9957
+4	-9958
+9	-9959
+8	-9960
+6	-9961
+7	-9962
+|	-9963
+!	-9964
+/	-9965
+?	-9966
+“	-9967
+”	-9968
+;	-9969
+$	-9970
+&	-9971
+–	-9972
+—	-9973
+*	-9974
+>	-9975
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/tokenizer_specs_sp10240.json b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/tokenizer_specs_sp10240.json
new file mode 100644
index 0000000000..9fcbe11191
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/tokenization_10kvocab/tokenizer/tokenizer_specs_sp10240.json
@@ -0,0 +1,12 @@
+{
+  "tokenizer_specs": [
+    {
+      "name": "sp_bpe_10240",
+      "dataset_suffix": "sp10240",
+      "vocab_size": 10240,
+      "model_prefix": "fineweb_10240_bpe",
+      "tokenizer_skip_docs": 50000,
+      "_audit_note": "tokenizer_skip_docs=50000 excludes the canonical val docs [0, 50000) from BPE training. Train iterates docs [50000, end). Verified at download_hf_docs_and_tokenize.py _iter_sentencepiece_text skip_docs param."
+    }
+  ]
+}
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_gpt.py b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_gpt.py
new file mode 100644
index 0000000000..bc8205456f
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_gpt.py
@@ -0,0 +1,3859 @@
+import base64, collections, copy, fcntl, glob, io, lzma, math, os
+from pathlib import Path
+import random, re, subprocess, sys, time, uuid, numpy as np, sentencepiece as spm, torch, torch.distributed as dist, torch.nn.functional as F
+from torch import Tensor, nn
+from flash_attn_interface import (
+    flash_attn_func as flash_attn_3_func,
+    flash_attn_varlen_func,
+)
+from concurrent.futures import ThreadPoolExecutor
+import triton
+import triton.language as tl
+from triton.tools.tensor_descriptor import TensorDescriptor
+
+
+# ===== Fused softcapped cross-entropy (Triton) — training-only path =====
+# Replaces the eager
+#     logits_softcap = softcap * tanh(logits / softcap)
+#     F.cross_entropy(logits_softcap.float(), targets, reduction="mean")
+# sequence with a single fused kernel that reads logits_proj once, applies
+# softcap in-register, and computes (LSE, loss) in one streaming pass. The
+# backward kernel mirrors the forward so there's no stored softcapped logits.
+# Numerically identical to the eager path up to fp32 accumulation differences.
+_FUSED_CE_LIBRARY = "pgsubmission1draft7fusedce"
+_FUSED_CE_BLOCK_SIZE = 1024
+_FUSED_CE_NUM_WARPS = 4
+
+
+@triton.jit
+def _softcapped_ce_fwd_kernel(
+    logits_ptr, losses_ptr, lse_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    max_val = -float("inf")
+    sum_exp = 0.0
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=-float("inf"),
+        ).to(tl.float32)
+        z = A * tl.sigmoid(val * inv_C)
+        z = tl.where(mask, z, -float("inf"))
+        curr_max = tl.max(z, axis=0)
+        new_max = tl.maximum(max_val, curr_max)
+        sum_exp = sum_exp * tl.exp(max_val - new_max) + tl.sum(tl.exp(z - new_max), axis=0)
+        max_val = new_max
+    lse = max_val + tl.log(sum_exp)
+    tl.store(lse_ptr + row_idx, lse)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    target_val = tl.load(logits_row_ptr + target * stride_logits_v).to(tl.float32)
+    target_z = A * tl.sigmoid(target_val * inv_C)
+    tl.store(losses_ptr + row_idx, lse - target_z)
+
+
+@triton.jit
+def _softcapped_ce_bwd_kernel(
+    grad_logits_ptr, grad_losses_ptr, lse_ptr, logits_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    stride_grad_n, stride_grad_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    grad_row_ptr = grad_logits_ptr + row_idx * stride_grad_n
+    lse = tl.load(lse_ptr + row_idx)
+    grad_loss = tl.load(grad_losses_ptr + row_idx).to(tl.float32)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    dz_dx_scale = A * inv_C
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=0.0,
+        ).to(tl.float32)
+        sigmoid_u = tl.sigmoid(val * inv_C)
+        z = A * sigmoid_u
+        probs = tl.exp(z - lse)
+        grad_z = grad_loss * (probs - tl.where(cols == target, 1.0, 0.0))
+        grad_x = grad_z * (dz_dx_scale * sigmoid_u * (1.0 - sigmoid_u))
+        tl.store(grad_row_ptr + cols * stride_grad_v, grad_x, mask=mask)
+
+
+def _validate_softcapped_ce_inputs(
+    logits: Tensor, targets: Tensor, softcap: float,
+) -> tuple[Tensor, Tensor]:
+    if logits.ndim != 2:
+        raise ValueError(f"Expected logits.ndim=2, got {logits.ndim}")
+    if targets.ndim != 1:
+        raise ValueError(f"Expected targets.ndim=1, got {targets.ndim}")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    if not logits.is_cuda or not targets.is_cuda:
+        raise ValueError("softcapped_cross_entropy requires CUDA tensors")
+    if softcap <= 0.0:
+        raise ValueError(f"softcap must be positive, got {softcap}")
+    if logits.dtype not in (torch.float16, torch.bfloat16, torch.float32):
+        raise ValueError(f"Unsupported logits dtype: {logits.dtype}")
+    logits = logits.contiguous()
+    targets = targets.contiguous()
+    if targets.dtype != torch.int64:
+        targets = targets.to(dtype=torch.int64)
+    return logits, targets
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce", mutates_args=())
+def softcapped_ce_op(logits: Tensor, targets: Tensor, softcap: float) -> tuple[Tensor, Tensor]:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    n_rows, n_cols = logits.shape
+    losses = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    lse = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    _softcapped_ce_fwd_kernel[(n_rows,)](
+        logits, losses, lse, targets,
+        logits.stride(0), logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return losses, lse
+
+
+@softcapped_ce_op.register_fake
+def _(logits: Tensor, targets: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1:
+        raise ValueError("softcapped_ce fake impl expects 2D logits and 1D targets")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    n_rows = logits.shape[0]
+    return (
+        logits.new_empty((n_rows,), dtype=torch.float32),
+        logits.new_empty((n_rows,), dtype=torch.float32),
+    )
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce_backward", mutates_args=())
+def softcapped_ce_backward_op(
+    logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float,
+) -> Tensor:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    lse = lse.contiguous()
+    grad_losses = grad_losses.contiguous().to(dtype=torch.float32)
+    if lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("Expected 1D lse and grad_losses")
+    if lse.shape[0] != logits.shape[0] or grad_losses.shape[0] != logits.shape[0]:
+        raise ValueError(
+            f"Expected row-aligned lse/grad_losses, got logits={tuple(logits.shape)} "
+            f"lse={tuple(lse.shape)} grad_losses={tuple(grad_losses.shape)}"
+        )
+    grad_logits = torch.empty_like(logits)
+    n_rows, n_cols = logits.shape
+    _softcapped_ce_bwd_kernel[(n_rows,)](
+        grad_logits, grad_losses, lse, logits, targets,
+        logits.stride(0), logits.stride(1),
+        grad_logits.stride(0), grad_logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return grad_logits
+
+
+@softcapped_ce_backward_op.register_fake
+def _(logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1 or lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("softcapped_ce_backward fake impl expects 2D logits and 1D row tensors")
+    if (
+        logits.shape[0] != targets.shape[0]
+        or logits.shape[0] != lse.shape[0]
+        or logits.shape[0] != grad_losses.shape[0]
+    ):
+        raise ValueError("softcapped_ce_backward fake impl expects row-aligned tensors")
+    return logits.new_empty(logits.shape)
+
+
+def _softcapped_ce_setup_context(
+    ctx: torch.autograd.function.FunctionCtx, inputs, output,
+) -> None:
+    logits, targets, softcap = inputs
+    _losses, lse = output
+    ctx.save_for_backward(logits, targets, lse)
+    ctx.softcap = float(softcap)
+
+
+def _softcapped_ce_backward(
+    ctx: torch.autograd.function.FunctionCtx, grad_losses: Tensor, grad_lse: "Tensor | None",
+):
+    del grad_lse
+    logits, targets, lse = ctx.saved_tensors
+    grad_logits = torch.ops.pgsubmission1draft7fusedce.softcapped_ce_backward(
+        logits, targets, lse, grad_losses, ctx.softcap
+    )
+    return grad_logits, None, None
+
+
+softcapped_ce_op.register_autograd(
+    _softcapped_ce_backward, setup_context=_softcapped_ce_setup_context,
+)
+
+
+def softcapped_cross_entropy(
+    logits: Tensor, targets: Tensor, softcap: float, reduction: str = "mean",
+) -> Tensor:
+    losses, _lse = torch.ops.pgsubmission1draft7fusedce.softcapped_ce(
+        logits, targets, float(softcap)
+    )
+    if reduction == "none":
+        return losses
+    if reduction == "sum":
+        return losses.sum()
+    if reduction == "mean":
+        return losses.mean()
+    raise ValueError(f"Unsupported reduction={reduction!r}")
+
+
+class Hyperparameters:
+    data_dir = os.environ.get("DATA_DIR", "./data/")
+    seed = int(os.environ.get("SEED", 1337))
+    run_id = os.environ.get("RUN_ID", str(uuid.uuid4()))
+    iterations = int(os.environ.get("ITERATIONS", 20000))
+    warmdown_frac = float(os.environ.get("WARMDOWN_FRAC", 0.75))
+    warmup_steps = int(os.environ.get("WARMUP_STEPS", 20))
+    train_batch_tokens = int(os.environ.get("TRAIN_BATCH_TOKENS", 786432))
+    # Fused softcapped CE (Triton). Training-only — forward_logits eval path still uses
+    # eager softcap+F.cross_entropy. Default ON since validated as at-worst neutral.
+    fused_ce_enabled = bool(int(os.environ.get("FUSED_CE_ENABLED", "1")))
+    train_seq_len = int(os.environ.get("TRAIN_SEQ_LEN", 2048))
+    train_log_every = int(os.environ.get("TRAIN_LOG_EVERY", 500))
+    max_wallclock_seconds = float(os.environ.get("MAX_WALLCLOCK_SECONDS", 6e2))
+    val_batch_tokens = int(os.environ.get("VAL_BATCH_TOKENS", 524288))
+    eval_seq_len = int(os.environ.get("EVAL_SEQ_LEN", 2048))
+    val_loss_every = int(os.environ.get("VAL_LOSS_EVERY", 4000))
+    vocab_size = int(os.environ.get("VOCAB_SIZE", 8192))
+    num_layers = int(os.environ.get("NUM_LAYERS", 11))
+    xsa_last_n = int(os.environ.get("XSA_LAST_N", 11))
+    model_dim = int(os.environ.get("MODEL_DIM", 512))
+    num_kv_heads = int(os.environ.get("NUM_KV_HEADS", 4))
+    num_heads = int(os.environ.get("NUM_HEADS", 8))
+    mlp_mult = float(os.environ.get("MLP_MULT", 4.0))
+    skip_gates_enabled = bool(int(os.environ.get("SKIP_GATES_ENABLED", "1")))
+    tie_embeddings = bool(int(os.environ.get("TIE_EMBEDDINGS", "1")))
+    logit_softcap = float(os.environ.get("LOGIT_SOFTCAP", 3e1))
+    rope_base = float(os.environ.get("ROPE_BASE", 1e4))
+    rope_dims = int(os.environ.get("ROPE_DIMS", 16))
+    rope_train_seq_len = int(os.environ.get("ROPE_TRAIN_SEQ_LEN", 2048))
+    rope_yarn = bool(int(os.environ.get("ROPE_YARN", "0")))
+    ln_scale = bool(int(os.environ.get("LN_SCALE", "1")))
+    qk_gain_init = float(os.environ.get("QK_GAIN_INIT", 5.0))
+    num_loops = int(os.environ.get("NUM_LOOPS", 2))
+    loop_start = int(os.environ.get("LOOP_START", 3))
+    loop_end = int(os.environ.get("LOOP_END", 5))
+    enable_looping_at = float(os.environ.get("ENABLE_LOOPING_AT", 0.35))
+    parallel_start_layer = int(os.environ.get("PARALLEL_START_LAYER", 8))
+    parallel_final_lane = os.environ.get("PARALLEL_FINAL_LANE", "mean")
+    min_lr = float(os.environ.get("MIN_LR", 0.0))
+    embed_lr = float(os.environ.get("EMBED_LR", 0.6))
+    tied_embed_lr = float(os.environ.get("TIED_EMBED_LR", 0.03))
+    tied_embed_init_std = float(os.environ.get("TIED_EMBED_INIT_STD", 0.005))
+    matrix_lr = float(os.environ.get("MATRIX_LR", 0.026))
+    scalar_lr = float(os.environ.get("SCALAR_LR", 0.02))
+    muon_momentum = float(os.environ.get("MUON_MOMENTUM", 0.97))
+    muon_backend_steps = int(os.environ.get("MUON_BACKEND_STEPS", 5))
+    muon_momentum_warmup_start = float(
+        os.environ.get("MUON_MOMENTUM_WARMUP_START", 0.92)
+    )
+    muon_momentum_warmup_steps = int(os.environ.get("MUON_MOMENTUM_WARMUP_STEPS", 1500))
+    muon_row_normalize = bool(int(os.environ.get("MUON_ROW_NORMALIZE", "1")))
+    beta1 = float(os.environ.get("BETA1", 0.9))
+    beta2 = float(os.environ.get("BETA2", 0.95))
+    adam_eps = float(os.environ.get("ADAM_EPS", 1e-08))
+    grad_clip_norm = float(os.environ.get("GRAD_CLIP_NORM", 0.3))
+    eval_stride = int(os.environ.get("EVAL_STRIDE", 64))
+    adam_wd = float(os.environ.get("ADAM_WD", 0.02))
+    muon_wd = float(os.environ.get("MUON_WD", 0.095))
+    embed_wd = float(os.environ.get("EMBED_WD", 0.085))
+    ema_decay = float(os.environ.get("EMA_DECAY", 0.9965))
+    ttt_enabled = bool(int(os.environ.get("TTT_ENABLED", "1")))
+    ttt_lora_rank = int(os.environ.get("TTT_LORA_RANK", 96))
+    ttt_lora_lr = float(os.environ.get("TTT_LORA_LR", 0.0001))
+    ttt_chunk_size = int(os.environ.get("TTT_CHUNK_SIZE", 48))
+    ttt_eval_seq_len = int(os.environ.get("TTT_EVAL_SEQ_LEN", 2048))
+    ttt_batch_size = int(os.environ.get("TTT_BATCH_SIZE", 64))
+    ttt_grad_steps = int(os.environ.get("TTT_GRAD_STEPS", 1))
+    ttt_weight_decay = float(os.environ.get("TTT_WEIGHT_DECAY", 1.0))
+    ttt_beta1 = float(os.environ.get("TTT_BETA1", 0))
+    ttt_beta2 = float(os.environ.get("TTT_BETA2", 0.999))
+    ttt_k_lora = bool(int(os.environ.get("TTT_K_LORA", "1")))
+    ttt_mlp_lora = bool(int(os.environ.get("TTT_MLP_LORA", "1")))
+    ttt_o_lora = bool(int(os.environ.get("TTT_O_LORA", "1")))
+    ttt_optimizer = os.environ.get("TTT_OPTIMIZER", "adam")
+    ttt_eval_batches = os.environ.get("TTT_EVAL_BATCHES", "")
+    val_doc_fraction = float(os.environ.get("VAL_DOC_FRACTION", 1.0))
+    compressor = os.environ.get("COMPRESSOR", "brotli")
+    gptq_calibration_batches = int(os.environ.get("GPTQ_CALIBRATION_BATCHES", 16))
+    gptq_reserve_seconds = float(os.environ.get("GPTQ_RESERVE_SECONDS", 4.0))
+    phased_ttt_prefix_docs = int(os.environ.get("PHASED_TTT_PREFIX_DOCS", 2000))
+    phased_ttt_num_phases = int(os.environ.get("PHASED_TTT_NUM_PHASES", 1))
+    global_ttt_lr = float(os.environ.get("GLOBAL_TTT_LR", 0.001))
+    global_ttt_momentum = float(os.environ.get("GLOBAL_TTT_MOMENTUM", 0.9))
+    global_ttt_epochs = int(os.environ.get("GLOBAL_TTT_EPOCHS", 1))
+    global_ttt_chunk_tokens = int(os.environ.get("GLOBAL_TTT_CHUNK_TOKENS", 32768))
+    global_ttt_batch_seqs = int(os.environ.get("GLOBAL_TTT_BATCH_SEQS", 32))
+    global_ttt_warmup_start_lr = float(os.environ.get("GLOBAL_TTT_WARMUP_START_LR", 0.0))
+    global_ttt_warmup_chunks = int(os.environ.get("GLOBAL_TTT_WARMUP_CHUNKS", 0))
+    global_ttt_grad_clip = float(os.environ.get("GLOBAL_TTT_GRAD_CLIP", 1.0))
+    global_ttt_respect_doc_boundaries = bool(int(os.environ.get("GLOBAL_TTT_RESPECT_DOC_BOUNDARIES", "1")))
+    matrix_bits = int(os.environ.get("MATRIX_BITS", 6))
+    embed_bits = int(os.environ.get("EMBED_BITS", 8))
+    matrix_clip_sigmas = float(os.environ.get("MATRIX_CLIP_SIGMAS", 12.85))
+    embed_clip_sigmas = float(os.environ.get("EMBED_CLIP_SIGMAS", 2e1))
+    mlp_clip_sigmas = float(os.environ.get("MLP_CLIP_SIGMAS", 10.0))
+    attn_clip_sigmas = float(os.environ.get("ATTN_CLIP_SIGMAS", 13.0))
+    # AttnOutGate (per-head multiplicative output gate, PR #1667 MarioPaerle).
+    # Zero-init weight: 2*sigmoid(0)=1 -> transparent at start. Source defaults to
+    # block input x ('proj'); 'q' uses raw Q projection output.
+    attn_out_gate_enabled = bool(int(os.environ.get("ATTN_OUT_GATE_ENABLED", "0")))
+    attn_out_gate_src = os.environ.get("ATTN_OUT_GATE_SRC", "proj")
+    # SmearGate (input-dependent forward-1 token smear, modded-nanogpt @classiclarryd
+    # via PR #1667). x_t <- x_t + lam * sigmoid(W*x_t[:gate_window]) * x_{t-1}.
+    # lam=0 + W=0 -> transparent at init.
+    smear_gate_enabled = bool(int(os.environ.get("SMEAR_GATE_ENABLED", "0")))
+    # Window: first GATE_WINDOW dims of the source feed the gate projection.
+    gate_window = int(os.environ.get("GATE_WINDOW", 12))
+    # Gated Attention (Qwen, NeurIPS 2025 Best Paper, arXiv:2505.06708;
+    # qiuzh20/gated_attention). Per-head sigmoid gate on SDPA output, BEFORE
+    # out_proj. Gate input = full block input x (paper's headwise G1 variant
+    # driven from hidden_states). W_g shape (num_heads, dim), plain sigmoid.
+    # Near-zero init gives g~0.5 at step 0 (half attention output); per-block
+    # attn_scale (init 1.0) compensates during training. Name contains
+    # "attn_gate" so CONTROL_TENSOR_NAME_PATTERNS routes it to scalar AdamW.
+    gated_attn_enabled = bool(int(os.environ.get("GATED_ATTN_ENABLED", "0")))
+    gated_attn_init_std = float(os.environ.get("GATED_ATTN_INIT_STD", 0.01))
+    # Dedicated int8-per-row quantization for `attn_gate_w` tensors. These are
+    # small ((num_heads, dim) = (8, 512) = 4096 params) and bypass GPTQ via the
+    # numel<=65536 passthrough branch -> stored as fp16 (8 KB/layer, ~65 KB total
+    # compressed). int8-per-row cuts the raw tensor in half with negligible BPB
+    # impact: scales per head (8 values), symmetric quant over [-127, 127].
+    # No Hessian needed (gate weights not in collect_hessians()).
+    gated_attn_quant_gate = bool(int(os.environ.get("GATED_ATTN_QUANT_GATE", "0")))
+    # Sparse Attention Gate (modded-nanogpt-style). Keeps dense SDPA and only
+    # swaps the output-gate input to the first GATE_WINDOW residual dims.
+    # W_g: (num_heads, gate_window) = (8, 12) = 96 params/layer (~44K total),
+    # vs dense GatedAttn's (8, 512) = 4K/layer (~44K diff). Name "attn_gate_w"
+    # is shared so quant routing and int8 gate passthrough Just Work. Gate
+    # passthrough int8 still applies via GATED_ATTN_QUANT_GATE=1.
+    # Mutually exclusive with ATTN_OUT_GATE_ENABLED and GATED_ATTN_ENABLED.
+    sparse_attn_gate_enabled = bool(int(os.environ.get("SPARSE_ATTN_GATE_ENABLED", "0")))
+    sparse_attn_gate_init_std = float(os.environ.get("SPARSE_ATTN_GATE_INIT_STD", 0.0))
+    sparse_attn_gate_scale = float(os.environ.get("SPARSE_ATTN_GATE_SCALE", 1.0))
+    # LQER asymmetric rank-k correction on top-K quant-error tensors (PR #1530 v2 port).
+    # Computes SVD of E = W_fp - W_quant, packs top-r A,B as INT2/INT4 (asym) or INTk (sym).
+    lqer_enabled = bool(int(os.environ.get("LQER_ENABLED", "1")))
+    lqer_rank = int(os.environ.get("LQER_RANK", 4))
+    lqer_top_k = int(os.environ.get("LQER_TOP_K", 3))
+    lqer_factor_bits = int(os.environ.get("LQER_FACTOR_BITS", 4))
+    lqer_asym_enabled = bool(int(os.environ.get("LQER_ASYM_ENABLED", "1")))
+    lqer_asym_group = int(os.environ.get("LQER_ASYM_GROUP", "64"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    rank = int(os.environ.get("RANK", "0"))
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    is_main_process = rank == 0
+    grad_accum_steps = 8 // world_size
+    # CaseOps integration: optional override of dataset root + tokenizer path.
+    # When CASEOPS_ENABLED=1, the wrapper loads a per-token byte sidecar
+    # (fineweb_val_bytes_*.bin, identical shard layout to val_*.bin) and uses
+    # it as the canonical raw-byte budget for BPB accounting. The sidecar
+    # REPLACES the build_sentencepiece_luts byte-counting path entirely.
+    caseops_enabled = bool(int(os.environ.get("CASEOPS_ENABLED", "0")))
+    _default_caseops_data = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "datasets",
+        "fineweb10B_sp8192_lossless_caps_caseops_v1_reserved",
+    )
+    _default_caseops_tok = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "tokenizers",
+        "fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model",
+    )
+    if caseops_enabled:
+        datasets_dir = os.environ.get("DATA_PATH", _default_caseops_data)
+        tokenizer_path = os.environ.get("TOKENIZER_PATH", _default_caseops_tok)
+    else:
+        datasets_dir = os.environ.get(
+            "DATA_PATH",
+            os.path.join(data_dir, "datasets", f"fineweb10B_sp{vocab_size}"),
+        )
+        tokenizer_path = os.environ.get(
+            "TOKENIZER_PATH",
+            os.path.join(data_dir, "tokenizers", f"fineweb_{vocab_size}_bpe.model"),
+        )
+    train_files = os.path.join(datasets_dir, "fineweb_train_*.bin")
+    val_files = os.path.join(datasets_dir, "fineweb_val_*.bin")
+    val_bytes_files = os.path.join(datasets_dir, "fineweb_val_bytes_*.bin")
+    artifact_dir = os.environ.get("ARTIFACT_DIR", "")
+    logfile = (
+        os.path.join(artifact_dir, f"{run_id}.txt")
+        if artifact_dir
+        else f"logs/{run_id}.txt"
+    )
+    model_path = (
+        os.path.join(artifact_dir, "final_model.pt")
+        if artifact_dir
+        else "final_model.pt"
+    )
+    quantized_model_path = (
+        os.path.join(artifact_dir, "final_model.int6.ptz")
+        if artifact_dir
+        else "final_model.int6.ptz"
+    )
+
+
+# ===== 2026-04-30 SP10240 CaseOps MLP3.75 late045 promoted test car =====
+# Source of truth for this new experiment. The launcher only checks files and
+# calls this run.py; it does not define model or eval conditions.
+TEST_ID = "2026-04-30_pr1855_sp10240_caseops_mlp375_late045_8x"
+TEST_DATE = "2026-04-30"
+RUN_LABEL = "standard_8x"
+RUN_KIND = "new_experiment"
+SOURCE_PARENT = "legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py"
+SOURCE_PARENT_SHA256 = "454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b"
+SOURCE_TOKENIZER_LANE = "scripts/prepare_sp10240_caseops_data.py"
+PARENT_RUN = "2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x"
+HYPOTHESIS = (
+    "Promote the best legal SP10240 CaseOps Side4 mechanics candidate to a "
+    "clean standard 8x run: 11L MLP3.75 with loop2 enabled at 0.45, keeping "
+    "PR1855 LQER/pergroup/phased-TTT compression/eval machinery fixed."
+)
+SIZE_CAP_BYTES = 16000000
+BUILD_SECONDS = 600
+EVAL_SECONDS = 600
+
+Hyperparameters.test_id = TEST_ID
+Hyperparameters.test_date = TEST_DATE
+Hyperparameters.run_label = RUN_LABEL
+Hyperparameters.run_kind = RUN_KIND
+Hyperparameters.source_parent = SOURCE_PARENT
+Hyperparameters.source_parent_sha256 = SOURCE_PARENT_SHA256
+Hyperparameters.source_tokenizer_lane = SOURCE_TOKENIZER_LANE
+Hyperparameters.parent_run = PARENT_RUN
+Hyperparameters.hypothesis = HYPOTHESIS
+Hyperparameters.size_cap_bytes = SIZE_CAP_BYTES
+Hyperparameters.build_seconds = BUILD_SECONDS
+Hyperparameters.eval_seconds = EVAL_SECONDS
+
+Hyperparameters.data_dir = "/workspace/SOTA_FINAL/data"
+_caseops_root = os.path.join(
+    Hyperparameters.data_dir, "datasets", "fineweb10B_sp10240_caseops", "datasets"
+)
+Hyperparameters.vocab_size = 10240
+Hyperparameters.caseops_enabled = True
+Hyperparameters.datasets_dir = os.path.join(
+    _caseops_root, "datasets", "fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+)
+Hyperparameters.train_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_train_*.bin")
+Hyperparameters.val_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_*.bin")
+Hyperparameters.val_bytes_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_bytes_*.bin")
+Hyperparameters.tokenizer_path = os.path.join(
+    _caseops_root, "tokenizers", "fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model"
+)
+
+Hyperparameters.seed = 42
+Hyperparameters.run_id = "pr1855_sp10240_caseops_mlp375_late045_8x_seed42"
+Hyperparameters.artifact_dir = "logs"
+Hyperparameters.logfile = os.path.join(Hyperparameters.artifact_dir, f"{Hyperparameters.run_id}.txt")
+Hyperparameters.model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.pt")
+Hyperparameters.quantized_model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.int6.ptz")
+Hyperparameters.iterations = 20000
+Hyperparameters.max_wallclock_seconds = float(BUILD_SECONDS)
+Hyperparameters.num_layers = 11
+Hyperparameters.xsa_last_n = 11
+Hyperparameters.model_dim = 512
+Hyperparameters.num_heads = 8
+Hyperparameters.num_kv_heads = 4
+Hyperparameters.mlp_mult = 3.75
+Hyperparameters.num_loops = 2
+Hyperparameters.loop_start = 3
+Hyperparameters.loop_end = 5
+Hyperparameters.enable_looping_at = 0.45
+Hyperparameters.parallel_start_layer = 8
+Hyperparameters.qk_gain_init = 5.25
+Hyperparameters.warmdown_frac = 0.85
+Hyperparameters.warmup_steps = 20
+Hyperparameters.min_lr = 0.1
+Hyperparameters.matrix_lr = 0.026
+Hyperparameters.beta2 = 0.99
+Hyperparameters.muon_backend_steps = 5
+Hyperparameters.grad_clip_norm = 0.3
+Hyperparameters.val_loss_every = 0
+Hyperparameters.ttt_enabled = True
+Hyperparameters.ttt_lora_rank = 80
+Hyperparameters.ttt_chunk_size = 48
+Hyperparameters.ttt_weight_decay = 0.5
+Hyperparameters.ttt_beta2 = 0.99
+Hyperparameters.phased_ttt_prefix_docs = 2500
+Hyperparameters.phased_ttt_num_phases = 3
+Hyperparameters.global_ttt_momentum = 0.9
+Hyperparameters.compressor = "pergroup"
+Hyperparameters.gptq_reserve_seconds = 0.5
+Hyperparameters.gptq_calibration_batches = 16
+Hyperparameters.matrix_bits = 6
+Hyperparameters.embed_bits = 7
+Hyperparameters.mlp_clip_sigmas = 11.5
+Hyperparameters.attn_clip_sigmas = 13.0
+Hyperparameters.embed_clip_sigmas = 14.0
+Hyperparameters.gated_attn_quant_gate = True
+Hyperparameters.sparse_attn_gate_enabled = True
+Hyperparameters.sparse_attn_gate_scale = 0.5
+Hyperparameters.gate_window = 12
+Hyperparameters.smear_gate_enabled = True
+Hyperparameters.lqer_enabled = True
+Hyperparameters.lqer_asym_enabled = True
+Hyperparameters.lqer_rank = 4
+Hyperparameters.lqer_factor_bits = 4
+Hyperparameters.lqer_asym_group = 64
+Hyperparameters.lqer_top_k = 3
+Hyperparameters.fused_ce_enabled = True
+
+_logger_hparams = None
+
+
+def set_logging_hparams(h):
+    global _logger_hparams
+    _logger_hparams = h
+
+
+def log(msg, console=True):
+    if _logger_hparams is None:
+        print(msg)
+        return
+    if _logger_hparams.is_main_process:
+        if console:
+            print(msg)
+        if _logger_hparams.logfile is not None:
+            with open(_logger_hparams.logfile, "a", encoding="utf-8") as f:
+                print(msg, file=f)
+
+
+class ValidationData:
+    def __init__(self, h, device):
+        self.sp = spm.SentencePieceProcessor(model_file=h.tokenizer_path)
+        if int(self.sp.vocab_size()) != h.vocab_size:
+            raise ValueError(
+                f"VOCAB_SIZE={h.vocab_size} does not match tokenizer vocab_size={int(self.sp.vocab_size())}"
+            )
+        self.val_tokens = load_validation_tokens(h.val_files, h.eval_seq_len)
+        self.caseops_enabled = bool(getattr(h, "caseops_enabled", False))
+        if self.caseops_enabled:
+            self.base_bytes_lut = None
+            self.has_leading_space_lut = None
+            self.is_boundary_token_lut = None
+        else:
+            (
+                self.base_bytes_lut,
+                self.has_leading_space_lut,
+                self.is_boundary_token_lut,
+            ) = build_sentencepiece_luts(self.sp, h.vocab_size, device)
+        self.val_bytes = None
+        if self.caseops_enabled:
+            self.val_bytes = load_validation_byte_sidecar(
+                h.val_bytes_files, h.eval_seq_len, self.val_tokens.numel()
+            )
+
+
+def build_sentencepiece_luts(sp, vocab_size, device):
+    sp_vocab_size = int(sp.vocab_size())
+    assert (
+        sp.piece_to_id("▁") != sp.unk_id()
+    ), "Tokenizer must have '▁' (space) as its own token for correct BPB byte counting"
+    table_size = max(sp_vocab_size, vocab_size)
+    base_bytes_np = np.zeros((table_size,), dtype=np.int16)
+    has_leading_space_np = np.zeros((table_size,), dtype=np.bool_)
+    is_boundary_token_np = np.ones((table_size,), dtype=np.bool_)
+    for token_id in range(sp_vocab_size):
+        if sp.is_control(token_id) or sp.is_unknown(token_id) or sp.is_unused(token_id):
+            continue
+        is_boundary_token_np[token_id] = False
+        if sp.is_byte(token_id):
+            base_bytes_np[token_id] = 1
+            continue
+        piece = sp.id_to_piece(token_id)
+        if piece.startswith("▁"):
+            has_leading_space_np[token_id] = True
+            piece = piece[1:]
+        base_bytes_np[token_id] = len(piece.encode("utf-8"))
+    return (
+        torch.tensor(base_bytes_np, dtype=torch.int16, device=device),
+        torch.tensor(has_leading_space_np, dtype=torch.bool, device=device),
+        torch.tensor(is_boundary_token_np, dtype=torch.bool, device=device),
+    )
+
+
+def load_validation_tokens(pattern, seq_len):
+    # Filter out CaseOps byte sidecar shards which share the val_*.bin glob.
+    files = [
+        Path(p)
+        for p in sorted(glob.glob(pattern))
+        if "_bytes_" not in Path(p).name
+    ]
+    if not files:
+        raise FileNotFoundError(f"No files found for pattern: {pattern}")
+    tokens = torch.cat([load_data_shard(file) for file in files]).contiguous()
+    usable = (tokens.numel() - 1) // seq_len * seq_len
+    if usable <= 0:
+        raise ValueError(f"Validation split is too short for TRAIN_SEQ_LEN={seq_len}")
+    return tokens[: usable + 1]
+
+
+def load_validation_byte_sidecar(pattern, seq_len, expected_len):
+    """Load CaseOps per-token byte sidecar(s). Same shard layout as token shards
+    (256 int32 header + uint16 array). Each entry = canonical raw-text byte
+    budget for that token in the corresponding val shard. Returns a CPU
+    int16 tensor sliced to match expected_len (i.e. val_tokens length)."""
+    files = [Path(p) for p in sorted(glob.glob(pattern))]
+    if not files:
+        raise FileNotFoundError(f"No byte sidecar files for pattern: {pattern}")
+    shards = [load_data_shard(file) for file in files]
+    # load_data_shard returns uint16 — that's exactly what the sidecar stores.
+    bytes_full = torch.cat(shards).contiguous()
+    if bytes_full.numel() < expected_len:
+        raise ValueError(
+            f"Byte sidecar too short: {bytes_full.numel()} < val_tokens {expected_len}"
+        )
+    return bytes_full[:expected_len].to(torch.int32)
+
+
+def load_data_shard(file):
+    header_bytes = 256 * np.dtype("<i4").itemsize
+    token_bytes = np.dtype("<u2").itemsize
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    num_tokens = int(header[2])
+    expected_size = header_bytes + num_tokens * token_bytes
+    if file.stat().st_size != expected_size:
+        raise ValueError(
+            f"Shard size mismatch for {file}: expected {expected_size} bytes"
+        )
+    tokens_np = np.fromfile(file, dtype="<u2", count=num_tokens, offset=header_bytes)
+    if tokens_np.size != num_tokens:
+        raise ValueError(f"Short read for {file}")
+    return torch.from_numpy(tokens_np.astype(np.uint16, copy=False))
+
+
+_SHARD_HEADER_BYTES = 256 * np.dtype("<i4").itemsize
+_SHARD_NTOKENS_CACHE = {}
+_MMAP_CACHE = {}
+
+
+def _read_num_tokens(file):
+    key = str(file)
+    cached = _SHARD_NTOKENS_CACHE.get(key)
+    if cached is not None:
+        return cached
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    n = int(header[2])
+    _SHARD_NTOKENS_CACHE[key] = n
+    return n
+
+
+def _get_shard_memmap(file):
+    key = str(file)
+    mm = _MMAP_CACHE.get(key)
+    if mm is not None:
+        return mm
+    n = _read_num_tokens(file)
+    mm = np.memmap(file, mode="r", dtype="<u2", offset=_SHARD_HEADER_BYTES, shape=(n,))
+    _MMAP_CACHE[key] = mm
+    return mm
+
+
+BOS_ID = None
+
+
+def get_next_multiple_of_n(v, n):
+    return ((v + n - 1) // n) * n
+
+
+def _build_cu_seqlens(bos_pos, total_len, device, max_doc_len=0, bucket_size=64):
+    if not bos_pos or bos_pos[0] != 0:
+        bos_pos = [0] + bos_pos
+    seg_starts = []
+    starts_with_end = bos_pos + [total_len]
+    for i in range(len(starts_with_end) - 1):
+        start = starts_with_end[i]
+        end = starts_with_end[i + 1]
+        if max_doc_len > 0:
+            pos = start
+            while pos < end:
+                seg_starts.append(pos)
+                pos += max_doc_len
+        else:
+            seg_starts.append(start)
+    boundaries = seg_starts + [total_len]
+    padded_len = get_next_multiple_of_n(len(boundaries), bucket_size)
+    cu = torch.full((padded_len,), total_len, dtype=torch.int32, device=device)
+    cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+    seg_ends = seg_starts[1:] + [total_len]
+    max_seqlen = max(end - start for start, end in zip(seg_starts, seg_ends))
+    return cu, max_seqlen
+
+class DocumentPackingLoader:
+    _shard_pool = ThreadPoolExecutor(1)
+
+    def __init__(self, h, device, cu_bucket_size=64):
+        self.rank = h.rank
+        self.world_size = h.world_size
+        self.device = device
+        self.cu_bucket_size = cu_bucket_size
+        self.max_seq_len = h.train_seq_len
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files
+        self.file_iter = iter(self.files)
+        self._init_shard(load_data_shard(next(self.file_iter)))
+        self._next_shard = self._submit_next_shard()
+        self._batch_pool = ThreadPoolExecutor(1)
+        self._prefetch_queue = []
+
+    def _init_shard(self, tokens):
+        global BOS_ID
+        self.tokens = tokens
+        self.shard_size = tokens.numel()
+        if BOS_ID is None:
+            BOS_ID = 1
+        self.bos_idx = (
+            (tokens == BOS_ID).nonzero(as_tuple=True)[0].to(torch.int64).cpu().numpy()
+        )
+        self.cursor = int(self.bos_idx[0])
+
+    def _submit_next_shard(self):
+        try:
+            path = next(self.file_iter)
+            return self._shard_pool.submit(load_data_shard, path)
+        except StopIteration:
+            return None
+
+    def _advance_shard(self):
+        if self._next_shard is None:
+            self.file_iter = iter(self.files)
+            self._next_shard = self._shard_pool.submit(
+                load_data_shard, next(self.file_iter)
+            )
+        self._init_shard(self._next_shard.result())
+        self._next_shard = self._submit_next_shard()
+
+    def _local_doc_starts(self, local_start, total_len):
+        lo = np.searchsorted(self.bos_idx, local_start, side="left")
+        hi = np.searchsorted(self.bos_idx, local_start + total_len, side="left")
+        return (self.bos_idx[lo:hi] - local_start).tolist()
+
+    def _prepare_batch(self, num_tokens_local, max_seq_len):
+        per_rank_span = num_tokens_local + 1
+        global_span = per_rank_span * self.world_size
+        while self.cursor + global_span > self.shard_size:
+            self._advance_shard()
+        local_start = self.cursor + self.rank * per_rank_span
+        buf = self.tokens[local_start : local_start + per_rank_span]
+        inputs = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        targets = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        inputs.copy_(buf[:-1])
+        targets.copy_(buf[1:])
+        starts = self._local_doc_starts(local_start, inputs.numel())
+        cu_seqlens, max_seqlen = _build_cu_seqlens(
+            starts, inputs.numel(), inputs.device, max_seq_len, self.cu_bucket_size
+        )
+        cu_seqlens = cu_seqlens.pin_memory()
+        self.cursor += global_span
+        return inputs, targets, cu_seqlens, max_seqlen
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        num_tokens_local = global_tokens // (self.world_size * grad_accum_steps)
+        while len(self._prefetch_queue) < 2:
+            self._prefetch_queue.append(
+                self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        inputs, targets, cu_seqlens, max_seqlen = self._prefetch_queue.pop(0).result()
+        self._prefetch_queue.append(
+            self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        return (
+            inputs[None].to(self.device, non_blocking=True),
+            targets[None].to(self.device, non_blocking=True),
+            cu_seqlens.to(self.device, non_blocking=True),
+            max_seqlen,
+        )
+
+
+class ShuffledSequenceLoader:
+    def __init__(self, h, device):
+        self.world_size = h.world_size
+        self.seq_len = h.train_seq_len
+        self.device = device
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files[h.rank :: h.world_size]
+        self.rng = np.random.Generator(np.random.PCG64(h.rank))
+        self.num_tokens = [_read_num_tokens(f) for f in self.files]
+        self.start_inds = [[] for _ in self.files]
+        for si in range(len(self.files)):
+            self._reset_shard(si)
+
+    def _reset_shard(self, si):
+        max_phase = min(
+            self.seq_len - 1, max(0, self.num_tokens[si] - self.seq_len - 1)
+        )
+        phase = int(self.rng.integers(max_phase + 1)) if max_phase > 0 else 0
+        num_sequences = (self.num_tokens[si] - 1 - phase) // self.seq_len
+        sequence_order = self.rng.permutation(num_sequences)
+        self.start_inds[si] = (phase + sequence_order * self.seq_len).tolist()
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        device_tokens = global_tokens // (self.world_size * grad_accum_steps)
+        device_batch_size = device_tokens // self.seq_len
+        remaining = np.array([len(s) for s in self.start_inds], dtype=np.float64)
+        x = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        y = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        for bi in range(device_batch_size):
+            total = remaining.sum()
+            if total <= 0:
+                for si in range(len(self.files)):
+                    self._reset_shard(si)
+                remaining = np.array(
+                    [len(s) for s in self.start_inds], dtype=np.float64
+                )
+                total = remaining.sum()
+            probs = remaining / total
+            si = int(self.rng.choice(len(self.files), p=probs))
+            start_ind = self.start_inds[si].pop()
+            remaining[si] -= 1
+            mm = _get_shard_memmap(self.files[si])
+            window = torch.as_tensor(
+                np.array(mm[start_ind : start_ind + self.seq_len + 1], dtype=np.int64)
+            )
+            x[bi] = window[:-1]
+            y[bi] = window[1:]
+        return x.to(self.device, non_blocking=True), y.to(
+            self.device, non_blocking=True
+        )
+
+
+class RMSNorm(nn.Module):
+    def __init__(self, eps=None):
+        super().__init__()
+        self.eps = eps
+
+    def forward(self, x):
+        return F.rms_norm(x, (x.size(-1),), eps=self.eps)
+
+
+class CastedLinear(nn.Linear):
+    def forward(self, x):
+        w = self.weight.to(x.dtype)
+        bias = self.bias.to(x.dtype) if self.bias is not None else None
+        return F.linear(x, w, bias)
+
+
+@triton.jit
+def linear_leaky_relu_square_kernel(
+    a_desc,
+    b_desc,
+    c_desc,
+    aux_desc,
+    M,
+    N,
+    K,
+    BLOCK_SIZE_M: tl.constexpr,
+    BLOCK_SIZE_N: tl.constexpr,
+    BLOCK_SIZE_K: tl.constexpr,
+    NUM_SMS: tl.constexpr,
+    FORWARD: tl.constexpr,
+):
+    dtype = tl.bfloat16
+    start_pid = tl.program_id(axis=0)
+    num_pid_m = tl.cdiv(M, BLOCK_SIZE_M)
+    num_pid_n = tl.cdiv(N, BLOCK_SIZE_N)
+    k_tiles = tl.cdiv(K, BLOCK_SIZE_K)
+    num_tiles = num_pid_m * num_pid_n
+    tile_id_c = start_pid - NUM_SMS
+    for tile_id in tl.range(start_pid, num_tiles, NUM_SMS, flatten=True):
+        pid_m = tile_id // num_pid_n
+        pid_n = tile_id % num_pid_n
+        offs_am = pid_m * BLOCK_SIZE_M
+        offs_bn = pid_n * BLOCK_SIZE_N
+        accumulator = tl.zeros((BLOCK_SIZE_M, BLOCK_SIZE_N), dtype=tl.float32)
+        for ki in range(k_tiles):
+            offs_k = ki * BLOCK_SIZE_K
+            a = a_desc.load([offs_am, offs_k])
+            b = b_desc.load([offs_bn, offs_k])
+            accumulator = tl.dot(a, b.T, accumulator)
+        tile_id_c += NUM_SMS
+        offs_am_c = offs_am
+        offs_bn_c = offs_bn
+        acc = tl.reshape(accumulator, (BLOCK_SIZE_M, 2, BLOCK_SIZE_N // 2))
+        acc = tl.permute(acc, (0, 2, 1))
+        acc0, acc1 = tl.split(acc)
+        c0 = acc0.to(dtype)
+        c1 = acc1.to(dtype)
+        if not FORWARD:
+            pre0 = aux_desc.load([offs_am_c, offs_bn_c])
+            pre1 = aux_desc.load([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2])
+            c0 = c0 * tl.where(pre0 > 0, 2.0 * pre0, 0.5 * pre0)
+            c1 = c1 * tl.where(pre1 > 0, 2.0 * pre1, 0.5 * pre1)
+        c_desc.store([offs_am_c, offs_bn_c], c0)
+        c_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], c1)
+        if FORWARD:
+            aux0 = tl.where(c0 > 0, c0, 0.5 * c0)
+            aux1 = tl.where(c1 > 0, c1, 0.5 * c1)
+            aux_desc.store([offs_am_c, offs_bn_c], aux0 * aux0)
+            aux_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], aux1 * aux1)
+
+
+def linear_leaky_relu_square(a, b, aux=None):
+    M, K = a.shape
+    N, K2 = b.shape
+    assert K == K2
+    c = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    forward = aux is None
+    if aux is None:
+        aux = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    num_sms = torch.cuda.get_device_properties(a.device).multi_processor_count
+    BLOCK_SIZE_M, BLOCK_SIZE_N, BLOCK_SIZE_K = 256, 128, 64
+    num_stages = 4 if forward else 3
+    a_desc = TensorDescriptor.from_tensor(a, [BLOCK_SIZE_M, BLOCK_SIZE_K])
+    b_desc = TensorDescriptor.from_tensor(b, [BLOCK_SIZE_N, BLOCK_SIZE_K])
+    c_desc = TensorDescriptor.from_tensor(c, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    aux_desc = TensorDescriptor.from_tensor(aux, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    grid = lambda _meta: (
+        min(num_sms, triton.cdiv(M, BLOCK_SIZE_M) * triton.cdiv(N, BLOCK_SIZE_N)),
+    )
+    linear_leaky_relu_square_kernel[grid](
+        a_desc,
+        b_desc,
+        c_desc,
+        aux_desc,
+        M,
+        N,
+        K,
+        BLOCK_SIZE_M=BLOCK_SIZE_M,
+        BLOCK_SIZE_N=BLOCK_SIZE_N,
+        BLOCK_SIZE_K=BLOCK_SIZE_K,
+        NUM_SMS=num_sms,
+        FORWARD=forward,
+        num_stages=num_stages,
+        num_warps=8,
+    )
+    if forward:
+        return c, aux
+    return c
+
+
+class FusedLinearLeakyReLUSquareFunction(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, x, w1, w2):
+        x_flat = x.reshape(-1, x.shape[-1])
+        pre, post = linear_leaky_relu_square(x_flat, w1)
+        out = F.linear(post, w2)
+        ctx.save_for_backward(x, w1, w2, pre, post)
+        return out.view(*x.shape[:-1], out.shape[-1])
+
+    @staticmethod
+    def backward(ctx, grad_output):
+        x, w1, w2, pre, post = ctx.saved_tensors
+        x_flat = x.reshape(-1, x.shape[-1])
+        grad_output_flat = grad_output.reshape(-1, grad_output.shape[-1])
+        dw2 = grad_output_flat.T @ post
+        dpre = linear_leaky_relu_square(grad_output_flat, w2.T.contiguous(), aux=pre)
+        dw1 = dpre.T @ x_flat
+        dx = dpre @ w1
+        return dx.view_as(x), dw1, dw2
+
+
+FusedLeakyReLUSquareMLP = FusedLinearLeakyReLUSquareFunction.apply
+
+
+class Rotary(nn.Module):
+    def __init__(self, dim, base=1e4, train_seq_len=1024, rope_dims=0, yarn=True):
+        super().__init__()
+        self.dim = dim
+        self.base = base
+        self.train_seq_len = train_seq_len
+        self.yarn = yarn
+        self.rope_dims = rope_dims if rope_dims > 0 else dim
+        inv_freq = 1.0 / base ** (
+            torch.arange(0, self.rope_dims, 2, dtype=torch.float32) / self.rope_dims
+        )
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        self._seq_len_cached = 0
+        self._cos_cached = None
+        self._sin_cached = None
+
+    def forward(self, seq_len, device, dtype):
+        if (
+            self._cos_cached is None
+            or self._sin_cached is None
+            or self._seq_len_cached < seq_len
+            or self._cos_cached.device != device
+        ):
+            rd = self.rope_dims
+            if self.yarn and seq_len > self.train_seq_len:
+                scale = seq_len / self.train_seq_len
+                new_base = self.base * scale ** (rd / (rd - 2))
+                inv_freq = 1.0 / new_base ** (
+                    torch.arange(0, rd, 2, dtype=torch.float32, device=device) / rd
+                )
+            else:
+                inv_freq = self.inv_freq.float().to(device)
+            t = torch.arange(seq_len, device=device, dtype=torch.float32)
+            freqs = torch.outer(t, inv_freq)
+            self._cos_cached = freqs.cos()[None, :, None, :]
+            self._sin_cached = freqs.sin()[None, :, None, :]
+            self._seq_len_cached = seq_len
+        return self._cos_cached[:, :seq_len].to(dtype=dtype), self._sin_cached[:, :seq_len].to(dtype=dtype)
+
+
+def apply_rotary_emb(x, cos, sin, rope_dims=0):
+    if rope_dims > 0 and rope_dims < x.size(-1):
+        x_rope, x_pass = x[..., :rope_dims], x[..., rope_dims:]
+        half = rope_dims // 2
+        x1, x2 = x_rope[..., :half], x_rope[..., half:]
+        x_rope = torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+        return torch.cat((x_rope, x_pass), dim=-1)
+    half = x.size(-1) // 2
+    x1, x2 = x[..., :half], x[..., half:]
+    return torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+
+
+class CausalSelfAttention(nn.Module):
+    def __init__(
+        self, dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=True,
+        attn_out_gate=False, attn_out_gate_src="proj", gate_window=12,
+        gated_attn=False, gated_attn_init_std=0.01,
+        sparse_attn_gate=False, sparse_attn_gate_init_std=0.0, sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        if dim % num_heads != 0:
+            raise ValueError("model_dim must be divisible by num_heads")
+        if num_heads % num_kv_heads != 0:
+            raise ValueError("num_heads must be divisible by num_kv_heads")
+        if int(attn_out_gate) + int(gated_attn) + int(sparse_attn_gate) > 1:
+            raise ValueError(
+                "attn_out_gate, gated_attn, and sparse_attn_gate are mutually exclusive"
+            )
+        self.num_heads = num_heads
+        self.num_kv_heads = num_kv_heads
+        self.head_dim = dim // num_heads
+        if self.head_dim % 2 != 0:
+            raise ValueError("head_dim must be even for RoPE")
+        self.q_gain = nn.Parameter(
+            torch.full((num_heads,), qk_gain_init, dtype=torch.float32)
+        )
+        self.rope_dims = 0
+        self.rotary = Rotary(self.head_dim, base=rope_base, train_seq_len=train_seq_len, yarn=yarn)
+        self.use_xsa = False
+        # AttnOutGate (PR #1667 MarioPaerle): per-head multiplicative gate on attention
+        # output. CastedLinear so restore_fp32_params casts back to fp32 for GPTQ.
+        # _zero_init -> 2*sigmoid(0)=1 -> transparent at init.
+        self.attn_out_gate = attn_out_gate
+        self.attn_out_gate_src = attn_out_gate_src
+        self.gate_window = gate_window
+        if attn_out_gate:
+            self.attn_gate_proj = CastedLinear(gate_window, num_heads, bias=False)
+            self.attn_gate_proj._zero_init = True
+        # Gated Attention (arXiv:2505.06708, Qwen, NeurIPS 2025). Per-head sigmoid
+        # gate on SDPA output, BEFORE out_proj. Gate projection W_g: (num_heads, dim).
+        # Name "attn_gate_w" contains "attn_gate" substring so it matches
+        # CONTROL_TENSOR_NAME_PATTERNS and routes to the scalar AdamW group.
+        # fp32 Parameter -> restore_fp32_params path covers it via the ndim<2 OR
+        # name-pattern check (name matches "attn_gate"). Cast to x.dtype on use.
+        self.gated_attn = gated_attn
+        if gated_attn:
+            W = torch.empty(num_heads, dim, dtype=torch.float32)
+            nn.init.normal_(W, mean=0.0, std=gated_attn_init_std)
+            self.attn_gate_w = nn.Parameter(W)
+        # Sparse attention head-output gate (modded-nanogpt style). Keeps dense SDPA
+        # and only narrows the gate input to the first gate_window residual dims.
+        # W_g: (num_heads, gate_window). y_{t,h} <- sigmoid(scale * W_g_h @ x_t[:gate_window]) * y_{t,h}.
+        # Shares attn_gate_w name with dense GatedAttn so the quant routing
+        # (CONTROL_TENSOR_NAME_PATTERNS / attn_gate_w int8 passthrough) is unchanged.
+        self.sparse_attn_gate = sparse_attn_gate
+        self.sparse_attn_gate_scale = sparse_attn_gate_scale
+        if sparse_attn_gate:
+            W = torch.empty(num_heads, gate_window, dtype=torch.float32)
+            if sparse_attn_gate_init_std > 0:
+                nn.init.normal_(W, mean=0.0, std=sparse_attn_gate_init_std)
+            else:
+                nn.init.zeros_(W)
+            self.attn_gate_w = nn.Parameter(W)
+
+    def _xsa_efficient(self, y, v):
+        B, T, H, D = y.shape
+        Hkv = v.size(-2)
+        group = H // Hkv
+        y_g = y.reshape(B, T, Hkv, group, D)
+        vn = F.normalize(v, dim=-1).unsqueeze(-2)
+        proj = (y_g * vn).sum(dim=-1, keepdim=True) * vn
+        return (y_g - proj).reshape(B, T, H, D)
+
+    def forward(self, x, q_w, k_w, v_w, out_w, cu_seqlens=None, max_seqlen=0):
+        bsz, seqlen, dim = x.shape
+        # q_raw kept around as a tap point for attn_out_gate_src='q' (post-projection,
+        # pre-reshape, pre-RoPE).
+        q_raw = F.linear(x, q_w.to(x.dtype))
+        q = q_raw.reshape(bsz, seqlen, self.num_heads, self.head_dim)
+        k = F.linear(x, k_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        v = F.linear(x, v_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = self.rotary(seqlen, x.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, self.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, self.rope_dims)
+        q = q * self.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        if cu_seqlens is not None:
+            y = flash_attn_varlen_func(
+                q[0],
+                k[0],
+                v[0],
+                cu_seqlens_q=cu_seqlens,
+                cu_seqlens_k=cu_seqlens,
+                max_seqlen_q=max_seqlen,
+                max_seqlen_k=max_seqlen,
+                causal=True,
+                window_size=(-1, -1),
+            )[None]
+        else:
+            y = flash_attn_3_func(q, k, v, causal=True)
+        if self.use_xsa:
+            y = self._xsa_efficient(y, v)
+        # AttnOutGate inlined (PR #1667). Inline + .contiguous() barrier so torch.compile
+        # fullgraph=True is happy (this avoids the @torch.compiler.disable trap that
+        # crashed gates v3). Per-head gate on (B,T,H,D) tensor: g shape [B,T,H], broadcast
+        # over D via [..., None]. zero-init weight -> 2*sigmoid(0)=1 -> transparent.
+        if self.attn_out_gate:
+            gate_src = q_raw if self.attn_out_gate_src == "q" else x
+            gate_in = gate_src[..., : self.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(self.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (arXiv:2505.06708 G1). Inline + .contiguous() barrier so
+        # torch.compile fullgraph=True is happy. Per-head gate on (B,T,H,D): g shape
+        # [B,T,H], broadcast over D via [..., None]. Paper: g = sigmoid(x @ W_g.T)
+        # where W_g: (H, dim). .to(x.dtype) on fp32 param before broadcast with bf16.
+        if self.gated_attn:
+            x_c = x.contiguous()
+            g = torch.sigmoid(F.linear(x_c, self.attn_gate_w.to(x.dtype)))
+            y = y * g[..., None]
+        # Sparse head-output gate: narrower (gate_window) input, same shape g as GatedAttn.
+        if self.sparse_attn_gate:
+            gate_in = x[..., : self.gate_window].contiguous()
+            g = torch.sigmoid(
+                self.sparse_attn_gate_scale
+                * F.linear(gate_in, self.attn_gate_w.to(x.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        self._last_proj_input = y.detach() if getattr(self, "_calib", False) else None
+        return F.linear(y, out_w.to(x.dtype))
+
+
+class MLP(nn.Module):
+    def __init__(self, dim, mlp_mult):
+        super().__init__()
+        self.use_fused = True
+
+    def forward(self, x, up_w, down_w):
+        if self.training and self.use_fused:
+            return FusedLeakyReLUSquareMLP(x, up_w.to(x.dtype), down_w.to(x.dtype))
+        hidden = F.leaky_relu(F.linear(x, up_w.to(x.dtype)), negative_slope=0.5).square()
+        self._last_down_input = hidden.detach() if getattr(self, "_calib", False) else None
+        return F.linear(hidden, down_w.to(x.dtype))
+
+
+class Block(nn.Module):
+    def __init__(
+        self,
+        dim,
+        num_heads,
+        num_kv_heads,
+        mlp_mult,
+        rope_base,
+        qk_gain_init,
+        train_seq_len,
+        layer_idx=0,
+        ln_scale=False,
+        yarn=True,
+        attn_out_gate=False,
+        attn_out_gate_src="proj",
+        gate_window=12,
+        gated_attn=False,
+        gated_attn_init_std=0.01,
+        sparse_attn_gate=False,
+        sparse_attn_gate_init_std=0.0,
+        sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        self.attn_norm = RMSNorm()
+        self.mlp_norm = RMSNorm()
+        self.attn = CausalSelfAttention(
+            dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=yarn,
+            attn_out_gate=attn_out_gate, attn_out_gate_src=attn_out_gate_src, gate_window=gate_window,
+            gated_attn=gated_attn, gated_attn_init_std=gated_attn_init_std,
+            sparse_attn_gate=sparse_attn_gate,
+            sparse_attn_gate_init_std=sparse_attn_gate_init_std,
+            sparse_attn_gate_scale=sparse_attn_gate_scale,
+        )
+        self.mlp = MLP(dim, mlp_mult)
+        self.attn_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.mlp_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.resid_mix = nn.Parameter(
+            torch.stack((torch.ones(dim), torch.zeros(dim))).float()
+        )
+        self.ln_scale_factor = 1.0 / math.sqrt(layer_idx + 1) if ln_scale else 1.0
+
+    def forward(self, x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=None, max_seqlen=0):
+        mix = self.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        attn_out = self.attn(
+            self.attn_norm(x_in) * self.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens,
+            max_seqlen=max_seqlen,
+        )
+        x_out = x_in + self.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        x_out = x_out + self.mlp_scale.to(dtype=x_out.dtype)[
+            None, None, :
+        ] * self.mlp(self.mlp_norm(x_out) * self.ln_scale_factor, up_w, down_w)
+        return x_out
+
+class GPT(nn.Module):
+    def __init__(self, h):
+        super().__init__()
+        if h.logit_softcap <= 0.0:
+            raise ValueError(f"logit_softcap must be positive, got {h.logit_softcap}")
+        self.tie_embeddings = h.tie_embeddings
+        self.tied_embed_init_std = h.tied_embed_init_std
+        self.logit_softcap = h.logit_softcap
+        self.fused_ce_enabled = bool(h.fused_ce_enabled)
+        self.tok_emb = nn.Embedding(h.vocab_size, h.model_dim)
+        self.num_layers = h.num_layers
+        head_dim = h.model_dim // h.num_heads
+        kv_dim = h.num_kv_heads * head_dim
+        hidden_dim = int(h.mlp_mult * h.model_dim)
+        self.qo_bank = nn.Parameter(torch.empty(2 * h.num_layers, h.model_dim, h.model_dim))
+        self.kv_bank = nn.Parameter(torch.empty(2 * h.num_layers, kv_dim, h.model_dim))
+        self.mlp_up_bank = nn.Parameter(torch.empty(h.num_layers, hidden_dim, h.model_dim))
+        self.mlp_down_bank = nn.Parameter(torch.empty(h.num_layers, h.model_dim, hidden_dim))
+        self.num_encoder_layers = h.num_layers // 2
+        self.num_decoder_layers = h.num_layers - self.num_encoder_layers
+        self.blocks = nn.ModuleList(
+            [
+                Block(
+                    h.model_dim,
+                    h.num_heads,
+                    h.num_kv_heads,
+                    h.mlp_mult,
+                    h.rope_base,
+                    h.qk_gain_init,
+                    h.train_seq_len,
+                    layer_idx=i,
+                    ln_scale=h.ln_scale,
+                    yarn=h.rope_yarn,
+                    attn_out_gate=h.attn_out_gate_enabled,
+                    attn_out_gate_src=h.attn_out_gate_src,
+                    gate_window=h.gate_window,
+                    gated_attn=h.gated_attn_enabled,
+                    gated_attn_init_std=h.gated_attn_init_std,
+                    sparse_attn_gate=h.sparse_attn_gate_enabled,
+                    sparse_attn_gate_init_std=h.sparse_attn_gate_init_std,
+                    sparse_attn_gate_scale=h.sparse_attn_gate_scale,
+                )
+                for i in range(h.num_layers)
+            ]
+        )
+        if h.rope_dims > 0:
+            head_dim = h.model_dim // h.num_heads
+            for block in self.blocks:
+                block.attn.rope_dims = h.rope_dims
+                block.attn.rotary = Rotary(
+                    head_dim,
+                    base=h.rope_base,
+                    train_seq_len=h.train_seq_len,
+                    rope_dims=h.rope_dims,
+                    yarn=h.rope_yarn,
+                )
+        self.final_norm = RMSNorm()
+        self.lm_head = (
+            None
+            if h.tie_embeddings
+            else CastedLinear(h.model_dim, h.vocab_size, bias=False)
+        )
+        if self.lm_head is not None:
+            self.lm_head._zero_init = True
+        if h.xsa_last_n > 0:
+            for i in range(max(0, h.num_layers - h.xsa_last_n), h.num_layers):
+                self.blocks[i].attn.use_xsa = True
+        self.looping_active = False
+        if h.num_loops > 0:
+            loop_seg = list(range(h.loop_start, h.loop_end + 1))
+            all_indices = list(range(h.loop_start))
+            for _ in range(h.num_loops + 1):
+                all_indices.extend(loop_seg)
+            all_indices.extend(range(h.loop_end + 1, h.num_layers))
+            num_enc = len(all_indices) // 2
+            self.encoder_indices = all_indices[:num_enc]
+            self.decoder_indices = all_indices[num_enc:]
+        else:
+            self.encoder_indices = list(range(self.num_encoder_layers))
+            self.decoder_indices = list(range(self.num_encoder_layers, h.num_layers))
+        self.num_skip_weights = min(
+            len(self.encoder_indices), len(self.decoder_indices)
+        )
+        self.skip_weights = nn.Parameter(
+            torch.ones(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+        )
+        self.skip_gates = (
+            nn.Parameter(
+                torch.zeros(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+            )
+            if h.skip_gates_enabled
+            else None
+        )
+        self.parallel_start_layer = h.parallel_start_layer
+        self.parallel_final_lane = h.parallel_final_lane.lower()
+        self.parallel_post_lambdas = nn.Parameter(
+            torch.ones(h.num_layers, 2, 2, dtype=torch.float32)
+        )
+        self.parallel_resid_lambdas = nn.Parameter(
+            torch.full((h.num_layers, 2), 1.1, dtype=torch.float32)
+        )
+        # SmearGate (PR #1667 / modded-nanogpt @classiclarryd):
+        #   x_t <- x_t + lam * sigmoid(W * x_t[:gate_window]) * x_{t-1}.
+        # Per-token forward-1 smear of the embedding lane. W zero-init + lam=0 ->
+        # transparent at init. Uses CastedLinear so restore_fp32_params handles dtype.
+        self.smear_gate_enabled = h.smear_gate_enabled
+        if self.smear_gate_enabled:
+            self.smear_window = h.gate_window
+            self.smear_gate = CastedLinear(self.smear_window, 1, bias=False)
+            self.smear_gate._zero_init = True
+            self.smear_lambda = nn.Parameter(torch.zeros(1, dtype=torch.float32))
+        self._init_weights()
+
+    def _init_weights(self):
+        if self.tie_embeddings:
+            nn.init.normal_(self.tok_emb.weight, mean=0.0, std=self.tied_embed_init_std)
+        n = self.num_layers
+        proj_scale = 1.0 / math.sqrt(2 * n)
+        for i in range(n):
+            nn.init.orthogonal_(self.qo_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.qo_bank.data[n + i])
+            self.qo_bank.data[n + i].mul_(proj_scale)
+            nn.init.orthogonal_(self.kv_bank.data[i], gain=1.0)
+            nn.init.orthogonal_(self.kv_bank.data[n + i], gain=1.0)
+        for i in range(n):
+            nn.init.orthogonal_(self.mlp_up_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.mlp_down_bank.data[i])
+            self.mlp_down_bank.data[i].mul_(proj_scale)
+        for name, module in self.named_modules():
+            if isinstance(module, nn.Linear):
+                if getattr(module, "_zero_init", False):
+                    nn.init.zeros_(module.weight)
+                elif (
+                    module.weight.ndim == 2
+                    and module.weight.shape[0] >= 64
+                    and module.weight.shape[1] >= 64
+                ):
+                    nn.init.orthogonal_(module.weight, gain=1.0)
+
+    def _bank_weights(self, i):
+        n = self.num_layers
+        return (
+            self.qo_bank[i],
+            self.kv_bank[i],
+            self.kv_bank[n + i],
+            self.qo_bank[n + i],
+            self.mlp_up_bank[i],
+            self.mlp_down_bank[i],
+        )
+
+    def _parallel_block(
+        self, block_idx, lane0, lane1, x0,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+        cu_seqlens=None, max_seqlen=0,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        attn_out = block.attn(
+            block.attn_norm(attn_read) * block.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+        )
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * block.mlp(
+            block.mlp_norm(mlp_read) * block.ln_scale_factor, up_w, down_w
+        )
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+    def _final_parallel_hidden(self, lane0, lane1):
+        if self.parallel_final_lane == "mlp":
+            return lane1
+        if self.parallel_final_lane == "attn":
+            return lane0
+        return 0.5 * (lane0 + lane1)
+
+    def _forward_hidden(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        """Run the encoder/decoder stack to the final RMSNorm; returns pre-projection hidden.
+        Shared by eval (softcap+projection via forward_logits) and train (fused CE path)."""
+        x = self.tok_emb(input_ids)
+        # SmearGate (PR #1667). lam=0 + W=0 -> identity at init.
+        # Cross-doc leak fix: zero the prev-token smear at any position whose current token
+        # is BOS, so the BOS embedding starting doc N+1 in a packed stream is not
+        # contaminated by doc N's last token (audited issue on PR#1797 base).
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else range(self.num_encoder_layers)
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else range(
+                self.num_encoder_layers,
+                self.num_encoder_layers + self.num_decoder_layers,
+            )
+        )
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block(
+                    i, lane0, lane1, x0, q_w, k_w, v_w, out_w, up_w, down_w,
+                    cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        return x
+
+    def _project_logits(self, hidden):
+        if self.tie_embeddings:
+            return F.linear(hidden, self.tok_emb.weight)
+        return self.lm_head(hidden)
+
+    def forward_logits(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        return self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+
+    def forward(self, input_ids, target_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        flat_targets = target_ids.reshape(-1)
+        # Fused softcapped-CE kernel (training path only). Applies softcap inside the
+        # Triton kernel; takes pre-softcap logits_proj. Non-fused path matches stock
+        # PR-1736 numerics exactly (softcap in fp32, then F.cross_entropy on fp32).
+        if self.fused_ce_enabled:
+            return softcapped_cross_entropy(
+                logits_proj.reshape(-1, logits_proj.size(-1)),
+                flat_targets,
+                self.logit_softcap,
+                reduction="mean",
+            )
+        logits = self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+        return F.cross_entropy(
+            logits.reshape(-1, logits.size(-1)).float(),
+            flat_targets,
+            reduction="mean",
+        )
+
+    def forward_ttt(self, input_ids, target_ids, lora):
+        x = self.tok_emb(input_ids)
+        # SmearGate on the TTT path — same inline compute as forward_logits.
+        # Cross-doc leak fix: see _forward_hidden comment.
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else list(range(self.num_encoder_layers))
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else list(
+                range(
+                    self.num_encoder_layers,
+                    self.num_encoder_layers + self.num_decoder_layers,
+                )
+            )
+        )
+        slot = 0
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block_with_lora(
+                    i, lane0, lane1, x0, lora, slot,
+                    q_w, k_w, v_w, out_w, up_w, down_w,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        if self.tie_embeddings:
+            logits = F.linear(x, self.tok_emb.weight)
+        else:
+            logits = self.lm_head(x)
+        logits = logits + lora.lm_head_lora(x)
+        logits = self.logit_softcap * torch.tanh(logits / self.logit_softcap)
+        bsz, sl, V = logits.shape
+        return F.cross_entropy(
+            logits.float().reshape(-1, V), target_ids.reshape(-1), reduction="none"
+        ).reshape(bsz, sl)
+
+    def _block_with_lora(self, block, x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w):
+        mix = block.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        n = block.attn_norm(x_in) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        # Keep raw Q for AttnOutGate src='q' (matches forward path semantics).
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT path) — inline + .contiguous() barrier, same as the eval path.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT path). Gate input is n (post-norm block input), same
+        # as eval path. .to(n.dtype) on fp32 param before bf16 broadcast.
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT path) — must match the eval path in
+        # forward() exactly, else training (which applied the gate) and TTT eval (which
+        # skipped it) produce mismatched representations and catastrophic BPB regression.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        x_out = x_in + block.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        mlp_n = block.mlp_norm(x_out) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        x_out = x_out + block.mlp_scale.to(dtype=x_out.dtype)[None, None, :] * mlp_out
+        return x_out
+
+    def _parallel_block_with_lora(
+        self, block_idx, lane0, lane1, x0, lora, slot,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        n = block.attn_norm(attn_read) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT parallel path) — inline + .contiguous() barrier.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT parallel path). Gate input is n (post-norm block input).
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT parallel path) — must match the
+        # eval path in forward() to keep train/eval semantics in sync.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_n = block.mlp_norm(mlp_read) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * mlp_out
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+
+class BatchedLinearLoRA(nn.Module):
+    # PR-1767: rank-scaled output (alpha/rank), like standard LoRA. Decouples
+    # effective magnitude from rank so changing rank does not change LR scale.
+    _ALPHA = float(os.environ.get("TTT_LORA_ALPHA", "144"))
+    # PR-1767: optionally keep A warm across per-doc resets (only B is zeroed).
+    # Accumulates useful feature directions across documents within a TTT phase.
+    _WARM_START_A = bool(int(os.environ.get("TTT_WARM_START_A", "1")))
+
+    def __init__(self, bsz, in_features, out_features, rank):
+        super().__init__()
+        self._bound = 1.0 / math.sqrt(in_features)
+        self._scale = self._ALPHA / rank
+        self.A = nn.Parameter(
+            torch.empty(bsz, rank, in_features).uniform_(-self._bound, self._bound)
+        )
+        self.B = nn.Parameter(torch.zeros(bsz, out_features, rank))
+
+    def reset(self):
+        with torch.no_grad():
+            if not self._WARM_START_A:
+                self.A.uniform_(-self._bound, self._bound)
+            self.B.zero_()
+
+    def forward(self, x):
+        return ((x @ self.A.transpose(1, 2)) @ self.B.transpose(1, 2)) * self._scale
+
+
+class BatchedTTTLoRA(nn.Module):
+    def __init__(self, bsz, model, rank, k_lora=True, mlp_lora=True, o_lora=True):
+        super().__init__()
+        self.bsz = bsz
+        dim = model.qo_bank.shape[-1]
+        vocab = model.tok_emb.num_embeddings
+        if getattr(model, "looping_active", False):
+            num_slots = len(model.encoder_indices) + len(model.decoder_indices)
+        else:
+            num_slots = len(model.blocks)
+        kv_dim = model.blocks[0].attn.num_kv_heads * (
+            dim // model.blocks[0].attn.num_heads
+        )
+        embed_dim = model.tok_emb.embedding_dim
+        self.lm_head_lora = BatchedLinearLoRA(bsz, embed_dim, vocab, rank)
+        self.q_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+        )
+        self.v_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+        )
+        self.k_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+            )
+            if k_lora
+            else None
+        )
+        self.mlp_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if mlp_lora
+            else None
+        )
+        self.o_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if o_lora
+            else None
+        )
+
+    def reset(self):
+        with torch.no_grad():
+            self.lm_head_lora.reset()
+            for loras in [self.q_loras, self.v_loras, self.k_loras,
+                          self.mlp_loras, self.o_loras]:
+                if loras is not None:
+                    for lora in loras:
+                        lora.reset()
+
+
+# Polar Express per-iteration minimax Newton-Schulz coefficients (PR #1344).
+# Replaces the fixed (3.4445, -4.775, 2.0315) coefficients of stock Muon.
+# Applied at backend_steps=5 — taking more than 5 iterations from this list
+# falls back to the final (converged) tuple via the slice guard below.
+_PE_COEFFS = (
+    (8.156554524902461, -22.48329292557795, 15.878769915207462),
+    (4.042929935166739, -2.808917465908714, 0.5000178451051316),
+    (3.8916678022926607, -2.772484153217685, 0.5060648178503393),
+    (3.285753657755655, -2.3681294933425376, 0.46449024233003106),
+    (2.3465413258596377, -1.7097828382687081, 0.42323551169305323),
+)
+
+
+@torch.compile
+def zeropower_via_newtonschulz5(G, steps=10, eps=1e-07):
+    was_2d = G.ndim == 2
+    if was_2d:
+        G = G.unsqueeze(0)
+    X = G.bfloat16()
+    transposed = X.size(-2) > X.size(-1)
+    if transposed:
+        X = X.mT
+    X = X / (X.norm(dim=(-2, -1), keepdim=True) + eps)
+    coeffs = _PE_COEFFS[:steps] if steps <= len(_PE_COEFFS) else _PE_COEFFS
+    for a, b, c in coeffs:
+        A = X @ X.mT
+        B = b * A + c * (A @ A)
+        X = a * X + B @ X
+    if transposed:
+        X = X.mT
+    if was_2d:
+        X = X.squeeze(0)
+    return X
+
+
+class Muon(torch.optim.Optimizer):
+    def __init__(
+        self,
+        params,
+        lr,
+        momentum,
+        backend_steps,
+        nesterov=True,
+        weight_decay=0.0,
+        row_normalize=False,
+    ):
+        super().__init__(
+            params,
+            dict(
+                lr=lr,
+                momentum=momentum,
+                backend_steps=backend_steps,
+                nesterov=nesterov,
+                weight_decay=weight_decay,
+                row_normalize=row_normalize,
+            ),
+        )
+        self._built = False
+
+    def _build(self):
+        self._distributed = dist.is_available() and dist.is_initialized()
+        self._world_size = dist.get_world_size() if self._distributed else 1
+        self._rank = dist.get_rank() if self._distributed else 0
+        ws = self._world_size
+        self._bank_meta = []
+        for group in self.param_groups:
+            for p in group["params"]:
+                B = p.shape[0]
+                padded_B = ((B + ws - 1) // ws) * ws
+                shard_B = padded_B // ws
+                tail = p.shape[1:]
+                dev = p.device
+                self._bank_meta.append({
+                    "p": p,
+                    "B": B,
+                    "padded_grad": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard_mom": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "full_update": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "scale": max(1, p.shape[-2] / p.shape[-1]) ** 0.5,
+                })
+        self._bank_meta.sort(key=lambda m: -m["p"].numel())
+        self._built = True
+
+    def launch_reduce_scatters(self):
+        if not self._built:
+            self._build()
+        if not self._distributed:
+            return
+        self._rs_futures = []
+        for m in self._bank_meta:
+            p = m["p"]
+            if p.grad is None:
+                self._rs_futures.append(None)
+                continue
+            pg = m["padded_grad"]
+            pg[: m["B"]].copy_(p.grad)
+            fut = dist.reduce_scatter_tensor(
+                m["shard"], pg, op=dist.ReduceOp.AVG, async_op=True
+            )
+            self._rs_futures.append(fut)
+
+    @torch.no_grad()
+    def step(self, closure=None):
+        loss = None
+        if closure is not None:
+            with torch.enable_grad():
+                loss = closure()
+        if not self._built:
+            self._build()
+        for group in self.param_groups:
+            lr = group["lr"]
+            momentum = group["momentum"]
+            backend_steps = group["backend_steps"]
+            nesterov = group["nesterov"]
+            wd = group.get("weight_decay", 0.0)
+            row_normalize = group.get("row_normalize", False)
+            prev_ag_handle = None
+            prev_m = None
+            sharded = self._distributed and hasattr(self, "_rs_futures")
+            for idx, m in enumerate(self._bank_meta):
+                p = m["p"]
+                if p.grad is None:
+                    continue
+                if prev_ag_handle is not None:
+                    prev_ag_handle.wait()
+                    pp = prev_m["p"]
+                    upd = prev_m["full_update"][: prev_m["B"]]
+                    if wd > 0.0:
+                        pp.data.mul_(1.0 - lr * wd)
+                    pp.add_(upd, alpha=-lr * prev_m["scale"])
+                if sharded and self._rs_futures[idx] is not None:
+                    self._rs_futures[idx].wait()
+                    g = m["shard"]
+                    buf = m["shard_mom"]
+                else:
+                    g = p.grad.bfloat16()
+                    state = self.state[p]
+                    if "momentum_buffer" not in state:
+                        state["momentum_buffer"] = torch.zeros_like(g)
+                    buf = state["momentum_buffer"]
+                buf.mul_(momentum).add_(g)
+                if nesterov:
+                    update = g.add(buf, alpha=momentum)
+                else:
+                    update = buf
+                if row_normalize:
+                    rn = update.float().norm(dim=-1, keepdim=True).clamp_min(1e-07)
+                    update = update / rn.to(update.dtype)
+                update = zeropower_via_newtonschulz5(update, steps=backend_steps)
+                if sharded:
+                    prev_ag_handle = dist.all_gather_into_tensor(
+                        m["full_update"], update, async_op=True
+                    )
+                    prev_m = m
+                else:
+                    if wd > 0.0:
+                        p.data.mul_(1.0 - lr * wd)
+                    p.add_(update, alpha=-lr * m["scale"])
+            if prev_ag_handle is not None:
+                prev_ag_handle.wait()
+                pp = prev_m["p"]
+                upd = prev_m["full_update"][: prev_m["B"]]
+                if wd > 0.0:
+                    pp.data.mul_(1.0 - lr * wd)
+                pp.add_(upd, alpha=-lr * prev_m["scale"])
+            if hasattr(self, "_rs_futures"):
+                del self._rs_futures
+        return loss
+
+
+CONTROL_TENSOR_NAME_PATTERNS = tuple(
+    pattern
+    for pattern in os.environ.get(
+        "CONTROL_TENSOR_NAME_PATTERNS",
+        "attn_scale,attn_scales,mlp_scale,mlp_scales,resid_mix,resid_mixes,q_gain,skip_weight,skip_weights,skip_gates,parallel_post_lambdas,parallel_resid_lambdas,attn_gate_proj,attn_gate_w,smear_gate,smear_lambda",
+    ).split(",")
+    if pattern
+)
+
+
+PACKED_REPLICATED_GRAD_MAX_NUMEL = 1 << 15
+
+
+class Optimizers:
+    def __init__(self, h, base_model):
+        matrix_params = [
+            base_model.qo_bank,
+            base_model.kv_bank,
+            base_model.mlp_up_bank,
+            base_model.mlp_down_bank,
+        ]
+        block_named_params = list(base_model.blocks.named_parameters())
+        scalar_params = [
+            p
+            for (name, p) in block_named_params
+            if p.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ]
+        if base_model.skip_weights.numel() > 0:
+            scalar_params.append(base_model.skip_weights)
+        if base_model.skip_gates is not None and base_model.skip_gates.numel() > 0:
+            scalar_params.append(base_model.skip_gates)
+        if base_model.parallel_post_lambdas is not None:
+            scalar_params.append(base_model.parallel_post_lambdas)
+        if base_model.parallel_resid_lambdas is not None:
+            scalar_params.append(base_model.parallel_resid_lambdas)
+        # SmearGate params live on GPT root (not in .blocks), so add them by hand.
+        # Both are tiny (gate_window scalars + 1 lambda). Optimized via scalar Adam.
+        if getattr(base_model, "smear_gate_enabled", False):
+            scalar_params.append(base_model.smear_gate.weight)
+            scalar_params.append(base_model.smear_lambda)
+        token_lr = h.tied_embed_lr if h.tie_embeddings else h.embed_lr
+        tok_params = [
+            {"params": [base_model.tok_emb.weight], "lr": token_lr, "base_lr": token_lr}
+        ]
+        self.optimizer_tok = torch.optim.AdamW(
+            tok_params,
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.embed_wd,
+            fused=True,
+        )
+        self.optimizer_muon = Muon(
+            matrix_params,
+            lr=h.matrix_lr,
+            momentum=h.muon_momentum,
+            backend_steps=h.muon_backend_steps,
+            weight_decay=h.muon_wd,
+            row_normalize=h.muon_row_normalize,
+        )
+        for group in self.optimizer_muon.param_groups:
+            group["base_lr"] = h.matrix_lr
+        self.optimizer_scalar = torch.optim.AdamW(
+            [{"params": scalar_params, "lr": h.scalar_lr, "base_lr": h.scalar_lr}],
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.adam_wd,
+            fused=True,
+        )
+        self.optimizers = [
+            self.optimizer_tok,
+            self.optimizer_muon,
+            self.optimizer_scalar,
+        ]
+        self.replicated_params = list(tok_params[0]["params"])
+        self.replicated_params.extend(scalar_params)
+        self.replicated_large_params = []
+        self.replicated_packed_params = []
+        for p in self.replicated_params:
+            if p.numel() <= PACKED_REPLICATED_GRAD_MAX_NUMEL:
+                self.replicated_packed_params.append(p)
+            else:
+                self.replicated_large_params.append(p)
+        self._aux_stream = torch.cuda.Stream()
+
+    def __iter__(self):
+        return iter(self.optimizers)
+
+    def zero_grad_all(self):
+        for opt in self.optimizers:
+            opt.zero_grad(set_to_none=True)
+
+    def _all_reduce_packed_grads(self):
+        grads_by_key = collections.defaultdict(list)
+        for p in self.replicated_packed_params:
+            if p.grad is not None:
+                grads_by_key[(p.grad.device, p.grad.dtype)].append(p.grad)
+        for grads in grads_by_key.values():
+            flat = torch.empty(
+                sum(g.numel() for g in grads),
+                device=grads[0].device,
+                dtype=grads[0].dtype,
+            )
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                flat[offset : offset + n].copy_(g.contiguous().view(-1))
+                offset += n
+            dist.all_reduce(flat, op=dist.ReduceOp.AVG)
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                g.copy_(flat[offset : offset + n].view_as(g))
+                offset += n
+
+    def step(self, distributed=False):
+        self.optimizer_muon.launch_reduce_scatters()
+        if distributed:
+            reduce_handles = [
+                dist.all_reduce(p.grad, op=dist.ReduceOp.AVG, async_op=True)
+                for p in self.replicated_large_params
+                if p.grad is not None
+            ]
+            self._all_reduce_packed_grads()
+            for handle in reduce_handles:
+                handle.wait()
+        self._aux_stream.wait_stream(torch.cuda.current_stream())
+        with torch.cuda.stream(self._aux_stream):
+            self.optimizer_tok.step()
+            self.optimizer_scalar.step()
+        self.optimizer_muon.step()
+        torch.cuda.current_stream().wait_stream(self._aux_stream)
+        self.zero_grad_all()
+
+
+def restore_fp32_params(model):
+    for module in model.modules():
+        if isinstance(module, CastedLinear):
+            module.float()
+    for name, param in model.named_parameters():
+        if (
+            param.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ) and param.dtype != torch.float32:
+            param.data = param.data.float()
+    if hasattr(model, "qo_bank") and model.qo_bank is not None:
+        model.qo_bank.data = model.qo_bank.data.float()
+        model.kv_bank.data = model.kv_bank.data.float()
+    model.mlp_up_bank.data = model.mlp_up_bank.data.float()
+    model.mlp_down_bank.data = model.mlp_down_bank.data.float()
+
+
+def collect_hessians(model, train_loader, h, device, n_calibration_batches=64):
+    hessians = {}
+    hooks = []
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = True
+        block.mlp._calib = True
+        block.mlp.use_fused = False
+
+    def make_attn_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            for suffix in ["c_q", "c_k", "c_v"]:
+                name = f"blocks.{layer_idx}.attn.{suffix}.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            y = module._last_proj_input
+            if y is not None:
+                y = y.float()
+                if y.ndim == 3:
+                    y = y.reshape(-1, y.shape[-1])
+                name = f"blocks.{layer_idx}.attn.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        y.shape[1], y.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(y.T, y)
+        return hook_fn
+
+    def make_mlp_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            name = f"blocks.{layer_idx}.mlp.fc.weight"
+            if name not in hessians:
+                hessians[name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[name].addmm_(x.T, x)
+            h_act = module._last_down_input
+            if h_act is not None:
+                h_act = h_act.float()
+                if h_act.ndim == 3:
+                    h_act = h_act.reshape(-1, h_act.shape[-1])
+                name = f"blocks.{layer_idx}.mlp.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        h_act.shape[1], h_act.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(h_act.T, h_act)
+        return hook_fn
+
+    for i, block in enumerate(model.blocks):
+        hooks.append(block.attn.register_forward_hook(make_attn_hook(i)))
+        hooks.append(block.mlp.register_forward_hook(make_mlp_hook(i)))
+
+    # Hessian hooks for embedding factorization projection layers
+    def make_linear_input_hook(weight_name):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            if weight_name not in hessians:
+                hessians[weight_name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[weight_name].addmm_(x.T, x)
+        return hook_fn
+
+    if model.tie_embeddings:
+        hook_module = model.final_norm
+
+        def make_output_hook(name):
+            def hook_fn(module, inp, out):
+                x = out.detach().float()
+                if x.ndim == 3:
+                    x = x.reshape(-1, x.shape[-1])
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            return hook_fn
+
+        hooks.append(
+            hook_module.register_forward_hook(make_output_hook("tok_emb.weight"))
+        )
+    model.eval()
+    with torch.no_grad():
+        for _ in range(n_calibration_batches):
+            x, _ = train_loader.next_batch(h.train_batch_tokens, h.grad_accum_steps)
+            model.forward_logits(x)
+    for hook in hooks:
+        hook.remove()
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = False
+        block.mlp._calib = False
+        block.mlp.use_fused = True
+    for name in hessians:
+        hessians[name] = hessians[name].cpu() / n_calibration_batches
+    return hessians
+
+
+def gptq_quantize_weight(w, H, clip_sigmas=3.0, clip_range=63, block_size=128):
+    W_orig = w.float().clone()
+    rows, cols = W_orig.shape
+    H = H.float().clone()
+    dead = torch.diag(H) == 0
+    H[dead, dead] = 1
+    damp = 0.01 * H.diag().mean()
+    H.diagonal().add_(damp)
+    perm = torch.argsort(H.diag(), descending=True)
+    invperm = torch.argsort(perm)
+    W_perm = W_orig[:, perm].clone()
+    W_perm[:, dead[perm]] = 0
+    H = H[perm][:, perm]
+    Hinv = torch.cholesky_inverse(torch.linalg.cholesky(H))
+    Hinv = torch.linalg.cholesky(Hinv, upper=True)
+    row_std = W_orig.std(dim=1)
+    s = (clip_sigmas * row_std / clip_range).clamp_min(1e-10).to(torch.float16)
+    sf = s.float()
+    Q = torch.zeros(rows, cols, dtype=torch.int8)
+    W_work = W_perm.clone()
+    for i1 in range(0, cols, block_size):
+        i2 = min(i1 + block_size, cols)
+        W_block = W_work[:, i1:i2].clone()
+        Hinv_block = Hinv[i1:i2, i1:i2]
+        Err = torch.zeros(rows, i2 - i1)
+        for j in range(i2 - i1):
+            w_col = W_block[:, j]
+            d = Hinv_block[j, j]
+            q_col = torch.clamp(torch.round(w_col / sf), -clip_range, clip_range)
+            Q[:, i1 + j] = q_col.to(torch.int8)
+            err = (w_col - q_col.float() * sf) / d
+            Err[:, j] = err
+            W_block[:, j:] -= err.unsqueeze(1) * Hinv_block[j, j:].unsqueeze(0)
+        if i2 < cols:
+            W_work[:, i2:] -= Err @ Hinv[i1:i2, i2:]
+    return Q[:, invperm], s
+
+
+def _quantize_gate_int8_row(w):
+    # Symmetric int8-per-row quantization for small gate tensors. w shape
+    # (R, C) -> (R,) scales in fp16, int8 values in [-127, 127]. Single scale
+    # per row keeps accuracy high while halving storage vs fp16.
+    W = w.float().contiguous()
+    row_max = W.abs().amax(dim=1).clamp_min(1e-10)
+    s = (row_max / 127.0).to(torch.float16)
+    sf = s.float().view(-1, 1)
+    q = torch.clamp(torch.round(W / sf), -127, 127).to(torch.int8)
+    return q, s
+
+
+def _lqer_pack(A, B, bits):
+    rng = 2 ** (bits - 1) - 1
+    sA = (A.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    sB = (B.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    qB = torch.clamp(torch.round(B / sB.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    return qA, sA, qB, sB
+
+
+def _lqer_pack_asym(A, B, g=64):
+    # A: INT2 per-matrix scalar (signed [-2,1], scale = |A|max/1.5).
+    sA = (A.abs().amax().clamp_min(1e-10) / 1.5).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float()), -2, 1).to(torch.int8)
+    # B: INT4 groupwise g over flattened B (signed [-8,7], per-group scale).
+    Bf = B.reshape(-1, g)
+    Bmax = Bf.abs().amax(dim=-1, keepdim=True).clamp_min(1e-10)
+    sB = (Bmax / 7.5).to(torch.float16).reshape(-1)
+    qB = torch.clamp(torch.round(Bf / sB.float().reshape(-1, 1)), -8, 7).to(
+        torch.int8
+    ).reshape(B.shape)
+    return qA, sA, qB, sB
+
+
+def gptq_mixed_quantize(state_dict, hessians, h):
+    result = {}
+    meta = {}
+    quant_gate = bool(getattr(h, "gated_attn_quant_gate", False))
+    lqer_on = bool(getattr(h, "lqer_enabled", False))
+    lqer_cands = {}
+    for (name, tensor) in state_dict.items():
+        t = tensor.detach().cpu().contiguous()
+        # Dedicated int8-per-row path for attn_gate_w (bypasses both GPTQ and
+        # fp16 passthrough). Applied BEFORE the numel<=65536 passthrough check
+        # so the gate tensor is routed here instead of to fp16.
+        if (
+            quant_gate
+            and t.is_floating_point()
+            and t.ndim == 2
+            and name.endswith(".attn_gate_w")
+            # Dense GatedAttn: (num_heads, dim) = (8, 512) = 4096.
+            # Sparse gate: (num_heads, gate_window) = (8, 12) = 96.
+            # Both need int8-per-row routing; the 1024 lower bound in stock
+            # PR-1736 presumed dense-only. Widen to catch both.
+            and 32 <= t.numel() <= 8192
+        ):
+            gq, gs = _quantize_gate_int8_row(t)
+            result[name + ".gq"] = gq
+            result[name + ".gs"] = gs
+            meta[name] = "gate_int8_row"
+            continue
+        if not t.is_floating_point() or t.numel() <= 65536:
+            result[name] = t.to(torch.float16) if t.is_floating_point() else t
+            meta[name] = "passthrough (float16)"
+            continue
+        if "tok_emb" in name:
+            cs = h.embed_clip_sigmas
+        elif ".mlp." in name:
+            cs = h.mlp_clip_sigmas
+        elif ".attn." in name:
+            cs = h.attn_clip_sigmas
+        else:
+            cs = h.matrix_clip_sigmas
+        bits = h.embed_bits if "tok_emb" in name else h.matrix_bits
+        clip_range = 2 ** (bits - 1) - 1
+        ret = gptq_quantize_weight(
+            t, hessians[name], clip_sigmas=cs, clip_range=clip_range
+        )
+        q, s = ret
+        result[name + ".q"] = q
+        result[name + ".scale"] = s
+        meta[name] = f"gptq (int{bits})"
+        if lqer_on:
+            W_q = q.float() * s.float().view(-1, 1)
+            E = t.float() - W_q
+            lqer_cands[name] = (E, float(E.norm()))
+    if lqer_on and lqer_cands:
+        top = sorted(lqer_cands.items(), key=lambda kv: -kv[1][1])[: h.lqer_top_k]
+        asym_on = bool(getattr(h, "lqer_asym_enabled", False))
+        asym_g = int(getattr(h, "lqer_asym_group", 64))
+        for (name, (E, _)) in top:
+            U, S, Vh = torch.linalg.svd(E, full_matrices=False)
+            r = min(h.lqer_rank, S.numel())
+            A = (U[:, :r] * S[:r]).contiguous()
+            B = Vh[:r, :].contiguous()
+            if asym_on and B.numel() % asym_g == 0:
+                qA, sA, qB, sB = _lqer_pack_asym(A, B, asym_g)
+                result[name + ".lqA_a"] = qA
+                result[name + ".lqAs_a"] = sA
+                result[name + ".lqB_a"] = qB
+                result[name + ".lqBs_a"] = sB
+                meta[name] = meta[name] + "+lqer_asym"
+            else:
+                qA, sA, qB, sB = _lqer_pack(A, B, h.lqer_factor_bits)
+                result[name + ".lqA"] = qA
+                result[name + ".lqAs"] = sA
+                result[name + ".lqB"] = qB
+                result[name + ".lqBs"] = sB
+                meta[name] = meta[name] + "+lqer"
+    categories = collections.defaultdict(set)
+    for (name, cat) in meta.items():
+        short = re.sub("\\.\\d+$", "", re.sub("blocks\\.\\d+", "blocks", name))
+        categories[cat].add(short)
+    log("Quantized weights:")
+    for cat in sorted(categories):
+        log(f"  {cat}: {', '.join(sorted(categories[cat]))}")
+    return result, meta
+
+def dequantize_mixed(result, meta, template_sd):
+    out = {}
+    for (name, orig) in template_sd.items():
+        info = meta.get(name)
+        if info is None:
+            continue
+        orig_dtype = orig.dtype
+        if "passthrough" in info:
+            t = result[name]
+            if t.dtype == torch.float16 and orig_dtype in (
+                torch.float32,
+                torch.bfloat16,
+            ):
+                t = t.to(orig_dtype)
+            out[name] = t
+            continue
+        if info == "gate_int8_row":
+            gq = result[name + ".gq"]
+            gs = result[name + ".gs"]
+            out[name] = (gq.float() * gs.float().view(-1, 1)).to(orig_dtype)
+            continue
+        q, s = result[name + ".q"], result[name + ".scale"]
+        if s.ndim > 0:
+            W = q.float() * s.float().view(q.shape[0], *[1] * (q.ndim - 1))
+        else:
+            W = q.float() * float(s.item())
+        if "lqer_asym" in info:
+            qA_t = result[name + ".lqA_a"]
+            sA_t = result[name + ".lqAs_a"]
+            qB_t = result[name + ".lqB_a"]
+            sB_t = result[name + ".lqBs_a"]
+            qA = qA_t.float() * float(sA_t)
+            g_sz = qB_t.numel() // sB_t.numel()
+            qB = (qB_t.reshape(-1, g_sz).float() * sB_t.float().view(-1, 1)).reshape(
+                qB_t.shape
+            )
+            W = W + qA @ qB
+        elif "lqer" in info:
+            qA = result[name + ".lqA"].float() * result[name + ".lqAs"].float().view(-1, 1)
+            qB = result[name + ".lqB"].float() * result[name + ".lqBs"].float().view(-1, 1)
+            W = W + qA @ qB
+        out[name] = W.to(orig_dtype)
+    return out
+
+
+_BSHF_MAGIC = b"BSHF"
+
+
+# ── Per-group lrzip compression (ported from PR#1586 via PR#1667/1729) ────────
+
+_GROUP_ORDER = [
+    "_tok_emb.weight.q",
+    "attn.c_k.weight.q", "attn.c_q.weight.q",
+    "attn.c_v.weight.q", "attn.proj.weight.q",
+    "mlp.fc.weight.q", "mlp.proj.weight.q",
+]
+_SIMSORT_KEYS = {"_tok_emb.weight.q", "attn.c_q.weight.q", "mlp.fc.weight.q"}
+_PACK_MAGIC = b"PGRP"
+
+
+def _similarity_sort_l1(matrix):
+    import numpy as _np
+    n = matrix.shape[0]
+    used = _np.zeros(n, dtype=bool)
+    order = [0]
+    used[0] = True
+    cur = matrix[0].astype(_np.float32)
+    for _ in range(n - 1):
+        dists = _np.sum(_np.abs(matrix[~used].astype(_np.float32) - cur), axis=1)
+        unused = _np.where(~used)[0]
+        best = unused[_np.argmin(dists)]
+        order.append(best)
+        used[best] = True
+        cur = matrix[best].astype(_np.float32)
+    return _np.array(order, dtype=_np.uint16)
+
+
+def _lrzip_compress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.bin")
+    out = f"{inp}.lrz"
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-z", "-L", "9", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _lrzip_decompress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.lrz")
+    out = os.path.join(tmpdir, f"{label}.bin")
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-d", "-f", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _pack_streams(streams):
+    import struct
+    n = len(streams)
+    hdr = _PACK_MAGIC + struct.pack("<I", n)
+    for s in streams:
+        hdr += struct.pack("<I", len(s))
+    return hdr + b"".join(streams)
+
+
+def _unpack_streams(blob):
+    import struct
+    assert blob[:4] == _PACK_MAGIC
+    n = struct.unpack("<I", blob[4:8])[0]
+    off = 8
+    lengths = [struct.unpack("<I", blob[off + i*4:off + i*4 + 4])[0] for i in range(n)]
+    off += n * 4
+    streams = []
+    for length in lengths:
+        streams.append(blob[off:off + length])
+        off += length
+    return streams
+
+
+def _compress(raw, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.compress(raw, quality=11)
+    if compressor == "lzma":
+        import lzma
+        return lzma.compress(raw, preset=9)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _decompress(blob, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.decompress(blob)
+    if compressor == "lzma":
+        import lzma
+        return lzma.decompress(blob)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _serialize_pergroup(quant_result, quant_meta, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    groups = collections.defaultdict(list)
+    remainder = {}
+    for name, t in sorted(quant_result.items()):
+        if t.dtype != torch.int8:
+            remainder[name] = t
+            continue
+        parts = name.split(".")
+        routed = False
+        if parts[0] == "blocks" and parts[1].isdigit():
+            key = ".".join(parts[2:])
+            if key in _GROUP_ORDER:
+                groups[key].append((int(parts[1]), t))
+                routed = True
+        else:
+            group_key = "_" + name
+            if group_key in _GROUP_ORDER:
+                groups[group_key] = [(0, t)]
+                routed = True
+        if not routed:
+            # int8 tensor that doesn't fit a known group (e.g. gate_int8_row
+            # tensors like attn.attn_gate_w.gq from GATED_ATTN). Stash in
+            # the brotli-compressed remainder blob so it round-trips.
+            remainder[name] = t
+
+    streams = []
+    all_perms = b""
+    shape_manifest = {}
+
+    for group_key in _GROUP_ORDER:
+        if group_key not in groups:
+            streams.append(b"")
+            continue
+        tensors = sorted(groups[group_key], key=lambda x: x[0])
+        blob = b""
+        grp_shapes = []
+        for idx, t in tensors:
+            arr = t.numpy()
+            orig_shape = arr.shape
+            if arr.ndim == 2:
+                if group_key in _SIMSORT_KEYS:
+                    order = _similarity_sort_l1(arr)
+                    all_perms += order.tobytes()
+                    arr = arr[order]
+                arr = _np.ascontiguousarray(arr.T)
+            blob += arr.tobytes()
+            grp_shapes.append(orig_shape)
+        shape_manifest[group_key] = grp_shapes
+        compressed = _lrzip_compress(blob, tmpdir, group_key.replace(".", "_"))
+        streams.append(compressed)
+
+    remainder_buf = io.BytesIO()
+    torch.save({"r": remainder, "m": quant_meta, "s": shape_manifest}, remainder_buf)
+    streams.append(brotli.compress(remainder_buf.getvalue(), quality=11, lgwin=24))
+    streams.append(brotli.compress(all_perms, quality=11) if all_perms else b"")
+
+    return _pack_streams(streams)
+
+
+def _deserialize_pergroup(blob, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    streams = _unpack_streams(blob)
+    n_groups = len(_GROUP_ORDER)
+
+    remainder_state = torch.load(
+        io.BytesIO(brotli.decompress(streams[n_groups])), map_location="cpu"
+    )
+    quant_meta = remainder_state["m"]
+    quant_result = dict(remainder_state["r"])
+    shape_manifest = remainder_state["s"]
+    all_perms = brotli.decompress(streams[n_groups + 1]) if streams[n_groups + 1] else b""
+
+    def _decompress_one(args):
+        i, gk, data = args
+        if not data:
+            return gk, b""
+        return gk, _lrzip_decompress(data, tmpdir, f"d_{gk.replace('.', '_')}")
+
+    from concurrent.futures import ThreadPoolExecutor as _TPool
+    with _TPool(max_workers=n_groups) as pool:
+        futs = [pool.submit(_decompress_one, (i, gk, streams[i])) for i, gk in enumerate(_GROUP_ORDER)]
+        raw_groups = {f.result()[0]: f.result()[1] for f in futs}
+
+    perm_off = 0
+    for group_key in _GROUP_ORDER:
+        raw = raw_groups.get(group_key, b"")
+        if not raw:
+            continue
+        grp_shapes = shape_manifest[group_key]
+        data_arr = _np.frombuffer(raw, dtype=_np.int8)
+
+        if group_key.startswith("_"):
+            tensor_names = [group_key[1:]]
+        else:
+            tensor_names = [f"blocks.{i}.{group_key}" for i in range(num_layers)]
+
+        offset = 0
+        for tname, orig_shape in zip(tensor_names, grp_shapes):
+            n_elem = 1
+            for d in orig_shape:
+                n_elem *= d
+            chunk = data_arr[offset:offset + n_elem].copy()
+            offset += n_elem
+
+            if len(orig_shape) == 2:
+                rows, cols = orig_shape
+                chunk = chunk.reshape(cols, rows).T
+
+                if group_key in _SIMSORT_KEYS:
+                    perm = _np.frombuffer(all_perms[perm_off:perm_off + rows * 2], dtype=_np.uint16)
+                    perm_off += rows * 2
+                    inv_perm = _np.empty_like(perm)
+                    inv_perm[perm] = _np.arange(rows, dtype=_np.uint16)
+                    chunk = chunk[inv_perm]
+
+                chunk = chunk.reshape(orig_shape)
+
+            quant_result[tname] = torch.from_numpy(_np.ascontiguousarray(chunk))
+
+    return quant_result, quant_meta
+
+
+def _unbank_state_dict(state_dict, num_layers):
+    sd = {}
+    n = num_layers
+    for k, v in state_dict.items():
+        t = v.detach().cpu() if v is not None else None
+        if k == "qo_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_q.weight"] = t[i]
+                sd[f"blocks.{i}.attn.proj.weight"] = t[n + i]
+        elif k == "kv_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_k.weight"] = t[i]
+                sd[f"blocks.{i}.attn.c_v.weight"] = t[n + i]
+        elif k == "mlp_up_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.fc.weight"] = t[i]
+        elif k == "mlp_down_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.proj.weight"] = t[i]
+        else:
+            if t is not None:
+                sd[k] = t
+    return sd
+
+
+def _rebank_state_dict(flat_sd, num_layers, model_dim, kv_dim, hidden_dim):
+    sd = {}
+    n = num_layers
+    sd["qo_bank"] = torch.zeros(2 * n, model_dim, model_dim)
+    sd["kv_bank"] = torch.zeros(2 * n, kv_dim, model_dim)
+    for i in range(n):
+        sd["qo_bank"][i] = flat_sd[f"blocks.{i}.attn.c_q.weight"]
+        sd["qo_bank"][n + i] = flat_sd[f"blocks.{i}.attn.proj.weight"]
+        sd["kv_bank"][i] = flat_sd[f"blocks.{i}.attn.c_k.weight"]
+        sd["kv_bank"][n + i] = flat_sd[f"blocks.{i}.attn.c_v.weight"]
+    sd["mlp_up_bank"] = torch.zeros(n, hidden_dim, model_dim)
+    sd["mlp_down_bank"] = torch.zeros(n, model_dim, hidden_dim)
+    for i in range(n):
+        sd["mlp_up_bank"][i] = flat_sd[f"blocks.{i}.mlp.fc.weight"]
+        sd["mlp_down_bank"][i] = flat_sd[f"blocks.{i}.mlp.proj.weight"]
+    for k, v in flat_sd.items():
+        if not (
+            k.startswith("blocks.")
+            and any(
+                p in k
+                for p in [
+                    ".attn.c_q.", ".attn.c_k.", ".attn.c_v.",
+                    ".attn.proj.", ".mlp.fc.", ".mlp.proj.",
+                ]
+            )
+        ):
+            sd[k] = v
+    return sd
+
+
+
+def _compressed_code_size(code):
+    import brotli
+    code_raw = code.encode("utf-8")
+    try:
+        minified = subprocess.run(
+            ["pyminify", "--no-rename-locals", "--no-hoist-literals", "--remove-literal-statements", "--remove-asserts", "--prefer-single-line", "-"],
+            input=code_raw, capture_output=True, check=True,
+        ).stdout
+    except (FileNotFoundError, subprocess.CalledProcessError):
+        minified = code_raw
+    compressed = brotli.compress(minified, quality=11)
+    encoded = base64.b85encode(compressed)
+    wrapper = b"import brotli as B,base64 as b\nexec(B.decompress(b.b85decode(\"" + encoded + b"\")))\n"
+    return len(code_raw), len(wrapper)
+
+
+def serialize(h, base_model, code):
+    code_bytes_uncompressed, code_bytes = _compressed_code_size(code)
+    if h.is_main_process:
+        torch.save(base_model.state_dict(), h.model_path)
+        model_bytes = os.path.getsize(h.model_path)
+        log(f"Serialized model: {model_bytes} bytes")
+        log(f"Code size (uncompressed): {code_bytes_uncompressed} bytes")
+        log(f"Code size (compressed): {code_bytes} bytes")
+    sd_cpu = _unbank_state_dict(base_model.state_dict(), h.num_layers)
+    device = torch.device("cuda", h.local_rank)
+    t0 = time.perf_counter()
+    calib_loader = ShuffledSequenceLoader(h, device)
+    log("GPTQ:collecting Hessians from calibration data...")
+    hessians = collect_hessians(
+        base_model,
+        calib_loader,
+        h,
+        device,
+        n_calibration_batches=h.gptq_calibration_batches,
+    )
+    log(f"GPTQ:collected {len(hessians)} Hessians in {time.perf_counter()-t0:.1f}s")
+    quant_result, quant_meta = gptq_mixed_quantize(sd_cpu, hessians, h)
+    if h.compressor == "pergroup":
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_")
+        log("Serialize: per-group lrzip compression...")
+        t1 = time.perf_counter()
+        quant_blob = _serialize_pergroup(quant_result, quant_meta, h.num_layers, tmpdir)
+        log(f"Serialize: per-group compression done in {time.perf_counter()-t1:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_buf = io.BytesIO()
+        torch.save({"w": quant_result, "m": quant_meta}, quant_buf)
+        quant_raw = quant_buf.getvalue()
+        quant_blob = _compress(quant_raw, h.compressor)
+    quant_file_bytes = len(quant_blob)
+    bytes_total = quant_file_bytes + code_bytes
+    if h.is_main_process:
+        with open(h.quantized_model_path, "wb") as f:
+            f.write(quant_blob)
+        log(f"Serialized model quantized+{h.compressor}: {quant_file_bytes} bytes")
+        log(f"Total submission size quantized+{h.compressor}: {bytes_total} bytes")
+    return bytes_total, quant_file_bytes
+
+
+def deserialize(h, device):
+    eval_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(eval_model)
+    flat_template = _unbank_state_dict(eval_model.state_dict(), h.num_layers)
+    with open(h.quantized_model_path, "rb") as f:
+        quant_blob_disk = f.read()
+    if quant_blob_disk[:4] == _PACK_MAGIC:
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_dec_")
+        log("Deserialize: per-group lrzip decompression...")
+        t0 = time.perf_counter()
+        quant_result, quant_meta = _deserialize_pergroup(
+            quant_blob_disk, h.num_layers, tmpdir
+        )
+        log(f"Deserialize: decompression done in {time.perf_counter()-t0:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_state = torch.load(
+            io.BytesIO(_decompress(quant_blob_disk, h.compressor)), map_location="cpu"
+        )
+        quant_result, quant_meta = quant_state["w"], quant_state["m"]
+    deq_flat = dequantize_mixed(quant_result, quant_meta, flat_template)
+    head_dim = h.model_dim // h.num_heads
+    kv_dim = h.num_kv_heads * head_dim
+    hidden_dim = int(h.mlp_mult * h.model_dim)
+    deq_state = _rebank_state_dict(deq_flat, h.num_layers, h.model_dim, kv_dim, hidden_dim)
+    eval_model.load_state_dict(deq_state, strict=True)
+    return eval_model
+
+
+def _loss_bpb(loss_sum, token_count, byte_count):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_count.item())
+    return val_loss, val_bpb
+
+
+def eval_val(h, device, val_data, model, forward_logits_fn=None):
+    seq_len = h.eval_seq_len
+    local_batch_tokens = h.val_batch_tokens // (h.world_size * h.grad_accum_steps)
+    if local_batch_tokens < seq_len:
+        raise ValueError(
+            f"VAL_BATCH_SIZE must provide at least one sequence per rank; got VAL_BATCH_SIZE={h.val_batch_tokens}, WORLD_SIZE={h.world_size}, GRAD_ACCUM_STEPS={h.grad_accum_steps}, seq_len={seq_len}"
+        )
+    local_batch_seqs = local_batch_tokens // seq_len
+    total_seqs = (val_data.val_tokens.numel() - 1) // seq_len
+    seq_start = total_seqs * h.rank // h.world_size
+    seq_end = total_seqs * (h.rank + 1) // h.world_size
+
+    # TODO: Don't truncate this.
+    seq_end = seq_start + ((seq_end - seq_start) // local_batch_seqs) * local_batch_seqs
+
+    val_loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    val_token_count = torch.zeros((), device=device, dtype=torch.float64)
+    val_byte_count = torch.zeros((), device=device, dtype=torch.float64)
+    run_forward_logits = (
+        (model.module.forward_logits if hasattr(model, "module") else model.forward_logits)
+        if forward_logits_fn is None
+        else forward_logits_fn
+    )
+    model.eval()
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    with torch.no_grad():
+        for batch_seq_start in range(seq_start, seq_end, local_batch_seqs):
+            batch_seq_end = min(batch_seq_start + local_batch_seqs, seq_end)
+            raw_start = batch_seq_start * seq_len
+            raw_end = batch_seq_end * seq_len + 1
+            local = val_data.val_tokens[raw_start:raw_end].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            x = local[:-1]
+            y = local[1:]
+            bos_pos = (x == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                bos_pos, x.numel(), x.device, h.eval_seq_len, 64
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                logits = run_forward_logits(
+                    x[None], cu_seqlens=cu_seqlens, max_seqlen=max_seqlen
+                ).detach()
+            per_token_loss = F.cross_entropy(
+                logits.reshape(-1, logits.size(-1)).float(),
+                y.reshape(-1),
+                reduction="none",
+            )
+            val_loss_sum += per_token_loss.to(torch.float64).sum()
+            val_token_count += float(y.numel())
+            prev_ids = x
+            tgt_ids = y
+            sidecar_slice = val_data.val_bytes[raw_start + 1 : raw_end].to(
+                device=device, dtype=torch.int32, non_blocking=True
+            )
+            val_byte_count += sidecar_slice.to(torch.float64).sum()
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(val_loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_token_count, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_byte_count, op=dist.ReduceOp.SUM)
+    model.train()
+    return _loss_bpb(val_loss_sum, val_token_count, val_byte_count)
+
+
+def _find_docs(all_tokens):
+    bos_positions = (all_tokens == BOS_ID).nonzero(as_tuple=True)[0].numpy()
+    docs = []
+    for i in range(len(bos_positions)):
+        start = int(bos_positions[i])
+        end = (
+            int(bos_positions[i + 1])
+            if i + 1 < len(bos_positions)
+            else all_tokens.numel()
+        )
+        if i + 1 < len(bos_positions):
+            end += 1
+        assert end - start >= 2
+        docs.append((start, end - start))
+    return docs
+
+
+def _build_ttt_global_batches(doc_entries, h, ascending=False):
+    batch_size = h.ttt_batch_size
+    global_doc_entries = sorted(doc_entries, key=lambda x: x[1][1])
+    global_batches = [
+        global_doc_entries[i : i + batch_size]
+        for i in range(0, len(global_doc_entries), batch_size)
+    ]
+    indexed = list(enumerate(global_batches))
+    if not ascending:
+        indexed.sort(key=lambda ib: -max(dl for _, (_, dl) in ib[1]))
+    return indexed
+
+
+def _init_batch_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(4, "little"))
+
+
+def _claim_next_batch(counter_path, queue_len):
+    try:
+        with open(counter_path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            idx = int.from_bytes(f.read(4), "little")
+            f.seek(0)
+            f.write((idx + 1).to_bytes(4, "little"))
+            f.flush()
+    except FileNotFoundError:
+        return queue_len
+    return idx
+
+
+def _compute_chunk_window(ci, pred_len, num_chunks, chunk_size, eval_seq_len):
+    chunk_end = pred_len if ci == num_chunks - 1 else (ci + 1) * chunk_size
+    win_start = max(0, chunk_end - eval_seq_len)
+    win_len = chunk_end - win_start
+    chunk_start = ci * chunk_size
+    chunk_offset = chunk_start - win_start
+    chunk_len = chunk_end - chunk_start
+    return win_start, win_len, chunk_offset, chunk_len
+
+
+def _accumulate_bpb(
+    ptl,
+    x,
+    y,
+    chunk_offsets,
+    chunk_lens,
+    pos_idx,
+    base_bytes_lut,
+    has_leading_space_lut,
+    is_boundary_token_lut,
+    loss_sum,
+    byte_sum,
+    token_count,
+    y_bytes=None,
+):
+    pos = pos_idx[: x.size(1)].unsqueeze(0)
+    mask = (
+        (chunk_lens.unsqueeze(1) > 0)
+        & (pos >= chunk_offsets.unsqueeze(1))
+        & (pos < (chunk_offsets + chunk_lens).unsqueeze(1))
+    )
+    mask_f64 = mask.to(torch.float64)
+    if y_bytes is not None:
+        tok_bytes = y_bytes.to(torch.float64)
+    else:
+        tok_bytes = base_bytes_lut[y].to(torch.float64)
+        tok_bytes += (has_leading_space_lut[y] & ~is_boundary_token_lut[x]).to(
+            torch.float64
+        )
+    loss_sum += (ptl.to(torch.float64) * mask_f64).sum()
+    byte_sum += (tok_bytes * mask_f64).sum()
+    token_count += chunk_lens.to(torch.float64).sum()
+
+
+def _loss_bpb_from_sums(loss_sum, token_count, byte_sum):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_sum.item())
+    return val_loss, val_bpb
+
+
+def _add_to_counter(path, delta):
+    try:
+        with open(path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            cur = int.from_bytes(f.read(8), "little", signed=True)
+            cur += int(delta)
+            f.seek(0)
+            f.write(int(cur).to_bytes(8, "little", signed=True))
+            f.flush()
+            return cur
+    except FileNotFoundError:
+        return int(delta)
+
+
+def _init_int64_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(8, "little", signed=True))
+
+
+def _select_ttt_doc_entries(docs, h):
+    doc_entries = list(enumerate(docs))
+    if h.val_doc_fraction < 1.0:
+        sample_n = max(1, int(round(len(docs) * h.val_doc_fraction)))
+        sampled_indices = sorted(
+            random.Random(h.seed).sample(range(len(docs)), sample_n)
+        )
+        return [(i, docs[i]) for i in sampled_indices]
+    return doc_entries
+
+
+def train_val_ttt_global_sgd_distributed(h, device, val_data, base_model, val_tokens, batch_seqs=None):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    seq_len = h.eval_seq_len
+    total_tokens = val_tokens.numel() - 1
+    ttt_chunk = h.global_ttt_chunk_tokens
+    batch_seqs = h.global_ttt_batch_seqs if batch_seqs is None else batch_seqs
+    num_chunks = (total_tokens + ttt_chunk - 1) // ttt_chunk
+    ttt_params = [p for p in base_model.parameters()]
+    for p in ttt_params:
+        p.requires_grad_(True)
+    optimizer = torch.optim.SGD(
+        ttt_params, lr=h.global_ttt_lr, momentum=h.global_ttt_momentum
+    )
+    t_start = time.perf_counter()
+    for ci in range(num_chunks):
+        chunk_start = ci * ttt_chunk
+        chunk_end = min((ci + 1) * ttt_chunk, total_tokens)
+        is_last_chunk = ci == num_chunks - 1
+        if is_last_chunk or h.global_ttt_epochs <= 0:
+            continue
+        base_model.train()
+        chunk_seqs = (chunk_end - chunk_start) // seq_len
+        if chunk_seqs <= 0:
+            continue
+        warmup_chunks = max(0, min(h.global_ttt_warmup_chunks, num_chunks - 1))
+        if warmup_chunks > 0 and ci < warmup_chunks:
+            warmup_denom = max(warmup_chunks - 1, 1)
+            warmup_t = ci / warmup_denom
+            lr_now = (
+                h.global_ttt_warmup_start_lr
+                + (h.global_ttt_lr - h.global_ttt_warmup_start_lr) * warmup_t
+            )
+        else:
+            decay_steps = max(num_chunks - 1 - warmup_chunks, 1)
+            decay_ci = max(ci - warmup_chunks, 0)
+            lr_now = h.global_ttt_lr * 0.5 * (
+                1.0 + math.cos(math.pi * decay_ci / decay_steps)
+            )
+        for pg in optimizer.param_groups:
+            pg["lr"] = lr_now
+        my_seq_s = chunk_seqs * h.rank // h.world_size
+        my_seq_e = chunk_seqs * (h.rank + 1) // h.world_size
+        my_chunk_seqs = my_seq_e - my_seq_s
+        for _ in range(h.global_ttt_epochs):
+            for bs in range(0, my_chunk_seqs, batch_seqs):
+                be = min(bs + batch_seqs, my_chunk_seqs)
+                actual_bs = my_seq_s + bs
+                start_tok = chunk_start + actual_bs * seq_len
+                end_tok = chunk_start + (my_seq_s + be) * seq_len + 1
+                if end_tok > val_tokens.numel():
+                    continue
+                local = val_tokens[start_tok:end_tok].to(device=device, dtype=torch.int64)
+                x_flat = local[:-1]
+                y_flat = local[1:]
+                optimizer.zero_grad(set_to_none=True)
+                with torch.enable_grad():
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                        if h.global_ttt_respect_doc_boundaries:
+                            bos_pos = (x_flat == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+                            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                                bos_pos, x_flat.numel(), x_flat.device, h.eval_seq_len, 64
+                            )
+                            loss = base_model(
+                                x_flat[None],
+                                y_flat[None],
+                                cu_seqlens=cu_seqlens,
+                                max_seqlen=max_seqlen,
+                            )
+                        else:
+                            x = x_flat.reshape(-1, seq_len)
+                            y = y_flat.reshape(-1, seq_len)
+                            loss = base_model(x, y)
+                loss.backward()
+                if dist.is_available() and dist.is_initialized():
+                    for p in ttt_params:
+                        if p.grad is not None:
+                            dist.all_reduce(p.grad, op=dist.ReduceOp.SUM)
+                            p.grad.mul_(1.0 / h.world_size)
+                if h.global_ttt_grad_clip > 0:
+                    torch.nn.utils.clip_grad_norm_(ttt_params, h.global_ttt_grad_clip)
+                optimizer.step()
+        base_model.eval()
+        if h.rank == 0:
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"tttg: c{ci+1}/{num_chunks} lr:{lr_now:.6f} t:{elapsed:.1f}s"
+            )
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.eval()
+
+
+def eval_val_ttt_phased(h, base_model, device, val_data, forward_ttt_train):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    for p in base_model.parameters():
+        p.requires_grad_(False)
+    all_tokens = val_data.val_tokens
+    all_tokens_idx = all_tokens.to(torch.int32)
+    docs = _find_docs(all_tokens)
+    doc_entries = _select_ttt_doc_entries(docs, h)
+    prefix_doc_limit = max(0, min(len(doc_entries), int(h.phased_ttt_prefix_docs)))
+    num_phases = max(1, int(h.phased_ttt_num_phases))
+    phase_boundaries = []
+    for pi in range(num_phases):
+        boundary = prefix_doc_limit * (pi + 1) // num_phases
+        phase_boundaries.append(boundary)
+    current_phase = 0
+    current_phase_boundary = phase_boundaries[0]
+    log(
+        "ttt_phased:"
+        f" total_docs:{len(doc_entries)} prefix_docs:{prefix_doc_limit} "
+        f"suffix_docs:{len(doc_entries) - prefix_doc_limit}"
+        f" num_phases:{num_phases} boundaries:{phase_boundaries}"
+    )
+    chunk_size, eval_seq_len = h.ttt_chunk_size, h.ttt_eval_seq_len
+    eval_batch_set = None
+    if h.ttt_eval_batches:
+        eval_batch_set = set(int(x) for x in h.ttt_eval_batches.split(",") if x.strip())
+    use_ascending = eval_batch_set is not None
+    global_batches_sorted = _build_ttt_global_batches(
+        doc_entries, h, ascending=use_ascending
+    )
+    queue_len = len(global_batches_sorted)
+    counter_path = f"/tmp/ttt_counter_{h.run_id}"
+    prefix_counter_path = f"/tmp/ttt_prefix_counter_{h.run_id}"
+    pause_flag_path = f"/tmp/ttt_pause_flag_{h.run_id}"
+    if h.rank == 0:
+        _init_batch_counter(counter_path)
+        _init_int64_counter(prefix_counter_path)
+        try:
+            os.remove(pause_flag_path)
+        except FileNotFoundError:
+            pass
+    if dist.is_available() and dist.is_initialized():
+        path_list = [counter_path, prefix_counter_path, pause_flag_path]
+        dist.broadcast_object_list(path_list, src=0)
+        counter_path, prefix_counter_path, pause_flag_path = path_list
+        dist.barrier()
+    loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    byte_sum = torch.zeros((), device=device, dtype=torch.float64)
+    token_count = torch.zeros((), device=device, dtype=torch.float64)
+    t_start = time.perf_counter()
+    reusable_lora = BatchedTTTLoRA(
+        h.ttt_batch_size, base_model, h.ttt_lora_rank,
+        k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+    ).to(device)
+
+    def _build_opt(lora):
+        if h.ttt_optimizer == "sgd":
+            return torch.optim.SGD(
+                lora.parameters(), lr=h.ttt_lora_lr,
+                momentum=h.ttt_beta1, weight_decay=h.ttt_weight_decay,
+            )
+        return torch.optim.AdamW(
+            lora.parameters(), lr=h.ttt_lora_lr,
+            betas=(h.ttt_beta1, h.ttt_beta2),
+            eps=1e-10, weight_decay=h.ttt_weight_decay, fused=True,
+        )
+
+    reusable_opt = _build_opt(reusable_lora)
+    local_scored_docs = []
+    global_ttt_done = prefix_doc_limit == 0
+    try:
+      while True:
+        queue_idx = _claim_next_batch(counter_path, queue_len)
+        if queue_idx >= queue_len:
+            break
+        orig_batch_idx, batch_entries = global_batches_sorted[queue_idx]
+        batch = [doc for _, doc in batch_entries]
+        bsz = len(batch)
+        prev_loss = loss_sum.item()
+        prev_bytes = byte_sum.item()
+        prev_tokens = token_count.item()
+        if bsz == reusable_lora.bsz:
+            reusable_lora.reset()
+            for s in reusable_opt.state.values():
+                for k, v in s.items():
+                    if isinstance(v, torch.Tensor):
+                        v.zero_()
+                    elif k == "step":
+                        s[k] = 0
+            cur_lora = reusable_lora
+            cur_opt = reusable_opt
+        else:
+            cur_lora = BatchedTTTLoRA(
+                bsz, base_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            cur_opt = _build_opt(cur_lora)
+        pred_lens = [doc_len - 1 for _, doc_len in batch]
+        num_chunks = [(pl + chunk_size - 1) // chunk_size for pl in pred_lens]
+        max_nc = max(num_chunks)
+        num_chunks_t = torch.tensor(num_chunks, dtype=torch.int64, device=device)
+        for ci in range(max_nc):
+            active = [ci < nc for nc in num_chunks]
+            needs_train = any(ci < nc - 1 for nc in num_chunks)
+            tok_starts = torch.zeros(bsz, dtype=torch.int64)
+            tok_wls = torch.zeros(bsz, dtype=torch.int64)
+            chunk_offsets_cpu = torch.zeros(bsz, dtype=torch.int64)
+            chunk_lens_cpu = torch.zeros(bsz, dtype=torch.int64)
+            for b in range(bsz):
+                if not active[b]:
+                    continue
+                doc_start, doc_len = batch[b]
+                win_start, win_len, chunk_offset, chunk_len = _compute_chunk_window(
+                    ci, pred_lens[b], num_chunks[b], chunk_size, eval_seq_len
+                )
+                tok_starts[b] = doc_start + win_start
+                tok_wls[b] = win_len
+                chunk_offsets_cpu[b] = chunk_offset
+                chunk_lens_cpu[b] = chunk_len
+            _, context_size, chunk_offset, _ = _compute_chunk_window(
+                ci, (ci + 1) * chunk_size, ci + 1, chunk_size, eval_seq_len
+            )
+            col_idx = torch.arange(context_size + 1)
+            idx = tok_starts.unsqueeze(1) + col_idx.unsqueeze(0)
+            idx.clamp_(max=all_tokens.numel() - 1)
+            gathered_gpu = all_tokens_idx[idx].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            valid = (col_idx[:context_size].unsqueeze(0) < tok_wls.unsqueeze(1)).to(
+                device, non_blocking=True
+            )
+            chunk_offsets = chunk_offsets_cpu.to(device, non_blocking=True)
+            chunk_lens = chunk_lens_cpu.to(device, non_blocking=True)
+            x = torch.where(valid, gathered_gpu[:, :context_size], 0)
+            y = torch.where(valid, gathered_gpu[:, 1 : context_size + 1], 0)
+            ctx_pos = torch.arange(context_size, device=device, dtype=torch.int64)
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+            # CaseOps sidecar-driven byte budget. Mirror the index pattern
+            # used to build y from all_tokens: y[b, j] corresponds to the
+            # token at global position tok_starts[b] + 1 + j (when valid).
+            y_bytes_arg = None
+            if val_data.caseops_enabled and val_data.val_bytes is not None:
+                y_idx = (
+                    tok_starts.unsqueeze(1)
+                    + 1
+                    + col_idx[:context_size].unsqueeze(0)
+                )
+                y_idx = y_idx.clamp_(max=val_data.val_bytes.numel() - 1)
+                y_bytes_arg = val_data.val_bytes[y_idx].to(
+                    device=device, dtype=torch.int32, non_blocking=True
+                )
+                # Mirror the `valid` masking used for y so out-of-range tokens
+                # contribute zero bytes (matches y=0 substitution above).
+                y_bytes_arg = torch.where(
+                    valid, y_bytes_arg, torch.zeros_like(y_bytes_arg)
+                )
+            with torch.no_grad():
+                _accumulate_bpb(
+                    per_tok_loss,
+                    x,
+                    y,
+                    chunk_offsets,
+                    chunk_lens,
+                    ctx_pos,
+                    val_data.base_bytes_lut,
+                    val_data.has_leading_space_lut,
+                    val_data.is_boundary_token_lut,
+                    loss_sum,
+                    byte_sum,
+                    token_count,
+                    y_bytes=y_bytes_arg,
+                )
+            if needs_train:
+                activate_chunk_mask = (num_chunks_t - 1 > ci).float()
+                for gi in range(h.ttt_grad_steps):
+                    if gi > 0:
+                        with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                            per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+                    per_doc = per_tok_loss[
+                        :, chunk_offset : chunk_offset + chunk_size
+                    ].mean(dim=-1)
+                    cur_opt.zero_grad(set_to_none=True)
+                    (per_doc * activate_chunk_mask).sum().backward()
+                    cur_opt.step()
+            else:
+                del per_tok_loss
+        batch_num = orig_batch_idx + 1
+        doc_lens = [dl for _, dl in batch]
+        should_report = batch_num in eval_batch_set if eval_batch_set is not None else True
+        if should_report:
+            cur_tokens = token_count.item()
+            cur_loss_val = loss_sum.item()
+            cur_bytes_val = byte_sum.item()
+            dt = cur_tokens - prev_tokens
+            db = cur_bytes_val - prev_bytes
+            if dt > 0 and db > 0:
+                b_loss = (cur_loss_val - prev_loss) / dt
+                b_bpb = b_loss / math.log(2.0) * (dt / db)
+            else:
+                b_loss = b_bpb = 0.0
+            r_loss = cur_loss_val / max(cur_tokens, 1)
+            r_bpb = r_loss / math.log(2.0) * (cur_tokens / max(cur_bytes_val, 1))
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"ttp: b{batch_num}/{queue_len} bl:{b_loss:.4f} bb:{b_bpb:.4f} "
+                f"rl:{r_loss:.4f} rb:{r_bpb:.4f} dl:{min(doc_lens)}-{max(doc_lens)} "
+                f"gd:{int(global_ttt_done)}"
+            )
+        if not global_ttt_done:
+            local_scored_docs.extend(
+                (orig_batch_idx, pos, doc_start, doc_len)
+                for pos, (doc_start, doc_len) in enumerate(batch)
+            )
+            prefix_done = _add_to_counter(prefix_counter_path, len(batch_entries))
+            if prefix_done >= current_phase_boundary:
+                try:
+                    with open(pause_flag_path, "x"):
+                        pass
+                except FileExistsError:
+                    pass
+            should_pause = os.path.exists(pause_flag_path)
+            if should_pause:
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                gathered_scored_docs = [None] * h.world_size
+                if dist.is_available() and dist.is_initialized():
+                    dist.all_gather_object(gathered_scored_docs, local_scored_docs)
+                else:
+                    gathered_scored_docs = [local_scored_docs]
+                scored_docs_for_global = []
+                for rank_docs in gathered_scored_docs:
+                    if rank_docs:
+                        scored_docs_for_global.extend(rank_docs)
+                scored_docs_for_global.sort(key=lambda x: (x[0], x[1]))
+                scored_docs_for_global = scored_docs_for_global[:current_phase_boundary]
+                scored_token_chunks = [
+                    val_data.val_tokens[doc_start : doc_start + doc_len]
+                    for _, _, doc_start, doc_len in scored_docs_for_global
+                ]
+                if scored_token_chunks:
+                    global_ttt_tokens = torch.cat(scored_token_chunks)
+                else:
+                    global_ttt_tokens = val_data.val_tokens[:0]
+                if h.rank == 0:
+                    prefix_done = 0
+                    try:
+                        with open(prefix_counter_path, "rb") as f:
+                            prefix_done = int.from_bytes(
+                                f.read(8), "little", signed=True
+                            )
+                    except FileNotFoundError:
+                        pass
+                    log(
+                        f"ttpp: phase:{current_phase + 1}/{num_phases} pd:{prefix_done} "
+                        f"gd:{len(scored_docs_for_global)} "
+                        f"t:{time.perf_counter() - t_start:.1f}s"
+                    )
+                train_val_ttt_global_sgd_distributed(
+                    h, device, val_data, base_model, global_ttt_tokens
+                )
+                for p in base_model.parameters():
+                    p.requires_grad_(False)
+                reusable_lora = BatchedTTTLoRA(
+                    h.ttt_batch_size, base_model, h.ttt_lora_rank,
+                    k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+                ).to(device)
+                reusable_opt = _build_opt(reusable_lora)
+                current_phase += 1
+                if current_phase >= num_phases:
+                    global_ttt_done = True
+                else:
+                    current_phase_boundary = phase_boundaries[current_phase]
+                    if h.rank == 0:
+                        try:
+                            os.remove(pause_flag_path)
+                        except FileNotFoundError:
+                            pass
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                if h.rank == 0:
+                    log(f"ttpr: phase:{current_phase}/{num_phases} t:{time.perf_counter() - t_start:.1f}s")
+        del cur_lora, cur_opt
+    finally:
+        pass
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(byte_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(token_count, op=dist.ReduceOp.SUM)
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.train()
+    return _loss_bpb_from_sums(loss_sum, token_count, byte_sum)
+
+
+def timed_eval(label, fn, *args, **kwargs):
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    val_loss, val_bpb = fn(*args, **kwargs)
+    torch.cuda.synchronize()
+    elapsed_ms = 1e3 * (time.perf_counter() - t0)
+    log(
+        f"{label} val_loss:{val_loss:.8f} val_bpb:{val_bpb:.8f} eval_time:{elapsed_ms:.0f}ms"
+    )
+    return val_loss, val_bpb
+
+
+def train_model(h, device, val_data):
+    base_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(base_model)
+    compiled_model = torch.compile(base_model, dynamic=False, fullgraph=True)
+    compiled_forward_logits = torch.compile(
+        base_model.forward_logits, dynamic=False, fullgraph=True
+    )
+    model = compiled_model
+    log(f"model_params:{sum(p.numel()for p in base_model.parameters())}")
+    optimizers = Optimizers(h, base_model)
+    train_loader = DocumentPackingLoader(h, device)
+    max_wallclock_ms = (
+        1e3 * h.max_wallclock_seconds if h.max_wallclock_seconds > 0 else None
+    )
+    if max_wallclock_ms is not None:
+        max_wallclock_ms -= h.gptq_reserve_seconds * 1e3
+        log(
+            f"gptq:reserving {h.gptq_reserve_seconds:.0f}s, effective={max_wallclock_ms:.0f}ms"
+        )
+
+    def training_frac(step, elapsed_ms):
+        if max_wallclock_ms is None:
+            return step / max(h.iterations, 1)
+        return elapsed_ms / max(max_wallclock_ms, 1e-09)
+
+    def lr_mul(frac):
+        if h.warmdown_frac <= 0:
+            return 1.0
+        if frac >= 1.0 - h.warmdown_frac:
+            return max((1.0 - frac) / h.warmdown_frac, h.min_lr)
+        return 1.0
+
+    _clip_params = [p for p in base_model.parameters() if p.requires_grad]
+    def step_fn(step, lr_scale):
+        train_loss = torch.zeros((), device=device)
+        for micro_step in range(h.grad_accum_steps):
+            x, y, cu_seqlens, _max_seqlen = train_loader.next_batch(
+                h.train_batch_tokens, h.grad_accum_steps
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                loss = model(x, y, cu_seqlens=cu_seqlens, max_seqlen=h.train_seq_len)
+            train_loss += loss.detach()
+            (loss / h.grad_accum_steps).backward()
+        train_loss /= h.grad_accum_steps
+        if step <= h.muon_momentum_warmup_steps:
+
+            frac = (
+
+                min(step / h.muon_momentum_warmup_steps, 1.0)
+
+                if h.muon_momentum_warmup_steps > 0
+
+                else 1.0
+
+            )
+
+            muon_momentum = (
+
+                1 - frac
+
+            ) * h.muon_momentum_warmup_start + frac * h.muon_momentum
+
+            for group in optimizers.optimizer_muon.param_groups:
+
+                group["momentum"] = muon_momentum
+        for opt in optimizers:
+            for group in opt.param_groups:
+                group["lr"] = group["base_lr"] * lr_scale
+        if h.grad_clip_norm > 0:
+            torch.nn.utils.clip_grad_norm_(_clip_params, h.grad_clip_norm)
+        optimizers.step(distributed=h.distributed)
+        return train_loss
+
+    if h.warmup_steps > 0:
+        initial_model_state = {
+            name: tensor.detach().cpu().clone()
+            for (name, tensor) in base_model.state_dict().items()
+        }
+        initial_optimizer_states = [
+            copy.deepcopy(opt.state_dict()) for opt in optimizers
+        ]
+        model.train()
+        num_tokens_local = h.train_batch_tokens // h.world_size
+        for blk in base_model.blocks:
+            blk.attn.rotary(num_tokens_local, device, torch.bfloat16)
+        cu_bucket_size = train_loader.cu_bucket_size
+        warmup_cu_buckets = tuple(cu_bucket_size * i for i in range(1, 5))
+        warmup_cu_iters = 3
+        x, y, cu_seqlens, _ = train_loader.next_batch(
+            h.train_batch_tokens, h.grad_accum_steps
+        )
+        log(f"warmup_cu_buckets:{','.join(str(b) for b in warmup_cu_buckets)} iters_each:{warmup_cu_iters}")
+        def _run_cu_bucket_warmup():
+            for bucket_len in warmup_cu_buckets:
+                boundaries = list(range(0, x.size(1), max(h.train_seq_len, 1)))
+                if boundaries[-1] != x.size(1):
+                    boundaries.append(x.size(1))
+                cu = torch.full((bucket_len,), x.size(1), dtype=torch.int32, device=device)
+                cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+                for _ in range(warmup_cu_iters):
+                    optimizers.zero_grad_all()
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                        wloss = model(x, y, cu_seqlens=cu, max_seqlen=h.train_seq_len)
+                    (wloss / h.grad_accum_steps).backward()
+            optimizers.zero_grad_all()
+        _run_cu_bucket_warmup()
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            _run_cu_bucket_warmup()
+            base_model.looping_active = False
+        for warmup_step in range(h.warmup_steps):
+            step_fn(warmup_step, 1.0)
+            if (
+                warmup_step <= 5
+                or (warmup_step + 1) % 10 == 0
+                or warmup_step + 1 == h.warmup_steps
+            ):
+                log(f"warmup_step: {warmup_step+1}/{h.warmup_steps}")
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            log(
+                f"loop_warmup:enabled encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+            for warmup_step in range(h.warmup_steps):
+                step_fn(warmup_step, 1.0)
+                if (
+                    warmup_step <= 5
+                    or (warmup_step + 1) % 10 == 0
+                    or warmup_step + 1 == h.warmup_steps
+                ):
+                    log(f"loop_warmup_step: {warmup_step+1}/{h.warmup_steps}")
+            base_model.looping_active = False
+        base_model.load_state_dict(initial_model_state, strict=True)
+        for (opt, state) in zip(optimizers, initial_optimizer_states, strict=True):
+            opt.load_state_dict(state)
+        optimizers.zero_grad_all()
+        train_loader = DocumentPackingLoader(h, device)
+    _live_state = base_model.state_dict(keep_vars=True)
+    ema_state = {
+        name: t.detach().float().clone()
+        for (name, t) in _live_state.items()
+    }
+    _ema_pairs = [(ema_state[name], t) for (name, t) in _live_state.items()]
+    ema_decay = h.ema_decay
+    training_time_ms = 0.0
+    stop_after_step = None
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    step = 0
+    while True:
+        last_step = (
+            step == h.iterations
+            or stop_after_step is not None
+            and step >= stop_after_step
+        )
+        should_validate = (
+            last_step or h.val_loss_every > 0 and step % h.val_loss_every == 0
+        )
+        if should_validate:
+            torch.cuda.synchronize()
+            training_time_ms += 1e3 * (time.perf_counter() - t0)
+            val_loss, val_bpb = eval_val(
+                h, device, val_data, model, compiled_forward_logits
+            )
+            log(
+                f"{step}/{h.iterations} val_loss: {val_loss:.4f} val_bpb: {val_bpb:.4f}"
+            )
+            torch.cuda.synchronize()
+            t0 = time.perf_counter()
+        if last_step:
+            if stop_after_step is not None and step < h.iterations:
+                log(
+                    f"stopping_early: wallclock_cap train_time: {training_time_ms:.0f}ms step: {step}/{h.iterations}"
+                )
+            break
+        elapsed_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        frac = training_frac(step, elapsed_ms)
+        scale = lr_mul(frac)
+        if (
+            h.num_loops > 0
+            and not base_model.looping_active
+            and frac >= h.enable_looping_at
+        ):
+            base_model.looping_active = True
+            log(
+                f"layer_loop:enabled step:{step} frac:{frac:.3f} encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+        train_loss = step_fn(step, scale)
+        with torch.no_grad():
+            for ema_t, t in _ema_pairs:
+                ema_t.mul_(ema_decay).add_(t.detach(), alpha=1.0 - ema_decay)
+        step += 1
+        approx_training_time_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        should_log_train = h.train_log_every > 0 and (
+            step <= 5 or step % h.train_log_every == 0 or stop_after_step is not None
+        )
+        if should_log_train:
+            tok_per_sec = step * h.train_batch_tokens / (approx_training_time_ms / 1e3)
+            log(
+                f"{step}/{h.iterations} train_loss: {train_loss.item():.4f} train_time: {approx_training_time_ms/60000:.1f}m tok/s: {tok_per_sec:.0f}"
+            )
+        reached_cap = (
+            max_wallclock_ms is not None and approx_training_time_ms >= max_wallclock_ms
+        )
+        if h.distributed and max_wallclock_ms is not None:
+            reached_cap_tensor = torch.tensor(int(reached_cap), device=device)
+            dist.all_reduce(reached_cap_tensor, op=dist.ReduceOp.MAX)
+            reached_cap = bool(reached_cap_tensor.item())
+        if stop_after_step is None and reached_cap:
+            stop_after_step = step
+    log(
+        f"peak memory allocated: {torch.cuda.max_memory_allocated()//1024//1024} MiB reserved: {torch.cuda.max_memory_reserved()//1024//1024} MiB"
+    )
+    log("ema:applying EMA weights")
+    current_state = base_model.state_dict()
+    avg_state = {
+        name: t.to(dtype=current_state[name].dtype) for (name, t) in ema_state.items()
+    }
+    base_model.load_state_dict(avg_state, strict=True)
+    return base_model, compiled_model, compiled_forward_logits
+
+
+def train_and_eval(h, device):
+    random.seed(h.seed)
+    np.random.seed(h.seed)
+    torch.manual_seed(h.seed)
+    torch.cuda.manual_seed_all(h.seed)
+    if h.artifact_dir and h.is_main_process:
+        os.makedirs(h.artifact_dir, exist_ok=True)
+    val_data = ValidationData(h, device)
+    log(
+        f"train_shards: {len(list(Path(h.datasets_dir).resolve().glob('fineweb_train_*.bin')))}"
+    )
+    log(f"val_tokens: {val_data.val_tokens.numel()-1}")
+    # TTT_EVAL_ONLY: skip training + GPTQ, jump straight to TTT eval on a
+    # pre-existing quantized artifact. Used to test TTT-only improvements
+    # (e.g., PR-1767's alpha/warm-start/WD) without retraining.
+    ttt_eval_only = os.environ.get("TTT_EVAL_ONLY", "0") == "1"
+    if ttt_eval_only:
+        log("TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval")
+        log(f"ttt_lora_alpha: {BatchedLinearLoRA._ALPHA}")
+        log(f"ttt_warm_start_a: {BatchedLinearLoRA._WARM_START_A}")
+        log(f"ttt_weight_decay: {h.ttt_weight_decay}")
+    else:
+        base_model, compiled_model, compiled_forward_logits = train_model(
+            h, device, val_data
+        )
+        torch._dynamo.reset()
+        timed_eval(
+            "diagnostic pre-quantization post-ema",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        if os.environ.get("PREQUANT_ONLY", "0") == "1":
+            log("PREQUANT_ONLY=1 — skipping serialize/GPTQ/post-quant eval/TTT")
+            return
+        serialize(h, base_model, Path(__file__).read_text(encoding="utf-8"))
+        if h.distributed:
+            dist.barrier()
+    eval_model = deserialize(h, device)
+    if h.num_loops > 0:
+        eval_model.looping_active = True
+    if not ttt_eval_only:
+        compiled_model = torch.compile(eval_model, dynamic=False, fullgraph=True)
+        compiled_forward_logits = torch.compile(
+            eval_model.forward_logits, dynamic=False, fullgraph=True
+        )
+        timed_eval(
+            "diagnostic quantized",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        del eval_model
+    if h.ttt_enabled:
+        if not ttt_eval_only:
+            del compiled_model
+        if ttt_eval_only:
+            del eval_model
+        torch._dynamo.reset()
+        torch.cuda.empty_cache()
+        ttt_model = deserialize(h, device)
+        if h.num_loops > 0:
+            ttt_model.looping_active = True
+        for p in ttt_model.parameters():
+            p.requires_grad_(False)
+
+        if h.rope_yarn:
+            _yarn_seqlen = h.train_batch_tokens // h.grad_accum_steps
+            for block in ttt_model.blocks:
+                block.attn.rotary(_yarn_seqlen, device, torch.bfloat16)
+        else:
+            for block in ttt_model.blocks:
+                block.attn.rotary._cos_cached = None
+                block.attn.rotary._sin_cached = None
+                block.attn.rotary._seq_len_cached = 0
+                block.attn.rotary(h.ttt_eval_seq_len, device, torch.bfloat16)
+
+        def _fwd_ttt_inner(input_ids, target_ids, lora):
+            return ttt_model.forward_ttt(input_ids, target_ids, lora=lora)
+
+        _fwd_ttt_compiled_inner = None
+
+        def _fwd_ttt(input_ids, target_ids, lora):
+            nonlocal _fwd_ttt_compiled_inner
+            if _fwd_ttt_compiled_inner is None:
+                _fwd_ttt_compiled_inner = torch.compile(_fwd_ttt_inner, dynamic=True)
+            return _fwd_ttt_compiled_inner(input_ids, target_ids, lora=lora)
+
+        fwd_ttt_compiled = _fwd_ttt
+        log(f"ttt_lora:warming up compile (random tokens, no val data)")
+        global BOS_ID
+        if BOS_ID is None:
+            BOS_ID = 1
+        t_warmup = time.perf_counter()
+        warmup_bszes = [h.ttt_batch_size]
+        for bsz in warmup_bszes:
+            wl = BatchedTTTLoRA(
+                bsz, ttt_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            wo = torch.optim.AdamW(
+                wl.parameters(),
+                lr=h.ttt_lora_lr,
+                betas=(h.ttt_beta1, h.ttt_beta2),
+                eps=1e-10,
+                weight_decay=h.ttt_weight_decay,
+                fused=True,
+            )
+            for ctx_len in (h.ttt_chunk_size, h.ttt_eval_seq_len):
+                xw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                yw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                    ptl = fwd_ttt_compiled(xw, yw, lora=wl)
+                ptl[:, : min(h.ttt_chunk_size, ctx_len)].mean(dim=-1).sum().backward()
+                wo.step()
+                wo.zero_grad(set_to_none=True)
+            del wl, wo
+        torch.cuda.empty_cache()
+        compile_elapsed = time.perf_counter() - t_warmup
+        log(f"ttt_lora:compile warmup done ({compile_elapsed:.1f}s)")
+        log("\nbeginning TTT eval timer")
+        torch.cuda.synchronize()
+        t_ttt = time.perf_counter()
+        ttt_val_loss, ttt_val_bpb = eval_val_ttt_phased(
+            h, ttt_model, device, val_data, forward_ttt_train=fwd_ttt_compiled
+        )
+        torch.cuda.synchronize()
+        ttt_eval_elapsed = time.perf_counter() - t_ttt
+        log(
+            "quantized_ttt_phased "
+            f"val_loss:{ttt_val_loss:.8f} val_bpb:{ttt_val_bpb:.8f} "
+            f"eval_time:{1e3*ttt_eval_elapsed:.0f}ms"
+        )
+        log(f"total_eval_time:{ttt_eval_elapsed:.1f}s")
+        del ttt_model
+
+
+def main():
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    if not torch.cuda.is_available():
+        raise RuntimeError("CUDA is required")
+    if world_size <= 0:
+        raise ValueError(f"WORLD_SIZE must be positive, got {world_size}")
+    if 8 % world_size != 0:
+        raise ValueError(
+            f"WORLD_SIZE={world_size} must divide 8 so grad_accum_steps stays integral"
+        )
+    device = torch.device("cuda", local_rank)
+    torch.cuda.set_device(device)
+    if distributed:
+        dist.init_process_group(backend="nccl", device_id=device)
+        dist.barrier()
+    torch.backends.cuda.matmul.allow_tf32 = True
+    torch.backends.cudnn.allow_tf32 = True
+    torch.set_float32_matmul_precision("high")
+    from torch.backends.cuda import (
+        enable_cudnn_sdp,
+        enable_flash_sdp,
+        enable_math_sdp,
+        enable_mem_efficient_sdp,
+    )
+
+    enable_cudnn_sdp(False)
+    enable_flash_sdp(True)
+    enable_mem_efficient_sdp(False)
+    enable_math_sdp(False)
+    torch._dynamo.config.optimize_ddp = False
+    torch._dynamo.config.cache_size_limit = 64
+    h = Hyperparameters()
+    set_logging_hparams(h)
+    if h.is_main_process:
+        os.makedirs(h.artifact_dir if h.artifact_dir else "logs", exist_ok=True)
+        log(100 * "=", console=False)
+        log("Hyperparameters:", console=True)
+        for (k, v) in sorted(vars(type(h)).items()):
+            if not k.startswith("_"):
+                log(f"  {k}: {v}", console=True)
+        log("=" * 100, console=False)
+        log("Source code:", console=False)
+        log("=" * 100, console=False)
+        with open(__file__, "r", encoding="utf-8") as _src:
+            log(_src.read(), console=False)
+        log("=" * 100, console=False)
+        log(f"Running Python {sys.version}", console=False)
+        log(f"Running PyTorch {torch.__version__}", console=False)
+        log("=" * 100, console=False)
+    train_and_eval(h, device)
+    if distributed:
+        dist.destroy_process_group()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed0.log b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed0.log
new file mode 100644
index 0000000000..751a4181d7
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed0.log
@@ -0,0 +1,4798 @@
+====================================================================================================
+Hyperparameters:
+  adam_eps: 1e-08
+  adam_wd: 0.02
+  artifact_dir: logs
+  attn_clip_sigmas: 13.0
+  attn_out_gate_enabled: False
+  attn_out_gate_src: proj
+  beta1: 0.9
+  beta2: 0.99
+  build_seconds: 600
+  caseops_enabled: True
+  compressor: pergroup
+  data_dir: /workspace/SOTA_FINAL/data
+  datasets_dir: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved
+  distributed: True
+  ema_decay: 0.9965
+  embed_bits: 7
+  embed_clip_sigmas: 14.0
+  embed_lr: 0.6
+  embed_wd: 0.085
+  enable_looping_at: 0.45
+  eval_seconds: 600
+  eval_seq_len: 2048
+  eval_stride: 64
+  fused_ce_enabled: True
+  gate_window: 12
+  gated_attn_enabled: False
+  gated_attn_init_std: 0.01
+  gated_attn_quant_gate: True
+  global_ttt_batch_seqs: 32
+  global_ttt_chunk_tokens: 32768
+  global_ttt_epochs: 1
+  global_ttt_grad_clip: 1.0
+  global_ttt_lr: 0.001
+  global_ttt_momentum: 0.9
+  global_ttt_respect_doc_boundaries: True
+  global_ttt_warmup_chunks: 0
+  global_ttt_warmup_start_lr: 0.0
+  gptq_calibration_batches: 16
+  gptq_reserve_seconds: 0.5
+  grad_accum_steps: 1
+  grad_clip_norm: 0.3
+  hypothesis: Seed repeat of the clean SP10240 CaseOps MLP3.75 late045 standard 8x submission candidate, changing only seed/run identity.
+  is_main_process: True
+  iterations: 20000
+  ln_scale: True
+  local_rank: 0
+  logfile: logs/pr1855_sp10240_caseops_mlp375_late045_seed0_8x.txt
+  logit_softcap: 30.0
+  loop_end: 5
+  loop_start: 3
+  lqer_asym_enabled: True
+  lqer_asym_group: 64
+  lqer_enabled: True
+  lqer_factor_bits: 4
+  lqer_rank: 4
+  lqer_top_k: 3
+  matrix_bits: 6
+  matrix_clip_sigmas: 12.85
+  matrix_lr: 0.026
+  max_wallclock_seconds: 600.0
+  min_lr: 0.1
+  mlp_clip_sigmas: 11.5
+  mlp_mult: 3.75
+  model_dim: 512
+  model_path: logs/final_model.pt
+  muon_backend_steps: 5
+  muon_momentum: 0.97
+  muon_momentum_warmup_start: 0.92
+  muon_momentum_warmup_steps: 1500
+  muon_row_normalize: True
+  muon_wd: 0.095
+  num_heads: 8
+  num_kv_heads: 4
+  num_layers: 11
+  num_loops: 2
+  parallel_final_lane: mean
+  parallel_start_layer: 8
+  parent_run: 2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x
+  phased_ttt_num_phases: 3
+  phased_ttt_prefix_docs: 2500
+  qk_gain_init: 5.25
+  quantized_model_path: logs/final_model.int6.ptz
+  rank: 0
+  rope_base: 10000.0
+  rope_dims: 16
+  rope_train_seq_len: 2048
+  rope_yarn: False
+  run_id: pr1855_sp10240_caseops_mlp375_late045_seed0_8x
+  run_kind: seed_repeat
+  run_label: standard_8x
+  scalar_lr: 0.02
+  seed: 0
+  size_cap_bytes: 16000000
+  skip_gates_enabled: True
+  smear_gate_enabled: True
+  source_parent: legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py
+  source_parent_sha256: 454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b
+  source_tokenizer_lane: scripts/prepare_sp10240_caseops_data.py
+  sparse_attn_gate_enabled: True
+  sparse_attn_gate_init_std: 0.0
+  sparse_attn_gate_scale: 0.5
+  test_date: 2026-05-01
+  test_id: 2026-05-01_pr1855_sp10240_caseops_mlp375_late045_seed0_8x
+  tie_embeddings: True
+  tied_embed_init_std: 0.005
+  tied_embed_lr: 0.03
+  tokenizer_path: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/tokenizers/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model
+  train_batch_tokens: 786432
+  train_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin
+  train_log_every: 500
+  train_seq_len: 2048
+  ttt_batch_size: 64
+  ttt_beta1: 0.0
+  ttt_beta2: 0.99
+  ttt_chunk_size: 48
+  ttt_enabled: True
+  ttt_eval_batches: 
+  ttt_eval_seq_len: 2048
+  ttt_grad_steps: 1
+  ttt_k_lora: True
+  ttt_lora_lr: 0.0001
+  ttt_lora_rank: 80
+  ttt_mlp_lora: True
+  ttt_o_lora: True
+  ttt_optimizer: adam
+  ttt_weight_decay: 0.5
+  val_batch_tokens: 524288
+  val_bytes_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin
+  val_doc_fraction: 1.0
+  val_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin
+  val_loss_every: 0
+  vocab_size: 10240
+  warmdown_frac: 0.85
+  warmup_steps: 20
+  world_size: 8
+  xsa_last_n: 11
+====================================================================================================
+Source code:
+====================================================================================================
+import base64, collections, copy, fcntl, glob, io, lzma, math, os
+from pathlib import Path
+import random, re, subprocess, sys, time, uuid, numpy as np, sentencepiece as spm, torch, torch.distributed as dist, torch.nn.functional as F
+from torch import Tensor, nn
+from flash_attn_interface import (
+    flash_attn_func as flash_attn_3_func,
+    flash_attn_varlen_func,
+)
+from concurrent.futures import ThreadPoolExecutor
+import triton
+import triton.language as tl
+from triton.tools.tensor_descriptor import TensorDescriptor
+
+
+# ===== Fused softcapped cross-entropy (Triton) — training-only path =====
+# Replaces the eager
+#     logits_softcap = softcap * tanh(logits / softcap)
+#     F.cross_entropy(logits_softcap.float(), targets, reduction="mean")
+# sequence with a single fused kernel that reads logits_proj once, applies
+# softcap in-register, and computes (LSE, loss) in one streaming pass. The
+# backward kernel mirrors the forward so there's no stored softcapped logits.
+# Numerically identical to the eager path up to fp32 accumulation differences.
+_FUSED_CE_LIBRARY = "pgsubmission1draft7fusedce"
+_FUSED_CE_BLOCK_SIZE = 1024
+_FUSED_CE_NUM_WARPS = 4
+
+
+@triton.jit
+def _softcapped_ce_fwd_kernel(
+    logits_ptr, losses_ptr, lse_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    max_val = -float("inf")
+    sum_exp = 0.0
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=-float("inf"),
+        ).to(tl.float32)
+        z = A * tl.sigmoid(val * inv_C)
+        z = tl.where(mask, z, -float("inf"))
+        curr_max = tl.max(z, axis=0)
+        new_max = tl.maximum(max_val, curr_max)
+        sum_exp = sum_exp * tl.exp(max_val - new_max) + tl.sum(tl.exp(z - new_max), axis=0)
+        max_val = new_max
+    lse = max_val + tl.log(sum_exp)
+    tl.store(lse_ptr + row_idx, lse)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    target_val = tl.load(logits_row_ptr + target * stride_logits_v).to(tl.float32)
+    target_z = A * tl.sigmoid(target_val * inv_C)
+    tl.store(losses_ptr + row_idx, lse - target_z)
+
+
+@triton.jit
+def _softcapped_ce_bwd_kernel(
+    grad_logits_ptr, grad_losses_ptr, lse_ptr, logits_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    stride_grad_n, stride_grad_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    grad_row_ptr = grad_logits_ptr + row_idx * stride_grad_n
+    lse = tl.load(lse_ptr + row_idx)
+    grad_loss = tl.load(grad_losses_ptr + row_idx).to(tl.float32)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    dz_dx_scale = A * inv_C
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=0.0,
+        ).to(tl.float32)
+        sigmoid_u = tl.sigmoid(val * inv_C)
+        z = A * sigmoid_u
+        probs = tl.exp(z - lse)
+        grad_z = grad_loss * (probs - tl.where(cols == target, 1.0, 0.0))
+        grad_x = grad_z * (dz_dx_scale * sigmoid_u * (1.0 - sigmoid_u))
+        tl.store(grad_row_ptr + cols * stride_grad_v, grad_x, mask=mask)
+
+
+def _validate_softcapped_ce_inputs(
+    logits: Tensor, targets: Tensor, softcap: float,
+) -> tuple[Tensor, Tensor]:
+    if logits.ndim != 2:
+        raise ValueError(f"Expected logits.ndim=2, got {logits.ndim}")
+    if targets.ndim != 1:
+        raise ValueError(f"Expected targets.ndim=1, got {targets.ndim}")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    if not logits.is_cuda or not targets.is_cuda:
+        raise ValueError("softcapped_cross_entropy requires CUDA tensors")
+    if softcap <= 0.0:
+        raise ValueError(f"softcap must be positive, got {softcap}")
+    if logits.dtype not in (torch.float16, torch.bfloat16, torch.float32):
+        raise ValueError(f"Unsupported logits dtype: {logits.dtype}")
+    logits = logits.contiguous()
+    targets = targets.contiguous()
+    if targets.dtype != torch.int64:
+        targets = targets.to(dtype=torch.int64)
+    return logits, targets
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce", mutates_args=())
+def softcapped_ce_op(logits: Tensor, targets: Tensor, softcap: float) -> tuple[Tensor, Tensor]:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    n_rows, n_cols = logits.shape
+    losses = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    lse = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    _softcapped_ce_fwd_kernel[(n_rows,)](
+        logits, losses, lse, targets,
+        logits.stride(0), logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return losses, lse
+
+
+@softcapped_ce_op.register_fake
+def _(logits: Tensor, targets: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1:
+        raise ValueError("softcapped_ce fake impl expects 2D logits and 1D targets")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    n_rows = logits.shape[0]
+    return (
+        logits.new_empty((n_rows,), dtype=torch.float32),
+        logits.new_empty((n_rows,), dtype=torch.float32),
+    )
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce_backward", mutates_args=())
+def softcapped_ce_backward_op(
+    logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float,
+) -> Tensor:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    lse = lse.contiguous()
+    grad_losses = grad_losses.contiguous().to(dtype=torch.float32)
+    if lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("Expected 1D lse and grad_losses")
+    if lse.shape[0] != logits.shape[0] or grad_losses.shape[0] != logits.shape[0]:
+        raise ValueError(
+            f"Expected row-aligned lse/grad_losses, got logits={tuple(logits.shape)} "
+            f"lse={tuple(lse.shape)} grad_losses={tuple(grad_losses.shape)}"
+        )
+    grad_logits = torch.empty_like(logits)
+    n_rows, n_cols = logits.shape
+    _softcapped_ce_bwd_kernel[(n_rows,)](
+        grad_logits, grad_losses, lse, logits, targets,
+        logits.stride(0), logits.stride(1),
+        grad_logits.stride(0), grad_logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return grad_logits
+
+
+@softcapped_ce_backward_op.register_fake
+def _(logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1 or lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("softcapped_ce_backward fake impl expects 2D logits and 1D row tensors")
+    if (
+        logits.shape[0] != targets.shape[0]
+        or logits.shape[0] != lse.shape[0]
+        or logits.shape[0] != grad_losses.shape[0]
+    ):
+        raise ValueError("softcapped_ce_backward fake impl expects row-aligned tensors")
+    return logits.new_empty(logits.shape)
+
+
+def _softcapped_ce_setup_context(
+    ctx: torch.autograd.function.FunctionCtx, inputs, output,
+) -> None:
+    logits, targets, softcap = inputs
+    _losses, lse = output
+    ctx.save_for_backward(logits, targets, lse)
+    ctx.softcap = float(softcap)
+
+
+def _softcapped_ce_backward(
+    ctx: torch.autograd.function.FunctionCtx, grad_losses: Tensor, grad_lse: "Tensor | None",
+):
+    del grad_lse
+    logits, targets, lse = ctx.saved_tensors
+    grad_logits = torch.ops.pgsubmission1draft7fusedce.softcapped_ce_backward(
+        logits, targets, lse, grad_losses, ctx.softcap
+    )
+    return grad_logits, None, None
+
+
+softcapped_ce_op.register_autograd(
+    _softcapped_ce_backward, setup_context=_softcapped_ce_setup_context,
+)
+
+
+def softcapped_cross_entropy(
+    logits: Tensor, targets: Tensor, softcap: float, reduction: str = "mean",
+) -> Tensor:
+    losses, _lse = torch.ops.pgsubmission1draft7fusedce.softcapped_ce(
+        logits, targets, float(softcap)
+    )
+    if reduction == "none":
+        return losses
+    if reduction == "sum":
+        return losses.sum()
+    if reduction == "mean":
+        return losses.mean()
+    raise ValueError(f"Unsupported reduction={reduction!r}")
+
+
+class Hyperparameters:
+    data_dir = os.environ.get("DATA_DIR", "./data/")
+    seed = int(os.environ.get("SEED", 1337))
+    run_id = os.environ.get("RUN_ID", str(uuid.uuid4()))
+    iterations = int(os.environ.get("ITERATIONS", 20000))
+    warmdown_frac = float(os.environ.get("WARMDOWN_FRAC", 0.75))
+    warmup_steps = int(os.environ.get("WARMUP_STEPS", 20))
+    train_batch_tokens = int(os.environ.get("TRAIN_BATCH_TOKENS", 786432))
+    # Fused softcapped CE (Triton). Training-only — forward_logits eval path still uses
+    # eager softcap+F.cross_entropy. Default ON since validated as at-worst neutral.
+    fused_ce_enabled = bool(int(os.environ.get("FUSED_CE_ENABLED", "1")))
+    train_seq_len = int(os.environ.get("TRAIN_SEQ_LEN", 2048))
+    train_log_every = int(os.environ.get("TRAIN_LOG_EVERY", 500))
+    max_wallclock_seconds = float(os.environ.get("MAX_WALLCLOCK_SECONDS", 6e2))
+    val_batch_tokens = int(os.environ.get("VAL_BATCH_TOKENS", 524288))
+    eval_seq_len = int(os.environ.get("EVAL_SEQ_LEN", 2048))
+    val_loss_every = int(os.environ.get("VAL_LOSS_EVERY", 4000))
+    vocab_size = int(os.environ.get("VOCAB_SIZE", 8192))
+    num_layers = int(os.environ.get("NUM_LAYERS", 11))
+    xsa_last_n = int(os.environ.get("XSA_LAST_N", 11))
+    model_dim = int(os.environ.get("MODEL_DIM", 512))
+    num_kv_heads = int(os.environ.get("NUM_KV_HEADS", 4))
+    num_heads = int(os.environ.get("NUM_HEADS", 8))
+    mlp_mult = float(os.environ.get("MLP_MULT", 4.0))
+    skip_gates_enabled = bool(int(os.environ.get("SKIP_GATES_ENABLED", "1")))
+    tie_embeddings = bool(int(os.environ.get("TIE_EMBEDDINGS", "1")))
+    logit_softcap = float(os.environ.get("LOGIT_SOFTCAP", 3e1))
+    rope_base = float(os.environ.get("ROPE_BASE", 1e4))
+    rope_dims = int(os.environ.get("ROPE_DIMS", 16))
+    rope_train_seq_len = int(os.environ.get("ROPE_TRAIN_SEQ_LEN", 2048))
+    rope_yarn = bool(int(os.environ.get("ROPE_YARN", "0")))
+    ln_scale = bool(int(os.environ.get("LN_SCALE", "1")))
+    qk_gain_init = float(os.environ.get("QK_GAIN_INIT", 5.0))
+    num_loops = int(os.environ.get("NUM_LOOPS", 2))
+    loop_start = int(os.environ.get("LOOP_START", 3))
+    loop_end = int(os.environ.get("LOOP_END", 5))
+    enable_looping_at = float(os.environ.get("ENABLE_LOOPING_AT", 0.35))
+    parallel_start_layer = int(os.environ.get("PARALLEL_START_LAYER", 8))
+    parallel_final_lane = os.environ.get("PARALLEL_FINAL_LANE", "mean")
+    min_lr = float(os.environ.get("MIN_LR", 0.0))
+    embed_lr = float(os.environ.get("EMBED_LR", 0.6))
+    tied_embed_lr = float(os.environ.get("TIED_EMBED_LR", 0.03))
+    tied_embed_init_std = float(os.environ.get("TIED_EMBED_INIT_STD", 0.005))
+    matrix_lr = float(os.environ.get("MATRIX_LR", 0.026))
+    scalar_lr = float(os.environ.get("SCALAR_LR", 0.02))
+    muon_momentum = float(os.environ.get("MUON_MOMENTUM", 0.97))
+    muon_backend_steps = int(os.environ.get("MUON_BACKEND_STEPS", 5))
+    muon_momentum_warmup_start = float(
+        os.environ.get("MUON_MOMENTUM_WARMUP_START", 0.92)
+    )
+    muon_momentum_warmup_steps = int(os.environ.get("MUON_MOMENTUM_WARMUP_STEPS", 1500))
+    muon_row_normalize = bool(int(os.environ.get("MUON_ROW_NORMALIZE", "1")))
+    beta1 = float(os.environ.get("BETA1", 0.9))
+    beta2 = float(os.environ.get("BETA2", 0.95))
+    adam_eps = float(os.environ.get("ADAM_EPS", 1e-08))
+    grad_clip_norm = float(os.environ.get("GRAD_CLIP_NORM", 0.3))
+    eval_stride = int(os.environ.get("EVAL_STRIDE", 64))
+    adam_wd = float(os.environ.get("ADAM_WD", 0.02))
+    muon_wd = float(os.environ.get("MUON_WD", 0.095))
+    embed_wd = float(os.environ.get("EMBED_WD", 0.085))
+    ema_decay = float(os.environ.get("EMA_DECAY", 0.9965))
+    ttt_enabled = bool(int(os.environ.get("TTT_ENABLED", "1")))
+    ttt_lora_rank = int(os.environ.get("TTT_LORA_RANK", 96))
+    ttt_lora_lr = float(os.environ.get("TTT_LORA_LR", 0.0001))
+    ttt_chunk_size = int(os.environ.get("TTT_CHUNK_SIZE", 48))
+    ttt_eval_seq_len = int(os.environ.get("TTT_EVAL_SEQ_LEN", 2048))
+    ttt_batch_size = int(os.environ.get("TTT_BATCH_SIZE", 64))
+    ttt_grad_steps = int(os.environ.get("TTT_GRAD_STEPS", 1))
+    ttt_weight_decay = float(os.environ.get("TTT_WEIGHT_DECAY", 1.0))
+    ttt_beta1 = float(os.environ.get("TTT_BETA1", 0))
+    ttt_beta2 = float(os.environ.get("TTT_BETA2", 0.999))
+    ttt_k_lora = bool(int(os.environ.get("TTT_K_LORA", "1")))
+    ttt_mlp_lora = bool(int(os.environ.get("TTT_MLP_LORA", "1")))
+    ttt_o_lora = bool(int(os.environ.get("TTT_O_LORA", "1")))
+    ttt_optimizer = os.environ.get("TTT_OPTIMIZER", "adam")
+    ttt_eval_batches = os.environ.get("TTT_EVAL_BATCHES", "")
+    val_doc_fraction = float(os.environ.get("VAL_DOC_FRACTION", 1.0))
+    compressor = os.environ.get("COMPRESSOR", "brotli")
+    gptq_calibration_batches = int(os.environ.get("GPTQ_CALIBRATION_BATCHES", 16))
+    gptq_reserve_seconds = float(os.environ.get("GPTQ_RESERVE_SECONDS", 4.0))
+    phased_ttt_prefix_docs = int(os.environ.get("PHASED_TTT_PREFIX_DOCS", 2000))
+    phased_ttt_num_phases = int(os.environ.get("PHASED_TTT_NUM_PHASES", 1))
+    global_ttt_lr = float(os.environ.get("GLOBAL_TTT_LR", 0.001))
+    global_ttt_momentum = float(os.environ.get("GLOBAL_TTT_MOMENTUM", 0.9))
+    global_ttt_epochs = int(os.environ.get("GLOBAL_TTT_EPOCHS", 1))
+    global_ttt_chunk_tokens = int(os.environ.get("GLOBAL_TTT_CHUNK_TOKENS", 32768))
+    global_ttt_batch_seqs = int(os.environ.get("GLOBAL_TTT_BATCH_SEQS", 32))
+    global_ttt_warmup_start_lr = float(os.environ.get("GLOBAL_TTT_WARMUP_START_LR", 0.0))
+    global_ttt_warmup_chunks = int(os.environ.get("GLOBAL_TTT_WARMUP_CHUNKS", 0))
+    global_ttt_grad_clip = float(os.environ.get("GLOBAL_TTT_GRAD_CLIP", 1.0))
+    global_ttt_respect_doc_boundaries = bool(int(os.environ.get("GLOBAL_TTT_RESPECT_DOC_BOUNDARIES", "1")))
+    matrix_bits = int(os.environ.get("MATRIX_BITS", 6))
+    embed_bits = int(os.environ.get("EMBED_BITS", 8))
+    matrix_clip_sigmas = float(os.environ.get("MATRIX_CLIP_SIGMAS", 12.85))
+    embed_clip_sigmas = float(os.environ.get("EMBED_CLIP_SIGMAS", 2e1))
+    mlp_clip_sigmas = float(os.environ.get("MLP_CLIP_SIGMAS", 10.0))
+    attn_clip_sigmas = float(os.environ.get("ATTN_CLIP_SIGMAS", 13.0))
+    # AttnOutGate (per-head multiplicative output gate, PR #1667 MarioPaerle).
+    # Zero-init weight: 2*sigmoid(0)=1 -> transparent at start. Source defaults to
+    # block input x ('proj'); 'q' uses raw Q projection output.
+    attn_out_gate_enabled = bool(int(os.environ.get("ATTN_OUT_GATE_ENABLED", "0")))
+    attn_out_gate_src = os.environ.get("ATTN_OUT_GATE_SRC", "proj")
+    # SmearGate (input-dependent forward-1 token smear, modded-nanogpt @classiclarryd
+    # via PR #1667). x_t <- x_t + lam * sigmoid(W*x_t[:gate_window]) * x_{t-1}.
+    # lam=0 + W=0 -> transparent at init.
+    smear_gate_enabled = bool(int(os.environ.get("SMEAR_GATE_ENABLED", "0")))
+    # Window: first GATE_WINDOW dims of the source feed the gate projection.
+    gate_window = int(os.environ.get("GATE_WINDOW", 12))
+    # Gated Attention (Qwen, NeurIPS 2025 Best Paper, arXiv:2505.06708;
+    # qiuzh20/gated_attention). Per-head sigmoid gate on SDPA output, BEFORE
+    # out_proj. Gate input = full block input x (paper's headwise G1 variant
+    # driven from hidden_states). W_g shape (num_heads, dim), plain sigmoid.
+    # Near-zero init gives g~0.5 at step 0 (half attention output); per-block
+    # attn_scale (init 1.0) compensates during training. Name contains
+    # "attn_gate" so CONTROL_TENSOR_NAME_PATTERNS routes it to scalar AdamW.
+    gated_attn_enabled = bool(int(os.environ.get("GATED_ATTN_ENABLED", "0")))
+    gated_attn_init_std = float(os.environ.get("GATED_ATTN_INIT_STD", 0.01))
+    # Dedicated int8-per-row quantization for `attn_gate_w` tensors. These are
+    # small ((num_heads, dim) = (8, 512) = 4096 params) and bypass GPTQ via the
+    # numel<=65536 passthrough branch -> stored as fp16 (8 KB/layer, ~65 KB total
+    # compressed). int8-per-row cuts the raw tensor in half with negligible BPB
+    # impact: scales per head (8 values), symmetric quant over [-127, 127].
+    # No Hessian needed (gate weights not in collect_hessians()).
+    gated_attn_quant_gate = bool(int(os.environ.get("GATED_ATTN_QUANT_GATE", "0")))
+    # Sparse Attention Gate (modded-nanogpt-style). Keeps dense SDPA and only
+    # swaps the output-gate input to the first GATE_WINDOW residual dims.
+    # W_g: (num_heads, gate_window) = (8, 12) = 96 params/layer (~44K total),
+    # vs dense GatedAttn's (8, 512) = 4K/layer (~44K diff). Name "attn_gate_w"
+    # is shared so quant routing and int8 gate passthrough Just Work. Gate
+    # passthrough int8 still applies via GATED_ATTN_QUANT_GATE=1.
+    # Mutually exclusive with ATTN_OUT_GATE_ENABLED and GATED_ATTN_ENABLED.
+    sparse_attn_gate_enabled = bool(int(os.environ.get("SPARSE_ATTN_GATE_ENABLED", "0")))
+    sparse_attn_gate_init_std = float(os.environ.get("SPARSE_ATTN_GATE_INIT_STD", 0.0))
+    sparse_attn_gate_scale = float(os.environ.get("SPARSE_ATTN_GATE_SCALE", 1.0))
+    # LQER asymmetric rank-k correction on top-K quant-error tensors (PR #1530 v2 port).
+    # Computes SVD of E = W_fp - W_quant, packs top-r A,B as INT2/INT4 (asym) or INTk (sym).
+    lqer_enabled = bool(int(os.environ.get("LQER_ENABLED", "1")))
+    lqer_rank = int(os.environ.get("LQER_RANK", 4))
+    lqer_top_k = int(os.environ.get("LQER_TOP_K", 3))
+    lqer_factor_bits = int(os.environ.get("LQER_FACTOR_BITS", 4))
+    lqer_asym_enabled = bool(int(os.environ.get("LQER_ASYM_ENABLED", "1")))
+    lqer_asym_group = int(os.environ.get("LQER_ASYM_GROUP", "64"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    rank = int(os.environ.get("RANK", "0"))
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    is_main_process = rank == 0
+    grad_accum_steps = 8 // world_size
+    # CaseOps integration: optional override of dataset root + tokenizer path.
+    # When CASEOPS_ENABLED=1, the wrapper loads a per-token byte sidecar
+    # (fineweb_val_bytes_*.bin, identical shard layout to val_*.bin) and uses
+    # it as the canonical raw-byte budget for BPB accounting. The sidecar
+    # REPLACES the build_sentencepiece_luts byte-counting path entirely.
+    caseops_enabled = bool(int(os.environ.get("CASEOPS_ENABLED", "0")))
+    _default_caseops_data = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "datasets",
+        "fineweb10B_sp8192_lossless_caps_caseops_v1_reserved",
+    )
+    _default_caseops_tok = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "tokenizers",
+        "fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model",
+    )
+    if caseops_enabled:
+        datasets_dir = os.environ.get("DATA_PATH", _default_caseops_data)
+        tokenizer_path = os.environ.get("TOKENIZER_PATH", _default_caseops_tok)
+    else:
+        datasets_dir = os.environ.get(
+            "DATA_PATH",
+            os.path.join(data_dir, "datasets", f"fineweb10B_sp{vocab_size}"),
+        )
+        tokenizer_path = os.environ.get(
+            "TOKENIZER_PATH",
+            os.path.join(data_dir, "tokenizers", f"fineweb_{vocab_size}_bpe.model"),
+        )
+    train_files = os.path.join(datasets_dir, "fineweb_train_*.bin")
+    val_files = os.path.join(datasets_dir, "fineweb_val_*.bin")
+    val_bytes_files = os.path.join(datasets_dir, "fineweb_val_bytes_*.bin")
+    artifact_dir = os.environ.get("ARTIFACT_DIR", "")
+    logfile = (
+        os.path.join(artifact_dir, f"{run_id}.txt")
+        if artifact_dir
+        else f"logs/{run_id}.txt"
+    )
+    model_path = (
+        os.path.join(artifact_dir, "final_model.pt")
+        if artifact_dir
+        else "final_model.pt"
+    )
+    quantized_model_path = (
+        os.path.join(artifact_dir, "final_model.int6.ptz")
+        if artifact_dir
+        else "final_model.int6.ptz"
+    )
+
+
+# ===== 2026-04-30 SP10240 CaseOps MLP3.75 late045 promoted test car =====
+# Source of truth for this new experiment. The launcher only checks files and
+# calls this run.py; it does not define model or eval conditions.
+TEST_ID = "2026-05-01_pr1855_sp10240_caseops_mlp375_late045_seed0_8x"
+TEST_DATE = "2026-05-01"
+RUN_LABEL = "standard_8x"
+RUN_KIND = "seed_repeat"
+SOURCE_PARENT = "legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py"
+SOURCE_PARENT_SHA256 = "454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b"
+SOURCE_TOKENIZER_LANE = "scripts/prepare_sp10240_caseops_data.py"
+PARENT_RUN = "2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x"
+HYPOTHESIS = (
+    "Seed repeat of the clean SP10240 CaseOps MLP3.75 late045 standard 8x "
+    "submission candidate, changing only seed/run identity."
+)
+SIZE_CAP_BYTES = 16000000
+BUILD_SECONDS = 600
+EVAL_SECONDS = 600
+
+Hyperparameters.test_id = TEST_ID
+Hyperparameters.test_date = TEST_DATE
+Hyperparameters.run_label = RUN_LABEL
+Hyperparameters.run_kind = RUN_KIND
+Hyperparameters.source_parent = SOURCE_PARENT
+Hyperparameters.source_parent_sha256 = SOURCE_PARENT_SHA256
+Hyperparameters.source_tokenizer_lane = SOURCE_TOKENIZER_LANE
+Hyperparameters.parent_run = PARENT_RUN
+Hyperparameters.hypothesis = HYPOTHESIS
+Hyperparameters.size_cap_bytes = SIZE_CAP_BYTES
+Hyperparameters.build_seconds = BUILD_SECONDS
+Hyperparameters.eval_seconds = EVAL_SECONDS
+
+Hyperparameters.data_dir = "/workspace/SOTA_FINAL/data"
+_caseops_root = os.path.join(
+    Hyperparameters.data_dir, "datasets", "fineweb10B_sp10240_caseops", "datasets"
+)
+Hyperparameters.vocab_size = 10240
+Hyperparameters.caseops_enabled = True
+Hyperparameters.datasets_dir = os.path.join(
+    _caseops_root, "datasets", "fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+)
+Hyperparameters.train_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_train_*.bin")
+Hyperparameters.val_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_*.bin")
+Hyperparameters.val_bytes_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_bytes_*.bin")
+Hyperparameters.tokenizer_path = os.path.join(
+    _caseops_root, "tokenizers", "fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model"
+)
+
+Hyperparameters.seed = 0
+Hyperparameters.run_id = "pr1855_sp10240_caseops_mlp375_late045_seed0_8x"
+Hyperparameters.artifact_dir = "logs"
+Hyperparameters.logfile = os.path.join(Hyperparameters.artifact_dir, f"{Hyperparameters.run_id}.txt")
+Hyperparameters.model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.pt")
+Hyperparameters.quantized_model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.int6.ptz")
+Hyperparameters.iterations = 20000
+Hyperparameters.max_wallclock_seconds = float(BUILD_SECONDS)
+Hyperparameters.num_layers = 11
+Hyperparameters.xsa_last_n = 11
+Hyperparameters.model_dim = 512
+Hyperparameters.num_heads = 8
+Hyperparameters.num_kv_heads = 4
+Hyperparameters.mlp_mult = 3.75
+Hyperparameters.num_loops = 2
+Hyperparameters.loop_start = 3
+Hyperparameters.loop_end = 5
+Hyperparameters.enable_looping_at = 0.45
+Hyperparameters.parallel_start_layer = 8
+Hyperparameters.qk_gain_init = 5.25
+Hyperparameters.warmdown_frac = 0.85
+Hyperparameters.warmup_steps = 20
+Hyperparameters.min_lr = 0.1
+Hyperparameters.matrix_lr = 0.026
+Hyperparameters.beta2 = 0.99
+Hyperparameters.muon_backend_steps = 5
+Hyperparameters.grad_clip_norm = 0.3
+Hyperparameters.val_loss_every = 0
+Hyperparameters.ttt_enabled = True
+Hyperparameters.ttt_lora_rank = 80
+Hyperparameters.ttt_chunk_size = 48
+Hyperparameters.ttt_weight_decay = 0.5
+Hyperparameters.ttt_beta2 = 0.99
+Hyperparameters.phased_ttt_prefix_docs = 2500
+Hyperparameters.phased_ttt_num_phases = 3
+Hyperparameters.global_ttt_momentum = 0.9
+Hyperparameters.compressor = "pergroup"
+Hyperparameters.gptq_reserve_seconds = 0.5
+Hyperparameters.gptq_calibration_batches = 16
+Hyperparameters.matrix_bits = 6
+Hyperparameters.embed_bits = 7
+Hyperparameters.mlp_clip_sigmas = 11.5
+Hyperparameters.attn_clip_sigmas = 13.0
+Hyperparameters.embed_clip_sigmas = 14.0
+Hyperparameters.gated_attn_quant_gate = True
+Hyperparameters.sparse_attn_gate_enabled = True
+Hyperparameters.sparse_attn_gate_scale = 0.5
+Hyperparameters.gate_window = 12
+Hyperparameters.smear_gate_enabled = True
+Hyperparameters.lqer_enabled = True
+Hyperparameters.lqer_asym_enabled = True
+Hyperparameters.lqer_rank = 4
+Hyperparameters.lqer_factor_bits = 4
+Hyperparameters.lqer_asym_group = 64
+Hyperparameters.lqer_top_k = 3
+Hyperparameters.fused_ce_enabled = True
+
+_logger_hparams = None
+
+
+def set_logging_hparams(h):
+    global _logger_hparams
+    _logger_hparams = h
+
+
+def log(msg, console=True):
+    if _logger_hparams is None:
+        print(msg)
+        return
+    if _logger_hparams.is_main_process:
+        if console:
+            print(msg)
+        if _logger_hparams.logfile is not None:
+            with open(_logger_hparams.logfile, "a", encoding="utf-8") as f:
+                print(msg, file=f)
+
+
+class ValidationData:
+    def __init__(self, h, device):
+        self.sp = spm.SentencePieceProcessor(model_file=h.tokenizer_path)
+        if int(self.sp.vocab_size()) != h.vocab_size:
+            raise ValueError(
+                f"VOCAB_SIZE={h.vocab_size} does not match tokenizer vocab_size={int(self.sp.vocab_size())}"
+            )
+        self.val_tokens = load_validation_tokens(h.val_files, h.eval_seq_len)
+        self.caseops_enabled = bool(getattr(h, "caseops_enabled", False))
+        if self.caseops_enabled:
+            self.base_bytes_lut = None
+            self.has_leading_space_lut = None
+            self.is_boundary_token_lut = None
+        else:
+            (
+                self.base_bytes_lut,
+                self.has_leading_space_lut,
+                self.is_boundary_token_lut,
+            ) = build_sentencepiece_luts(self.sp, h.vocab_size, device)
+        self.val_bytes = None
+        if self.caseops_enabled:
+            self.val_bytes = load_validation_byte_sidecar(
+                h.val_bytes_files, h.eval_seq_len, self.val_tokens.numel()
+            )
+
+
+def build_sentencepiece_luts(sp, vocab_size, device):
+    sp_vocab_size = int(sp.vocab_size())
+    assert (
+        sp.piece_to_id("▁") != sp.unk_id()
+    ), "Tokenizer must have '▁' (space) as its own token for correct BPB byte counting"
+    table_size = max(sp_vocab_size, vocab_size)
+    base_bytes_np = np.zeros((table_size,), dtype=np.int16)
+    has_leading_space_np = np.zeros((table_size,), dtype=np.bool_)
+    is_boundary_token_np = np.ones((table_size,), dtype=np.bool_)
+    for token_id in range(sp_vocab_size):
+        if sp.is_control(token_id) or sp.is_unknown(token_id) or sp.is_unused(token_id):
+            continue
+        is_boundary_token_np[token_id] = False
+        if sp.is_byte(token_id):
+            base_bytes_np[token_id] = 1
+            continue
+        piece = sp.id_to_piece(token_id)
+        if piece.startswith("▁"):
+            has_leading_space_np[token_id] = True
+            piece = piece[1:]
+        base_bytes_np[token_id] = len(piece.encode("utf-8"))
+    return (
+        torch.tensor(base_bytes_np, dtype=torch.int16, device=device),
+        torch.tensor(has_leading_space_np, dtype=torch.bool, device=device),
+        torch.tensor(is_boundary_token_np, dtype=torch.bool, device=device),
+    )
+
+
+def load_validation_tokens(pattern, seq_len):
+    # Filter out CaseOps byte sidecar shards which share the val_*.bin glob.
+    files = [
+        Path(p)
+        for p in sorted(glob.glob(pattern))
+        if "_bytes_" not in Path(p).name
+    ]
+    if not files:
+        raise FileNotFoundError(f"No files found for pattern: {pattern}")
+    tokens = torch.cat([load_data_shard(file) for file in files]).contiguous()
+    usable = (tokens.numel() - 1) // seq_len * seq_len
+    if usable <= 0:
+        raise ValueError(f"Validation split is too short for TRAIN_SEQ_LEN={seq_len}")
+    return tokens[: usable + 1]
+
+
+def load_validation_byte_sidecar(pattern, seq_len, expected_len):
+    """Load CaseOps per-token byte sidecar(s). Same shard layout as token shards
+    (256 int32 header + uint16 array). Each entry = canonical raw-text byte
+    budget for that token in the corresponding val shard. Returns a CPU
+    int16 tensor sliced to match expected_len (i.e. val_tokens length)."""
+    files = [Path(p) for p in sorted(glob.glob(pattern))]
+    if not files:
+        raise FileNotFoundError(f"No byte sidecar files for pattern: {pattern}")
+    shards = [load_data_shard(file) for file in files]
+    # load_data_shard returns uint16 — that's exactly what the sidecar stores.
+    bytes_full = torch.cat(shards).contiguous()
+    if bytes_full.numel() < expected_len:
+        raise ValueError(
+            f"Byte sidecar too short: {bytes_full.numel()} < val_tokens {expected_len}"
+        )
+    return bytes_full[:expected_len].to(torch.int32)
+
+
+def load_data_shard(file):
+    header_bytes = 256 * np.dtype("<i4").itemsize
+    token_bytes = np.dtype("<u2").itemsize
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    num_tokens = int(header[2])
+    expected_size = header_bytes + num_tokens * token_bytes
+    if file.stat().st_size != expected_size:
+        raise ValueError(
+            f"Shard size mismatch for {file}: expected {expected_size} bytes"
+        )
+    tokens_np = np.fromfile(file, dtype="<u2", count=num_tokens, offset=header_bytes)
+    if tokens_np.size != num_tokens:
+        raise ValueError(f"Short read for {file}")
+    return torch.from_numpy(tokens_np.astype(np.uint16, copy=False))
+
+
+_SHARD_HEADER_BYTES = 256 * np.dtype("<i4").itemsize
+_SHARD_NTOKENS_CACHE = {}
+_MMAP_CACHE = {}
+
+
+def _read_num_tokens(file):
+    key = str(file)
+    cached = _SHARD_NTOKENS_CACHE.get(key)
+    if cached is not None:
+        return cached
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    n = int(header[2])
+    _SHARD_NTOKENS_CACHE[key] = n
+    return n
+
+
+def _get_shard_memmap(file):
+    key = str(file)
+    mm = _MMAP_CACHE.get(key)
+    if mm is not None:
+        return mm
+    n = _read_num_tokens(file)
+    mm = np.memmap(file, mode="r", dtype="<u2", offset=_SHARD_HEADER_BYTES, shape=(n,))
+    _MMAP_CACHE[key] = mm
+    return mm
+
+
+BOS_ID = None
+
+
+def get_next_multiple_of_n(v, n):
+    return ((v + n - 1) // n) * n
+
+
+def _build_cu_seqlens(bos_pos, total_len, device, max_doc_len=0, bucket_size=64):
+    if not bos_pos or bos_pos[0] != 0:
+        bos_pos = [0] + bos_pos
+    seg_starts = []
+    starts_with_end = bos_pos + [total_len]
+    for i in range(len(starts_with_end) - 1):
+        start = starts_with_end[i]
+        end = starts_with_end[i + 1]
+        if max_doc_len > 0:
+            pos = start
+            while pos < end:
+                seg_starts.append(pos)
+                pos += max_doc_len
+        else:
+            seg_starts.append(start)
+    boundaries = seg_starts + [total_len]
+    padded_len = get_next_multiple_of_n(len(boundaries), bucket_size)
+    cu = torch.full((padded_len,), total_len, dtype=torch.int32, device=device)
+    cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+    seg_ends = seg_starts[1:] + [total_len]
+    max_seqlen = max(end - start for start, end in zip(seg_starts, seg_ends))
+    return cu, max_seqlen
+
+class DocumentPackingLoader:
+    _shard_pool = ThreadPoolExecutor(1)
+
+    def __init__(self, h, device, cu_bucket_size=64):
+        self.rank = h.rank
+        self.world_size = h.world_size
+        self.device = device
+        self.cu_bucket_size = cu_bucket_size
+        self.max_seq_len = h.train_seq_len
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files
+        self.file_iter = iter(self.files)
+        self._init_shard(load_data_shard(next(self.file_iter)))
+        self._next_shard = self._submit_next_shard()
+        self._batch_pool = ThreadPoolExecutor(1)
+        self._prefetch_queue = []
+
+    def _init_shard(self, tokens):
+        global BOS_ID
+        self.tokens = tokens
+        self.shard_size = tokens.numel()
+        if BOS_ID is None:
+            BOS_ID = 1
+        self.bos_idx = (
+            (tokens == BOS_ID).nonzero(as_tuple=True)[0].to(torch.int64).cpu().numpy()
+        )
+        self.cursor = int(self.bos_idx[0])
+
+    def _submit_next_shard(self):
+        try:
+            path = next(self.file_iter)
+            return self._shard_pool.submit(load_data_shard, path)
+        except StopIteration:
+            return None
+
+    def _advance_shard(self):
+        if self._next_shard is None:
+            self.file_iter = iter(self.files)
+            self._next_shard = self._shard_pool.submit(
+                load_data_shard, next(self.file_iter)
+            )
+        self._init_shard(self._next_shard.result())
+        self._next_shard = self._submit_next_shard()
+
+    def _local_doc_starts(self, local_start, total_len):
+        lo = np.searchsorted(self.bos_idx, local_start, side="left")
+        hi = np.searchsorted(self.bos_idx, local_start + total_len, side="left")
+        return (self.bos_idx[lo:hi] - local_start).tolist()
+
+    def _prepare_batch(self, num_tokens_local, max_seq_len):
+        per_rank_span = num_tokens_local + 1
+        global_span = per_rank_span * self.world_size
+        while self.cursor + global_span > self.shard_size:
+            self._advance_shard()
+        local_start = self.cursor + self.rank * per_rank_span
+        buf = self.tokens[local_start : local_start + per_rank_span]
+        inputs = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        targets = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        inputs.copy_(buf[:-1])
+        targets.copy_(buf[1:])
+        starts = self._local_doc_starts(local_start, inputs.numel())
+        cu_seqlens, max_seqlen = _build_cu_seqlens(
+            starts, inputs.numel(), inputs.device, max_seq_len, self.cu_bucket_size
+        )
+        cu_seqlens = cu_seqlens.pin_memory()
+        self.cursor += global_span
+        return inputs, targets, cu_seqlens, max_seqlen
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        num_tokens_local = global_tokens // (self.world_size * grad_accum_steps)
+        while len(self._prefetch_queue) < 2:
+            self._prefetch_queue.append(
+                self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        inputs, targets, cu_seqlens, max_seqlen = self._prefetch_queue.pop(0).result()
+        self._prefetch_queue.append(
+            self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        return (
+            inputs[None].to(self.device, non_blocking=True),
+            targets[None].to(self.device, non_blocking=True),
+            cu_seqlens.to(self.device, non_blocking=True),
+            max_seqlen,
+        )
+
+
+class ShuffledSequenceLoader:
+    def __init__(self, h, device):
+        self.world_size = h.world_size
+        self.seq_len = h.train_seq_len
+        self.device = device
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files[h.rank :: h.world_size]
+        self.rng = np.random.Generator(np.random.PCG64(h.rank))
+        self.num_tokens = [_read_num_tokens(f) for f in self.files]
+        self.start_inds = [[] for _ in self.files]
+        for si in range(len(self.files)):
+            self._reset_shard(si)
+
+    def _reset_shard(self, si):
+        max_phase = min(
+            self.seq_len - 1, max(0, self.num_tokens[si] - self.seq_len - 1)
+        )
+        phase = int(self.rng.integers(max_phase + 1)) if max_phase > 0 else 0
+        num_sequences = (self.num_tokens[si] - 1 - phase) // self.seq_len
+        sequence_order = self.rng.permutation(num_sequences)
+        self.start_inds[si] = (phase + sequence_order * self.seq_len).tolist()
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        device_tokens = global_tokens // (self.world_size * grad_accum_steps)
+        device_batch_size = device_tokens // self.seq_len
+        remaining = np.array([len(s) for s in self.start_inds], dtype=np.float64)
+        x = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        y = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        for bi in range(device_batch_size):
+            total = remaining.sum()
+            if total <= 0:
+                for si in range(len(self.files)):
+                    self._reset_shard(si)
+                remaining = np.array(
+                    [len(s) for s in self.start_inds], dtype=np.float64
+                )
+                total = remaining.sum()
+            probs = remaining / total
+            si = int(self.rng.choice(len(self.files), p=probs))
+            start_ind = self.start_inds[si].pop()
+            remaining[si] -= 1
+            mm = _get_shard_memmap(self.files[si])
+            window = torch.as_tensor(
+                np.array(mm[start_ind : start_ind + self.seq_len + 1], dtype=np.int64)
+            )
+            x[bi] = window[:-1]
+            y[bi] = window[1:]
+        return x.to(self.device, non_blocking=True), y.to(
+            self.device, non_blocking=True
+        )
+
+
+class RMSNorm(nn.Module):
+    def __init__(self, eps=None):
+        super().__init__()
+        self.eps = eps
+
+    def forward(self, x):
+        return F.rms_norm(x, (x.size(-1),), eps=self.eps)
+
+
+class CastedLinear(nn.Linear):
+    def forward(self, x):
+        w = self.weight.to(x.dtype)
+        bias = self.bias.to(x.dtype) if self.bias is not None else None
+        return F.linear(x, w, bias)
+
+
+@triton.jit
+def linear_leaky_relu_square_kernel(
+    a_desc,
+    b_desc,
+    c_desc,
+    aux_desc,
+    M,
+    N,
+    K,
+    BLOCK_SIZE_M: tl.constexpr,
+    BLOCK_SIZE_N: tl.constexpr,
+    BLOCK_SIZE_K: tl.constexpr,
+    NUM_SMS: tl.constexpr,
+    FORWARD: tl.constexpr,
+):
+    dtype = tl.bfloat16
+    start_pid = tl.program_id(axis=0)
+    num_pid_m = tl.cdiv(M, BLOCK_SIZE_M)
+    num_pid_n = tl.cdiv(N, BLOCK_SIZE_N)
+    k_tiles = tl.cdiv(K, BLOCK_SIZE_K)
+    num_tiles = num_pid_m * num_pid_n
+    tile_id_c = start_pid - NUM_SMS
+    for tile_id in tl.range(start_pid, num_tiles, NUM_SMS, flatten=True):
+        pid_m = tile_id // num_pid_n
+        pid_n = tile_id % num_pid_n
+        offs_am = pid_m * BLOCK_SIZE_M
+        offs_bn = pid_n * BLOCK_SIZE_N
+        accumulator = tl.zeros((BLOCK_SIZE_M, BLOCK_SIZE_N), dtype=tl.float32)
+        for ki in range(k_tiles):
+            offs_k = ki * BLOCK_SIZE_K
+            a = a_desc.load([offs_am, offs_k])
+            b = b_desc.load([offs_bn, offs_k])
+            accumulator = tl.dot(a, b.T, accumulator)
+        tile_id_c += NUM_SMS
+        offs_am_c = offs_am
+        offs_bn_c = offs_bn
+        acc = tl.reshape(accumulator, (BLOCK_SIZE_M, 2, BLOCK_SIZE_N // 2))
+        acc = tl.permute(acc, (0, 2, 1))
+        acc0, acc1 = tl.split(acc)
+        c0 = acc0.to(dtype)
+        c1 = acc1.to(dtype)
+        if not FORWARD:
+            pre0 = aux_desc.load([offs_am_c, offs_bn_c])
+            pre1 = aux_desc.load([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2])
+            c0 = c0 * tl.where(pre0 > 0, 2.0 * pre0, 0.5 * pre0)
+            c1 = c1 * tl.where(pre1 > 0, 2.0 * pre1, 0.5 * pre1)
+        c_desc.store([offs_am_c, offs_bn_c], c0)
+        c_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], c1)
+        if FORWARD:
+            aux0 = tl.where(c0 > 0, c0, 0.5 * c0)
+            aux1 = tl.where(c1 > 0, c1, 0.5 * c1)
+            aux_desc.store([offs_am_c, offs_bn_c], aux0 * aux0)
+            aux_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], aux1 * aux1)
+
+
+def linear_leaky_relu_square(a, b, aux=None):
+    M, K = a.shape
+    N, K2 = b.shape
+    assert K == K2
+    c = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    forward = aux is None
+    if aux is None:
+        aux = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    num_sms = torch.cuda.get_device_properties(a.device).multi_processor_count
+    BLOCK_SIZE_M, BLOCK_SIZE_N, BLOCK_SIZE_K = 256, 128, 64
+    num_stages = 4 if forward else 3
+    a_desc = TensorDescriptor.from_tensor(a, [BLOCK_SIZE_M, BLOCK_SIZE_K])
+    b_desc = TensorDescriptor.from_tensor(b, [BLOCK_SIZE_N, BLOCK_SIZE_K])
+    c_desc = TensorDescriptor.from_tensor(c, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    aux_desc = TensorDescriptor.from_tensor(aux, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    grid = lambda _meta: (
+        min(num_sms, triton.cdiv(M, BLOCK_SIZE_M) * triton.cdiv(N, BLOCK_SIZE_N)),
+    )
+    linear_leaky_relu_square_kernel[grid](
+        a_desc,
+        b_desc,
+        c_desc,
+        aux_desc,
+        M,
+        N,
+        K,
+        BLOCK_SIZE_M=BLOCK_SIZE_M,
+        BLOCK_SIZE_N=BLOCK_SIZE_N,
+        BLOCK_SIZE_K=BLOCK_SIZE_K,
+        NUM_SMS=num_sms,
+        FORWARD=forward,
+        num_stages=num_stages,
+        num_warps=8,
+    )
+    if forward:
+        return c, aux
+    return c
+
+
+class FusedLinearLeakyReLUSquareFunction(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, x, w1, w2):
+        x_flat = x.reshape(-1, x.shape[-1])
+        pre, post = linear_leaky_relu_square(x_flat, w1)
+        out = F.linear(post, w2)
+        ctx.save_for_backward(x, w1, w2, pre, post)
+        return out.view(*x.shape[:-1], out.shape[-1])
+
+    @staticmethod
+    def backward(ctx, grad_output):
+        x, w1, w2, pre, post = ctx.saved_tensors
+        x_flat = x.reshape(-1, x.shape[-1])
+        grad_output_flat = grad_output.reshape(-1, grad_output.shape[-1])
+        dw2 = grad_output_flat.T @ post
+        dpre = linear_leaky_relu_square(grad_output_flat, w2.T.contiguous(), aux=pre)
+        dw1 = dpre.T @ x_flat
+        dx = dpre @ w1
+        return dx.view_as(x), dw1, dw2
+
+
+FusedLeakyReLUSquareMLP = FusedLinearLeakyReLUSquareFunction.apply
+
+
+class Rotary(nn.Module):
+    def __init__(self, dim, base=1e4, train_seq_len=1024, rope_dims=0, yarn=True):
+        super().__init__()
+        self.dim = dim
+        self.base = base
+        self.train_seq_len = train_seq_len
+        self.yarn = yarn
+        self.rope_dims = rope_dims if rope_dims > 0 else dim
+        inv_freq = 1.0 / base ** (
+            torch.arange(0, self.rope_dims, 2, dtype=torch.float32) / self.rope_dims
+        )
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        self._seq_len_cached = 0
+        self._cos_cached = None
+        self._sin_cached = None
+
+    def forward(self, seq_len, device, dtype):
+        if (
+            self._cos_cached is None
+            or self._sin_cached is None
+            or self._seq_len_cached < seq_len
+            or self._cos_cached.device != device
+        ):
+            rd = self.rope_dims
+            if self.yarn and seq_len > self.train_seq_len:
+                scale = seq_len / self.train_seq_len
+                new_base = self.base * scale ** (rd / (rd - 2))
+                inv_freq = 1.0 / new_base ** (
+                    torch.arange(0, rd, 2, dtype=torch.float32, device=device) / rd
+                )
+            else:
+                inv_freq = self.inv_freq.float().to(device)
+            t = torch.arange(seq_len, device=device, dtype=torch.float32)
+            freqs = torch.outer(t, inv_freq)
+            self._cos_cached = freqs.cos()[None, :, None, :]
+            self._sin_cached = freqs.sin()[None, :, None, :]
+            self._seq_len_cached = seq_len
+        return self._cos_cached[:, :seq_len].to(dtype=dtype), self._sin_cached[:, :seq_len].to(dtype=dtype)
+
+
+def apply_rotary_emb(x, cos, sin, rope_dims=0):
+    if rope_dims > 0 and rope_dims < x.size(-1):
+        x_rope, x_pass = x[..., :rope_dims], x[..., rope_dims:]
+        half = rope_dims // 2
+        x1, x2 = x_rope[..., :half], x_rope[..., half:]
+        x_rope = torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+        return torch.cat((x_rope, x_pass), dim=-1)
+    half = x.size(-1) // 2
+    x1, x2 = x[..., :half], x[..., half:]
+    return torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+
+
+class CausalSelfAttention(nn.Module):
+    def __init__(
+        self, dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=True,
+        attn_out_gate=False, attn_out_gate_src="proj", gate_window=12,
+        gated_attn=False, gated_attn_init_std=0.01,
+        sparse_attn_gate=False, sparse_attn_gate_init_std=0.0, sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        if dim % num_heads != 0:
+            raise ValueError("model_dim must be divisible by num_heads")
+        if num_heads % num_kv_heads != 0:
+            raise ValueError("num_heads must be divisible by num_kv_heads")
+        if int(attn_out_gate) + int(gated_attn) + int(sparse_attn_gate) > 1:
+            raise ValueError(
+                "attn_out_gate, gated_attn, and sparse_attn_gate are mutually exclusive"
+            )
+        self.num_heads = num_heads
+        self.num_kv_heads = num_kv_heads
+        self.head_dim = dim // num_heads
+        if self.head_dim % 2 != 0:
+            raise ValueError("head_dim must be even for RoPE")
+        self.q_gain = nn.Parameter(
+            torch.full((num_heads,), qk_gain_init, dtype=torch.float32)
+        )
+        self.rope_dims = 0
+        self.rotary = Rotary(self.head_dim, base=rope_base, train_seq_len=train_seq_len, yarn=yarn)
+        self.use_xsa = False
+        # AttnOutGate (PR #1667 MarioPaerle): per-head multiplicative gate on attention
+        # output. CastedLinear so restore_fp32_params casts back to fp32 for GPTQ.
+        # _zero_init -> 2*sigmoid(0)=1 -> transparent at init.
+        self.attn_out_gate = attn_out_gate
+        self.attn_out_gate_src = attn_out_gate_src
+        self.gate_window = gate_window
+        if attn_out_gate:
+            self.attn_gate_proj = CastedLinear(gate_window, num_heads, bias=False)
+            self.attn_gate_proj._zero_init = True
+        # Gated Attention (arXiv:2505.06708, Qwen, NeurIPS 2025). Per-head sigmoid
+        # gate on SDPA output, BEFORE out_proj. Gate projection W_g: (num_heads, dim).
+        # Name "attn_gate_w" contains "attn_gate" substring so it matches
+        # CONTROL_TENSOR_NAME_PATTERNS and routes to the scalar AdamW group.
+        # fp32 Parameter -> restore_fp32_params path covers it via the ndim<2 OR
+        # name-pattern check (name matches "attn_gate"). Cast to x.dtype on use.
+        self.gated_attn = gated_attn
+        if gated_attn:
+            W = torch.empty(num_heads, dim, dtype=torch.float32)
+            nn.init.normal_(W, mean=0.0, std=gated_attn_init_std)
+            self.attn_gate_w = nn.Parameter(W)
+        # Sparse attention head-output gate (modded-nanogpt style). Keeps dense SDPA
+        # and only narrows the gate input to the first gate_window residual dims.
+        # W_g: (num_heads, gate_window). y_{t,h} <- sigmoid(scale * W_g_h @ x_t[:gate_window]) * y_{t,h}.
+        # Shares attn_gate_w name with dense GatedAttn so the quant routing
+        # (CONTROL_TENSOR_NAME_PATTERNS / attn_gate_w int8 passthrough) is unchanged.
+        self.sparse_attn_gate = sparse_attn_gate
+        self.sparse_attn_gate_scale = sparse_attn_gate_scale
+        if sparse_attn_gate:
+            W = torch.empty(num_heads, gate_window, dtype=torch.float32)
+            if sparse_attn_gate_init_std > 0:
+                nn.init.normal_(W, mean=0.0, std=sparse_attn_gate_init_std)
+            else:
+                nn.init.zeros_(W)
+            self.attn_gate_w = nn.Parameter(W)
+
+    def _xsa_efficient(self, y, v):
+        B, T, H, D = y.shape
+        Hkv = v.size(-2)
+        group = H // Hkv
+        y_g = y.reshape(B, T, Hkv, group, D)
+        vn = F.normalize(v, dim=-1).unsqueeze(-2)
+        proj = (y_g * vn).sum(dim=-1, keepdim=True) * vn
+        return (y_g - proj).reshape(B, T, H, D)
+
+    def forward(self, x, q_w, k_w, v_w, out_w, cu_seqlens=None, max_seqlen=0):
+        bsz, seqlen, dim = x.shape
+        # q_raw kept around as a tap point for attn_out_gate_src='q' (post-projection,
+        # pre-reshape, pre-RoPE).
+        q_raw = F.linear(x, q_w.to(x.dtype))
+        q = q_raw.reshape(bsz, seqlen, self.num_heads, self.head_dim)
+        k = F.linear(x, k_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        v = F.linear(x, v_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = self.rotary(seqlen, x.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, self.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, self.rope_dims)
+        q = q * self.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        if cu_seqlens is not None:
+            y = flash_attn_varlen_func(
+                q[0],
+                k[0],
+                v[0],
+                cu_seqlens_q=cu_seqlens,
+                cu_seqlens_k=cu_seqlens,
+                max_seqlen_q=max_seqlen,
+                max_seqlen_k=max_seqlen,
+                causal=True,
+                window_size=(-1, -1),
+            )[None]
+        else:
+            y = flash_attn_3_func(q, k, v, causal=True)
+        if self.use_xsa:
+            y = self._xsa_efficient(y, v)
+        # AttnOutGate inlined (PR #1667). Inline + .contiguous() barrier so torch.compile
+        # fullgraph=True is happy (this avoids the @torch.compiler.disable trap that
+        # crashed gates v3). Per-head gate on (B,T,H,D) tensor: g shape [B,T,H], broadcast
+        # over D via [..., None]. zero-init weight -> 2*sigmoid(0)=1 -> transparent.
+        if self.attn_out_gate:
+            gate_src = q_raw if self.attn_out_gate_src == "q" else x
+            gate_in = gate_src[..., : self.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(self.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (arXiv:2505.06708 G1). Inline + .contiguous() barrier so
+        # torch.compile fullgraph=True is happy. Per-head gate on (B,T,H,D): g shape
+        # [B,T,H], broadcast over D via [..., None]. Paper: g = sigmoid(x @ W_g.T)
+        # where W_g: (H, dim). .to(x.dtype) on fp32 param before broadcast with bf16.
+        if self.gated_attn:
+            x_c = x.contiguous()
+            g = torch.sigmoid(F.linear(x_c, self.attn_gate_w.to(x.dtype)))
+            y = y * g[..., None]
+        # Sparse head-output gate: narrower (gate_window) input, same shape g as GatedAttn.
+        if self.sparse_attn_gate:
+            gate_in = x[..., : self.gate_window].contiguous()
+            g = torch.sigmoid(
+                self.sparse_attn_gate_scale
+                * F.linear(gate_in, self.attn_gate_w.to(x.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        self._last_proj_input = y.detach() if getattr(self, "_calib", False) else None
+        return F.linear(y, out_w.to(x.dtype))
+
+
+class MLP(nn.Module):
+    def __init__(self, dim, mlp_mult):
+        super().__init__()
+        self.use_fused = True
+
+    def forward(self, x, up_w, down_w):
+        if self.training and self.use_fused:
+            return FusedLeakyReLUSquareMLP(x, up_w.to(x.dtype), down_w.to(x.dtype))
+        hidden = F.leaky_relu(F.linear(x, up_w.to(x.dtype)), negative_slope=0.5).square()
+        self._last_down_input = hidden.detach() if getattr(self, "_calib", False) else None
+        return F.linear(hidden, down_w.to(x.dtype))
+
+
+class Block(nn.Module):
+    def __init__(
+        self,
+        dim,
+        num_heads,
+        num_kv_heads,
+        mlp_mult,
+        rope_base,
+        qk_gain_init,
+        train_seq_len,
+        layer_idx=0,
+        ln_scale=False,
+        yarn=True,
+        attn_out_gate=False,
+        attn_out_gate_src="proj",
+        gate_window=12,
+        gated_attn=False,
+        gated_attn_init_std=0.01,
+        sparse_attn_gate=False,
+        sparse_attn_gate_init_std=0.0,
+        sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        self.attn_norm = RMSNorm()
+        self.mlp_norm = RMSNorm()
+        self.attn = CausalSelfAttention(
+            dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=yarn,
+            attn_out_gate=attn_out_gate, attn_out_gate_src=attn_out_gate_src, gate_window=gate_window,
+            gated_attn=gated_attn, gated_attn_init_std=gated_attn_init_std,
+            sparse_attn_gate=sparse_attn_gate,
+            sparse_attn_gate_init_std=sparse_attn_gate_init_std,
+            sparse_attn_gate_scale=sparse_attn_gate_scale,
+        )
+        self.mlp = MLP(dim, mlp_mult)
+        self.attn_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.mlp_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.resid_mix = nn.Parameter(
+            torch.stack((torch.ones(dim), torch.zeros(dim))).float()
+        )
+        self.ln_scale_factor = 1.0 / math.sqrt(layer_idx + 1) if ln_scale else 1.0
+
+    def forward(self, x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=None, max_seqlen=0):
+        mix = self.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        attn_out = self.attn(
+            self.attn_norm(x_in) * self.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens,
+            max_seqlen=max_seqlen,
+        )
+        x_out = x_in + self.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        x_out = x_out + self.mlp_scale.to(dtype=x_out.dtype)[
+            None, None, :
+        ] * self.mlp(self.mlp_norm(x_out) * self.ln_scale_factor, up_w, down_w)
+        return x_out
+
+class GPT(nn.Module):
+    def __init__(self, h):
+        super().__init__()
+        if h.logit_softcap <= 0.0:
+            raise ValueError(f"logit_softcap must be positive, got {h.logit_softcap}")
+        self.tie_embeddings = h.tie_embeddings
+        self.tied_embed_init_std = h.tied_embed_init_std
+        self.logit_softcap = h.logit_softcap
+        self.fused_ce_enabled = bool(h.fused_ce_enabled)
+        self.tok_emb = nn.Embedding(h.vocab_size, h.model_dim)
+        self.num_layers = h.num_layers
+        head_dim = h.model_dim // h.num_heads
+        kv_dim = h.num_kv_heads * head_dim
+        hidden_dim = int(h.mlp_mult * h.model_dim)
+        self.qo_bank = nn.Parameter(torch.empty(2 * h.num_layers, h.model_dim, h.model_dim))
+        self.kv_bank = nn.Parameter(torch.empty(2 * h.num_layers, kv_dim, h.model_dim))
+        self.mlp_up_bank = nn.Parameter(torch.empty(h.num_layers, hidden_dim, h.model_dim))
+        self.mlp_down_bank = nn.Parameter(torch.empty(h.num_layers, h.model_dim, hidden_dim))
+        self.num_encoder_layers = h.num_layers // 2
+        self.num_decoder_layers = h.num_layers - self.num_encoder_layers
+        self.blocks = nn.ModuleList(
+            [
+                Block(
+                    h.model_dim,
+                    h.num_heads,
+                    h.num_kv_heads,
+                    h.mlp_mult,
+                    h.rope_base,
+                    h.qk_gain_init,
+                    h.train_seq_len,
+                    layer_idx=i,
+                    ln_scale=h.ln_scale,
+                    yarn=h.rope_yarn,
+                    attn_out_gate=h.attn_out_gate_enabled,
+                    attn_out_gate_src=h.attn_out_gate_src,
+                    gate_window=h.gate_window,
+                    gated_attn=h.gated_attn_enabled,
+                    gated_attn_init_std=h.gated_attn_init_std,
+                    sparse_attn_gate=h.sparse_attn_gate_enabled,
+                    sparse_attn_gate_init_std=h.sparse_attn_gate_init_std,
+                    sparse_attn_gate_scale=h.sparse_attn_gate_scale,
+                )
+                for i in range(h.num_layers)
+            ]
+        )
+        if h.rope_dims > 0:
+            head_dim = h.model_dim // h.num_heads
+            for block in self.blocks:
+                block.attn.rope_dims = h.rope_dims
+                block.attn.rotary = Rotary(
+                    head_dim,
+                    base=h.rope_base,
+                    train_seq_len=h.train_seq_len,
+                    rope_dims=h.rope_dims,
+                    yarn=h.rope_yarn,
+                )
+        self.final_norm = RMSNorm()
+        self.lm_head = (
+            None
+            if h.tie_embeddings
+            else CastedLinear(h.model_dim, h.vocab_size, bias=False)
+        )
+        if self.lm_head is not None:
+            self.lm_head._zero_init = True
+        if h.xsa_last_n > 0:
+            for i in range(max(0, h.num_layers - h.xsa_last_n), h.num_layers):
+                self.blocks[i].attn.use_xsa = True
+        self.looping_active = False
+        if h.num_loops > 0:
+            loop_seg = list(range(h.loop_start, h.loop_end + 1))
+            all_indices = list(range(h.loop_start))
+            for _ in range(h.num_loops + 1):
+                all_indices.extend(loop_seg)
+            all_indices.extend(range(h.loop_end + 1, h.num_layers))
+            num_enc = len(all_indices) // 2
+            self.encoder_indices = all_indices[:num_enc]
+            self.decoder_indices = all_indices[num_enc:]
+        else:
+            self.encoder_indices = list(range(self.num_encoder_layers))
+            self.decoder_indices = list(range(self.num_encoder_layers, h.num_layers))
+        self.num_skip_weights = min(
+            len(self.encoder_indices), len(self.decoder_indices)
+        )
+        self.skip_weights = nn.Parameter(
+            torch.ones(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+        )
+        self.skip_gates = (
+            nn.Parameter(
+                torch.zeros(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+            )
+            if h.skip_gates_enabled
+            else None
+        )
+        self.parallel_start_layer = h.parallel_start_layer
+        self.parallel_final_lane = h.parallel_final_lane.lower()
+        self.parallel_post_lambdas = nn.Parameter(
+            torch.ones(h.num_layers, 2, 2, dtype=torch.float32)
+        )
+        self.parallel_resid_lambdas = nn.Parameter(
+            torch.full((h.num_layers, 2), 1.1, dtype=torch.float32)
+        )
+        # SmearGate (PR #1667 / modded-nanogpt @classiclarryd):
+        #   x_t <- x_t + lam * sigmoid(W * x_t[:gate_window]) * x_{t-1}.
+        # Per-token forward-1 smear of the embedding lane. W zero-init + lam=0 ->
+        # transparent at init. Uses CastedLinear so restore_fp32_params handles dtype.
+        self.smear_gate_enabled = h.smear_gate_enabled
+        if self.smear_gate_enabled:
+            self.smear_window = h.gate_window
+            self.smear_gate = CastedLinear(self.smear_window, 1, bias=False)
+            self.smear_gate._zero_init = True
+            self.smear_lambda = nn.Parameter(torch.zeros(1, dtype=torch.float32))
+        self._init_weights()
+
+    def _init_weights(self):
+        if self.tie_embeddings:
+            nn.init.normal_(self.tok_emb.weight, mean=0.0, std=self.tied_embed_init_std)
+        n = self.num_layers
+        proj_scale = 1.0 / math.sqrt(2 * n)
+        for i in range(n):
+            nn.init.orthogonal_(self.qo_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.qo_bank.data[n + i])
+            self.qo_bank.data[n + i].mul_(proj_scale)
+            nn.init.orthogonal_(self.kv_bank.data[i], gain=1.0)
+            nn.init.orthogonal_(self.kv_bank.data[n + i], gain=1.0)
+        for i in range(n):
+            nn.init.orthogonal_(self.mlp_up_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.mlp_down_bank.data[i])
+            self.mlp_down_bank.data[i].mul_(proj_scale)
+        for name, module in self.named_modules():
+            if isinstance(module, nn.Linear):
+                if getattr(module, "_zero_init", False):
+                    nn.init.zeros_(module.weight)
+                elif (
+                    module.weight.ndim == 2
+                    and module.weight.shape[0] >= 64
+                    and module.weight.shape[1] >= 64
+                ):
+                    nn.init.orthogonal_(module.weight, gain=1.0)
+
+    def _bank_weights(self, i):
+        n = self.num_layers
+        return (
+            self.qo_bank[i],
+            self.kv_bank[i],
+            self.kv_bank[n + i],
+            self.qo_bank[n + i],
+            self.mlp_up_bank[i],
+            self.mlp_down_bank[i],
+        )
+
+    def _parallel_block(
+        self, block_idx, lane0, lane1, x0,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+        cu_seqlens=None, max_seqlen=0,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        attn_out = block.attn(
+            block.attn_norm(attn_read) * block.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+        )
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * block.mlp(
+            block.mlp_norm(mlp_read) * block.ln_scale_factor, up_w, down_w
+        )
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+    def _final_parallel_hidden(self, lane0, lane1):
+        if self.parallel_final_lane == "mlp":
+            return lane1
+        if self.parallel_final_lane == "attn":
+            return lane0
+        return 0.5 * (lane0 + lane1)
+
+    def _forward_hidden(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        """Run the encoder/decoder stack to the final RMSNorm; returns pre-projection hidden.
+        Shared by eval (softcap+projection via forward_logits) and train (fused CE path)."""
+        x = self.tok_emb(input_ids)
+        # SmearGate (PR #1667). lam=0 + W=0 -> identity at init.
+        # Cross-doc leak fix: zero the prev-token smear at any position whose current token
+        # is BOS, so the BOS embedding starting doc N+1 in a packed stream is not
+        # contaminated by doc N's last token (audited issue on PR#1797 base).
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else range(self.num_encoder_layers)
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else range(
+                self.num_encoder_layers,
+                self.num_encoder_layers + self.num_decoder_layers,
+            )
+        )
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block(
+                    i, lane0, lane1, x0, q_w, k_w, v_w, out_w, up_w, down_w,
+                    cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        return x
+
+    def _project_logits(self, hidden):
+        if self.tie_embeddings:
+            return F.linear(hidden, self.tok_emb.weight)
+        return self.lm_head(hidden)
+
+    def forward_logits(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        return self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+
+    def forward(self, input_ids, target_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        flat_targets = target_ids.reshape(-1)
+        # Fused softcapped-CE kernel (training path only). Applies softcap inside the
+        # Triton kernel; takes pre-softcap logits_proj. Non-fused path matches stock
+        # PR-1736 numerics exactly (softcap in fp32, then F.cross_entropy on fp32).
+        if self.fused_ce_enabled:
+            return softcapped_cross_entropy(
+                logits_proj.reshape(-1, logits_proj.size(-1)),
+                flat_targets,
+                self.logit_softcap,
+                reduction="mean",
+            )
+        logits = self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+        return F.cross_entropy(
+            logits.reshape(-1, logits.size(-1)).float(),
+            flat_targets,
+            reduction="mean",
+        )
+
+    def forward_ttt(self, input_ids, target_ids, lora):
+        x = self.tok_emb(input_ids)
+        # SmearGate on the TTT path — same inline compute as forward_logits.
+        # Cross-doc leak fix: see _forward_hidden comment.
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else list(range(self.num_encoder_layers))
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else list(
+                range(
+                    self.num_encoder_layers,
+                    self.num_encoder_layers + self.num_decoder_layers,
+                )
+            )
+        )
+        slot = 0
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block_with_lora(
+                    i, lane0, lane1, x0, lora, slot,
+                    q_w, k_w, v_w, out_w, up_w, down_w,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        if self.tie_embeddings:
+            logits = F.linear(x, self.tok_emb.weight)
+        else:
+            logits = self.lm_head(x)
+        logits = logits + lora.lm_head_lora(x)
+        logits = self.logit_softcap * torch.tanh(logits / self.logit_softcap)
+        bsz, sl, V = logits.shape
+        return F.cross_entropy(
+            logits.float().reshape(-1, V), target_ids.reshape(-1), reduction="none"
+        ).reshape(bsz, sl)
+
+    def _block_with_lora(self, block, x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w):
+        mix = block.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        n = block.attn_norm(x_in) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        # Keep raw Q for AttnOutGate src='q' (matches forward path semantics).
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT path) — inline + .contiguous() barrier, same as the eval path.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT path). Gate input is n (post-norm block input), same
+        # as eval path. .to(n.dtype) on fp32 param before bf16 broadcast.
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT path) — must match the eval path in
+        # forward() exactly, else training (which applied the gate) and TTT eval (which
+        # skipped it) produce mismatched representations and catastrophic BPB regression.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        x_out = x_in + block.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        mlp_n = block.mlp_norm(x_out) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        x_out = x_out + block.mlp_scale.to(dtype=x_out.dtype)[None, None, :] * mlp_out
+        return x_out
+
+    def _parallel_block_with_lora(
+        self, block_idx, lane0, lane1, x0, lora, slot,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        n = block.attn_norm(attn_read) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT parallel path) — inline + .contiguous() barrier.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT parallel path). Gate input is n (post-norm block input).
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT parallel path) — must match the
+        # eval path in forward() to keep train/eval semantics in sync.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_n = block.mlp_norm(mlp_read) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * mlp_out
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+
+class BatchedLinearLoRA(nn.Module):
+    # PR-1767: rank-scaled output (alpha/rank), like standard LoRA. Decouples
+    # effective magnitude from rank so changing rank does not change LR scale.
+    _ALPHA = float(os.environ.get("TTT_LORA_ALPHA", "144"))
+    # PR-1767: optionally keep A warm across per-doc resets (only B is zeroed).
+    # Accumulates useful feature directions across documents within a TTT phase.
+    _WARM_START_A = bool(int(os.environ.get("TTT_WARM_START_A", "1")))
+
+    def __init__(self, bsz, in_features, out_features, rank):
+        super().__init__()
+        self._bound = 1.0 / math.sqrt(in_features)
+        self._scale = self._ALPHA / rank
+        self.A = nn.Parameter(
+            torch.empty(bsz, rank, in_features).uniform_(-self._bound, self._bound)
+        )
+        self.B = nn.Parameter(torch.zeros(bsz, out_features, rank))
+
+    def reset(self):
+        with torch.no_grad():
+            if not self._WARM_START_A:
+                self.A.uniform_(-self._bound, self._bound)
+            self.B.zero_()
+
+    def forward(self, x):
+        return ((x @ self.A.transpose(1, 2)) @ self.B.transpose(1, 2)) * self._scale
+
+
+class BatchedTTTLoRA(nn.Module):
+    def __init__(self, bsz, model, rank, k_lora=True, mlp_lora=True, o_lora=True):
+        super().__init__()
+        self.bsz = bsz
+        dim = model.qo_bank.shape[-1]
+        vocab = model.tok_emb.num_embeddings
+        if getattr(model, "looping_active", False):
+            num_slots = len(model.encoder_indices) + len(model.decoder_indices)
+        else:
+            num_slots = len(model.blocks)
+        kv_dim = model.blocks[0].attn.num_kv_heads * (
+            dim // model.blocks[0].attn.num_heads
+        )
+        embed_dim = model.tok_emb.embedding_dim
+        self.lm_head_lora = BatchedLinearLoRA(bsz, embed_dim, vocab, rank)
+        self.q_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+        )
+        self.v_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+        )
+        self.k_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+            )
+            if k_lora
+            else None
+        )
+        self.mlp_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if mlp_lora
+            else None
+        )
+        self.o_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if o_lora
+            else None
+        )
+
+    def reset(self):
+        with torch.no_grad():
+            self.lm_head_lora.reset()
+            for loras in [self.q_loras, self.v_loras, self.k_loras,
+                          self.mlp_loras, self.o_loras]:
+                if loras is not None:
+                    for lora in loras:
+                        lora.reset()
+
+
+# Polar Express per-iteration minimax Newton-Schulz coefficients (PR #1344).
+# Replaces the fixed (3.4445, -4.775, 2.0315) coefficients of stock Muon.
+# Applied at backend_steps=5 — taking more than 5 iterations from this list
+# falls back to the final (converged) tuple via the slice guard below.
+_PE_COEFFS = (
+    (8.156554524902461, -22.48329292557795, 15.878769915207462),
+    (4.042929935166739, -2.808917465908714, 0.5000178451051316),
+    (3.8916678022926607, -2.772484153217685, 0.5060648178503393),
+    (3.285753657755655, -2.3681294933425376, 0.46449024233003106),
+    (2.3465413258596377, -1.7097828382687081, 0.42323551169305323),
+)
+
+
+@torch.compile
+def zeropower_via_newtonschulz5(G, steps=10, eps=1e-07):
+    was_2d = G.ndim == 2
+    if was_2d:
+        G = G.unsqueeze(0)
+    X = G.bfloat16()
+    transposed = X.size(-2) > X.size(-1)
+    if transposed:
+        X = X.mT
+    X = X / (X.norm(dim=(-2, -1), keepdim=True) + eps)
+    coeffs = _PE_COEFFS[:steps] if steps <= len(_PE_COEFFS) else _PE_COEFFS
+    for a, b, c in coeffs:
+        A = X @ X.mT
+        B = b * A + c * (A @ A)
+        X = a * X + B @ X
+    if transposed:
+        X = X.mT
+    if was_2d:
+        X = X.squeeze(0)
+    return X
+
+
+class Muon(torch.optim.Optimizer):
+    def __init__(
+        self,
+        params,
+        lr,
+        momentum,
+        backend_steps,
+        nesterov=True,
+        weight_decay=0.0,
+        row_normalize=False,
+    ):
+        super().__init__(
+            params,
+            dict(
+                lr=lr,
+                momentum=momentum,
+                backend_steps=backend_steps,
+                nesterov=nesterov,
+                weight_decay=weight_decay,
+                row_normalize=row_normalize,
+            ),
+        )
+        self._built = False
+
+    def _build(self):
+        self._distributed = dist.is_available() and dist.is_initialized()
+        self._world_size = dist.get_world_size() if self._distributed else 1
+        self._rank = dist.get_rank() if self._distributed else 0
+        ws = self._world_size
+        self._bank_meta = []
+        for group in self.param_groups:
+            for p in group["params"]:
+                B = p.shape[0]
+                padded_B = ((B + ws - 1) // ws) * ws
+                shard_B = padded_B // ws
+                tail = p.shape[1:]
+                dev = p.device
+                self._bank_meta.append({
+                    "p": p,
+                    "B": B,
+                    "padded_grad": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard_mom": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "full_update": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "scale": max(1, p.shape[-2] / p.shape[-1]) ** 0.5,
+                })
+        self._bank_meta.sort(key=lambda m: -m["p"].numel())
+        self._built = True
+
+    def launch_reduce_scatters(self):
+        if not self._built:
+            self._build()
+        if not self._distributed:
+            return
+        self._rs_futures = []
+        for m in self._bank_meta:
+            p = m["p"]
+            if p.grad is None:
+                self._rs_futures.append(None)
+                continue
+            pg = m["padded_grad"]
+            pg[: m["B"]].copy_(p.grad)
+            fut = dist.reduce_scatter_tensor(
+                m["shard"], pg, op=dist.ReduceOp.AVG, async_op=True
+            )
+            self._rs_futures.append(fut)
+
+    @torch.no_grad()
+    def step(self, closure=None):
+        loss = None
+        if closure is not None:
+            with torch.enable_grad():
+                loss = closure()
+        if not self._built:
+            self._build()
+        for group in self.param_groups:
+            lr = group["lr"]
+            momentum = group["momentum"]
+            backend_steps = group["backend_steps"]
+            nesterov = group["nesterov"]
+            wd = group.get("weight_decay", 0.0)
+            row_normalize = group.get("row_normalize", False)
+            prev_ag_handle = None
+            prev_m = None
+            sharded = self._distributed and hasattr(self, "_rs_futures")
+            for idx, m in enumerate(self._bank_meta):
+                p = m["p"]
+                if p.grad is None:
+                    continue
+                if prev_ag_handle is not None:
+                    prev_ag_handle.wait()
+                    pp = prev_m["p"]
+                    upd = prev_m["full_update"][: prev_m["B"]]
+                    if wd > 0.0:
+                        pp.data.mul_(1.0 - lr * wd)
+                    pp.add_(upd, alpha=-lr * prev_m["scale"])
+                if sharded and self._rs_futures[idx] is not None:
+                    self._rs_futures[idx].wait()
+                    g = m["shard"]
+                    buf = m["shard_mom"]
+                else:
+                    g = p.grad.bfloat16()
+                    state = self.state[p]
+                    if "momentum_buffer" not in state:
+                        state["momentum_buffer"] = torch.zeros_like(g)
+                    buf = state["momentum_buffer"]
+                buf.mul_(momentum).add_(g)
+                if nesterov:
+                    update = g.add(buf, alpha=momentum)
+                else:
+                    update = buf
+                if row_normalize:
+                    rn = update.float().norm(dim=-1, keepdim=True).clamp_min(1e-07)
+                    update = update / rn.to(update.dtype)
+                update = zeropower_via_newtonschulz5(update, steps=backend_steps)
+                if sharded:
+                    prev_ag_handle = dist.all_gather_into_tensor(
+                        m["full_update"], update, async_op=True
+                    )
+                    prev_m = m
+                else:
+                    if wd > 0.0:
+                        p.data.mul_(1.0 - lr * wd)
+                    p.add_(update, alpha=-lr * m["scale"])
+            if prev_ag_handle is not None:
+                prev_ag_handle.wait()
+                pp = prev_m["p"]
+                upd = prev_m["full_update"][: prev_m["B"]]
+                if wd > 0.0:
+                    pp.data.mul_(1.0 - lr * wd)
+                pp.add_(upd, alpha=-lr * prev_m["scale"])
+            if hasattr(self, "_rs_futures"):
+                del self._rs_futures
+        return loss
+
+
+CONTROL_TENSOR_NAME_PATTERNS = tuple(
+    pattern
+    for pattern in os.environ.get(
+        "CONTROL_TENSOR_NAME_PATTERNS",
+        "attn_scale,attn_scales,mlp_scale,mlp_scales,resid_mix,resid_mixes,q_gain,skip_weight,skip_weights,skip_gates,parallel_post_lambdas,parallel_resid_lambdas,attn_gate_proj,attn_gate_w,smear_gate,smear_lambda",
+    ).split(",")
+    if pattern
+)
+
+
+PACKED_REPLICATED_GRAD_MAX_NUMEL = 1 << 15
+
+
+class Optimizers:
+    def __init__(self, h, base_model):
+        matrix_params = [
+            base_model.qo_bank,
+            base_model.kv_bank,
+            base_model.mlp_up_bank,
+            base_model.mlp_down_bank,
+        ]
+        block_named_params = list(base_model.blocks.named_parameters())
+        scalar_params = [
+            p
+            for (name, p) in block_named_params
+            if p.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ]
+        if base_model.skip_weights.numel() > 0:
+            scalar_params.append(base_model.skip_weights)
+        if base_model.skip_gates is not None and base_model.skip_gates.numel() > 0:
+            scalar_params.append(base_model.skip_gates)
+        if base_model.parallel_post_lambdas is not None:
+            scalar_params.append(base_model.parallel_post_lambdas)
+        if base_model.parallel_resid_lambdas is not None:
+            scalar_params.append(base_model.parallel_resid_lambdas)
+        # SmearGate params live on GPT root (not in .blocks), so add them by hand.
+        # Both are tiny (gate_window scalars + 1 lambda). Optimized via scalar Adam.
+        if getattr(base_model, "smear_gate_enabled", False):
+            scalar_params.append(base_model.smear_gate.weight)
+            scalar_params.append(base_model.smear_lambda)
+        token_lr = h.tied_embed_lr if h.tie_embeddings else h.embed_lr
+        tok_params = [
+            {"params": [base_model.tok_emb.weight], "lr": token_lr, "base_lr": token_lr}
+        ]
+        self.optimizer_tok = torch.optim.AdamW(
+            tok_params,
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.embed_wd,
+            fused=True,
+        )
+        self.optimizer_muon = Muon(
+            matrix_params,
+            lr=h.matrix_lr,
+            momentum=h.muon_momentum,
+            backend_steps=h.muon_backend_steps,
+            weight_decay=h.muon_wd,
+            row_normalize=h.muon_row_normalize,
+        )
+        for group in self.optimizer_muon.param_groups:
+            group["base_lr"] = h.matrix_lr
+        self.optimizer_scalar = torch.optim.AdamW(
+            [{"params": scalar_params, "lr": h.scalar_lr, "base_lr": h.scalar_lr}],
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.adam_wd,
+            fused=True,
+        )
+        self.optimizers = [
+            self.optimizer_tok,
+            self.optimizer_muon,
+            self.optimizer_scalar,
+        ]
+        self.replicated_params = list(tok_params[0]["params"])
+        self.replicated_params.extend(scalar_params)
+        self.replicated_large_params = []
+        self.replicated_packed_params = []
+        for p in self.replicated_params:
+            if p.numel() <= PACKED_REPLICATED_GRAD_MAX_NUMEL:
+                self.replicated_packed_params.append(p)
+            else:
+                self.replicated_large_params.append(p)
+        self._aux_stream = torch.cuda.Stream()
+
+    def __iter__(self):
+        return iter(self.optimizers)
+
+    def zero_grad_all(self):
+        for opt in self.optimizers:
+            opt.zero_grad(set_to_none=True)
+
+    def _all_reduce_packed_grads(self):
+        grads_by_key = collections.defaultdict(list)
+        for p in self.replicated_packed_params:
+            if p.grad is not None:
+                grads_by_key[(p.grad.device, p.grad.dtype)].append(p.grad)
+        for grads in grads_by_key.values():
+            flat = torch.empty(
+                sum(g.numel() for g in grads),
+                device=grads[0].device,
+                dtype=grads[0].dtype,
+            )
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                flat[offset : offset + n].copy_(g.contiguous().view(-1))
+                offset += n
+            dist.all_reduce(flat, op=dist.ReduceOp.AVG)
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                g.copy_(flat[offset : offset + n].view_as(g))
+                offset += n
+
+    def step(self, distributed=False):
+        self.optimizer_muon.launch_reduce_scatters()
+        if distributed:
+            reduce_handles = [
+                dist.all_reduce(p.grad, op=dist.ReduceOp.AVG, async_op=True)
+                for p in self.replicated_large_params
+                if p.grad is not None
+            ]
+            self._all_reduce_packed_grads()
+            for handle in reduce_handles:
+                handle.wait()
+        self._aux_stream.wait_stream(torch.cuda.current_stream())
+        with torch.cuda.stream(self._aux_stream):
+            self.optimizer_tok.step()
+            self.optimizer_scalar.step()
+        self.optimizer_muon.step()
+        torch.cuda.current_stream().wait_stream(self._aux_stream)
+        self.zero_grad_all()
+
+
+def restore_fp32_params(model):
+    for module in model.modules():
+        if isinstance(module, CastedLinear):
+            module.float()
+    for name, param in model.named_parameters():
+        if (
+            param.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ) and param.dtype != torch.float32:
+            param.data = param.data.float()
+    if hasattr(model, "qo_bank") and model.qo_bank is not None:
+        model.qo_bank.data = model.qo_bank.data.float()
+        model.kv_bank.data = model.kv_bank.data.float()
+    model.mlp_up_bank.data = model.mlp_up_bank.data.float()
+    model.mlp_down_bank.data = model.mlp_down_bank.data.float()
+
+
+def collect_hessians(model, train_loader, h, device, n_calibration_batches=64):
+    hessians = {}
+    hooks = []
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = True
+        block.mlp._calib = True
+        block.mlp.use_fused = False
+
+    def make_attn_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            for suffix in ["c_q", "c_k", "c_v"]:
+                name = f"blocks.{layer_idx}.attn.{suffix}.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            y = module._last_proj_input
+            if y is not None:
+                y = y.float()
+                if y.ndim == 3:
+                    y = y.reshape(-1, y.shape[-1])
+                name = f"blocks.{layer_idx}.attn.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        y.shape[1], y.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(y.T, y)
+        return hook_fn
+
+    def make_mlp_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            name = f"blocks.{layer_idx}.mlp.fc.weight"
+            if name not in hessians:
+                hessians[name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[name].addmm_(x.T, x)
+            h_act = module._last_down_input
+            if h_act is not None:
+                h_act = h_act.float()
+                if h_act.ndim == 3:
+                    h_act = h_act.reshape(-1, h_act.shape[-1])
+                name = f"blocks.{layer_idx}.mlp.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        h_act.shape[1], h_act.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(h_act.T, h_act)
+        return hook_fn
+
+    for i, block in enumerate(model.blocks):
+        hooks.append(block.attn.register_forward_hook(make_attn_hook(i)))
+        hooks.append(block.mlp.register_forward_hook(make_mlp_hook(i)))
+
+    # Hessian hooks for embedding factorization projection layers
+    def make_linear_input_hook(weight_name):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            if weight_name not in hessians:
+                hessians[weight_name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[weight_name].addmm_(x.T, x)
+        return hook_fn
+
+    if model.tie_embeddings:
+        hook_module = model.final_norm
+
+        def make_output_hook(name):
+            def hook_fn(module, inp, out):
+                x = out.detach().float()
+                if x.ndim == 3:
+                    x = x.reshape(-1, x.shape[-1])
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            return hook_fn
+
+        hooks.append(
+            hook_module.register_forward_hook(make_output_hook("tok_emb.weight"))
+        )
+    model.eval()
+    with torch.no_grad():
+        for _ in range(n_calibration_batches):
+            x, _ = train_loader.next_batch(h.train_batch_tokens, h.grad_accum_steps)
+            model.forward_logits(x)
+    for hook in hooks:
+        hook.remove()
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = False
+        block.mlp._calib = False
+        block.mlp.use_fused = True
+    for name in hessians:
+        hessians[name] = hessians[name].cpu() / n_calibration_batches
+    return hessians
+
+
+def gptq_quantize_weight(w, H, clip_sigmas=3.0, clip_range=63, block_size=128):
+    W_orig = w.float().clone()
+    rows, cols = W_orig.shape
+    H = H.float().clone()
+    dead = torch.diag(H) == 0
+    H[dead, dead] = 1
+    damp = 0.01 * H.diag().mean()
+    H.diagonal().add_(damp)
+    perm = torch.argsort(H.diag(), descending=True)
+    invperm = torch.argsort(perm)
+    W_perm = W_orig[:, perm].clone()
+    W_perm[:, dead[perm]] = 0
+    H = H[perm][:, perm]
+    Hinv = torch.cholesky_inverse(torch.linalg.cholesky(H))
+    Hinv = torch.linalg.cholesky(Hinv, upper=True)
+    row_std = W_orig.std(dim=1)
+    s = (clip_sigmas * row_std / clip_range).clamp_min(1e-10).to(torch.float16)
+    sf = s.float()
+    Q = torch.zeros(rows, cols, dtype=torch.int8)
+    W_work = W_perm.clone()
+    for i1 in range(0, cols, block_size):
+        i2 = min(i1 + block_size, cols)
+        W_block = W_work[:, i1:i2].clone()
+        Hinv_block = Hinv[i1:i2, i1:i2]
+        Err = torch.zeros(rows, i2 - i1)
+        for j in range(i2 - i1):
+            w_col = W_block[:, j]
+            d = Hinv_block[j, j]
+            q_col = torch.clamp(torch.round(w_col / sf), -clip_range, clip_range)
+            Q[:, i1 + j] = q_col.to(torch.int8)
+            err = (w_col - q_col.float() * sf) / d
+            Err[:, j] = err
+            W_block[:, j:] -= err.unsqueeze(1) * Hinv_block[j, j:].unsqueeze(0)
+        if i2 < cols:
+            W_work[:, i2:] -= Err @ Hinv[i1:i2, i2:]
+    return Q[:, invperm], s
+
+
+def _quantize_gate_int8_row(w):
+    # Symmetric int8-per-row quantization for small gate tensors. w shape
+    # (R, C) -> (R,) scales in fp16, int8 values in [-127, 127]. Single scale
+    # per row keeps accuracy high while halving storage vs fp16.
+    W = w.float().contiguous()
+    row_max = W.abs().amax(dim=1).clamp_min(1e-10)
+    s = (row_max / 127.0).to(torch.float16)
+    sf = s.float().view(-1, 1)
+    q = torch.clamp(torch.round(W / sf), -127, 127).to(torch.int8)
+    return q, s
+
+
+def _lqer_pack(A, B, bits):
+    rng = 2 ** (bits - 1) - 1
+    sA = (A.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    sB = (B.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    qB = torch.clamp(torch.round(B / sB.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    return qA, sA, qB, sB
+
+
+def _lqer_pack_asym(A, B, g=64):
+    # A: INT2 per-matrix scalar (signed [-2,1], scale = |A|max/1.5).
+    sA = (A.abs().amax().clamp_min(1e-10) / 1.5).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float()), -2, 1).to(torch.int8)
+    # B: INT4 groupwise g over flattened B (signed [-8,7], per-group scale).
+    Bf = B.reshape(-1, g)
+    Bmax = Bf.abs().amax(dim=-1, keepdim=True).clamp_min(1e-10)
+    sB = (Bmax / 7.5).to(torch.float16).reshape(-1)
+    qB = torch.clamp(torch.round(Bf / sB.float().reshape(-1, 1)), -8, 7).to(
+        torch.int8
+    ).reshape(B.shape)
+    return qA, sA, qB, sB
+
+
+def gptq_mixed_quantize(state_dict, hessians, h):
+    result = {}
+    meta = {}
+    quant_gate = bool(getattr(h, "gated_attn_quant_gate", False))
+    lqer_on = bool(getattr(h, "lqer_enabled", False))
+    lqer_cands = {}
+    for (name, tensor) in state_dict.items():
+        t = tensor.detach().cpu().contiguous()
+        # Dedicated int8-per-row path for attn_gate_w (bypasses both GPTQ and
+        # fp16 passthrough). Applied BEFORE the numel<=65536 passthrough check
+        # so the gate tensor is routed here instead of to fp16.
+        if (
+            quant_gate
+            and t.is_floating_point()
+            and t.ndim == 2
+            and name.endswith(".attn_gate_w")
+            # Dense GatedAttn: (num_heads, dim) = (8, 512) = 4096.
+            # Sparse gate: (num_heads, gate_window) = (8, 12) = 96.
+            # Both need int8-per-row routing; the 1024 lower bound in stock
+            # PR-1736 presumed dense-only. Widen to catch both.
+            and 32 <= t.numel() <= 8192
+        ):
+            gq, gs = _quantize_gate_int8_row(t)
+            result[name + ".gq"] = gq
+            result[name + ".gs"] = gs
+            meta[name] = "gate_int8_row"
+            continue
+        if not t.is_floating_point() or t.numel() <= 65536:
+            result[name] = t.to(torch.float16) if t.is_floating_point() else t
+            meta[name] = "passthrough (float16)"
+            continue
+        if "tok_emb" in name:
+            cs = h.embed_clip_sigmas
+        elif ".mlp." in name:
+            cs = h.mlp_clip_sigmas
+        elif ".attn." in name:
+            cs = h.attn_clip_sigmas
+        else:
+            cs = h.matrix_clip_sigmas
+        bits = h.embed_bits if "tok_emb" in name else h.matrix_bits
+        clip_range = 2 ** (bits - 1) - 1
+        ret = gptq_quantize_weight(
+            t, hessians[name], clip_sigmas=cs, clip_range=clip_range
+        )
+        q, s = ret
+        result[name + ".q"] = q
+        result[name + ".scale"] = s
+        meta[name] = f"gptq (int{bits})"
+        if lqer_on:
+            W_q = q.float() * s.float().view(-1, 1)
+            E = t.float() - W_q
+            lqer_cands[name] = (E, float(E.norm()))
+    if lqer_on and lqer_cands:
+        top = sorted(lqer_cands.items(), key=lambda kv: -kv[1][1])[: h.lqer_top_k]
+        asym_on = bool(getattr(h, "lqer_asym_enabled", False))
+        asym_g = int(getattr(h, "lqer_asym_group", 64))
+        for (name, (E, _)) in top:
+            U, S, Vh = torch.linalg.svd(E, full_matrices=False)
+            r = min(h.lqer_rank, S.numel())
+            A = (U[:, :r] * S[:r]).contiguous()
+            B = Vh[:r, :].contiguous()
+            if asym_on and B.numel() % asym_g == 0:
+                qA, sA, qB, sB = _lqer_pack_asym(A, B, asym_g)
+                result[name + ".lqA_a"] = qA
+                result[name + ".lqAs_a"] = sA
+                result[name + ".lqB_a"] = qB
+                result[name + ".lqBs_a"] = sB
+                meta[name] = meta[name] + "+lqer_asym"
+            else:
+                qA, sA, qB, sB = _lqer_pack(A, B, h.lqer_factor_bits)
+                result[name + ".lqA"] = qA
+                result[name + ".lqAs"] = sA
+                result[name + ".lqB"] = qB
+                result[name + ".lqBs"] = sB
+                meta[name] = meta[name] + "+lqer"
+    categories = collections.defaultdict(set)
+    for (name, cat) in meta.items():
+        short = re.sub("\\.\\d+$", "", re.sub("blocks\\.\\d+", "blocks", name))
+        categories[cat].add(short)
+    log("Quantized weights:")
+    for cat in sorted(categories):
+        log(f"  {cat}: {', '.join(sorted(categories[cat]))}")
+    return result, meta
+
+def dequantize_mixed(result, meta, template_sd):
+    out = {}
+    for (name, orig) in template_sd.items():
+        info = meta.get(name)
+        if info is None:
+            continue
+        orig_dtype = orig.dtype
+        if "passthrough" in info:
+            t = result[name]
+            if t.dtype == torch.float16 and orig_dtype in (
+                torch.float32,
+                torch.bfloat16,
+            ):
+                t = t.to(orig_dtype)
+            out[name] = t
+            continue
+        if info == "gate_int8_row":
+            gq = result[name + ".gq"]
+            gs = result[name + ".gs"]
+            out[name] = (gq.float() * gs.float().view(-1, 1)).to(orig_dtype)
+            continue
+        q, s = result[name + ".q"], result[name + ".scale"]
+        if s.ndim > 0:
+            W = q.float() * s.float().view(q.shape[0], *[1] * (q.ndim - 1))
+        else:
+            W = q.float() * float(s.item())
+        if "lqer_asym" in info:
+            qA_t = result[name + ".lqA_a"]
+            sA_t = result[name + ".lqAs_a"]
+            qB_t = result[name + ".lqB_a"]
+            sB_t = result[name + ".lqBs_a"]
+            qA = qA_t.float() * float(sA_t)
+            g_sz = qB_t.numel() // sB_t.numel()
+            qB = (qB_t.reshape(-1, g_sz).float() * sB_t.float().view(-1, 1)).reshape(
+                qB_t.shape
+            )
+            W = W + qA @ qB
+        elif "lqer" in info:
+            qA = result[name + ".lqA"].float() * result[name + ".lqAs"].float().view(-1, 1)
+            qB = result[name + ".lqB"].float() * result[name + ".lqBs"].float().view(-1, 1)
+            W = W + qA @ qB
+        out[name] = W.to(orig_dtype)
+    return out
+
+
+_BSHF_MAGIC = b"BSHF"
+
+
+# ── Per-group lrzip compression (ported from PR#1586 via PR#1667/1729) ────────
+
+_GROUP_ORDER = [
+    "_tok_emb.weight.q",
+    "attn.c_k.weight.q", "attn.c_q.weight.q",
+    "attn.c_v.weight.q", "attn.proj.weight.q",
+    "mlp.fc.weight.q", "mlp.proj.weight.q",
+]
+_SIMSORT_KEYS = {"_tok_emb.weight.q", "attn.c_q.weight.q", "mlp.fc.weight.q"}
+_PACK_MAGIC = b"PGRP"
+
+
+def _similarity_sort_l1(matrix):
+    import numpy as _np
+    n = matrix.shape[0]
+    used = _np.zeros(n, dtype=bool)
+    order = [0]
+    used[0] = True
+    cur = matrix[0].astype(_np.float32)
+    for _ in range(n - 1):
+        dists = _np.sum(_np.abs(matrix[~used].astype(_np.float32) - cur), axis=1)
+        unused = _np.where(~used)[0]
+        best = unused[_np.argmin(dists)]
+        order.append(best)
+        used[best] = True
+        cur = matrix[best].astype(_np.float32)
+    return _np.array(order, dtype=_np.uint16)
+
+
+def _lrzip_compress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.bin")
+    out = f"{inp}.lrz"
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-z", "-L", "9", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _lrzip_decompress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.lrz")
+    out = os.path.join(tmpdir, f"{label}.bin")
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-d", "-f", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _pack_streams(streams):
+    import struct
+    n = len(streams)
+    hdr = _PACK_MAGIC + struct.pack("<I", n)
+    for s in streams:
+        hdr += struct.pack("<I", len(s))
+    return hdr + b"".join(streams)
+
+
+def _unpack_streams(blob):
+    import struct
+    assert blob[:4] == _PACK_MAGIC
+    n = struct.unpack("<I", blob[4:8])[0]
+    off = 8
+    lengths = [struct.unpack("<I", blob[off + i*4:off + i*4 + 4])[0] for i in range(n)]
+    off += n * 4
+    streams = []
+    for length in lengths:
+        streams.append(blob[off:off + length])
+        off += length
+    return streams
+
+
+def _compress(raw, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.compress(raw, quality=11)
+    if compressor == "lzma":
+        import lzma
+        return lzma.compress(raw, preset=9)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _decompress(blob, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.decompress(blob)
+    if compressor == "lzma":
+        import lzma
+        return lzma.decompress(blob)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _serialize_pergroup(quant_result, quant_meta, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    groups = collections.defaultdict(list)
+    remainder = {}
+    for name, t in sorted(quant_result.items()):
+        if t.dtype != torch.int8:
+            remainder[name] = t
+            continue
+        parts = name.split(".")
+        routed = False
+        if parts[0] == "blocks" and parts[1].isdigit():
+            key = ".".join(parts[2:])
+            if key in _GROUP_ORDER:
+                groups[key].append((int(parts[1]), t))
+                routed = True
+        else:
+            group_key = "_" + name
+            if group_key in _GROUP_ORDER:
+                groups[group_key] = [(0, t)]
+                routed = True
+        if not routed:
+            # int8 tensor that doesn't fit a known group (e.g. gate_int8_row
+            # tensors like attn.attn_gate_w.gq from GATED_ATTN). Stash in
+            # the brotli-compressed remainder blob so it round-trips.
+            remainder[name] = t
+
+    streams = []
+    all_perms = b""
+    shape_manifest = {}
+
+    for group_key in _GROUP_ORDER:
+        if group_key not in groups:
+            streams.append(b"")
+            continue
+        tensors = sorted(groups[group_key], key=lambda x: x[0])
+        blob = b""
+        grp_shapes = []
+        for idx, t in tensors:
+            arr = t.numpy()
+            orig_shape = arr.shape
+            if arr.ndim == 2:
+                if group_key in _SIMSORT_KEYS:
+                    order = _similarity_sort_l1(arr)
+                    all_perms += order.tobytes()
+                    arr = arr[order]
+                arr = _np.ascontiguousarray(arr.T)
+            blob += arr.tobytes()
+            grp_shapes.append(orig_shape)
+        shape_manifest[group_key] = grp_shapes
+        compressed = _lrzip_compress(blob, tmpdir, group_key.replace(".", "_"))
+        streams.append(compressed)
+
+    remainder_buf = io.BytesIO()
+    torch.save({"r": remainder, "m": quant_meta, "s": shape_manifest}, remainder_buf)
+    streams.append(brotli.compress(remainder_buf.getvalue(), quality=11, lgwin=24))
+    streams.append(brotli.compress(all_perms, quality=11) if all_perms else b"")
+
+    return _pack_streams(streams)
+
+
+def _deserialize_pergroup(blob, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    streams = _unpack_streams(blob)
+    n_groups = len(_GROUP_ORDER)
+
+    remainder_state = torch.load(
+        io.BytesIO(brotli.decompress(streams[n_groups])), map_location="cpu"
+    )
+    quant_meta = remainder_state["m"]
+    quant_result = dict(remainder_state["r"])
+    shape_manifest = remainder_state["s"]
+    all_perms = brotli.decompress(streams[n_groups + 1]) if streams[n_groups + 1] else b""
+
+    def _decompress_one(args):
+        i, gk, data = args
+        if not data:
+            return gk, b""
+        return gk, _lrzip_decompress(data, tmpdir, f"d_{gk.replace('.', '_')}")
+
+    from concurrent.futures import ThreadPoolExecutor as _TPool
+    with _TPool(max_workers=n_groups) as pool:
+        futs = [pool.submit(_decompress_one, (i, gk, streams[i])) for i, gk in enumerate(_GROUP_ORDER)]
+        raw_groups = {f.result()[0]: f.result()[1] for f in futs}
+
+    perm_off = 0
+    for group_key in _GROUP_ORDER:
+        raw = raw_groups.get(group_key, b"")
+        if not raw:
+            continue
+        grp_shapes = shape_manifest[group_key]
+        data_arr = _np.frombuffer(raw, dtype=_np.int8)
+
+        if group_key.startswith("_"):
+            tensor_names = [group_key[1:]]
+        else:
+            tensor_names = [f"blocks.{i}.{group_key}" for i in range(num_layers)]
+
+        offset = 0
+        for tname, orig_shape in zip(tensor_names, grp_shapes):
+            n_elem = 1
+            for d in orig_shape:
+                n_elem *= d
+            chunk = data_arr[offset:offset + n_elem].copy()
+            offset += n_elem
+
+            if len(orig_shape) == 2:
+                rows, cols = orig_shape
+                chunk = chunk.reshape(cols, rows).T
+
+                if group_key in _SIMSORT_KEYS:
+                    perm = _np.frombuffer(all_perms[perm_off:perm_off + rows * 2], dtype=_np.uint16)
+                    perm_off += rows * 2
+                    inv_perm = _np.empty_like(perm)
+                    inv_perm[perm] = _np.arange(rows, dtype=_np.uint16)
+                    chunk = chunk[inv_perm]
+
+                chunk = chunk.reshape(orig_shape)
+
+            quant_result[tname] = torch.from_numpy(_np.ascontiguousarray(chunk))
+
+    return quant_result, quant_meta
+
+
+def _unbank_state_dict(state_dict, num_layers):
+    sd = {}
+    n = num_layers
+    for k, v in state_dict.items():
+        t = v.detach().cpu() if v is not None else None
+        if k == "qo_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_q.weight"] = t[i]
+                sd[f"blocks.{i}.attn.proj.weight"] = t[n + i]
+        elif k == "kv_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_k.weight"] = t[i]
+                sd[f"blocks.{i}.attn.c_v.weight"] = t[n + i]
+        elif k == "mlp_up_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.fc.weight"] = t[i]
+        elif k == "mlp_down_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.proj.weight"] = t[i]
+        else:
+            if t is not None:
+                sd[k] = t
+    return sd
+
+
+def _rebank_state_dict(flat_sd, num_layers, model_dim, kv_dim, hidden_dim):
+    sd = {}
+    n = num_layers
+    sd["qo_bank"] = torch.zeros(2 * n, model_dim, model_dim)
+    sd["kv_bank"] = torch.zeros(2 * n, kv_dim, model_dim)
+    for i in range(n):
+        sd["qo_bank"][i] = flat_sd[f"blocks.{i}.attn.c_q.weight"]
+        sd["qo_bank"][n + i] = flat_sd[f"blocks.{i}.attn.proj.weight"]
+        sd["kv_bank"][i] = flat_sd[f"blocks.{i}.attn.c_k.weight"]
+        sd["kv_bank"][n + i] = flat_sd[f"blocks.{i}.attn.c_v.weight"]
+    sd["mlp_up_bank"] = torch.zeros(n, hidden_dim, model_dim)
+    sd["mlp_down_bank"] = torch.zeros(n, model_dim, hidden_dim)
+    for i in range(n):
+        sd["mlp_up_bank"][i] = flat_sd[f"blocks.{i}.mlp.fc.weight"]
+        sd["mlp_down_bank"][i] = flat_sd[f"blocks.{i}.mlp.proj.weight"]
+    for k, v in flat_sd.items():
+        if not (
+            k.startswith("blocks.")
+            and any(
+                p in k
+                for p in [
+                    ".attn.c_q.", ".attn.c_k.", ".attn.c_v.",
+                    ".attn.proj.", ".mlp.fc.", ".mlp.proj.",
+                ]
+            )
+        ):
+            sd[k] = v
+    return sd
+
+
+
+def _compressed_code_size(code):
+    import brotli
+    code_raw = code.encode("utf-8")
+    try:
+        minified = subprocess.run(
+            ["pyminify", "--no-rename-locals", "--no-hoist-literals", "--remove-literal-statements", "--remove-asserts", "--prefer-single-line", "-"],
+            input=code_raw, capture_output=True, check=True,
+        ).stdout
+    except (FileNotFoundError, subprocess.CalledProcessError):
+        minified = code_raw
+    compressed = brotli.compress(minified, quality=11)
+    encoded = base64.b85encode(compressed)
+    wrapper = b"import brotli as B,base64 as b\nexec(B.decompress(b.b85decode(\"" + encoded + b"\")))\n"
+    return len(code_raw), len(wrapper)
+
+
+def serialize(h, base_model, code):
+    code_bytes_uncompressed, code_bytes = _compressed_code_size(code)
+    if h.is_main_process:
+        torch.save(base_model.state_dict(), h.model_path)
+        model_bytes = os.path.getsize(h.model_path)
+        log(f"Serialized model: {model_bytes} bytes")
+        log(f"Code size (uncompressed): {code_bytes_uncompressed} bytes")
+        log(f"Code size (compressed): {code_bytes} bytes")
+    sd_cpu = _unbank_state_dict(base_model.state_dict(), h.num_layers)
+    device = torch.device("cuda", h.local_rank)
+    t0 = time.perf_counter()
+    calib_loader = ShuffledSequenceLoader(h, device)
+    log("GPTQ:collecting Hessians from calibration data...")
+    hessians = collect_hessians(
+        base_model,
+        calib_loader,
+        h,
+        device,
+        n_calibration_batches=h.gptq_calibration_batches,
+    )
+    log(f"GPTQ:collected {len(hessians)} Hessians in {time.perf_counter()-t0:.1f}s")
+    quant_result, quant_meta = gptq_mixed_quantize(sd_cpu, hessians, h)
+    if h.compressor == "pergroup":
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_")
+        log("Serialize: per-group lrzip compression...")
+        t1 = time.perf_counter()
+        quant_blob = _serialize_pergroup(quant_result, quant_meta, h.num_layers, tmpdir)
+        log(f"Serialize: per-group compression done in {time.perf_counter()-t1:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_buf = io.BytesIO()
+        torch.save({"w": quant_result, "m": quant_meta}, quant_buf)
+        quant_raw = quant_buf.getvalue()
+        quant_blob = _compress(quant_raw, h.compressor)
+    quant_file_bytes = len(quant_blob)
+    bytes_total = quant_file_bytes + code_bytes
+    if h.is_main_process:
+        with open(h.quantized_model_path, "wb") as f:
+            f.write(quant_blob)
+        log(f"Serialized model quantized+{h.compressor}: {quant_file_bytes} bytes")
+        log(f"Total submission size quantized+{h.compressor}: {bytes_total} bytes")
+    return bytes_total, quant_file_bytes
+
+
+def deserialize(h, device):
+    eval_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(eval_model)
+    flat_template = _unbank_state_dict(eval_model.state_dict(), h.num_layers)
+    with open(h.quantized_model_path, "rb") as f:
+        quant_blob_disk = f.read()
+    if quant_blob_disk[:4] == _PACK_MAGIC:
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_dec_")
+        log("Deserialize: per-group lrzip decompression...")
+        t0 = time.perf_counter()
+        quant_result, quant_meta = _deserialize_pergroup(
+            quant_blob_disk, h.num_layers, tmpdir
+        )
+        log(f"Deserialize: decompression done in {time.perf_counter()-t0:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_state = torch.load(
+            io.BytesIO(_decompress(quant_blob_disk, h.compressor)), map_location="cpu"
+        )
+        quant_result, quant_meta = quant_state["w"], quant_state["m"]
+    deq_flat = dequantize_mixed(quant_result, quant_meta, flat_template)
+    head_dim = h.model_dim // h.num_heads
+    kv_dim = h.num_kv_heads * head_dim
+    hidden_dim = int(h.mlp_mult * h.model_dim)
+    deq_state = _rebank_state_dict(deq_flat, h.num_layers, h.model_dim, kv_dim, hidden_dim)
+    eval_model.load_state_dict(deq_state, strict=True)
+    return eval_model
+
+
+def _loss_bpb(loss_sum, token_count, byte_count):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_count.item())
+    return val_loss, val_bpb
+
+
+def eval_val(h, device, val_data, model, forward_logits_fn=None):
+    seq_len = h.eval_seq_len
+    local_batch_tokens = h.val_batch_tokens // (h.world_size * h.grad_accum_steps)
+    if local_batch_tokens < seq_len:
+        raise ValueError(
+            f"VAL_BATCH_SIZE must provide at least one sequence per rank; got VAL_BATCH_SIZE={h.val_batch_tokens}, WORLD_SIZE={h.world_size}, GRAD_ACCUM_STEPS={h.grad_accum_steps}, seq_len={seq_len}"
+        )
+    local_batch_seqs = local_batch_tokens // seq_len
+    total_seqs = (val_data.val_tokens.numel() - 1) // seq_len
+    seq_start = total_seqs * h.rank // h.world_size
+    seq_end = total_seqs * (h.rank + 1) // h.world_size
+
+    # TODO: Don't truncate this.
+    seq_end = seq_start + ((seq_end - seq_start) // local_batch_seqs) * local_batch_seqs
+
+    val_loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    val_token_count = torch.zeros((), device=device, dtype=torch.float64)
+    val_byte_count = torch.zeros((), device=device, dtype=torch.float64)
+    run_forward_logits = (
+        (model.module.forward_logits if hasattr(model, "module") else model.forward_logits)
+        if forward_logits_fn is None
+        else forward_logits_fn
+    )
+    model.eval()
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    with torch.no_grad():
+        for batch_seq_start in range(seq_start, seq_end, local_batch_seqs):
+            batch_seq_end = min(batch_seq_start + local_batch_seqs, seq_end)
+            raw_start = batch_seq_start * seq_len
+            raw_end = batch_seq_end * seq_len + 1
+            local = val_data.val_tokens[raw_start:raw_end].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            x = local[:-1]
+            y = local[1:]
+            bos_pos = (x == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                bos_pos, x.numel(), x.device, h.eval_seq_len, 64
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                logits = run_forward_logits(
+                    x[None], cu_seqlens=cu_seqlens, max_seqlen=max_seqlen
+                ).detach()
+            per_token_loss = F.cross_entropy(
+                logits.reshape(-1, logits.size(-1)).float(),
+                y.reshape(-1),
+                reduction="none",
+            )
+            val_loss_sum += per_token_loss.to(torch.float64).sum()
+            val_token_count += float(y.numel())
+            prev_ids = x
+            tgt_ids = y
+            sidecar_slice = val_data.val_bytes[raw_start + 1 : raw_end].to(
+                device=device, dtype=torch.int32, non_blocking=True
+            )
+            val_byte_count += sidecar_slice.to(torch.float64).sum()
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(val_loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_token_count, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_byte_count, op=dist.ReduceOp.SUM)
+    model.train()
+    return _loss_bpb(val_loss_sum, val_token_count, val_byte_count)
+
+
+def _find_docs(all_tokens):
+    bos_positions = (all_tokens == BOS_ID).nonzero(as_tuple=True)[0].numpy()
+    docs = []
+    for i in range(len(bos_positions)):
+        start = int(bos_positions[i])
+        end = (
+            int(bos_positions[i + 1])
+            if i + 1 < len(bos_positions)
+            else all_tokens.numel()
+        )
+        if i + 1 < len(bos_positions):
+            end += 1
+        assert end - start >= 2
+        docs.append((start, end - start))
+    return docs
+
+
+def _build_ttt_global_batches(doc_entries, h, ascending=False):
+    batch_size = h.ttt_batch_size
+    global_doc_entries = sorted(doc_entries, key=lambda x: x[1][1])
+    global_batches = [
+        global_doc_entries[i : i + batch_size]
+        for i in range(0, len(global_doc_entries), batch_size)
+    ]
+    indexed = list(enumerate(global_batches))
+    if not ascending:
+        indexed.sort(key=lambda ib: -max(dl for _, (_, dl) in ib[1]))
+    return indexed
+
+
+def _init_batch_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(4, "little"))
+
+
+def _claim_next_batch(counter_path, queue_len):
+    try:
+        with open(counter_path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            idx = int.from_bytes(f.read(4), "little")
+            f.seek(0)
+            f.write((idx + 1).to_bytes(4, "little"))
+            f.flush()
+    except FileNotFoundError:
+        return queue_len
+    return idx
+
+
+def _compute_chunk_window(ci, pred_len, num_chunks, chunk_size, eval_seq_len):
+    chunk_end = pred_len if ci == num_chunks - 1 else (ci + 1) * chunk_size
+    win_start = max(0, chunk_end - eval_seq_len)
+    win_len = chunk_end - win_start
+    chunk_start = ci * chunk_size
+    chunk_offset = chunk_start - win_start
+    chunk_len = chunk_end - chunk_start
+    return win_start, win_len, chunk_offset, chunk_len
+
+
+def _accumulate_bpb(
+    ptl,
+    x,
+    y,
+    chunk_offsets,
+    chunk_lens,
+    pos_idx,
+    base_bytes_lut,
+    has_leading_space_lut,
+    is_boundary_token_lut,
+    loss_sum,
+    byte_sum,
+    token_count,
+    y_bytes=None,
+):
+    pos = pos_idx[: x.size(1)].unsqueeze(0)
+    mask = (
+        (chunk_lens.unsqueeze(1) > 0)
+        & (pos >= chunk_offsets.unsqueeze(1))
+        & (pos < (chunk_offsets + chunk_lens).unsqueeze(1))
+    )
+    mask_f64 = mask.to(torch.float64)
+    if y_bytes is not None:
+        tok_bytes = y_bytes.to(torch.float64)
+    else:
+        tok_bytes = base_bytes_lut[y].to(torch.float64)
+        tok_bytes += (has_leading_space_lut[y] & ~is_boundary_token_lut[x]).to(
+            torch.float64
+        )
+    loss_sum += (ptl.to(torch.float64) * mask_f64).sum()
+    byte_sum += (tok_bytes * mask_f64).sum()
+    token_count += chunk_lens.to(torch.float64).sum()
+
+
+def _loss_bpb_from_sums(loss_sum, token_count, byte_sum):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_sum.item())
+    return val_loss, val_bpb
+
+
+def _add_to_counter(path, delta):
+    try:
+        with open(path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            cur = int.from_bytes(f.read(8), "little", signed=True)
+            cur += int(delta)
+            f.seek(0)
+            f.write(int(cur).to_bytes(8, "little", signed=True))
+            f.flush()
+            return cur
+    except FileNotFoundError:
+        return int(delta)
+
+
+def _init_int64_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(8, "little", signed=True))
+
+
+def _select_ttt_doc_entries(docs, h):
+    doc_entries = list(enumerate(docs))
+    if h.val_doc_fraction < 1.0:
+        sample_n = max(1, int(round(len(docs) * h.val_doc_fraction)))
+        sampled_indices = sorted(
+            random.Random(h.seed).sample(range(len(docs)), sample_n)
+        )
+        return [(i, docs[i]) for i in sampled_indices]
+    return doc_entries
+
+
+def train_val_ttt_global_sgd_distributed(h, device, val_data, base_model, val_tokens, batch_seqs=None):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    seq_len = h.eval_seq_len
+    total_tokens = val_tokens.numel() - 1
+    ttt_chunk = h.global_ttt_chunk_tokens
+    batch_seqs = h.global_ttt_batch_seqs if batch_seqs is None else batch_seqs
+    num_chunks = (total_tokens + ttt_chunk - 1) // ttt_chunk
+    ttt_params = [p for p in base_model.parameters()]
+    for p in ttt_params:
+        p.requires_grad_(True)
+    optimizer = torch.optim.SGD(
+        ttt_params, lr=h.global_ttt_lr, momentum=h.global_ttt_momentum
+    )
+    t_start = time.perf_counter()
+    for ci in range(num_chunks):
+        chunk_start = ci * ttt_chunk
+        chunk_end = min((ci + 1) * ttt_chunk, total_tokens)
+        is_last_chunk = ci == num_chunks - 1
+        if is_last_chunk or h.global_ttt_epochs <= 0:
+            continue
+        base_model.train()
+        chunk_seqs = (chunk_end - chunk_start) // seq_len
+        if chunk_seqs <= 0:
+            continue
+        warmup_chunks = max(0, min(h.global_ttt_warmup_chunks, num_chunks - 1))
+        if warmup_chunks > 0 and ci < warmup_chunks:
+            warmup_denom = max(warmup_chunks - 1, 1)
+            warmup_t = ci / warmup_denom
+            lr_now = (
+                h.global_ttt_warmup_start_lr
+                + (h.global_ttt_lr - h.global_ttt_warmup_start_lr) * warmup_t
+            )
+        else:
+            decay_steps = max(num_chunks - 1 - warmup_chunks, 1)
+            decay_ci = max(ci - warmup_chunks, 0)
+            lr_now = h.global_ttt_lr * 0.5 * (
+                1.0 + math.cos(math.pi * decay_ci / decay_steps)
+            )
+        for pg in optimizer.param_groups:
+            pg["lr"] = lr_now
+        my_seq_s = chunk_seqs * h.rank // h.world_size
+        my_seq_e = chunk_seqs * (h.rank + 1) // h.world_size
+        my_chunk_seqs = my_seq_e - my_seq_s
+        for _ in range(h.global_ttt_epochs):
+            for bs in range(0, my_chunk_seqs, batch_seqs):
+                be = min(bs + batch_seqs, my_chunk_seqs)
+                actual_bs = my_seq_s + bs
+                start_tok = chunk_start + actual_bs * seq_len
+                end_tok = chunk_start + (my_seq_s + be) * seq_len + 1
+                if end_tok > val_tokens.numel():
+                    continue
+                local = val_tokens[start_tok:end_tok].to(device=device, dtype=torch.int64)
+                x_flat = local[:-1]
+                y_flat = local[1:]
+                optimizer.zero_grad(set_to_none=True)
+                with torch.enable_grad():
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                        if h.global_ttt_respect_doc_boundaries:
+                            bos_pos = (x_flat == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+                            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                                bos_pos, x_flat.numel(), x_flat.device, h.eval_seq_len, 64
+                            )
+                            loss = base_model(
+                                x_flat[None],
+                                y_flat[None],
+                                cu_seqlens=cu_seqlens,
+                                max_seqlen=max_seqlen,
+                            )
+                        else:
+                            x = x_flat.reshape(-1, seq_len)
+                            y = y_flat.reshape(-1, seq_len)
+                            loss = base_model(x, y)
+                loss.backward()
+                if dist.is_available() and dist.is_initialized():
+                    for p in ttt_params:
+                        if p.grad is not None:
+                            dist.all_reduce(p.grad, op=dist.ReduceOp.SUM)
+                            p.grad.mul_(1.0 / h.world_size)
+                if h.global_ttt_grad_clip > 0:
+                    torch.nn.utils.clip_grad_norm_(ttt_params, h.global_ttt_grad_clip)
+                optimizer.step()
+        base_model.eval()
+        if h.rank == 0:
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"tttg: c{ci+1}/{num_chunks} lr:{lr_now:.6f} t:{elapsed:.1f}s"
+            )
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.eval()
+
+
+def eval_val_ttt_phased(h, base_model, device, val_data, forward_ttt_train):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    for p in base_model.parameters():
+        p.requires_grad_(False)
+    all_tokens = val_data.val_tokens
+    all_tokens_idx = all_tokens.to(torch.int32)
+    docs = _find_docs(all_tokens)
+    doc_entries = _select_ttt_doc_entries(docs, h)
+    prefix_doc_limit = max(0, min(len(doc_entries), int(h.phased_ttt_prefix_docs)))
+    num_phases = max(1, int(h.phased_ttt_num_phases))
+    phase_boundaries = []
+    for pi in range(num_phases):
+        boundary = prefix_doc_limit * (pi + 1) // num_phases
+        phase_boundaries.append(boundary)
+    current_phase = 0
+    current_phase_boundary = phase_boundaries[0]
+    log(
+        "ttt_phased:"
+        f" total_docs:{len(doc_entries)} prefix_docs:{prefix_doc_limit} "
+        f"suffix_docs:{len(doc_entries) - prefix_doc_limit}"
+        f" num_phases:{num_phases} boundaries:{phase_boundaries}"
+    )
+    chunk_size, eval_seq_len = h.ttt_chunk_size, h.ttt_eval_seq_len
+    eval_batch_set = None
+    if h.ttt_eval_batches:
+        eval_batch_set = set(int(x) for x in h.ttt_eval_batches.split(",") if x.strip())
+    use_ascending = eval_batch_set is not None
+    global_batches_sorted = _build_ttt_global_batches(
+        doc_entries, h, ascending=use_ascending
+    )
+    queue_len = len(global_batches_sorted)
+    counter_path = f"/tmp/ttt_counter_{h.run_id}"
+    prefix_counter_path = f"/tmp/ttt_prefix_counter_{h.run_id}"
+    pause_flag_path = f"/tmp/ttt_pause_flag_{h.run_id}"
+    if h.rank == 0:
+        _init_batch_counter(counter_path)
+        _init_int64_counter(prefix_counter_path)
+        try:
+            os.remove(pause_flag_path)
+        except FileNotFoundError:
+            pass
+    if dist.is_available() and dist.is_initialized():
+        path_list = [counter_path, prefix_counter_path, pause_flag_path]
+        dist.broadcast_object_list(path_list, src=0)
+        counter_path, prefix_counter_path, pause_flag_path = path_list
+        dist.barrier()
+    loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    byte_sum = torch.zeros((), device=device, dtype=torch.float64)
+    token_count = torch.zeros((), device=device, dtype=torch.float64)
+    t_start = time.perf_counter()
+    reusable_lora = BatchedTTTLoRA(
+        h.ttt_batch_size, base_model, h.ttt_lora_rank,
+        k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+    ).to(device)
+
+    def _build_opt(lora):
+        if h.ttt_optimizer == "sgd":
+            return torch.optim.SGD(
+                lora.parameters(), lr=h.ttt_lora_lr,
+                momentum=h.ttt_beta1, weight_decay=h.ttt_weight_decay,
+            )
+        return torch.optim.AdamW(
+            lora.parameters(), lr=h.ttt_lora_lr,
+            betas=(h.ttt_beta1, h.ttt_beta2),
+            eps=1e-10, weight_decay=h.ttt_weight_decay, fused=True,
+        )
+
+    reusable_opt = _build_opt(reusable_lora)
+    local_scored_docs = []
+    global_ttt_done = prefix_doc_limit == 0
+    try:
+      while True:
+        queue_idx = _claim_next_batch(counter_path, queue_len)
+        if queue_idx >= queue_len:
+            break
+        orig_batch_idx, batch_entries = global_batches_sorted[queue_idx]
+        batch = [doc for _, doc in batch_entries]
+        bsz = len(batch)
+        prev_loss = loss_sum.item()
+        prev_bytes = byte_sum.item()
+        prev_tokens = token_count.item()
+        if bsz == reusable_lora.bsz:
+            reusable_lora.reset()
+            for s in reusable_opt.state.values():
+                for k, v in s.items():
+                    if isinstance(v, torch.Tensor):
+                        v.zero_()
+                    elif k == "step":
+                        s[k] = 0
+            cur_lora = reusable_lora
+            cur_opt = reusable_opt
+        else:
+            cur_lora = BatchedTTTLoRA(
+                bsz, base_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            cur_opt = _build_opt(cur_lora)
+        pred_lens = [doc_len - 1 for _, doc_len in batch]
+        num_chunks = [(pl + chunk_size - 1) // chunk_size for pl in pred_lens]
+        max_nc = max(num_chunks)
+        num_chunks_t = torch.tensor(num_chunks, dtype=torch.int64, device=device)
+        for ci in range(max_nc):
+            active = [ci < nc for nc in num_chunks]
+            needs_train = any(ci < nc - 1 for nc in num_chunks)
+            tok_starts = torch.zeros(bsz, dtype=torch.int64)
+            tok_wls = torch.zeros(bsz, dtype=torch.int64)
+            chunk_offsets_cpu = torch.zeros(bsz, dtype=torch.int64)
+            chunk_lens_cpu = torch.zeros(bsz, dtype=torch.int64)
+            for b in range(bsz):
+                if not active[b]:
+                    continue
+                doc_start, doc_len = batch[b]
+                win_start, win_len, chunk_offset, chunk_len = _compute_chunk_window(
+                    ci, pred_lens[b], num_chunks[b], chunk_size, eval_seq_len
+                )
+                tok_starts[b] = doc_start + win_start
+                tok_wls[b] = win_len
+                chunk_offsets_cpu[b] = chunk_offset
+                chunk_lens_cpu[b] = chunk_len
+            _, context_size, chunk_offset, _ = _compute_chunk_window(
+                ci, (ci + 1) * chunk_size, ci + 1, chunk_size, eval_seq_len
+            )
+            col_idx = torch.arange(context_size + 1)
+            idx = tok_starts.unsqueeze(1) + col_idx.unsqueeze(0)
+            idx.clamp_(max=all_tokens.numel() - 1)
+            gathered_gpu = all_tokens_idx[idx].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            valid = (col_idx[:context_size].unsqueeze(0) < tok_wls.unsqueeze(1)).to(
+                device, non_blocking=True
+            )
+            chunk_offsets = chunk_offsets_cpu.to(device, non_blocking=True)
+            chunk_lens = chunk_lens_cpu.to(device, non_blocking=True)
+            x = torch.where(valid, gathered_gpu[:, :context_size], 0)
+            y = torch.where(valid, gathered_gpu[:, 1 : context_size + 1], 0)
+            ctx_pos = torch.arange(context_size, device=device, dtype=torch.int64)
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+            # CaseOps sidecar-driven byte budget. Mirror the index pattern
+            # used to build y from all_tokens: y[b, j] corresponds to the
+            # token at global position tok_starts[b] + 1 + j (when valid).
+            y_bytes_arg = None
+            if val_data.caseops_enabled and val_data.val_bytes is not None:
+                y_idx = (
+                    tok_starts.unsqueeze(1)
+                    + 1
+                    + col_idx[:context_size].unsqueeze(0)
+                )
+                y_idx = y_idx.clamp_(max=val_data.val_bytes.numel() - 1)
+                y_bytes_arg = val_data.val_bytes[y_idx].to(
+                    device=device, dtype=torch.int32, non_blocking=True
+                )
+                # Mirror the `valid` masking used for y so out-of-range tokens
+                # contribute zero bytes (matches y=0 substitution above).
+                y_bytes_arg = torch.where(
+                    valid, y_bytes_arg, torch.zeros_like(y_bytes_arg)
+                )
+            with torch.no_grad():
+                _accumulate_bpb(
+                    per_tok_loss,
+                    x,
+                    y,
+                    chunk_offsets,
+                    chunk_lens,
+                    ctx_pos,
+                    val_data.base_bytes_lut,
+                    val_data.has_leading_space_lut,
+                    val_data.is_boundary_token_lut,
+                    loss_sum,
+                    byte_sum,
+                    token_count,
+                    y_bytes=y_bytes_arg,
+                )
+            if needs_train:
+                activate_chunk_mask = (num_chunks_t - 1 > ci).float()
+                for gi in range(h.ttt_grad_steps):
+                    if gi > 0:
+                        with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                            per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+                    per_doc = per_tok_loss[
+                        :, chunk_offset : chunk_offset + chunk_size
+                    ].mean(dim=-1)
+                    cur_opt.zero_grad(set_to_none=True)
+                    (per_doc * activate_chunk_mask).sum().backward()
+                    cur_opt.step()
+            else:
+                del per_tok_loss
+        batch_num = orig_batch_idx + 1
+        doc_lens = [dl for _, dl in batch]
+        should_report = batch_num in eval_batch_set if eval_batch_set is not None else True
+        if should_report:
+            cur_tokens = token_count.item()
+            cur_loss_val = loss_sum.item()
+            cur_bytes_val = byte_sum.item()
+            dt = cur_tokens - prev_tokens
+            db = cur_bytes_val - prev_bytes
+            if dt > 0 and db > 0:
+                b_loss = (cur_loss_val - prev_loss) / dt
+                b_bpb = b_loss / math.log(2.0) * (dt / db)
+            else:
+                b_loss = b_bpb = 0.0
+            r_loss = cur_loss_val / max(cur_tokens, 1)
+            r_bpb = r_loss / math.log(2.0) * (cur_tokens / max(cur_bytes_val, 1))
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"ttp: b{batch_num}/{queue_len} bl:{b_loss:.4f} bb:{b_bpb:.4f} "
+                f"rl:{r_loss:.4f} rb:{r_bpb:.4f} dl:{min(doc_lens)}-{max(doc_lens)} "
+                f"gd:{int(global_ttt_done)}"
+            )
+        if not global_ttt_done:
+            local_scored_docs.extend(
+                (orig_batch_idx, pos, doc_start, doc_len)
+                for pos, (doc_start, doc_len) in enumerate(batch)
+            )
+            prefix_done = _add_to_counter(prefix_counter_path, len(batch_entries))
+            if prefix_done >= current_phase_boundary:
+                try:
+                    with open(pause_flag_path, "x"):
+                        pass
+                except FileExistsError:
+                    pass
+            should_pause = os.path.exists(pause_flag_path)
+            if should_pause:
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                gathered_scored_docs = [None] * h.world_size
+                if dist.is_available() and dist.is_initialized():
+                    dist.all_gather_object(gathered_scored_docs, local_scored_docs)
+                else:
+                    gathered_scored_docs = [local_scored_docs]
+                scored_docs_for_global = []
+                for rank_docs in gathered_scored_docs:
+                    if rank_docs:
+                        scored_docs_for_global.extend(rank_docs)
+                scored_docs_for_global.sort(key=lambda x: (x[0], x[1]))
+                scored_docs_for_global = scored_docs_for_global[:current_phase_boundary]
+                scored_token_chunks = [
+                    val_data.val_tokens[doc_start : doc_start + doc_len]
+                    for _, _, doc_start, doc_len in scored_docs_for_global
+                ]
+                if scored_token_chunks:
+                    global_ttt_tokens = torch.cat(scored_token_chunks)
+                else:
+                    global_ttt_tokens = val_data.val_tokens[:0]
+                if h.rank == 0:
+                    prefix_done = 0
+                    try:
+                        with open(prefix_counter_path, "rb") as f:
+                            prefix_done = int.from_bytes(
+                                f.read(8), "little", signed=True
+                            )
+                    except FileNotFoundError:
+                        pass
+                    log(
+                        f"ttpp: phase:{current_phase + 1}/{num_phases} pd:{prefix_done} "
+                        f"gd:{len(scored_docs_for_global)} "
+                        f"t:{time.perf_counter() - t_start:.1f}s"
+                    )
+                train_val_ttt_global_sgd_distributed(
+                    h, device, val_data, base_model, global_ttt_tokens
+                )
+                for p in base_model.parameters():
+                    p.requires_grad_(False)
+                reusable_lora = BatchedTTTLoRA(
+                    h.ttt_batch_size, base_model, h.ttt_lora_rank,
+                    k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+                ).to(device)
+                reusable_opt = _build_opt(reusable_lora)
+                current_phase += 1
+                if current_phase >= num_phases:
+                    global_ttt_done = True
+                else:
+                    current_phase_boundary = phase_boundaries[current_phase]
+                    if h.rank == 0:
+                        try:
+                            os.remove(pause_flag_path)
+                        except FileNotFoundError:
+                            pass
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                if h.rank == 0:
+                    log(f"ttpr: phase:{current_phase}/{num_phases} t:{time.perf_counter() - t_start:.1f}s")
+        del cur_lora, cur_opt
+    finally:
+        pass
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(byte_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(token_count, op=dist.ReduceOp.SUM)
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.train()
+    return _loss_bpb_from_sums(loss_sum, token_count, byte_sum)
+
+
+def timed_eval(label, fn, *args, **kwargs):
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    val_loss, val_bpb = fn(*args, **kwargs)
+    torch.cuda.synchronize()
+    elapsed_ms = 1e3 * (time.perf_counter() - t0)
+    log(
+        f"{label} val_loss:{val_loss:.8f} val_bpb:{val_bpb:.8f} eval_time:{elapsed_ms:.0f}ms"
+    )
+    return val_loss, val_bpb
+
+
+def train_model(h, device, val_data):
+    base_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(base_model)
+    compiled_model = torch.compile(base_model, dynamic=False, fullgraph=True)
+    compiled_forward_logits = torch.compile(
+        base_model.forward_logits, dynamic=False, fullgraph=True
+    )
+    model = compiled_model
+    log(f"model_params:{sum(p.numel()for p in base_model.parameters())}")
+    optimizers = Optimizers(h, base_model)
+    train_loader = DocumentPackingLoader(h, device)
+    max_wallclock_ms = (
+        1e3 * h.max_wallclock_seconds if h.max_wallclock_seconds > 0 else None
+    )
+    if max_wallclock_ms is not None:
+        max_wallclock_ms -= h.gptq_reserve_seconds * 1e3
+        log(
+            f"gptq:reserving {h.gptq_reserve_seconds:.0f}s, effective={max_wallclock_ms:.0f}ms"
+        )
+
+    def training_frac(step, elapsed_ms):
+        if max_wallclock_ms is None:
+            return step / max(h.iterations, 1)
+        return elapsed_ms / max(max_wallclock_ms, 1e-09)
+
+    def lr_mul(frac):
+        if h.warmdown_frac <= 0:
+            return 1.0
+        if frac >= 1.0 - h.warmdown_frac:
+            return max((1.0 - frac) / h.warmdown_frac, h.min_lr)
+        return 1.0
+
+    _clip_params = [p for p in base_model.parameters() if p.requires_grad]
+    def step_fn(step, lr_scale):
+        train_loss = torch.zeros((), device=device)
+        for micro_step in range(h.grad_accum_steps):
+            x, y, cu_seqlens, _max_seqlen = train_loader.next_batch(
+                h.train_batch_tokens, h.grad_accum_steps
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                loss = model(x, y, cu_seqlens=cu_seqlens, max_seqlen=h.train_seq_len)
+            train_loss += loss.detach()
+            (loss / h.grad_accum_steps).backward()
+        train_loss /= h.grad_accum_steps
+        if step <= h.muon_momentum_warmup_steps:
+
+            frac = (
+
+                min(step / h.muon_momentum_warmup_steps, 1.0)
+
+                if h.muon_momentum_warmup_steps > 0
+
+                else 1.0
+
+            )
+
+            muon_momentum = (
+
+                1 - frac
+
+            ) * h.muon_momentum_warmup_start + frac * h.muon_momentum
+
+            for group in optimizers.optimizer_muon.param_groups:
+
+                group["momentum"] = muon_momentum
+        for opt in optimizers:
+            for group in opt.param_groups:
+                group["lr"] = group["base_lr"] * lr_scale
+        if h.grad_clip_norm > 0:
+            torch.nn.utils.clip_grad_norm_(_clip_params, h.grad_clip_norm)
+        optimizers.step(distributed=h.distributed)
+        return train_loss
+
+    if h.warmup_steps > 0:
+        initial_model_state = {
+            name: tensor.detach().cpu().clone()
+            for (name, tensor) in base_model.state_dict().items()
+        }
+        initial_optimizer_states = [
+            copy.deepcopy(opt.state_dict()) for opt in optimizers
+        ]
+        model.train()
+        num_tokens_local = h.train_batch_tokens // h.world_size
+        for blk in base_model.blocks:
+            blk.attn.rotary(num_tokens_local, device, torch.bfloat16)
+        cu_bucket_size = train_loader.cu_bucket_size
+        warmup_cu_buckets = tuple(cu_bucket_size * i for i in range(1, 5))
+        warmup_cu_iters = 3
+        x, y, cu_seqlens, _ = train_loader.next_batch(
+            h.train_batch_tokens, h.grad_accum_steps
+        )
+        log(f"warmup_cu_buckets:{','.join(str(b) for b in warmup_cu_buckets)} iters_each:{warmup_cu_iters}")
+        def _run_cu_bucket_warmup():
+            for bucket_len in warmup_cu_buckets:
+                boundaries = list(range(0, x.size(1), max(h.train_seq_len, 1)))
+                if boundaries[-1] != x.size(1):
+                    boundaries.append(x.size(1))
+                cu = torch.full((bucket_len,), x.size(1), dtype=torch.int32, device=device)
+                cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+                for _ in range(warmup_cu_iters):
+                    optimizers.zero_grad_all()
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                        wloss = model(x, y, cu_seqlens=cu, max_seqlen=h.train_seq_len)
+                    (wloss / h.grad_accum_steps).backward()
+            optimizers.zero_grad_all()
+        _run_cu_bucket_warmup()
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            _run_cu_bucket_warmup()
+            base_model.looping_active = False
+        for warmup_step in range(h.warmup_steps):
+            step_fn(warmup_step, 1.0)
+            if (
+                warmup_step <= 5
+                or (warmup_step + 1) % 10 == 0
+                or warmup_step + 1 == h.warmup_steps
+            ):
+                log(f"warmup_step: {warmup_step+1}/{h.warmup_steps}")
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            log(
+                f"loop_warmup:enabled encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+            for warmup_step in range(h.warmup_steps):
+                step_fn(warmup_step, 1.0)
+                if (
+                    warmup_step <= 5
+                    or (warmup_step + 1) % 10 == 0
+                    or warmup_step + 1 == h.warmup_steps
+                ):
+                    log(f"loop_warmup_step: {warmup_step+1}/{h.warmup_steps}")
+            base_model.looping_active = False
+        base_model.load_state_dict(initial_model_state, strict=True)
+        for (opt, state) in zip(optimizers, initial_optimizer_states, strict=True):
+            opt.load_state_dict(state)
+        optimizers.zero_grad_all()
+        train_loader = DocumentPackingLoader(h, device)
+    _live_state = base_model.state_dict(keep_vars=True)
+    ema_state = {
+        name: t.detach().float().clone()
+        for (name, t) in _live_state.items()
+    }
+    _ema_pairs = [(ema_state[name], t) for (name, t) in _live_state.items()]
+    ema_decay = h.ema_decay
+    training_time_ms = 0.0
+    stop_after_step = None
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    step = 0
+    while True:
+        last_step = (
+            step == h.iterations
+            or stop_after_step is not None
+            and step >= stop_after_step
+        )
+        should_validate = (
+            last_step or h.val_loss_every > 0 and step % h.val_loss_every == 0
+        )
+        if should_validate:
+            torch.cuda.synchronize()
+            training_time_ms += 1e3 * (time.perf_counter() - t0)
+            val_loss, val_bpb = eval_val(
+                h, device, val_data, model, compiled_forward_logits
+            )
+            log(
+                f"{step}/{h.iterations} val_loss: {val_loss:.4f} val_bpb: {val_bpb:.4f}"
+            )
+            torch.cuda.synchronize()
+            t0 = time.perf_counter()
+        if last_step:
+            if stop_after_step is not None and step < h.iterations:
+                log(
+                    f"stopping_early: wallclock_cap train_time: {training_time_ms:.0f}ms step: {step}/{h.iterations}"
+                )
+            break
+        elapsed_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        frac = training_frac(step, elapsed_ms)
+        scale = lr_mul(frac)
+        if (
+            h.num_loops > 0
+            and not base_model.looping_active
+            and frac >= h.enable_looping_at
+        ):
+            base_model.looping_active = True
+            log(
+                f"layer_loop:enabled step:{step} frac:{frac:.3f} encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+        train_loss = step_fn(step, scale)
+        with torch.no_grad():
+            for ema_t, t in _ema_pairs:
+                ema_t.mul_(ema_decay).add_(t.detach(), alpha=1.0 - ema_decay)
+        step += 1
+        approx_training_time_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        should_log_train = h.train_log_every > 0 and (
+            step <= 5 or step % h.train_log_every == 0 or stop_after_step is not None
+        )
+        if should_log_train:
+            tok_per_sec = step * h.train_batch_tokens / (approx_training_time_ms / 1e3)
+            log(
+                f"{step}/{h.iterations} train_loss: {train_loss.item():.4f} train_time: {approx_training_time_ms/60000:.1f}m tok/s: {tok_per_sec:.0f}"
+            )
+        reached_cap = (
+            max_wallclock_ms is not None and approx_training_time_ms >= max_wallclock_ms
+        )
+        if h.distributed and max_wallclock_ms is not None:
+            reached_cap_tensor = torch.tensor(int(reached_cap), device=device)
+            dist.all_reduce(reached_cap_tensor, op=dist.ReduceOp.MAX)
+            reached_cap = bool(reached_cap_tensor.item())
+        if stop_after_step is None and reached_cap:
+            stop_after_step = step
+    log(
+        f"peak memory allocated: {torch.cuda.max_memory_allocated()//1024//1024} MiB reserved: {torch.cuda.max_memory_reserved()//1024//1024} MiB"
+    )
+    log("ema:applying EMA weights")
+    current_state = base_model.state_dict()
+    avg_state = {
+        name: t.to(dtype=current_state[name].dtype) for (name, t) in ema_state.items()
+    }
+    base_model.load_state_dict(avg_state, strict=True)
+    return base_model, compiled_model, compiled_forward_logits
+
+
+def train_and_eval(h, device):
+    random.seed(h.seed)
+    np.random.seed(h.seed)
+    torch.manual_seed(h.seed)
+    torch.cuda.manual_seed_all(h.seed)
+    if h.artifact_dir and h.is_main_process:
+        os.makedirs(h.artifact_dir, exist_ok=True)
+    val_data = ValidationData(h, device)
+    log(
+        f"train_shards: {len(list(Path(h.datasets_dir).resolve().glob('fineweb_train_*.bin')))}"
+    )
+    log(f"val_tokens: {val_data.val_tokens.numel()-1}")
+    # TTT_EVAL_ONLY: skip training + GPTQ, jump straight to TTT eval on a
+    # pre-existing quantized artifact. Used to test TTT-only improvements
+    # (e.g., PR-1767's alpha/warm-start/WD) without retraining.
+    ttt_eval_only = os.environ.get("TTT_EVAL_ONLY", "0") == "1"
+    if ttt_eval_only:
+        log("TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval")
+        log(f"ttt_lora_alpha: {BatchedLinearLoRA._ALPHA}")
+        log(f"ttt_warm_start_a: {BatchedLinearLoRA._WARM_START_A}")
+        log(f"ttt_weight_decay: {h.ttt_weight_decay}")
+    else:
+        base_model, compiled_model, compiled_forward_logits = train_model(
+            h, device, val_data
+        )
+        torch._dynamo.reset()
+        timed_eval(
+            "diagnostic pre-quantization post-ema",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        if os.environ.get("PREQUANT_ONLY", "0") == "1":
+            log("PREQUANT_ONLY=1 — skipping serialize/GPTQ/post-quant eval/TTT")
+            return
+        serialize(h, base_model, Path(__file__).read_text(encoding="utf-8"))
+        if h.distributed:
+            dist.barrier()
+    eval_model = deserialize(h, device)
+    if h.num_loops > 0:
+        eval_model.looping_active = True
+    if not ttt_eval_only:
+        compiled_model = torch.compile(eval_model, dynamic=False, fullgraph=True)
+        compiled_forward_logits = torch.compile(
+            eval_model.forward_logits, dynamic=False, fullgraph=True
+        )
+        timed_eval(
+            "diagnostic quantized",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        del eval_model
+    if h.ttt_enabled:
+        if not ttt_eval_only:
+            del compiled_model
+        if ttt_eval_only:
+            del eval_model
+        torch._dynamo.reset()
+        torch.cuda.empty_cache()
+        ttt_model = deserialize(h, device)
+        if h.num_loops > 0:
+            ttt_model.looping_active = True
+        for p in ttt_model.parameters():
+            p.requires_grad_(False)
+
+        if h.rope_yarn:
+            _yarn_seqlen = h.train_batch_tokens // h.grad_accum_steps
+            for block in ttt_model.blocks:
+                block.attn.rotary(_yarn_seqlen, device, torch.bfloat16)
+        else:
+            for block in ttt_model.blocks:
+                block.attn.rotary._cos_cached = None
+                block.attn.rotary._sin_cached = None
+                block.attn.rotary._seq_len_cached = 0
+                block.attn.rotary(h.ttt_eval_seq_len, device, torch.bfloat16)
+
+        def _fwd_ttt_inner(input_ids, target_ids, lora):
+            return ttt_model.forward_ttt(input_ids, target_ids, lora=lora)
+
+        _fwd_ttt_compiled_inner = None
+
+        def _fwd_ttt(input_ids, target_ids, lora):
+            nonlocal _fwd_ttt_compiled_inner
+            if _fwd_ttt_compiled_inner is None:
+                _fwd_ttt_compiled_inner = torch.compile(_fwd_ttt_inner, dynamic=True)
+            return _fwd_ttt_compiled_inner(input_ids, target_ids, lora=lora)
+
+        fwd_ttt_compiled = _fwd_ttt
+        log(f"ttt_lora:warming up compile (random tokens, no val data)")
+        global BOS_ID
+        if BOS_ID is None:
+            BOS_ID = 1
+        t_warmup = time.perf_counter()
+        warmup_bszes = [h.ttt_batch_size]
+        for bsz in warmup_bszes:
+            wl = BatchedTTTLoRA(
+                bsz, ttt_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            wo = torch.optim.AdamW(
+                wl.parameters(),
+                lr=h.ttt_lora_lr,
+                betas=(h.ttt_beta1, h.ttt_beta2),
+                eps=1e-10,
+                weight_decay=h.ttt_weight_decay,
+                fused=True,
+            )
+            for ctx_len in (h.ttt_chunk_size, h.ttt_eval_seq_len):
+                xw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                yw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                    ptl = fwd_ttt_compiled(xw, yw, lora=wl)
+                ptl[:, : min(h.ttt_chunk_size, ctx_len)].mean(dim=-1).sum().backward()
+                wo.step()
+                wo.zero_grad(set_to_none=True)
+            del wl, wo
+        torch.cuda.empty_cache()
+        compile_elapsed = time.perf_counter() - t_warmup
+        log(f"ttt_lora:compile warmup done ({compile_elapsed:.1f}s)")
+        log("\nbeginning TTT eval timer")
+        torch.cuda.synchronize()
+        t_ttt = time.perf_counter()
+        ttt_val_loss, ttt_val_bpb = eval_val_ttt_phased(
+            h, ttt_model, device, val_data, forward_ttt_train=fwd_ttt_compiled
+        )
+        torch.cuda.synchronize()
+        ttt_eval_elapsed = time.perf_counter() - t_ttt
+        log(
+            "quantized_ttt_phased "
+            f"val_loss:{ttt_val_loss:.8f} val_bpb:{ttt_val_bpb:.8f} "
+            f"eval_time:{1e3*ttt_eval_elapsed:.0f}ms"
+        )
+        log(f"total_eval_time:{ttt_eval_elapsed:.1f}s")
+        del ttt_model
+
+
+def main():
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    if not torch.cuda.is_available():
+        raise RuntimeError("CUDA is required")
+    if world_size <= 0:
+        raise ValueError(f"WORLD_SIZE must be positive, got {world_size}")
+    if 8 % world_size != 0:
+        raise ValueError(
+            f"WORLD_SIZE={world_size} must divide 8 so grad_accum_steps stays integral"
+        )
+    device = torch.device("cuda", local_rank)
+    torch.cuda.set_device(device)
+    if distributed:
+        dist.init_process_group(backend="nccl", device_id=device)
+        dist.barrier()
+    torch.backends.cuda.matmul.allow_tf32 = True
+    torch.backends.cudnn.allow_tf32 = True
+    torch.set_float32_matmul_precision("high")
+    from torch.backends.cuda import (
+        enable_cudnn_sdp,
+        enable_flash_sdp,
+        enable_math_sdp,
+        enable_mem_efficient_sdp,
+    )
+
+    enable_cudnn_sdp(False)
+    enable_flash_sdp(True)
+    enable_mem_efficient_sdp(False)
+    enable_math_sdp(False)
+    torch._dynamo.config.optimize_ddp = False
+    torch._dynamo.config.cache_size_limit = 64
+    h = Hyperparameters()
+    set_logging_hparams(h)
+    if h.is_main_process:
+        os.makedirs(h.artifact_dir if h.artifact_dir else "logs", exist_ok=True)
+        log(100 * "=", console=False)
+        log("Hyperparameters:", console=True)
+        for (k, v) in sorted(vars(type(h)).items()):
+            if not k.startswith("_"):
+                log(f"  {k}: {v}", console=True)
+        log("=" * 100, console=False)
+        log("Source code:", console=False)
+        log("=" * 100, console=False)
+        with open(__file__, "r", encoding="utf-8") as _src:
+            log(_src.read(), console=False)
+        log("=" * 100, console=False)
+        log(f"Running Python {sys.version}", console=False)
+        log(f"Running PyTorch {torch.__version__}", console=False)
+        log("=" * 100, console=False)
+    train_and_eval(h, device)
+    if distributed:
+        dist.destroy_process_group()
+
+
+if __name__ == "__main__":
+    main()
+
+====================================================================================================
+Running Python 3.11.10 (main, Sep  7 2024, 18:35:41) [GCC 11.4.0]
+Running PyTorch 2.11.0+cu130
+====================================================================================================
+train_shards: 80
+val_tokens: 46688256
+model_params:35552455
+gptq:reserving 0s, effective=599500ms
+warmup_cu_buckets:64,128,192,256 iters_each:3
+warmup_step: 1/20
+warmup_step: 2/20
+warmup_step: 3/20
+warmup_step: 4/20
+warmup_step: 5/20
+warmup_step: 6/20
+warmup_step: 10/20
+warmup_step: 20/20
+loop_warmup:enabled encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+loop_warmup_step: 1/20
+loop_warmup_step: 2/20
+loop_warmup_step: 3/20
+loop_warmup_step: 4/20
+loop_warmup_step: 5/20
+loop_warmup_step: 6/20
+loop_warmup_step: 10/20
+loop_warmup_step: 20/20
+1/20000 train_loss: 9.2341 train_time: 0.0m tok/s: 17138894
+2/20000 train_loss: 12.8346 train_time: 0.0m tok/s: 11518801
+3/20000 train_loss: 10.3274 train_time: 0.0m tok/s: 10328236
+4/20000 train_loss: 8.9738 train_time: 0.0m tok/s: 9762668
+5/20000 train_loss: 8.0432 train_time: 0.0m tok/s: 9488232
+500/20000 train_loss: 2.8365 train_time: 0.8m tok/s: 8387204
+1000/20000 train_loss: 2.7591 train_time: 1.6m tok/s: 8367479
+1500/20000 train_loss: 2.6979 train_time: 2.4m tok/s: 8360670
+2000/20000 train_loss: 2.7573 train_time: 3.1m tok/s: 8359866
+2500/20000 train_loss: 2.6630 train_time: 3.9m tok/s: 8361407
+layer_loop:enabled step:2869 frac:0.450 encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+3000/20000 train_loss: 2.6353 train_time: 4.8m tok/s: 8193083
+3500/20000 train_loss: 2.6111 train_time: 6.0m tok/s: 7640312
+4000/20000 train_loss: 2.5369 train_time: 7.2m tok/s: 7326155
+4500/20000 train_loss: 2.5039 train_time: 8.3m tok/s: 7099105
+5000/20000 train_loss: 2.4689 train_time: 9.5m tok/s: 6927258
+5231/20000 val_loss: 2.4196 val_bpb: 1.0788
+stopping_early: wallclock_cap train_time: 599626ms step: 5231/20000
+peak memory allocated: 41660 MiB reserved: 48494 MiB
+ema:applying EMA weights
+diagnostic pre-quantization post-ema val_loss:2.39082488 val_bpb:1.06593975 eval_time:11310ms
+Serialized model: 131747517 bytes
+Code size (uncompressed): 162947 bytes
+Code size (compressed): 41171 bytes
+GPTQ:collecting Hessians from calibration data...
+GPTQ:collected 67 Hessians in 3.4s
+Quantized weights:
+  gate_int8_row: blocks.attn.attn_gate_w
+  gptq (int6): blocks.attn.c_k.weight, blocks.attn.c_q.weight, blocks.attn.c_v.weight, blocks.attn.proj.weight, blocks.mlp.fc.weight, blocks.mlp.proj.weight
+  gptq (int6)+lqer_asym: blocks.mlp.fc.weight
+  gptq (int7)+lqer_asym: tok_emb.weight
+  passthrough (float16): blocks.attn.q_gain, blocks.attn_scale, blocks.mlp_scale, blocks.resid_mix, parallel_post_lambdas, parallel_resid_lambdas, skip_gates, skip_weights, smear_gate.weight, smear_lambda
+Serialize: per-group lrzip compression...
+Serialize: per-group compression done in 141.7s
+Serialized model quantized+pergroup: 15777612 bytes
+Total submission size quantized+pergroup: 15818783 bytes
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.0s
+diagnostic quantized val_loss:2.41085747 val_bpb:1.07487120 eval_time:72292ms
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.0s
+ttt_lora:warming up compile (random tokens, no val data)
+ttt_lora:compile warmup done (179.8s)
+
+beginning TTT eval timer
+ttt_phased: total_docs:50000 prefix_docs:2500 suffix_docs:47500 num_phases:3 boundaries:[833, 1666, 2500]
+ttp: b775/782 bl:2.3225 bb:1.0645 rl:2.3225 rb:1.0645 dl:6695-7323 gd:0
+ttp: b774/782 bl:2.3627 bb:1.0713 rl:2.3419 rb:1.0678 dl:6300-6695 gd:0
+ttp: b769/782 bl:2.3772 bb:1.0883 rl:2.3515 rb:1.0734 dl:4969-5170 gd:0
+ttp: b763/782 bl:2.4611 bb:1.0852 rl:2.3714 rb:1.0756 dl:4049-4178 gd:0
+ttpp: phase:1/3 pd:1296 gd:833 t:227.9s
+tttg: c1/127 lr:0.001000 t:1.4s
+tttg: c2/127 lr:0.001000 t:1.5s
+tttg: c3/127 lr:0.000999 t:1.6s
+tttg: c4/127 lr:0.000999 t:1.7s
+tttg: c5/127 lr:0.000998 t:1.7s
+tttg: c6/127 lr:0.000996 t:1.8s
+tttg: c7/127 lr:0.000994 t:1.9s
+tttg: c8/127 lr:0.000992 t:2.0s
+tttg: c9/127 lr:0.000990 t:2.1s
+tttg: c10/127 lr:0.000987 t:2.1s
+tttg: c11/127 lr:0.000985 t:2.2s
+tttg: c12/127 lr:0.000981 t:2.3s
+tttg: c13/127 lr:0.000978 t:2.4s
+tttg: c14/127 lr:0.000974 t:2.4s
+tttg: c15/127 lr:0.000970 t:2.5s
+tttg: c16/127 lr:0.000965 t:2.6s
+tttg: c17/127 lr:0.000961 t:2.7s
+tttg: c18/127 lr:0.000956 t:2.8s
+tttg: c19/127 lr:0.000950 t:2.9s
+tttg: c20/127 lr:0.000945 t:2.9s
+tttg: c21/127 lr:0.000939 t:3.0s
+tttg: c22/127 lr:0.000933 t:3.1s
+tttg: c23/127 lr:0.000927 t:3.2s
+tttg: c24/127 lr:0.000920 t:3.3s
+tttg: c25/127 lr:0.000913 t:3.3s
+tttg: c26/127 lr:0.000906 t:3.4s
+tttg: c27/127 lr:0.000899 t:3.5s
+tttg: c28/127 lr:0.000891 t:3.6s
+tttg: c29/127 lr:0.000883 t:3.7s
+tttg: c30/127 lr:0.000875 t:3.7s
+tttg: c31/127 lr:0.000867 t:3.8s
+tttg: c32/127 lr:0.000858 t:3.9s
+tttg: c33/127 lr:0.000849 t:4.0s
+tttg: c34/127 lr:0.000840 t:4.1s
+tttg: c35/127 lr:0.000831 t:4.1s
+tttg: c36/127 lr:0.000821 t:4.2s
+tttg: c37/127 lr:0.000812 t:4.3s
+tttg: c38/127 lr:0.000802 t:4.4s
+tttg: c39/127 lr:0.000792 t:4.5s
+tttg: c40/127 lr:0.000782 t:4.6s
+tttg: c41/127 lr:0.000771 t:4.7s
+tttg: c42/127 lr:0.000761 t:4.7s
+tttg: c43/127 lr:0.000750 t:4.8s
+tttg: c44/127 lr:0.000739 t:4.9s
+tttg: c45/127 lr:0.000728 t:5.0s
+tttg: c46/127 lr:0.000717 t:5.1s
+tttg: c47/127 lr:0.000706 t:5.1s
+tttg: c48/127 lr:0.000694 t:5.2s
+tttg: c49/127 lr:0.000683 t:5.3s
+tttg: c50/127 lr:0.000671 t:5.4s
+tttg: c51/127 lr:0.000659 t:5.5s
+tttg: c52/127 lr:0.000647 t:5.5s
+tttg: c53/127 lr:0.000635 t:5.6s
+tttg: c54/127 lr:0.000623 t:5.7s
+tttg: c55/127 lr:0.000611 t:5.8s
+tttg: c56/127 lr:0.000599 t:5.9s
+tttg: c57/127 lr:0.000587 t:6.0s
+tttg: c58/127 lr:0.000575 t:6.1s
+tttg: c59/127 lr:0.000562 t:6.1s
+tttg: c60/127 lr:0.000550 t:6.2s
+tttg: c61/127 lr:0.000537 t:6.3s
+tttg: c62/127 lr:0.000525 t:6.4s
+tttg: c63/127 lr:0.000512 t:6.5s
+tttg: c64/127 lr:0.000500 t:6.5s
+tttg: c65/127 lr:0.000488 t:6.6s
+tttg: c66/127 lr:0.000475 t:6.7s
+tttg: c67/127 lr:0.000463 t:6.8s
+tttg: c68/127 lr:0.000450 t:6.9s
+tttg: c69/127 lr:0.000438 t:6.9s
+tttg: c70/127 lr:0.000425 t:7.0s
+tttg: c71/127 lr:0.000413 t:7.1s
+tttg: c72/127 lr:0.000401 t:7.2s
+tttg: c73/127 lr:0.000389 t:7.3s
+tttg: c74/127 lr:0.000377 t:7.3s
+tttg: c75/127 lr:0.000365 t:7.4s
+tttg: c76/127 lr:0.000353 t:7.5s
+tttg: c77/127 lr:0.000341 t:7.6s
+tttg: c78/127 lr:0.000329 t:7.7s
+tttg: c79/127 lr:0.000317 t:7.7s
+tttg: c80/127 lr:0.000306 t:7.8s
+tttg: c81/127 lr:0.000294 t:7.9s
+tttg: c82/127 lr:0.000283 t:8.0s
+tttg: c83/127 lr:0.000272 t:8.1s
+tttg: c84/127 lr:0.000261 t:8.2s
+tttg: c85/127 lr:0.000250 t:8.2s
+tttg: c86/127 lr:0.000239 t:8.3s
+tttg: c87/127 lr:0.000229 t:8.4s
+tttg: c88/127 lr:0.000218 t:8.5s
+tttg: c89/127 lr:0.000208 t:8.6s
+tttg: c90/127 lr:0.000198 t:8.6s
+tttg: c91/127 lr:0.000188 t:8.7s
+tttg: c92/127 lr:0.000179 t:8.8s
+tttg: c93/127 lr:0.000169 t:8.9s
+tttg: c94/127 lr:0.000160 t:8.9s
+tttg: c95/127 lr:0.000151 t:9.0s
+tttg: c96/127 lr:0.000142 t:9.1s
+tttg: c97/127 lr:0.000133 t:9.2s
+tttg: c98/127 lr:0.000125 t:9.3s
+tttg: c99/127 lr:0.000117 t:9.3s
+tttg: c100/127 lr:0.000109 t:9.4s
+tttg: c101/127 lr:0.000101 t:9.5s
+tttg: c102/127 lr:0.000094 t:9.6s
+tttg: c103/127 lr:0.000087 t:9.7s
+tttg: c104/127 lr:0.000080 t:9.8s
+tttg: c105/127 lr:0.000073 t:9.8s
+tttg: c106/127 lr:0.000067 t:9.9s
+tttg: c107/127 lr:0.000061 t:10.0s
+tttg: c108/127 lr:0.000055 t:10.1s
+tttg: c109/127 lr:0.000050 t:10.2s
+tttg: c110/127 lr:0.000044 t:10.2s
+tttg: c111/127 lr:0.000039 t:10.3s
+tttg: c112/127 lr:0.000035 t:10.4s
+tttg: c113/127 lr:0.000030 t:10.5s
+tttg: c114/127 lr:0.000026 t:10.6s
+tttg: c115/127 lr:0.000022 t:10.6s
+tttg: c116/127 lr:0.000019 t:10.7s
+tttg: c117/127 lr:0.000015 t:10.8s
+tttg: c118/127 lr:0.000013 t:10.9s
+tttg: c119/127 lr:0.000010 t:11.0s
+tttg: c120/127 lr:0.000008 t:11.0s
+tttg: c121/127 lr:0.000006 t:11.1s
+tttg: c122/127 lr:0.000004 t:11.2s
+tttg: c123/127 lr:0.000002 t:11.3s
+tttg: c124/127 lr:0.000001 t:11.4s
+tttg: c125/127 lr:0.000001 t:11.4s
+tttg: c126/127 lr:0.000000 t:11.5s
+ttpr: phase:1/3 t:241.2s
+ttp: b759/782 bl:2.4374 bb:1.0676 rl:2.3806 rb:1.0744 dl:3647-3729 gd:0
+ttpp: phase:2/3 pd:2128 gd:1666 t:363.9s
+tttg: c1/214 lr:0.001000 t:0.1s
+tttg: c2/214 lr:0.001000 t:0.2s
+tttg: c3/214 lr:0.001000 t:0.2s
+tttg: c4/214 lr:0.001000 t:0.3s
+tttg: c5/214 lr:0.000999 t:0.4s
+tttg: c6/214 lr:0.000999 t:0.5s
+tttg: c7/214 lr:0.000998 t:0.6s
+tttg: c8/214 lr:0.000997 t:0.6s
+tttg: c9/214 lr:0.000997 t:0.7s
+tttg: c10/214 lr:0.000996 t:0.8s
+tttg: c11/214 lr:0.000995 t:0.9s
+tttg: c12/214 lr:0.000993 t:1.0s
+tttg: c13/214 lr:0.000992 t:1.0s
+tttg: c14/214 lr:0.000991 t:1.1s
+tttg: c15/214 lr:0.000989 t:1.2s
+tttg: c16/214 lr:0.000988 t:1.3s
+tttg: c17/214 lr:0.000986 t:1.4s
+tttg: c18/214 lr:0.000984 t:1.4s
+tttg: c19/214 lr:0.000982 t:1.5s
+tttg: c20/214 lr:0.000980 t:1.6s
+tttg: c21/214 lr:0.000978 t:1.7s
+tttg: c22/214 lr:0.000976 t:1.8s
+tttg: c23/214 lr:0.000974 t:1.8s
+tttg: c24/214 lr:0.000972 t:1.9s
+tttg: c25/214 lr:0.000969 t:2.0s
+tttg: c26/214 lr:0.000966 t:2.1s
+tttg: c27/214 lr:0.000964 t:2.2s
+tttg: c28/214 lr:0.000961 t:2.3s
+tttg: c29/214 lr:0.000958 t:2.3s
+tttg: c30/214 lr:0.000955 t:2.4s
+tttg: c31/214 lr:0.000952 t:2.5s
+tttg: c32/214 lr:0.000949 t:2.6s
+tttg: c33/214 lr:0.000945 t:2.7s
+tttg: c34/214 lr:0.000942 t:2.8s
+tttg: c35/214 lr:0.000938 t:2.8s
+tttg: c36/214 lr:0.000935 t:2.9s
+tttg: c37/214 lr:0.000931 t:3.0s
+tttg: c38/214 lr:0.000927 t:3.1s
+tttg: c39/214 lr:0.000924 t:3.2s
+tttg: c40/214 lr:0.000920 t:3.2s
+tttg: c41/214 lr:0.000915 t:3.3s
+tttg: c42/214 lr:0.000911 t:3.4s
+tttg: c43/214 lr:0.000907 t:3.5s
+tttg: c44/214 lr:0.000903 t:3.6s
+tttg: c45/214 lr:0.000898 t:3.6s
+tttg: c46/214 lr:0.000894 t:3.7s
+tttg: c47/214 lr:0.000889 t:3.8s
+tttg: c48/214 lr:0.000885 t:3.9s
+tttg: c49/214 lr:0.000880 t:4.0s
+tttg: c50/214 lr:0.000875 t:4.0s
+tttg: c51/214 lr:0.000870 t:4.1s
+tttg: c52/214 lr:0.000865 t:4.2s
+tttg: c53/214 lr:0.000860 t:4.3s
+tttg: c54/214 lr:0.000855 t:4.4s
+tttg: c55/214 lr:0.000850 t:4.5s
+tttg: c56/214 lr:0.000844 t:4.5s
+tttg: c57/214 lr:0.000839 t:4.6s
+tttg: c58/214 lr:0.000833 t:4.7s
+tttg: c59/214 lr:0.000828 t:4.8s
+tttg: c60/214 lr:0.000822 t:4.9s
+tttg: c61/214 lr:0.000817 t:4.9s
+tttg: c62/214 lr:0.000811 t:5.0s
+tttg: c63/214 lr:0.000805 t:5.1s
+tttg: c64/214 lr:0.000799 t:5.2s
+tttg: c65/214 lr:0.000793 t:5.3s
+tttg: c66/214 lr:0.000787 t:5.3s
+tttg: c67/214 lr:0.000781 t:5.4s
+tttg: c68/214 lr:0.000775 t:5.5s
+tttg: c69/214 lr:0.000769 t:5.6s
+tttg: c70/214 lr:0.000763 t:5.7s
+tttg: c71/214 lr:0.000756 t:5.7s
+tttg: c72/214 lr:0.000750 t:5.8s
+tttg: c73/214 lr:0.000744 t:5.9s
+tttg: c74/214 lr:0.000737 t:6.0s
+tttg: c75/214 lr:0.000731 t:6.1s
+tttg: c76/214 lr:0.000724 t:6.1s
+tttg: c77/214 lr:0.000717 t:6.2s
+tttg: c78/214 lr:0.000711 t:6.3s
+tttg: c79/214 lr:0.000704 t:6.4s
+tttg: c80/214 lr:0.000697 t:6.5s
+tttg: c81/214 lr:0.000690 t:6.5s
+tttg: c82/214 lr:0.000684 t:6.6s
+tttg: c83/214 lr:0.000677 t:6.7s
+tttg: c84/214 lr:0.000670 t:6.8s
+tttg: c85/214 lr:0.000663 t:6.9s
+tttg: c86/214 lr:0.000656 t:6.9s
+tttg: c87/214 lr:0.000649 t:7.0s
+tttg: c88/214 lr:0.000642 t:7.1s
+tttg: c89/214 lr:0.000635 t:7.2s
+tttg: c90/214 lr:0.000628 t:7.3s
+tttg: c91/214 lr:0.000620 t:7.4s
+tttg: c92/214 lr:0.000613 t:7.4s
+tttg: c93/214 lr:0.000606 t:7.5s
+tttg: c94/214 lr:0.000599 t:7.6s
+tttg: c95/214 lr:0.000592 t:7.7s
+tttg: c96/214 lr:0.000584 t:7.8s
+tttg: c97/214 lr:0.000577 t:7.8s
+tttg: c98/214 lr:0.000570 t:7.9s
+tttg: c99/214 lr:0.000563 t:8.0s
+tttg: c100/214 lr:0.000555 t:8.1s
+tttg: c101/214 lr:0.000548 t:8.1s
+tttg: c102/214 lr:0.000541 t:8.2s
+tttg: c103/214 lr:0.000533 t:8.3s
+tttg: c104/214 lr:0.000526 t:8.4s
+tttg: c105/214 lr:0.000518 t:8.5s
+tttg: c106/214 lr:0.000511 t:8.5s
+tttg: c107/214 lr:0.000504 t:8.6s
+tttg: c108/214 lr:0.000496 t:8.7s
+tttg: c109/214 lr:0.000489 t:8.8s
+tttg: c110/214 lr:0.000482 t:8.9s
+tttg: c111/214 lr:0.000474 t:9.0s
+tttg: c112/214 lr:0.000467 t:9.0s
+tttg: c113/214 lr:0.000459 t:9.1s
+tttg: c114/214 lr:0.000452 t:9.2s
+tttg: c115/214 lr:0.000445 t:9.3s
+tttg: c116/214 lr:0.000437 t:9.3s
+tttg: c117/214 lr:0.000430 t:9.4s
+tttg: c118/214 lr:0.000423 t:9.5s
+tttg: c119/214 lr:0.000416 t:9.6s
+tttg: c120/214 lr:0.000408 t:9.7s
+tttg: c121/214 lr:0.000401 t:9.7s
+tttg: c122/214 lr:0.000394 t:9.8s
+tttg: c123/214 lr:0.000387 t:9.9s
+tttg: c124/214 lr:0.000380 t:10.0s
+tttg: c125/214 lr:0.000372 t:10.1s
+tttg: c126/214 lr:0.000365 t:10.1s
+tttg: c127/214 lr:0.000358 t:10.2s
+tttg: c128/214 lr:0.000351 t:10.3s
+tttg: c129/214 lr:0.000344 t:10.4s
+tttg: c130/214 lr:0.000337 t:10.5s
+tttg: c131/214 lr:0.000330 t:10.5s
+tttg: c132/214 lr:0.000323 t:10.6s
+tttg: c133/214 lr:0.000316 t:10.7s
+tttg: c134/214 lr:0.000310 t:10.8s
+tttg: c135/214 lr:0.000303 t:10.9s
+tttg: c136/214 lr:0.000296 t:11.0s
+tttg: c137/214 lr:0.000289 t:11.0s
+tttg: c138/214 lr:0.000283 t:11.1s
+tttg: c139/214 lr:0.000276 t:11.2s
+tttg: c140/214 lr:0.000269 t:11.3s
+tttg: c141/214 lr:0.000263 t:11.3s
+tttg: c142/214 lr:0.000256 t:11.4s
+tttg: c143/214 lr:0.000250 t:11.5s
+tttg: c144/214 lr:0.000244 t:11.6s
+tttg: c145/214 lr:0.000237 t:11.7s
+tttg: c146/214 lr:0.000231 t:11.7s
+tttg: c147/214 lr:0.000225 t:11.8s
+tttg: c148/214 lr:0.000219 t:11.9s
+tttg: c149/214 lr:0.000213 t:12.0s
+tttg: c150/214 lr:0.000207 t:12.1s
+tttg: c151/214 lr:0.000201 t:12.2s
+tttg: c152/214 lr:0.000195 t:12.2s
+tttg: c153/214 lr:0.000189 t:12.3s
+tttg: c154/214 lr:0.000183 t:12.4s
+tttg: c155/214 lr:0.000178 t:12.5s
+tttg: c156/214 lr:0.000172 t:12.6s
+tttg: c157/214 lr:0.000167 t:12.6s
+tttg: c158/214 lr:0.000161 t:12.7s
+tttg: c159/214 lr:0.000156 t:12.8s
+tttg: c160/214 lr:0.000150 t:12.9s
+tttg: c161/214 lr:0.000145 t:13.0s
+tttg: c162/214 lr:0.000140 t:13.0s
+tttg: c163/214 lr:0.000135 t:13.1s
+tttg: c164/214 lr:0.000130 t:13.2s
+tttg: c165/214 lr:0.000125 t:13.3s
+tttg: c166/214 lr:0.000120 t:13.4s
+tttg: c167/214 lr:0.000115 t:13.4s
+tttg: c168/214 lr:0.000111 t:13.5s
+tttg: c169/214 lr:0.000106 t:13.6s
+tttg: c170/214 lr:0.000102 t:13.7s
+tttg: c171/214 lr:0.000097 t:13.8s
+tttg: c172/214 lr:0.000093 t:13.8s
+tttg: c173/214 lr:0.000089 t:13.9s
+tttg: c174/214 lr:0.000085 t:14.0s
+tttg: c175/214 lr:0.000080 t:14.1s
+tttg: c176/214 lr:0.000076 t:14.2s
+tttg: c177/214 lr:0.000073 t:14.2s
+tttg: c178/214 lr:0.000069 t:14.3s
+tttg: c179/214 lr:0.000065 t:14.4s
+tttg: c180/214 lr:0.000062 t:14.5s
+tttg: c181/214 lr:0.000058 t:14.6s
+tttg: c182/214 lr:0.000055 t:14.6s
+tttg: c183/214 lr:0.000051 t:14.7s
+tttg: c184/214 lr:0.000048 t:14.8s
+tttg: c185/214 lr:0.000045 t:14.9s
+tttg: c186/214 lr:0.000042 t:15.0s
+tttg: c187/214 lr:0.000039 t:15.0s
+tttg: c188/214 lr:0.000036 t:15.1s
+tttg: c189/214 lr:0.000034 t:15.2s
+tttg: c190/214 lr:0.000031 t:15.3s
+tttg: c191/214 lr:0.000028 t:15.4s
+tttg: c192/214 lr:0.000026 t:15.5s
+tttg: c193/214 lr:0.000024 t:15.5s
+tttg: c194/214 lr:0.000022 t:15.6s
+tttg: c195/214 lr:0.000020 t:15.7s
+tttg: c196/214 lr:0.000018 t:15.8s
+tttg: c197/214 lr:0.000016 t:15.9s
+tttg: c198/214 lr:0.000014 t:15.9s
+tttg: c199/214 lr:0.000012 t:16.0s
+tttg: c200/214 lr:0.000011 t:16.1s
+tttg: c201/214 lr:0.000009 t:16.2s
+tttg: c202/214 lr:0.000008 t:16.3s
+tttg: c203/214 lr:0.000007 t:16.3s
+tttg: c204/214 lr:0.000005 t:16.4s
+tttg: c205/214 lr:0.000004 t:16.5s
+tttg: c206/214 lr:0.000003 t:16.6s
+tttg: c207/214 lr:0.000003 t:16.7s
+tttg: c208/214 lr:0.000002 t:16.7s
+tttg: c209/214 lr:0.000001 t:16.8s
+tttg: c210/214 lr:0.000001 t:16.9s
+tttg: c211/214 lr:0.000000 t:17.0s
+tttg: c212/214 lr:0.000000 t:17.1s
+tttg: c213/214 lr:0.000000 t:17.1s
+ttpr: phase:2/3 t:382.8s
+ttp: b748/782 bl:2.3782 bb:1.0791 rl:2.3804 rb:1.0749 dl:2918-2965 gd:0
+ttpp: phase:3/3 pd:2960 gd:2500 t:397.7s
+tttg: c1/282 lr:0.001000 t:0.1s
+tttg: c2/282 lr:0.001000 t:0.2s
+tttg: c3/282 lr:0.001000 t:0.2s
+tttg: c4/282 lr:0.001000 t:0.3s
+tttg: c5/282 lr:0.001000 t:0.4s
+tttg: c6/282 lr:0.000999 t:0.5s
+tttg: c7/282 lr:0.000999 t:0.5s
+tttg: c8/282 lr:0.000998 t:0.6s
+tttg: c9/282 lr:0.000998 t:0.7s
+tttg: c10/282 lr:0.000997 t:0.8s
+tttg: c11/282 lr:0.000997 t:0.8s
+tttg: c12/282 lr:0.000996 t:0.9s
+tttg: c13/282 lr:0.000996 t:1.0s
+tttg: c14/282 lr:0.000995 t:1.1s
+tttg: c15/282 lr:0.000994 t:1.1s
+tttg: c16/282 lr:0.000993 t:1.2s
+tttg: c17/282 lr:0.000992 t:1.3s
+tttg: c18/282 lr:0.000991 t:1.4s
+tttg: c19/282 lr:0.000990 t:1.5s
+tttg: c20/282 lr:0.000989 t:1.5s
+tttg: c21/282 lr:0.000988 t:1.6s
+tttg: c22/282 lr:0.000986 t:1.7s
+tttg: c23/282 lr:0.000985 t:1.8s
+tttg: c24/282 lr:0.000984 t:1.9s
+tttg: c25/282 lr:0.000982 t:2.0s
+tttg: c26/282 lr:0.000981 t:2.0s
+tttg: c27/282 lr:0.000979 t:2.1s
+tttg: c28/282 lr:0.000977 t:2.2s
+tttg: c29/282 lr:0.000976 t:2.3s
+tttg: c30/282 lr:0.000974 t:2.3s
+tttg: c31/282 lr:0.000972 t:2.4s
+tttg: c32/282 lr:0.000970 t:2.5s
+tttg: c33/282 lr:0.000968 t:2.6s
+tttg: c34/282 lr:0.000966 t:2.7s
+tttg: c35/282 lr:0.000964 t:2.7s
+tttg: c36/282 lr:0.000962 t:2.8s
+tttg: c37/282 lr:0.000960 t:2.9s
+tttg: c38/282 lr:0.000958 t:3.0s
+tttg: c39/282 lr:0.000956 t:3.1s
+tttg: c40/282 lr:0.000953 t:3.1s
+tttg: c41/282 lr:0.000951 t:3.2s
+tttg: c42/282 lr:0.000948 t:3.3s
+tttg: c43/282 lr:0.000946 t:3.4s
+tttg: c44/282 lr:0.000943 t:3.5s
+tttg: c45/282 lr:0.000941 t:3.6s
+tttg: c46/282 lr:0.000938 t:3.6s
+tttg: c47/282 lr:0.000935 t:3.7s
+tttg: c48/282 lr:0.000933 t:3.8s
+tttg: c49/282 lr:0.000930 t:3.9s
+tttg: c50/282 lr:0.000927 t:3.9s
+tttg: c51/282 lr:0.000924 t:4.0s
+tttg: c52/282 lr:0.000921 t:4.1s
+tttg: c53/282 lr:0.000918 t:4.2s
+tttg: c54/282 lr:0.000915 t:4.3s
+tttg: c55/282 lr:0.000912 t:4.3s
+tttg: c56/282 lr:0.000908 t:4.4s
+tttg: c57/282 lr:0.000905 t:4.5s
+tttg: c58/282 lr:0.000902 t:4.6s
+tttg: c59/282 lr:0.000899 t:4.7s
+tttg: c60/282 lr:0.000895 t:4.7s
+tttg: c61/282 lr:0.000892 t:4.8s
+tttg: c62/282 lr:0.000888 t:4.9s
+tttg: c63/282 lr:0.000885 t:5.0s
+tttg: c64/282 lr:0.000881 t:5.1s
+tttg: c65/282 lr:0.000877 t:5.2s
+tttg: c66/282 lr:0.000874 t:5.2s
+tttg: c67/282 lr:0.000870 t:5.3s
+tttg: c68/282 lr:0.000866 t:5.4s
+tttg: c69/282 lr:0.000862 t:5.5s
+tttg: c70/282 lr:0.000858 t:5.6s
+tttg: c71/282 lr:0.000855 t:5.6s
+tttg: c72/282 lr:0.000851 t:5.7s
+tttg: c73/282 lr:0.000847 t:5.8s
+tttg: c74/282 lr:0.000843 t:5.9s
+tttg: c75/282 lr:0.000838 t:6.0s
+tttg: c76/282 lr:0.000834 t:6.0s
+tttg: c77/282 lr:0.000830 t:6.1s
+tttg: c78/282 lr:0.000826 t:6.2s
+tttg: c79/282 lr:0.000822 t:6.3s
+tttg: c80/282 lr:0.000817 t:6.3s
+tttg: c81/282 lr:0.000813 t:6.4s
+tttg: c82/282 lr:0.000809 t:6.5s
+tttg: c83/282 lr:0.000804 t:6.6s
+tttg: c84/282 lr:0.000800 t:6.7s
+tttg: c85/282 lr:0.000795 t:6.8s
+tttg: c86/282 lr:0.000791 t:6.8s
+tttg: c87/282 lr:0.000786 t:6.9s
+tttg: c88/282 lr:0.000782 t:7.0s
+tttg: c89/282 lr:0.000777 t:7.1s
+tttg: c90/282 lr:0.000772 t:7.2s
+tttg: c91/282 lr:0.000768 t:7.2s
+tttg: c92/282 lr:0.000763 t:7.3s
+tttg: c93/282 lr:0.000758 t:7.4s
+tttg: c94/282 lr:0.000753 t:7.5s
+tttg: c95/282 lr:0.000748 t:7.6s
+tttg: c96/282 lr:0.000744 t:7.6s
+tttg: c97/282 lr:0.000739 t:7.7s
+tttg: c98/282 lr:0.000734 t:7.8s
+tttg: c99/282 lr:0.000729 t:7.9s
+tttg: c100/282 lr:0.000724 t:8.0s
+tttg: c101/282 lr:0.000719 t:8.1s
+tttg: c102/282 lr:0.000714 t:8.1s
+tttg: c103/282 lr:0.000709 t:8.2s
+tttg: c104/282 lr:0.000704 t:8.3s
+tttg: c105/282 lr:0.000698 t:8.4s
+tttg: c106/282 lr:0.000693 t:8.5s
+tttg: c107/282 lr:0.000688 t:8.5s
+tttg: c108/282 lr:0.000683 t:8.6s
+tttg: c109/282 lr:0.000678 t:8.7s
+tttg: c110/282 lr:0.000672 t:8.8s
+tttg: c111/282 lr:0.000667 t:8.9s
+tttg: c112/282 lr:0.000662 t:8.9s
+tttg: c113/282 lr:0.000657 t:9.0s
+tttg: c114/282 lr:0.000651 t:9.1s
+tttg: c115/282 lr:0.000646 t:9.2s
+tttg: c116/282 lr:0.000641 t:9.3s
+tttg: c117/282 lr:0.000635 t:9.3s
+tttg: c118/282 lr:0.000630 t:9.4s
+tttg: c119/282 lr:0.000624 t:9.5s
+tttg: c120/282 lr:0.000619 t:9.6s
+tttg: c121/282 lr:0.000614 t:9.6s
+tttg: c122/282 lr:0.000608 t:9.7s
+tttg: c123/282 lr:0.000603 t:9.8s
+tttg: c124/282 lr:0.000597 t:9.9s
+tttg: c125/282 lr:0.000592 t:10.0s
+tttg: c126/282 lr:0.000586 t:10.0s
+tttg: c127/282 lr:0.000581 t:10.1s
+tttg: c128/282 lr:0.000575 t:10.2s
+tttg: c129/282 lr:0.000570 t:10.3s
+tttg: c130/282 lr:0.000564 t:10.4s
+tttg: c131/282 lr:0.000559 t:10.5s
+tttg: c132/282 lr:0.000553 t:10.5s
+tttg: c133/282 lr:0.000547 t:10.6s
+tttg: c134/282 lr:0.000542 t:10.7s
+tttg: c135/282 lr:0.000536 t:10.8s
+tttg: c136/282 lr:0.000531 t:10.9s
+tttg: c137/282 lr:0.000525 t:10.9s
+tttg: c138/282 lr:0.000520 t:11.0s
+tttg: c139/282 lr:0.000514 t:11.1s
+tttg: c140/282 lr:0.000508 t:11.2s
+tttg: c141/282 lr:0.000503 t:11.3s
+tttg: c142/282 lr:0.000497 t:11.3s
+tttg: c143/282 lr:0.000492 t:11.4s
+tttg: c144/282 lr:0.000486 t:11.5s
+tttg: c145/282 lr:0.000480 t:11.6s
+tttg: c146/282 lr:0.000475 t:11.7s
+tttg: c147/282 lr:0.000469 t:11.7s
+tttg: c148/282 lr:0.000464 t:11.8s
+tttg: c149/282 lr:0.000458 t:11.9s
+tttg: c150/282 lr:0.000453 t:12.0s
+tttg: c151/282 lr:0.000447 t:12.1s
+tttg: c152/282 lr:0.000441 t:12.1s
+tttg: c153/282 lr:0.000436 t:12.2s
+tttg: c154/282 lr:0.000430 t:12.3s
+tttg: c155/282 lr:0.000425 t:12.4s
+tttg: c156/282 lr:0.000419 t:12.5s
+tttg: c157/282 lr:0.000414 t:12.5s
+tttg: c158/282 lr:0.000408 t:12.6s
+tttg: c159/282 lr:0.000403 t:12.7s
+tttg: c160/282 lr:0.000397 t:12.8s
+tttg: c161/282 lr:0.000392 t:12.9s
+tttg: c162/282 lr:0.000386 t:13.0s
+tttg: c163/282 lr:0.000381 t:13.0s
+tttg: c164/282 lr:0.000376 t:13.1s
+tttg: c165/282 lr:0.000370 t:13.2s
+tttg: c166/282 lr:0.000365 t:13.3s
+tttg: c167/282 lr:0.000359 t:13.4s
+tttg: c168/282 lr:0.000354 t:13.4s
+tttg: c169/282 lr:0.000349 t:13.5s
+tttg: c170/282 lr:0.000343 t:13.6s
+tttg: c171/282 lr:0.000338 t:13.7s
+tttg: c172/282 lr:0.000333 t:13.8s
+tttg: c173/282 lr:0.000328 t:13.8s
+tttg: c174/282 lr:0.000322 t:13.9s
+tttg: c175/282 lr:0.000317 t:14.0s
+tttg: c176/282 lr:0.000312 t:14.1s
+tttg: c177/282 lr:0.000307 t:14.2s
+tttg: c178/282 lr:0.000302 t:14.2s
+tttg: c179/282 lr:0.000296 t:14.3s
+tttg: c180/282 lr:0.000291 t:14.4s
+tttg: c181/282 lr:0.000286 t:14.5s
+tttg: c182/282 lr:0.000281 t:14.6s
+tttg: c183/282 lr:0.000276 t:14.7s
+tttg: c184/282 lr:0.000271 t:14.7s
+tttg: c185/282 lr:0.000266 t:14.8s
+tttg: c186/282 lr:0.000261 t:14.9s
+tttg: c187/282 lr:0.000256 t:15.0s
+tttg: c188/282 lr:0.000252 t:15.1s
+tttg: c189/282 lr:0.000247 t:15.1s
+tttg: c190/282 lr:0.000242 t:15.2s
+tttg: c191/282 lr:0.000237 t:15.3s
+tttg: c192/282 lr:0.000232 t:15.4s
+tttg: c193/282 lr:0.000228 t:15.5s
+tttg: c194/282 lr:0.000223 t:15.5s
+tttg: c195/282 lr:0.000218 t:15.6s
+tttg: c196/282 lr:0.000214 t:15.7s
+tttg: c197/282 lr:0.000209 t:15.8s
+tttg: c198/282 lr:0.000205 t:15.9s
+tttg: c199/282 lr:0.000200 t:15.9s
+tttg: c200/282 lr:0.000196 t:16.0s
+tttg: c201/282 lr:0.000191 t:16.1s
+tttg: c202/282 lr:0.000187 t:16.2s
+tttg: c203/282 lr:0.000183 t:16.3s
+tttg: c204/282 lr:0.000178 t:16.3s
+tttg: c205/282 lr:0.000174 t:16.4s
+tttg: c206/282 lr:0.000170 t:16.5s
+tttg: c207/282 lr:0.000166 t:16.6s
+tttg: c208/282 lr:0.000162 t:16.7s
+tttg: c209/282 lr:0.000157 t:16.8s
+tttg: c210/282 lr:0.000153 t:16.8s
+tttg: c211/282 lr:0.000149 t:16.9s
+tttg: c212/282 lr:0.000145 t:17.0s
+tttg: c213/282 lr:0.000142 t:17.1s
+tttg: c214/282 lr:0.000138 t:17.2s
+tttg: c215/282 lr:0.000134 t:17.2s
+tttg: c216/282 lr:0.000130 t:17.3s
+tttg: c217/282 lr:0.000126 t:17.4s
+tttg: c218/282 lr:0.000123 t:17.5s
+tttg: c219/282 lr:0.000119 t:17.6s
+tttg: c220/282 lr:0.000115 t:17.6s
+tttg: c221/282 lr:0.000112 t:17.7s
+tttg: c222/282 lr:0.000108 t:17.8s
+tttg: c223/282 lr:0.000105 t:17.9s
+tttg: c224/282 lr:0.000101 t:18.0s
+tttg: c225/282 lr:0.000098 t:18.0s
+tttg: c226/282 lr:0.000095 t:18.1s
+tttg: c227/282 lr:0.000092 t:18.2s
+tttg: c228/282 lr:0.000088 t:18.3s
+tttg: c229/282 lr:0.000085 t:18.4s
+tttg: c230/282 lr:0.000082 t:18.4s
+tttg: c231/282 lr:0.000079 t:18.5s
+tttg: c232/282 lr:0.000076 t:18.6s
+tttg: c233/282 lr:0.000073 t:18.7s
+tttg: c234/282 lr:0.000070 t:18.8s
+tttg: c235/282 lr:0.000067 t:18.8s
+tttg: c236/282 lr:0.000065 t:18.9s
+tttg: c237/282 lr:0.000062 t:19.0s
+tttg: c238/282 lr:0.000059 t:19.1s
+tttg: c239/282 lr:0.000057 t:19.2s
+tttg: c240/282 lr:0.000054 t:19.2s
+tttg: c241/282 lr:0.000052 t:19.3s
+tttg: c242/282 lr:0.000049 t:19.4s
+tttg: c243/282 lr:0.000047 t:19.5s
+tttg: c244/282 lr:0.000044 t:19.6s
+tttg: c245/282 lr:0.000042 t:19.6s
+tttg: c246/282 lr:0.000040 t:19.7s
+tttg: c247/282 lr:0.000038 t:19.8s
+tttg: c248/282 lr:0.000036 t:19.9s
+tttg: c249/282 lr:0.000034 t:19.9s
+tttg: c250/282 lr:0.000032 t:20.0s
+tttg: c251/282 lr:0.000030 t:20.1s
+tttg: c252/282 lr:0.000028 t:20.2s
+tttg: c253/282 lr:0.000026 t:20.3s
+tttg: c254/282 lr:0.000024 t:20.4s
+tttg: c255/282 lr:0.000023 t:20.4s
+tttg: c256/282 lr:0.000021 t:20.5s
+tttg: c257/282 lr:0.000019 t:20.6s
+tttg: c258/282 lr:0.000018 t:20.7s
+tttg: c259/282 lr:0.000016 t:20.8s
+tttg: c260/282 lr:0.000015 t:20.9s
+tttg: c261/282 lr:0.000014 t:20.9s
+tttg: c262/282 lr:0.000012 t:21.0s
+tttg: c263/282 lr:0.000011 t:21.1s
+tttg: c264/282 lr:0.000010 t:21.2s
+tttg: c265/282 lr:0.000009 t:21.3s
+tttg: c266/282 lr:0.000008 t:21.3s
+tttg: c267/282 lr:0.000007 t:21.4s
+tttg: c268/282 lr:0.000006 t:21.5s
+tttg: c269/282 lr:0.000005 t:21.6s
+tttg: c270/282 lr:0.000004 t:21.7s
+tttg: c271/282 lr:0.000004 t:21.8s
+tttg: c272/282 lr:0.000003 t:21.8s
+tttg: c273/282 lr:0.000003 t:21.9s
+tttg: c274/282 lr:0.000002 t:22.0s
+tttg: c275/282 lr:0.000002 t:22.1s
+tttg: c276/282 lr:0.000001 t:22.2s
+tttg: c277/282 lr:0.000001 t:22.3s
+tttg: c278/282 lr:0.000000 t:22.3s
+tttg: c279/282 lr:0.000000 t:22.4s
+tttg: c280/282 lr:0.000000 t:22.5s
+tttg: c281/282 lr:0.000000 t:22.6s
+ttpr: phase:3/3 t:422.1s
+ttp: b734/782 bl:2.3674 bb:1.0768 rl:2.3794 rb:1.0750 dl:2404-2435 gd:1
+ttp: b722/782 bl:2.3905 bb:1.0528 rl:2.3801 rb:1.0736 dl:2111-2131 gd:1
+ttp: b714/782 bl:2.3970 bb:1.0289 rl:2.3810 rb:1.0710 dl:1968-1985 gd:1
+ttp: b705/782 bl:2.4372 bb:1.0733 rl:2.3838 rb:1.0711 dl:1839-1851 gd:1
+ttp: b701/782 bl:2.4124 bb:1.0423 rl:2.3851 rb:1.0698 dl:1791-1802 gd:1
+ttp: b692/782 bl:2.3560 bb:1.0752 rl:2.3839 rb:1.0700 dl:1695-1703 gd:1
+ttp: b685/782 bl:2.4347 bb:1.0392 rl:2.3858 rb:1.0688 dl:1625-1635 gd:1
+ttp: b672/782 bl:2.3820 bb:1.0765 rl:2.3857 rb:1.0690 dl:1515-1523 gd:1
+ttp: b667/782 bl:2.5318 bb:1.0744 rl:2.3904 rb:1.0692 dl:1477-1486 gd:1
+ttp: b663/782 bl:2.4210 bb:1.0413 rl:2.3914 rb:1.0683 dl:1450-1456 gd:1
+ttp: b651/782 bl:2.3524 bb:1.0009 rl:2.3903 rb:1.0663 dl:1372-1377 gd:1
+ttp: b645/782 bl:2.3520 bb:1.0326 rl:2.3892 rb:1.0654 dl:1333-1339 gd:1
+ttp: b639/782 bl:2.3221 bb:1.0275 rl:2.3875 rb:1.0644 dl:1299-1304 gd:1
+ttp: b625/782 bl:2.4403 bb:1.0293 rl:2.3888 rb:1.0636 dl:1223-1229 gd:1
+ttp: b618/782 bl:2.4165 bb:1.0242 rl:2.3894 rb:1.0627 dl:1187-1192 gd:1
+ttp: b613/782 bl:2.3503 bb:1.0501 rl:2.3886 rb:1.0624 dl:1161-1166 gd:1
+ttp: b603/782 bl:2.4319 bb:1.0702 rl:2.3894 rb:1.0626 dl:1117-1121 gd:1
+ttp: b594/782 bl:2.3072 bb:1.0289 rl:2.3879 rb:1.0619 dl:1079-1084 gd:1
+ttp: b587/782 bl:2.3061 bb:1.0044 rl:2.3864 rb:1.0608 dl:1051-1055 gd:1
+ttp: b582/782 bl:2.2476 bb:0.9985 rl:2.3840 rb:1.0598 dl:1030-1034 gd:1
+ttp: b573/782 bl:2.3242 bb:1.0302 rl:2.3830 rb:1.0593 dl:996-1000 gd:1
+ttp: b564/782 bl:2.3219 bb:1.0332 rl:2.3820 rb:1.0589 dl:966-969 gd:1
+ttp: b555/782 bl:2.4540 bb:1.0928 rl:2.3831 rb:1.0594 dl:936-939 gd:1
+ttp: b545/782 bl:2.3586 bb:1.0370 rl:2.3827 rb:1.0590 dl:905-908 gd:1
+ttp: b542/782 bl:2.5102 bb:1.0747 rl:2.3845 rb:1.0593 dl:896-899 gd:1
+ttp: b534/782 bl:2.4155 bb:1.0479 rl:2.3849 rb:1.0591 dl:871-874 gd:1
+ttp: b522/782 bl:2.3249 bb:1.0143 rl:2.3842 rb:1.0585 dl:837-839 gd:1
+ttp: b513/782 bl:2.4530 bb:1.0512 rl:2.3850 rb:1.0584 dl:811-815 gd:1
+ttp: b510/782 bl:2.3824 bb:1.0403 rl:2.3850 rb:1.0582 dl:803-806 gd:1
+ttp: b502/782 bl:2.3803 bb:1.0363 rl:2.3849 rb:1.0580 dl:782-785 gd:1
+ttp: b493/782 bl:2.4019 bb:1.0708 rl:2.3851 rb:1.0581 dl:759-762 gd:1
+ttp: b485/782 bl:2.4179 bb:1.0275 rl:2.3855 rb:1.0578 dl:741-743 gd:1
+ttp: b478/782 bl:2.3381 bb:1.0275 rl:2.3850 rb:1.0574 dl:724-726 gd:1
+ttp: b468/782 bl:2.4922 bb:1.0684 rl:2.3860 rb:1.0576 dl:701-704 gd:1
+ttp: b461/782 bl:2.3216 bb:1.0090 rl:2.3854 rb:1.0571 dl:686-689 gd:1
+ttp: b454/782 bl:2.3855 bb:1.0651 rl:2.3854 rb:1.0572 dl:672-675 gd:1
+ttp: b447/782 bl:2.3066 bb:1.0210 rl:2.3847 rb:1.0568 dl:657-660 gd:1
+ttp: b441/782 bl:2.3813 bb:1.0372 rl:2.3847 rb:1.0567 dl:646-647 gd:1
+ttp: b434/782 bl:2.3747 bb:1.0382 rl:2.3846 rb:1.0565 dl:632-634 gd:1
+ttp: b426/782 bl:2.3908 bb:1.0464 rl:2.3846 rb:1.0564 dl:617-619 gd:1
+ttp: b418/782 bl:2.3562 bb:1.0551 rl:2.3844 rb:1.0564 dl:601-603 gd:1
+ttp: b410/782 bl:2.3628 bb:1.0601 rl:2.3843 rb:1.0564 dl:587-589 gd:1
+ttp: b402/782 bl:2.4008 bb:1.0644 rl:2.3844 rb:1.0565 dl:571-573 gd:1
+ttp: b394/782 bl:2.3830 bb:1.0595 rl:2.3844 rb:1.0565 dl:557-559 gd:1
+ttp: b386/782 bl:2.4488 bb:1.1000 rl:2.3848 rb:1.0568 dl:544-546 gd:1
+ttp: b378/782 bl:2.3926 bb:1.0613 rl:2.3849 rb:1.0568 dl:530-532 gd:1
+ttp: b368/782 bl:2.4216 bb:1.0516 rl:2.3851 rb:1.0568 dl:514-516 gd:1
+ttp: b360/782 bl:2.3785 bb:1.0363 rl:2.3851 rb:1.0567 dl:501-502 gd:1
+ttp: b352/782 bl:2.3243 bb:1.0400 rl:2.3847 rb:1.0566 dl:487-489 gd:1
+ttp: b344/782 bl:2.4123 bb:1.0770 rl:2.3849 rb:1.0567 dl:476-477 gd:1
+ttp: b336/782 bl:2.4173 bb:1.1041 rl:2.3850 rb:1.0570 dl:464-465 gd:1
+ttp: b328/782 bl:2.3778 bb:1.0936 rl:2.3850 rb:1.0572 dl:452-454 gd:1
+ttp: b320/782 bl:2.4853 bb:1.0932 rl:2.3855 rb:1.0573 dl:440-442 gd:1
+ttp: b312/782 bl:2.3831 bb:1.0716 rl:2.3855 rb:1.0574 dl:429-430 gd:1
+ttp: b304/782 bl:2.3670 bb:1.0730 rl:2.3854 rb:1.0575 dl:417-418 gd:1
+ttp: b296/782 bl:2.4368 bb:1.0956 rl:2.3857 rb:1.0577 dl:405-407 gd:1
+ttp: b288/782 bl:2.3967 bb:1.0730 rl:2.3857 rb:1.0578 dl:394-395 gd:1
+ttp: b280/782 bl:2.3526 bb:1.0754 rl:2.3856 rb:1.0578 dl:383-384 gd:1
+ttp: b272/782 bl:2.4180 bb:1.0879 rl:2.3857 rb:1.0580 dl:372-373 gd:1
+ttp: b264/782 bl:2.3815 bb:1.0634 rl:2.3857 rb:1.0580 dl:362-364 gd:1
+ttp: b256/782 bl:2.4677 bb:1.1145 rl:2.3861 rb:1.0582 dl:353-354 gd:1
+ttp: b248/782 bl:2.4100 bb:1.0636 rl:2.3862 rb:1.0582 dl:343-344 gd:1
+ttp: b240/782 bl:2.4853 bb:1.1303 rl:2.3865 rb:1.0585 dl:333-334 gd:1
+ttp: b232/782 bl:2.3841 bb:1.1060 rl:2.3865 rb:1.0587 dl:323-325 gd:1
+ttp: b224/782 bl:2.5271 bb:1.0990 rl:2.3870 rb:1.0588 dl:314-315 gd:1
+ttp: b216/782 bl:2.3947 bb:1.1122 rl:2.3871 rb:1.0590 dl:305-306 gd:1
+ttp: b208/782 bl:2.4384 bb:1.0860 rl:2.3873 rb:1.0591 dl:297-298 gd:1
+ttp: b200/782 bl:2.4524 bb:1.1412 rl:2.3875 rb:1.0594 dl:289-290 gd:1
+ttp: b191/782 bl:2.5307 bb:1.1373 rl:2.3879 rb:1.0596 dl:279-280 gd:1
+ttp: b183/782 bl:2.4141 bb:1.1052 rl:2.3880 rb:1.0598 dl:270-271 gd:1
+ttp: b175/782 bl:2.3752 bb:1.1052 rl:2.3880 rb:1.0599 dl:263-264 gd:1
+ttp: b167/782 bl:2.5252 bb:1.1330 rl:2.3884 rb:1.0601 dl:255-256 gd:1
+ttp: b159/782 bl:2.4373 bb:1.1630 rl:2.3885 rb:1.0604 dl:248-249 gd:1
+ttp: b151/782 bl:2.4520 bb:1.1281 rl:2.3887 rb:1.0605 dl:240-241 gd:1
+ttp: b144/782 bl:2.5022 bb:1.1477 rl:2.3890 rb:1.0608 dl:233-234 gd:1
+ttp: b136/782 bl:2.5561 bb:1.1444 rl:2.3894 rb:1.0610 dl:226-227 gd:1
+ttp: b128/782 bl:2.5109 bb:1.1607 rl:2.3897 rb:1.0612 dl:219-220 gd:1
+ttp: b121/782 bl:2.3996 bb:1.1043 rl:2.3897 rb:1.0613 dl:213-213 gd:1
+ttp: b110/782 bl:2.4613 bb:1.1398 rl:2.3899 rb:1.0615 dl:203-204 gd:1
+ttp: b102/782 bl:2.5343 bb:1.1740 rl:2.3902 rb:1.0617 dl:196-197 gd:1
+ttp: b94/782 bl:2.5586 bb:1.2029 rl:2.3906 rb:1.0620 dl:189-190 gd:1
+ttp: b87/782 bl:2.4751 bb:1.1480 rl:2.3907 rb:1.0622 dl:183-183 gd:1
+ttp: b78/782 bl:2.5177 bb:1.1442 rl:2.3910 rb:1.0623 dl:174-175 gd:1
+ttp: b70/782 bl:2.5541 bb:1.1931 rl:2.3913 rb:1.0625 dl:168-169 gd:1
+ttp: b61/782 bl:2.5875 bb:1.1754 rl:2.3916 rb:1.0627 dl:160-161 gd:1
+ttp: b54/782 bl:2.7009 bb:1.2552 rl:2.3921 rb:1.0630 dl:153-154 gd:1
+ttp: b46/782 bl:2.5918 bb:1.2195 rl:2.3925 rb:1.0633 dl:146-147 gd:1
+ttp: b35/782 bl:2.5770 bb:1.1980 rl:2.3927 rb:1.0635 dl:135-136 gd:1
+ttp: b27/782 bl:2.7319 bb:1.2702 rl:2.3932 rb:1.0637 dl:127-128 gd:1
+ttp: b19/782 bl:2.7064 bb:1.2268 rl:2.3936 rb:1.0640 dl:118-119 gd:1
+ttp: b11/782 bl:2.6895 bb:1.1895 rl:2.3940 rb:1.0641 dl:106-108 gd:1
+ttp: b3/782 bl:2.7254 bb:1.1913 rl:2.3943 rb:1.0642 dl:88-91 gd:1
+quantized_ttt_phased val_loss:2.38257182 val_bpb:1.06226648 eval_time:517991ms
+total_eval_time:518.0s
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed1.log b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed1.log
new file mode 100644
index 0000000000..c77155eed4
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed1.log
@@ -0,0 +1,4797 @@
+====================================================================================================
+Hyperparameters:
+  adam_eps: 1e-08
+  adam_wd: 0.02
+  artifact_dir: logs
+  attn_clip_sigmas: 13.0
+  attn_out_gate_enabled: False
+  attn_out_gate_src: proj
+  beta1: 0.9
+  beta2: 0.99
+  build_seconds: 600
+  caseops_enabled: True
+  compressor: pergroup
+  data_dir: /workspace/SOTA_FINAL/data
+  datasets_dir: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved
+  distributed: True
+  ema_decay: 0.9965
+  embed_bits: 7
+  embed_clip_sigmas: 14.0
+  embed_lr: 0.6
+  embed_wd: 0.085
+  enable_looping_at: 0.45
+  eval_seconds: 600
+  eval_seq_len: 2048
+  eval_stride: 64
+  fused_ce_enabled: True
+  gate_window: 12
+  gated_attn_enabled: False
+  gated_attn_init_std: 0.01
+  gated_attn_quant_gate: True
+  global_ttt_batch_seqs: 32
+  global_ttt_chunk_tokens: 32768
+  global_ttt_epochs: 1
+  global_ttt_grad_clip: 1.0
+  global_ttt_lr: 0.001
+  global_ttt_momentum: 0.9
+  global_ttt_respect_doc_boundaries: True
+  global_ttt_warmup_chunks: 0
+  global_ttt_warmup_start_lr: 0.0
+  gptq_calibration_batches: 16
+  gptq_reserve_seconds: 0.5
+  grad_accum_steps: 1
+  grad_clip_norm: 0.3
+  hypothesis: Seed repeat of the clean SP10240 CaseOps MLP3.75 late045 standard 8x submission candidate, changing only seed/run identity.
+  is_main_process: True
+  iterations: 20000
+  ln_scale: True
+  local_rank: 0
+  logfile: logs/pr1855_sp10240_caseops_mlp375_late045_seed1_8x.txt
+  logit_softcap: 30.0
+  loop_end: 5
+  loop_start: 3
+  lqer_asym_enabled: True
+  lqer_asym_group: 64
+  lqer_enabled: True
+  lqer_factor_bits: 4
+  lqer_rank: 4
+  lqer_top_k: 3
+  matrix_bits: 6
+  matrix_clip_sigmas: 12.85
+  matrix_lr: 0.026
+  max_wallclock_seconds: 600.0
+  min_lr: 0.1
+  mlp_clip_sigmas: 11.5
+  mlp_mult: 3.75
+  model_dim: 512
+  model_path: logs/final_model.pt
+  muon_backend_steps: 5
+  muon_momentum: 0.97
+  muon_momentum_warmup_start: 0.92
+  muon_momentum_warmup_steps: 1500
+  muon_row_normalize: True
+  muon_wd: 0.095
+  num_heads: 8
+  num_kv_heads: 4
+  num_layers: 11
+  num_loops: 2
+  parallel_final_lane: mean
+  parallel_start_layer: 8
+  parent_run: 2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x
+  phased_ttt_num_phases: 3
+  phased_ttt_prefix_docs: 2500
+  qk_gain_init: 5.25
+  quantized_model_path: logs/final_model.int6.ptz
+  rank: 0
+  rope_base: 10000.0
+  rope_dims: 16
+  rope_train_seq_len: 2048
+  rope_yarn: False
+  run_id: pr1855_sp10240_caseops_mlp375_late045_seed1_8x
+  run_kind: seed_repeat
+  run_label: standard_8x
+  scalar_lr: 0.02
+  seed: 1
+  size_cap_bytes: 16000000
+  skip_gates_enabled: True
+  smear_gate_enabled: True
+  source_parent: legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py
+  source_parent_sha256: 454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b
+  source_tokenizer_lane: scripts/prepare_sp10240_caseops_data.py
+  sparse_attn_gate_enabled: True
+  sparse_attn_gate_init_std: 0.0
+  sparse_attn_gate_scale: 0.5
+  test_date: 2026-05-01
+  test_id: 2026-05-01_pr1855_sp10240_caseops_mlp375_late045_seed1_8x
+  tie_embeddings: True
+  tied_embed_init_std: 0.005
+  tied_embed_lr: 0.03
+  tokenizer_path: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/tokenizers/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model
+  train_batch_tokens: 786432
+  train_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin
+  train_log_every: 500
+  train_seq_len: 2048
+  ttt_batch_size: 64
+  ttt_beta1: 0.0
+  ttt_beta2: 0.99
+  ttt_chunk_size: 48
+  ttt_enabled: True
+  ttt_eval_batches: 
+  ttt_eval_seq_len: 2048
+  ttt_grad_steps: 1
+  ttt_k_lora: True
+  ttt_lora_lr: 0.0001
+  ttt_lora_rank: 80
+  ttt_mlp_lora: True
+  ttt_o_lora: True
+  ttt_optimizer: adam
+  ttt_weight_decay: 0.5
+  val_batch_tokens: 524288
+  val_bytes_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin
+  val_doc_fraction: 1.0
+  val_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin
+  val_loss_every: 0
+  vocab_size: 10240
+  warmdown_frac: 0.85
+  warmup_steps: 20
+  world_size: 8
+  xsa_last_n: 11
+====================================================================================================
+Source code:
+====================================================================================================
+import base64, collections, copy, fcntl, glob, io, lzma, math, os
+from pathlib import Path
+import random, re, subprocess, sys, time, uuid, numpy as np, sentencepiece as spm, torch, torch.distributed as dist, torch.nn.functional as F
+from torch import Tensor, nn
+from flash_attn_interface import (
+    flash_attn_func as flash_attn_3_func,
+    flash_attn_varlen_func,
+)
+from concurrent.futures import ThreadPoolExecutor
+import triton
+import triton.language as tl
+from triton.tools.tensor_descriptor import TensorDescriptor
+
+
+# ===== Fused softcapped cross-entropy (Triton) — training-only path =====
+# Replaces the eager
+#     logits_softcap = softcap * tanh(logits / softcap)
+#     F.cross_entropy(logits_softcap.float(), targets, reduction="mean")
+# sequence with a single fused kernel that reads logits_proj once, applies
+# softcap in-register, and computes (LSE, loss) in one streaming pass. The
+# backward kernel mirrors the forward so there's no stored softcapped logits.
+# Numerically identical to the eager path up to fp32 accumulation differences.
+_FUSED_CE_LIBRARY = "pgsubmission1draft7fusedce"
+_FUSED_CE_BLOCK_SIZE = 1024
+_FUSED_CE_NUM_WARPS = 4
+
+
+@triton.jit
+def _softcapped_ce_fwd_kernel(
+    logits_ptr, losses_ptr, lse_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    max_val = -float("inf")
+    sum_exp = 0.0
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=-float("inf"),
+        ).to(tl.float32)
+        z = A * tl.sigmoid(val * inv_C)
+        z = tl.where(mask, z, -float("inf"))
+        curr_max = tl.max(z, axis=0)
+        new_max = tl.maximum(max_val, curr_max)
+        sum_exp = sum_exp * tl.exp(max_val - new_max) + tl.sum(tl.exp(z - new_max), axis=0)
+        max_val = new_max
+    lse = max_val + tl.log(sum_exp)
+    tl.store(lse_ptr + row_idx, lse)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    target_val = tl.load(logits_row_ptr + target * stride_logits_v).to(tl.float32)
+    target_z = A * tl.sigmoid(target_val * inv_C)
+    tl.store(losses_ptr + row_idx, lse - target_z)
+
+
+@triton.jit
+def _softcapped_ce_bwd_kernel(
+    grad_logits_ptr, grad_losses_ptr, lse_ptr, logits_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    stride_grad_n, stride_grad_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    grad_row_ptr = grad_logits_ptr + row_idx * stride_grad_n
+    lse = tl.load(lse_ptr + row_idx)
+    grad_loss = tl.load(grad_losses_ptr + row_idx).to(tl.float32)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    dz_dx_scale = A * inv_C
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=0.0,
+        ).to(tl.float32)
+        sigmoid_u = tl.sigmoid(val * inv_C)
+        z = A * sigmoid_u
+        probs = tl.exp(z - lse)
+        grad_z = grad_loss * (probs - tl.where(cols == target, 1.0, 0.0))
+        grad_x = grad_z * (dz_dx_scale * sigmoid_u * (1.0 - sigmoid_u))
+        tl.store(grad_row_ptr + cols * stride_grad_v, grad_x, mask=mask)
+
+
+def _validate_softcapped_ce_inputs(
+    logits: Tensor, targets: Tensor, softcap: float,
+) -> tuple[Tensor, Tensor]:
+    if logits.ndim != 2:
+        raise ValueError(f"Expected logits.ndim=2, got {logits.ndim}")
+    if targets.ndim != 1:
+        raise ValueError(f"Expected targets.ndim=1, got {targets.ndim}")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    if not logits.is_cuda or not targets.is_cuda:
+        raise ValueError("softcapped_cross_entropy requires CUDA tensors")
+    if softcap <= 0.0:
+        raise ValueError(f"softcap must be positive, got {softcap}")
+    if logits.dtype not in (torch.float16, torch.bfloat16, torch.float32):
+        raise ValueError(f"Unsupported logits dtype: {logits.dtype}")
+    logits = logits.contiguous()
+    targets = targets.contiguous()
+    if targets.dtype != torch.int64:
+        targets = targets.to(dtype=torch.int64)
+    return logits, targets
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce", mutates_args=())
+def softcapped_ce_op(logits: Tensor, targets: Tensor, softcap: float) -> tuple[Tensor, Tensor]:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    n_rows, n_cols = logits.shape
+    losses = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    lse = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    _softcapped_ce_fwd_kernel[(n_rows,)](
+        logits, losses, lse, targets,
+        logits.stride(0), logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return losses, lse
+
+
+@softcapped_ce_op.register_fake
+def _(logits: Tensor, targets: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1:
+        raise ValueError("softcapped_ce fake impl expects 2D logits and 1D targets")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    n_rows = logits.shape[0]
+    return (
+        logits.new_empty((n_rows,), dtype=torch.float32),
+        logits.new_empty((n_rows,), dtype=torch.float32),
+    )
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce_backward", mutates_args=())
+def softcapped_ce_backward_op(
+    logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float,
+) -> Tensor:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    lse = lse.contiguous()
+    grad_losses = grad_losses.contiguous().to(dtype=torch.float32)
+    if lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("Expected 1D lse and grad_losses")
+    if lse.shape[0] != logits.shape[0] or grad_losses.shape[0] != logits.shape[0]:
+        raise ValueError(
+            f"Expected row-aligned lse/grad_losses, got logits={tuple(logits.shape)} "
+            f"lse={tuple(lse.shape)} grad_losses={tuple(grad_losses.shape)}"
+        )
+    grad_logits = torch.empty_like(logits)
+    n_rows, n_cols = logits.shape
+    _softcapped_ce_bwd_kernel[(n_rows,)](
+        grad_logits, grad_losses, lse, logits, targets,
+        logits.stride(0), logits.stride(1),
+        grad_logits.stride(0), grad_logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return grad_logits
+
+
+@softcapped_ce_backward_op.register_fake
+def _(logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1 or lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("softcapped_ce_backward fake impl expects 2D logits and 1D row tensors")
+    if (
+        logits.shape[0] != targets.shape[0]
+        or logits.shape[0] != lse.shape[0]
+        or logits.shape[0] != grad_losses.shape[0]
+    ):
+        raise ValueError("softcapped_ce_backward fake impl expects row-aligned tensors")
+    return logits.new_empty(logits.shape)
+
+
+def _softcapped_ce_setup_context(
+    ctx: torch.autograd.function.FunctionCtx, inputs, output,
+) -> None:
+    logits, targets, softcap = inputs
+    _losses, lse = output
+    ctx.save_for_backward(logits, targets, lse)
+    ctx.softcap = float(softcap)
+
+
+def _softcapped_ce_backward(
+    ctx: torch.autograd.function.FunctionCtx, grad_losses: Tensor, grad_lse: "Tensor | None",
+):
+    del grad_lse
+    logits, targets, lse = ctx.saved_tensors
+    grad_logits = torch.ops.pgsubmission1draft7fusedce.softcapped_ce_backward(
+        logits, targets, lse, grad_losses, ctx.softcap
+    )
+    return grad_logits, None, None
+
+
+softcapped_ce_op.register_autograd(
+    _softcapped_ce_backward, setup_context=_softcapped_ce_setup_context,
+)
+
+
+def softcapped_cross_entropy(
+    logits: Tensor, targets: Tensor, softcap: float, reduction: str = "mean",
+) -> Tensor:
+    losses, _lse = torch.ops.pgsubmission1draft7fusedce.softcapped_ce(
+        logits, targets, float(softcap)
+    )
+    if reduction == "none":
+        return losses
+    if reduction == "sum":
+        return losses.sum()
+    if reduction == "mean":
+        return losses.mean()
+    raise ValueError(f"Unsupported reduction={reduction!r}")
+
+
+class Hyperparameters:
+    data_dir = os.environ.get("DATA_DIR", "./data/")
+    seed = int(os.environ.get("SEED", 1337))
+    run_id = os.environ.get("RUN_ID", str(uuid.uuid4()))
+    iterations = int(os.environ.get("ITERATIONS", 20000))
+    warmdown_frac = float(os.environ.get("WARMDOWN_FRAC", 0.75))
+    warmup_steps = int(os.environ.get("WARMUP_STEPS", 20))
+    train_batch_tokens = int(os.environ.get("TRAIN_BATCH_TOKENS", 786432))
+    # Fused softcapped CE (Triton). Training-only — forward_logits eval path still uses
+    # eager softcap+F.cross_entropy. Default ON since validated as at-worst neutral.
+    fused_ce_enabled = bool(int(os.environ.get("FUSED_CE_ENABLED", "1")))
+    train_seq_len = int(os.environ.get("TRAIN_SEQ_LEN", 2048))
+    train_log_every = int(os.environ.get("TRAIN_LOG_EVERY", 500))
+    max_wallclock_seconds = float(os.environ.get("MAX_WALLCLOCK_SECONDS", 6e2))
+    val_batch_tokens = int(os.environ.get("VAL_BATCH_TOKENS", 524288))
+    eval_seq_len = int(os.environ.get("EVAL_SEQ_LEN", 2048))
+    val_loss_every = int(os.environ.get("VAL_LOSS_EVERY", 4000))
+    vocab_size = int(os.environ.get("VOCAB_SIZE", 8192))
+    num_layers = int(os.environ.get("NUM_LAYERS", 11))
+    xsa_last_n = int(os.environ.get("XSA_LAST_N", 11))
+    model_dim = int(os.environ.get("MODEL_DIM", 512))
+    num_kv_heads = int(os.environ.get("NUM_KV_HEADS", 4))
+    num_heads = int(os.environ.get("NUM_HEADS", 8))
+    mlp_mult = float(os.environ.get("MLP_MULT", 4.0))
+    skip_gates_enabled = bool(int(os.environ.get("SKIP_GATES_ENABLED", "1")))
+    tie_embeddings = bool(int(os.environ.get("TIE_EMBEDDINGS", "1")))
+    logit_softcap = float(os.environ.get("LOGIT_SOFTCAP", 3e1))
+    rope_base = float(os.environ.get("ROPE_BASE", 1e4))
+    rope_dims = int(os.environ.get("ROPE_DIMS", 16))
+    rope_train_seq_len = int(os.environ.get("ROPE_TRAIN_SEQ_LEN", 2048))
+    rope_yarn = bool(int(os.environ.get("ROPE_YARN", "0")))
+    ln_scale = bool(int(os.environ.get("LN_SCALE", "1")))
+    qk_gain_init = float(os.environ.get("QK_GAIN_INIT", 5.0))
+    num_loops = int(os.environ.get("NUM_LOOPS", 2))
+    loop_start = int(os.environ.get("LOOP_START", 3))
+    loop_end = int(os.environ.get("LOOP_END", 5))
+    enable_looping_at = float(os.environ.get("ENABLE_LOOPING_AT", 0.35))
+    parallel_start_layer = int(os.environ.get("PARALLEL_START_LAYER", 8))
+    parallel_final_lane = os.environ.get("PARALLEL_FINAL_LANE", "mean")
+    min_lr = float(os.environ.get("MIN_LR", 0.0))
+    embed_lr = float(os.environ.get("EMBED_LR", 0.6))
+    tied_embed_lr = float(os.environ.get("TIED_EMBED_LR", 0.03))
+    tied_embed_init_std = float(os.environ.get("TIED_EMBED_INIT_STD", 0.005))
+    matrix_lr = float(os.environ.get("MATRIX_LR", 0.026))
+    scalar_lr = float(os.environ.get("SCALAR_LR", 0.02))
+    muon_momentum = float(os.environ.get("MUON_MOMENTUM", 0.97))
+    muon_backend_steps = int(os.environ.get("MUON_BACKEND_STEPS", 5))
+    muon_momentum_warmup_start = float(
+        os.environ.get("MUON_MOMENTUM_WARMUP_START", 0.92)
+    )
+    muon_momentum_warmup_steps = int(os.environ.get("MUON_MOMENTUM_WARMUP_STEPS", 1500))
+    muon_row_normalize = bool(int(os.environ.get("MUON_ROW_NORMALIZE", "1")))
+    beta1 = float(os.environ.get("BETA1", 0.9))
+    beta2 = float(os.environ.get("BETA2", 0.95))
+    adam_eps = float(os.environ.get("ADAM_EPS", 1e-08))
+    grad_clip_norm = float(os.environ.get("GRAD_CLIP_NORM", 0.3))
+    eval_stride = int(os.environ.get("EVAL_STRIDE", 64))
+    adam_wd = float(os.environ.get("ADAM_WD", 0.02))
+    muon_wd = float(os.environ.get("MUON_WD", 0.095))
+    embed_wd = float(os.environ.get("EMBED_WD", 0.085))
+    ema_decay = float(os.environ.get("EMA_DECAY", 0.9965))
+    ttt_enabled = bool(int(os.environ.get("TTT_ENABLED", "1")))
+    ttt_lora_rank = int(os.environ.get("TTT_LORA_RANK", 96))
+    ttt_lora_lr = float(os.environ.get("TTT_LORA_LR", 0.0001))
+    ttt_chunk_size = int(os.environ.get("TTT_CHUNK_SIZE", 48))
+    ttt_eval_seq_len = int(os.environ.get("TTT_EVAL_SEQ_LEN", 2048))
+    ttt_batch_size = int(os.environ.get("TTT_BATCH_SIZE", 64))
+    ttt_grad_steps = int(os.environ.get("TTT_GRAD_STEPS", 1))
+    ttt_weight_decay = float(os.environ.get("TTT_WEIGHT_DECAY", 1.0))
+    ttt_beta1 = float(os.environ.get("TTT_BETA1", 0))
+    ttt_beta2 = float(os.environ.get("TTT_BETA2", 0.999))
+    ttt_k_lora = bool(int(os.environ.get("TTT_K_LORA", "1")))
+    ttt_mlp_lora = bool(int(os.environ.get("TTT_MLP_LORA", "1")))
+    ttt_o_lora = bool(int(os.environ.get("TTT_O_LORA", "1")))
+    ttt_optimizer = os.environ.get("TTT_OPTIMIZER", "adam")
+    ttt_eval_batches = os.environ.get("TTT_EVAL_BATCHES", "")
+    val_doc_fraction = float(os.environ.get("VAL_DOC_FRACTION", 1.0))
+    compressor = os.environ.get("COMPRESSOR", "brotli")
+    gptq_calibration_batches = int(os.environ.get("GPTQ_CALIBRATION_BATCHES", 16))
+    gptq_reserve_seconds = float(os.environ.get("GPTQ_RESERVE_SECONDS", 4.0))
+    phased_ttt_prefix_docs = int(os.environ.get("PHASED_TTT_PREFIX_DOCS", 2000))
+    phased_ttt_num_phases = int(os.environ.get("PHASED_TTT_NUM_PHASES", 1))
+    global_ttt_lr = float(os.environ.get("GLOBAL_TTT_LR", 0.001))
+    global_ttt_momentum = float(os.environ.get("GLOBAL_TTT_MOMENTUM", 0.9))
+    global_ttt_epochs = int(os.environ.get("GLOBAL_TTT_EPOCHS", 1))
+    global_ttt_chunk_tokens = int(os.environ.get("GLOBAL_TTT_CHUNK_TOKENS", 32768))
+    global_ttt_batch_seqs = int(os.environ.get("GLOBAL_TTT_BATCH_SEQS", 32))
+    global_ttt_warmup_start_lr = float(os.environ.get("GLOBAL_TTT_WARMUP_START_LR", 0.0))
+    global_ttt_warmup_chunks = int(os.environ.get("GLOBAL_TTT_WARMUP_CHUNKS", 0))
+    global_ttt_grad_clip = float(os.environ.get("GLOBAL_TTT_GRAD_CLIP", 1.0))
+    global_ttt_respect_doc_boundaries = bool(int(os.environ.get("GLOBAL_TTT_RESPECT_DOC_BOUNDARIES", "1")))
+    matrix_bits = int(os.environ.get("MATRIX_BITS", 6))
+    embed_bits = int(os.environ.get("EMBED_BITS", 8))
+    matrix_clip_sigmas = float(os.environ.get("MATRIX_CLIP_SIGMAS", 12.85))
+    embed_clip_sigmas = float(os.environ.get("EMBED_CLIP_SIGMAS", 2e1))
+    mlp_clip_sigmas = float(os.environ.get("MLP_CLIP_SIGMAS", 10.0))
+    attn_clip_sigmas = float(os.environ.get("ATTN_CLIP_SIGMAS", 13.0))
+    # AttnOutGate (per-head multiplicative output gate, PR #1667 MarioPaerle).
+    # Zero-init weight: 2*sigmoid(0)=1 -> transparent at start. Source defaults to
+    # block input x ('proj'); 'q' uses raw Q projection output.
+    attn_out_gate_enabled = bool(int(os.environ.get("ATTN_OUT_GATE_ENABLED", "0")))
+    attn_out_gate_src = os.environ.get("ATTN_OUT_GATE_SRC", "proj")
+    # SmearGate (input-dependent forward-1 token smear, modded-nanogpt @classiclarryd
+    # via PR #1667). x_t <- x_t + lam * sigmoid(W*x_t[:gate_window]) * x_{t-1}.
+    # lam=0 + W=0 -> transparent at init.
+    smear_gate_enabled = bool(int(os.environ.get("SMEAR_GATE_ENABLED", "0")))
+    # Window: first GATE_WINDOW dims of the source feed the gate projection.
+    gate_window = int(os.environ.get("GATE_WINDOW", 12))
+    # Gated Attention (Qwen, NeurIPS 2025 Best Paper, arXiv:2505.06708;
+    # qiuzh20/gated_attention). Per-head sigmoid gate on SDPA output, BEFORE
+    # out_proj. Gate input = full block input x (paper's headwise G1 variant
+    # driven from hidden_states). W_g shape (num_heads, dim), plain sigmoid.
+    # Near-zero init gives g~0.5 at step 0 (half attention output); per-block
+    # attn_scale (init 1.0) compensates during training. Name contains
+    # "attn_gate" so CONTROL_TENSOR_NAME_PATTERNS routes it to scalar AdamW.
+    gated_attn_enabled = bool(int(os.environ.get("GATED_ATTN_ENABLED", "0")))
+    gated_attn_init_std = float(os.environ.get("GATED_ATTN_INIT_STD", 0.01))
+    # Dedicated int8-per-row quantization for `attn_gate_w` tensors. These are
+    # small ((num_heads, dim) = (8, 512) = 4096 params) and bypass GPTQ via the
+    # numel<=65536 passthrough branch -> stored as fp16 (8 KB/layer, ~65 KB total
+    # compressed). int8-per-row cuts the raw tensor in half with negligible BPB
+    # impact: scales per head (8 values), symmetric quant over [-127, 127].
+    # No Hessian needed (gate weights not in collect_hessians()).
+    gated_attn_quant_gate = bool(int(os.environ.get("GATED_ATTN_QUANT_GATE", "0")))
+    # Sparse Attention Gate (modded-nanogpt-style). Keeps dense SDPA and only
+    # swaps the output-gate input to the first GATE_WINDOW residual dims.
+    # W_g: (num_heads, gate_window) = (8, 12) = 96 params/layer (~44K total),
+    # vs dense GatedAttn's (8, 512) = 4K/layer (~44K diff). Name "attn_gate_w"
+    # is shared so quant routing and int8 gate passthrough Just Work. Gate
+    # passthrough int8 still applies via GATED_ATTN_QUANT_GATE=1.
+    # Mutually exclusive with ATTN_OUT_GATE_ENABLED and GATED_ATTN_ENABLED.
+    sparse_attn_gate_enabled = bool(int(os.environ.get("SPARSE_ATTN_GATE_ENABLED", "0")))
+    sparse_attn_gate_init_std = float(os.environ.get("SPARSE_ATTN_GATE_INIT_STD", 0.0))
+    sparse_attn_gate_scale = float(os.environ.get("SPARSE_ATTN_GATE_SCALE", 1.0))
+    # LQER asymmetric rank-k correction on top-K quant-error tensors (PR #1530 v2 port).
+    # Computes SVD of E = W_fp - W_quant, packs top-r A,B as INT2/INT4 (asym) or INTk (sym).
+    lqer_enabled = bool(int(os.environ.get("LQER_ENABLED", "1")))
+    lqer_rank = int(os.environ.get("LQER_RANK", 4))
+    lqer_top_k = int(os.environ.get("LQER_TOP_K", 3))
+    lqer_factor_bits = int(os.environ.get("LQER_FACTOR_BITS", 4))
+    lqer_asym_enabled = bool(int(os.environ.get("LQER_ASYM_ENABLED", "1")))
+    lqer_asym_group = int(os.environ.get("LQER_ASYM_GROUP", "64"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    rank = int(os.environ.get("RANK", "0"))
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    is_main_process = rank == 0
+    grad_accum_steps = 8 // world_size
+    # CaseOps integration: optional override of dataset root + tokenizer path.
+    # When CASEOPS_ENABLED=1, the wrapper loads a per-token byte sidecar
+    # (fineweb_val_bytes_*.bin, identical shard layout to val_*.bin) and uses
+    # it as the canonical raw-byte budget for BPB accounting. The sidecar
+    # REPLACES the build_sentencepiece_luts byte-counting path entirely.
+    caseops_enabled = bool(int(os.environ.get("CASEOPS_ENABLED", "0")))
+    _default_caseops_data = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "datasets",
+        "fineweb10B_sp8192_lossless_caps_caseops_v1_reserved",
+    )
+    _default_caseops_tok = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "tokenizers",
+        "fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model",
+    )
+    if caseops_enabled:
+        datasets_dir = os.environ.get("DATA_PATH", _default_caseops_data)
+        tokenizer_path = os.environ.get("TOKENIZER_PATH", _default_caseops_tok)
+    else:
+        datasets_dir = os.environ.get(
+            "DATA_PATH",
+            os.path.join(data_dir, "datasets", f"fineweb10B_sp{vocab_size}"),
+        )
+        tokenizer_path = os.environ.get(
+            "TOKENIZER_PATH",
+            os.path.join(data_dir, "tokenizers", f"fineweb_{vocab_size}_bpe.model"),
+        )
+    train_files = os.path.join(datasets_dir, "fineweb_train_*.bin")
+    val_files = os.path.join(datasets_dir, "fineweb_val_*.bin")
+    val_bytes_files = os.path.join(datasets_dir, "fineweb_val_bytes_*.bin")
+    artifact_dir = os.environ.get("ARTIFACT_DIR", "")
+    logfile = (
+        os.path.join(artifact_dir, f"{run_id}.txt")
+        if artifact_dir
+        else f"logs/{run_id}.txt"
+    )
+    model_path = (
+        os.path.join(artifact_dir, "final_model.pt")
+        if artifact_dir
+        else "final_model.pt"
+    )
+    quantized_model_path = (
+        os.path.join(artifact_dir, "final_model.int6.ptz")
+        if artifact_dir
+        else "final_model.int6.ptz"
+    )
+
+
+# ===== 2026-04-30 SP10240 CaseOps MLP3.75 late045 promoted test car =====
+# Source of truth for this new experiment. The launcher only checks files and
+# calls this run.py; it does not define model or eval conditions.
+TEST_ID = "2026-05-01_pr1855_sp10240_caseops_mlp375_late045_seed1_8x"
+TEST_DATE = "2026-05-01"
+RUN_LABEL = "standard_8x"
+RUN_KIND = "seed_repeat"
+SOURCE_PARENT = "legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py"
+SOURCE_PARENT_SHA256 = "454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b"
+SOURCE_TOKENIZER_LANE = "scripts/prepare_sp10240_caseops_data.py"
+PARENT_RUN = "2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x"
+HYPOTHESIS = (
+    "Seed repeat of the clean SP10240 CaseOps MLP3.75 late045 standard 8x "
+    "submission candidate, changing only seed/run identity."
+)
+SIZE_CAP_BYTES = 16000000
+BUILD_SECONDS = 600
+EVAL_SECONDS = 600
+
+Hyperparameters.test_id = TEST_ID
+Hyperparameters.test_date = TEST_DATE
+Hyperparameters.run_label = RUN_LABEL
+Hyperparameters.run_kind = RUN_KIND
+Hyperparameters.source_parent = SOURCE_PARENT
+Hyperparameters.source_parent_sha256 = SOURCE_PARENT_SHA256
+Hyperparameters.source_tokenizer_lane = SOURCE_TOKENIZER_LANE
+Hyperparameters.parent_run = PARENT_RUN
+Hyperparameters.hypothesis = HYPOTHESIS
+Hyperparameters.size_cap_bytes = SIZE_CAP_BYTES
+Hyperparameters.build_seconds = BUILD_SECONDS
+Hyperparameters.eval_seconds = EVAL_SECONDS
+
+Hyperparameters.data_dir = "/workspace/SOTA_FINAL/data"
+_caseops_root = os.path.join(
+    Hyperparameters.data_dir, "datasets", "fineweb10B_sp10240_caseops", "datasets"
+)
+Hyperparameters.vocab_size = 10240
+Hyperparameters.caseops_enabled = True
+Hyperparameters.datasets_dir = os.path.join(
+    _caseops_root, "datasets", "fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+)
+Hyperparameters.train_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_train_*.bin")
+Hyperparameters.val_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_*.bin")
+Hyperparameters.val_bytes_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_bytes_*.bin")
+Hyperparameters.tokenizer_path = os.path.join(
+    _caseops_root, "tokenizers", "fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model"
+)
+
+Hyperparameters.seed = 1
+Hyperparameters.run_id = "pr1855_sp10240_caseops_mlp375_late045_seed1_8x"
+Hyperparameters.artifact_dir = "logs"
+Hyperparameters.logfile = os.path.join(Hyperparameters.artifact_dir, f"{Hyperparameters.run_id}.txt")
+Hyperparameters.model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.pt")
+Hyperparameters.quantized_model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.int6.ptz")
+Hyperparameters.iterations = 20000
+Hyperparameters.max_wallclock_seconds = float(BUILD_SECONDS)
+Hyperparameters.num_layers = 11
+Hyperparameters.xsa_last_n = 11
+Hyperparameters.model_dim = 512
+Hyperparameters.num_heads = 8
+Hyperparameters.num_kv_heads = 4
+Hyperparameters.mlp_mult = 3.75
+Hyperparameters.num_loops = 2
+Hyperparameters.loop_start = 3
+Hyperparameters.loop_end = 5
+Hyperparameters.enable_looping_at = 0.45
+Hyperparameters.parallel_start_layer = 8
+Hyperparameters.qk_gain_init = 5.25
+Hyperparameters.warmdown_frac = 0.85
+Hyperparameters.warmup_steps = 20
+Hyperparameters.min_lr = 0.1
+Hyperparameters.matrix_lr = 0.026
+Hyperparameters.beta2 = 0.99
+Hyperparameters.muon_backend_steps = 5
+Hyperparameters.grad_clip_norm = 0.3
+Hyperparameters.val_loss_every = 0
+Hyperparameters.ttt_enabled = True
+Hyperparameters.ttt_lora_rank = 80
+Hyperparameters.ttt_chunk_size = 48
+Hyperparameters.ttt_weight_decay = 0.5
+Hyperparameters.ttt_beta2 = 0.99
+Hyperparameters.phased_ttt_prefix_docs = 2500
+Hyperparameters.phased_ttt_num_phases = 3
+Hyperparameters.global_ttt_momentum = 0.9
+Hyperparameters.compressor = "pergroup"
+Hyperparameters.gptq_reserve_seconds = 0.5
+Hyperparameters.gptq_calibration_batches = 16
+Hyperparameters.matrix_bits = 6
+Hyperparameters.embed_bits = 7
+Hyperparameters.mlp_clip_sigmas = 11.5
+Hyperparameters.attn_clip_sigmas = 13.0
+Hyperparameters.embed_clip_sigmas = 14.0
+Hyperparameters.gated_attn_quant_gate = True
+Hyperparameters.sparse_attn_gate_enabled = True
+Hyperparameters.sparse_attn_gate_scale = 0.5
+Hyperparameters.gate_window = 12
+Hyperparameters.smear_gate_enabled = True
+Hyperparameters.lqer_enabled = True
+Hyperparameters.lqer_asym_enabled = True
+Hyperparameters.lqer_rank = 4
+Hyperparameters.lqer_factor_bits = 4
+Hyperparameters.lqer_asym_group = 64
+Hyperparameters.lqer_top_k = 3
+Hyperparameters.fused_ce_enabled = True
+
+_logger_hparams = None
+
+
+def set_logging_hparams(h):
+    global _logger_hparams
+    _logger_hparams = h
+
+
+def log(msg, console=True):
+    if _logger_hparams is None:
+        print(msg)
+        return
+    if _logger_hparams.is_main_process:
+        if console:
+            print(msg)
+        if _logger_hparams.logfile is not None:
+            with open(_logger_hparams.logfile, "a", encoding="utf-8") as f:
+                print(msg, file=f)
+
+
+class ValidationData:
+    def __init__(self, h, device):
+        self.sp = spm.SentencePieceProcessor(model_file=h.tokenizer_path)
+        if int(self.sp.vocab_size()) != h.vocab_size:
+            raise ValueError(
+                f"VOCAB_SIZE={h.vocab_size} does not match tokenizer vocab_size={int(self.sp.vocab_size())}"
+            )
+        self.val_tokens = load_validation_tokens(h.val_files, h.eval_seq_len)
+        self.caseops_enabled = bool(getattr(h, "caseops_enabled", False))
+        if self.caseops_enabled:
+            self.base_bytes_lut = None
+            self.has_leading_space_lut = None
+            self.is_boundary_token_lut = None
+        else:
+            (
+                self.base_bytes_lut,
+                self.has_leading_space_lut,
+                self.is_boundary_token_lut,
+            ) = build_sentencepiece_luts(self.sp, h.vocab_size, device)
+        self.val_bytes = None
+        if self.caseops_enabled:
+            self.val_bytes = load_validation_byte_sidecar(
+                h.val_bytes_files, h.eval_seq_len, self.val_tokens.numel()
+            )
+
+
+def build_sentencepiece_luts(sp, vocab_size, device):
+    sp_vocab_size = int(sp.vocab_size())
+    assert (
+        sp.piece_to_id("▁") != sp.unk_id()
+    ), "Tokenizer must have '▁' (space) as its own token for correct BPB byte counting"
+    table_size = max(sp_vocab_size, vocab_size)
+    base_bytes_np = np.zeros((table_size,), dtype=np.int16)
+    has_leading_space_np = np.zeros((table_size,), dtype=np.bool_)
+    is_boundary_token_np = np.ones((table_size,), dtype=np.bool_)
+    for token_id in range(sp_vocab_size):
+        if sp.is_control(token_id) or sp.is_unknown(token_id) or sp.is_unused(token_id):
+            continue
+        is_boundary_token_np[token_id] = False
+        if sp.is_byte(token_id):
+            base_bytes_np[token_id] = 1
+            continue
+        piece = sp.id_to_piece(token_id)
+        if piece.startswith("▁"):
+            has_leading_space_np[token_id] = True
+            piece = piece[1:]
+        base_bytes_np[token_id] = len(piece.encode("utf-8"))
+    return (
+        torch.tensor(base_bytes_np, dtype=torch.int16, device=device),
+        torch.tensor(has_leading_space_np, dtype=torch.bool, device=device),
+        torch.tensor(is_boundary_token_np, dtype=torch.bool, device=device),
+    )
+
+
+def load_validation_tokens(pattern, seq_len):
+    # Filter out CaseOps byte sidecar shards which share the val_*.bin glob.
+    files = [
+        Path(p)
+        for p in sorted(glob.glob(pattern))
+        if "_bytes_" not in Path(p).name
+    ]
+    if not files:
+        raise FileNotFoundError(f"No files found for pattern: {pattern}")
+    tokens = torch.cat([load_data_shard(file) for file in files]).contiguous()
+    usable = (tokens.numel() - 1) // seq_len * seq_len
+    if usable <= 0:
+        raise ValueError(f"Validation split is too short for TRAIN_SEQ_LEN={seq_len}")
+    return tokens[: usable + 1]
+
+
+def load_validation_byte_sidecar(pattern, seq_len, expected_len):
+    """Load CaseOps per-token byte sidecar(s). Same shard layout as token shards
+    (256 int32 header + uint16 array). Each entry = canonical raw-text byte
+    budget for that token in the corresponding val shard. Returns a CPU
+    int16 tensor sliced to match expected_len (i.e. val_tokens length)."""
+    files = [Path(p) for p in sorted(glob.glob(pattern))]
+    if not files:
+        raise FileNotFoundError(f"No byte sidecar files for pattern: {pattern}")
+    shards = [load_data_shard(file) for file in files]
+    # load_data_shard returns uint16 — that's exactly what the sidecar stores.
+    bytes_full = torch.cat(shards).contiguous()
+    if bytes_full.numel() < expected_len:
+        raise ValueError(
+            f"Byte sidecar too short: {bytes_full.numel()} < val_tokens {expected_len}"
+        )
+    return bytes_full[:expected_len].to(torch.int32)
+
+
+def load_data_shard(file):
+    header_bytes = 256 * np.dtype("<i4").itemsize
+    token_bytes = np.dtype("<u2").itemsize
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    num_tokens = int(header[2])
+    expected_size = header_bytes + num_tokens * token_bytes
+    if file.stat().st_size != expected_size:
+        raise ValueError(
+            f"Shard size mismatch for {file}: expected {expected_size} bytes"
+        )
+    tokens_np = np.fromfile(file, dtype="<u2", count=num_tokens, offset=header_bytes)
+    if tokens_np.size != num_tokens:
+        raise ValueError(f"Short read for {file}")
+    return torch.from_numpy(tokens_np.astype(np.uint16, copy=False))
+
+
+_SHARD_HEADER_BYTES = 256 * np.dtype("<i4").itemsize
+_SHARD_NTOKENS_CACHE = {}
+_MMAP_CACHE = {}
+
+
+def _read_num_tokens(file):
+    key = str(file)
+    cached = _SHARD_NTOKENS_CACHE.get(key)
+    if cached is not None:
+        return cached
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    n = int(header[2])
+    _SHARD_NTOKENS_CACHE[key] = n
+    return n
+
+
+def _get_shard_memmap(file):
+    key = str(file)
+    mm = _MMAP_CACHE.get(key)
+    if mm is not None:
+        return mm
+    n = _read_num_tokens(file)
+    mm = np.memmap(file, mode="r", dtype="<u2", offset=_SHARD_HEADER_BYTES, shape=(n,))
+    _MMAP_CACHE[key] = mm
+    return mm
+
+
+BOS_ID = None
+
+
+def get_next_multiple_of_n(v, n):
+    return ((v + n - 1) // n) * n
+
+
+def _build_cu_seqlens(bos_pos, total_len, device, max_doc_len=0, bucket_size=64):
+    if not bos_pos or bos_pos[0] != 0:
+        bos_pos = [0] + bos_pos
+    seg_starts = []
+    starts_with_end = bos_pos + [total_len]
+    for i in range(len(starts_with_end) - 1):
+        start = starts_with_end[i]
+        end = starts_with_end[i + 1]
+        if max_doc_len > 0:
+            pos = start
+            while pos < end:
+                seg_starts.append(pos)
+                pos += max_doc_len
+        else:
+            seg_starts.append(start)
+    boundaries = seg_starts + [total_len]
+    padded_len = get_next_multiple_of_n(len(boundaries), bucket_size)
+    cu = torch.full((padded_len,), total_len, dtype=torch.int32, device=device)
+    cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+    seg_ends = seg_starts[1:] + [total_len]
+    max_seqlen = max(end - start for start, end in zip(seg_starts, seg_ends))
+    return cu, max_seqlen
+
+class DocumentPackingLoader:
+    _shard_pool = ThreadPoolExecutor(1)
+
+    def __init__(self, h, device, cu_bucket_size=64):
+        self.rank = h.rank
+        self.world_size = h.world_size
+        self.device = device
+        self.cu_bucket_size = cu_bucket_size
+        self.max_seq_len = h.train_seq_len
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files
+        self.file_iter = iter(self.files)
+        self._init_shard(load_data_shard(next(self.file_iter)))
+        self._next_shard = self._submit_next_shard()
+        self._batch_pool = ThreadPoolExecutor(1)
+        self._prefetch_queue = []
+
+    def _init_shard(self, tokens):
+        global BOS_ID
+        self.tokens = tokens
+        self.shard_size = tokens.numel()
+        if BOS_ID is None:
+            BOS_ID = 1
+        self.bos_idx = (
+            (tokens == BOS_ID).nonzero(as_tuple=True)[0].to(torch.int64).cpu().numpy()
+        )
+        self.cursor = int(self.bos_idx[0])
+
+    def _submit_next_shard(self):
+        try:
+            path = next(self.file_iter)
+            return self._shard_pool.submit(load_data_shard, path)
+        except StopIteration:
+            return None
+
+    def _advance_shard(self):
+        if self._next_shard is None:
+            self.file_iter = iter(self.files)
+            self._next_shard = self._shard_pool.submit(
+                load_data_shard, next(self.file_iter)
+            )
+        self._init_shard(self._next_shard.result())
+        self._next_shard = self._submit_next_shard()
+
+    def _local_doc_starts(self, local_start, total_len):
+        lo = np.searchsorted(self.bos_idx, local_start, side="left")
+        hi = np.searchsorted(self.bos_idx, local_start + total_len, side="left")
+        return (self.bos_idx[lo:hi] - local_start).tolist()
+
+    def _prepare_batch(self, num_tokens_local, max_seq_len):
+        per_rank_span = num_tokens_local + 1
+        global_span = per_rank_span * self.world_size
+        while self.cursor + global_span > self.shard_size:
+            self._advance_shard()
+        local_start = self.cursor + self.rank * per_rank_span
+        buf = self.tokens[local_start : local_start + per_rank_span]
+        inputs = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        targets = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        inputs.copy_(buf[:-1])
+        targets.copy_(buf[1:])
+        starts = self._local_doc_starts(local_start, inputs.numel())
+        cu_seqlens, max_seqlen = _build_cu_seqlens(
+            starts, inputs.numel(), inputs.device, max_seq_len, self.cu_bucket_size
+        )
+        cu_seqlens = cu_seqlens.pin_memory()
+        self.cursor += global_span
+        return inputs, targets, cu_seqlens, max_seqlen
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        num_tokens_local = global_tokens // (self.world_size * grad_accum_steps)
+        while len(self._prefetch_queue) < 2:
+            self._prefetch_queue.append(
+                self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        inputs, targets, cu_seqlens, max_seqlen = self._prefetch_queue.pop(0).result()
+        self._prefetch_queue.append(
+            self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        return (
+            inputs[None].to(self.device, non_blocking=True),
+            targets[None].to(self.device, non_blocking=True),
+            cu_seqlens.to(self.device, non_blocking=True),
+            max_seqlen,
+        )
+
+
+class ShuffledSequenceLoader:
+    def __init__(self, h, device):
+        self.world_size = h.world_size
+        self.seq_len = h.train_seq_len
+        self.device = device
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files[h.rank :: h.world_size]
+        self.rng = np.random.Generator(np.random.PCG64(h.rank))
+        self.num_tokens = [_read_num_tokens(f) for f in self.files]
+        self.start_inds = [[] for _ in self.files]
+        for si in range(len(self.files)):
+            self._reset_shard(si)
+
+    def _reset_shard(self, si):
+        max_phase = min(
+            self.seq_len - 1, max(0, self.num_tokens[si] - self.seq_len - 1)
+        )
+        phase = int(self.rng.integers(max_phase + 1)) if max_phase > 0 else 0
+        num_sequences = (self.num_tokens[si] - 1 - phase) // self.seq_len
+        sequence_order = self.rng.permutation(num_sequences)
+        self.start_inds[si] = (phase + sequence_order * self.seq_len).tolist()
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        device_tokens = global_tokens // (self.world_size * grad_accum_steps)
+        device_batch_size = device_tokens // self.seq_len
+        remaining = np.array([len(s) for s in self.start_inds], dtype=np.float64)
+        x = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        y = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        for bi in range(device_batch_size):
+            total = remaining.sum()
+            if total <= 0:
+                for si in range(len(self.files)):
+                    self._reset_shard(si)
+                remaining = np.array(
+                    [len(s) for s in self.start_inds], dtype=np.float64
+                )
+                total = remaining.sum()
+            probs = remaining / total
+            si = int(self.rng.choice(len(self.files), p=probs))
+            start_ind = self.start_inds[si].pop()
+            remaining[si] -= 1
+            mm = _get_shard_memmap(self.files[si])
+            window = torch.as_tensor(
+                np.array(mm[start_ind : start_ind + self.seq_len + 1], dtype=np.int64)
+            )
+            x[bi] = window[:-1]
+            y[bi] = window[1:]
+        return x.to(self.device, non_blocking=True), y.to(
+            self.device, non_blocking=True
+        )
+
+
+class RMSNorm(nn.Module):
+    def __init__(self, eps=None):
+        super().__init__()
+        self.eps = eps
+
+    def forward(self, x):
+        return F.rms_norm(x, (x.size(-1),), eps=self.eps)
+
+
+class CastedLinear(nn.Linear):
+    def forward(self, x):
+        w = self.weight.to(x.dtype)
+        bias = self.bias.to(x.dtype) if self.bias is not None else None
+        return F.linear(x, w, bias)
+
+
+@triton.jit
+def linear_leaky_relu_square_kernel(
+    a_desc,
+    b_desc,
+    c_desc,
+    aux_desc,
+    M,
+    N,
+    K,
+    BLOCK_SIZE_M: tl.constexpr,
+    BLOCK_SIZE_N: tl.constexpr,
+    BLOCK_SIZE_K: tl.constexpr,
+    NUM_SMS: tl.constexpr,
+    FORWARD: tl.constexpr,
+):
+    dtype = tl.bfloat16
+    start_pid = tl.program_id(axis=0)
+    num_pid_m = tl.cdiv(M, BLOCK_SIZE_M)
+    num_pid_n = tl.cdiv(N, BLOCK_SIZE_N)
+    k_tiles = tl.cdiv(K, BLOCK_SIZE_K)
+    num_tiles = num_pid_m * num_pid_n
+    tile_id_c = start_pid - NUM_SMS
+    for tile_id in tl.range(start_pid, num_tiles, NUM_SMS, flatten=True):
+        pid_m = tile_id // num_pid_n
+        pid_n = tile_id % num_pid_n
+        offs_am = pid_m * BLOCK_SIZE_M
+        offs_bn = pid_n * BLOCK_SIZE_N
+        accumulator = tl.zeros((BLOCK_SIZE_M, BLOCK_SIZE_N), dtype=tl.float32)
+        for ki in range(k_tiles):
+            offs_k = ki * BLOCK_SIZE_K
+            a = a_desc.load([offs_am, offs_k])
+            b = b_desc.load([offs_bn, offs_k])
+            accumulator = tl.dot(a, b.T, accumulator)
+        tile_id_c += NUM_SMS
+        offs_am_c = offs_am
+        offs_bn_c = offs_bn
+        acc = tl.reshape(accumulator, (BLOCK_SIZE_M, 2, BLOCK_SIZE_N // 2))
+        acc = tl.permute(acc, (0, 2, 1))
+        acc0, acc1 = tl.split(acc)
+        c0 = acc0.to(dtype)
+        c1 = acc1.to(dtype)
+        if not FORWARD:
+            pre0 = aux_desc.load([offs_am_c, offs_bn_c])
+            pre1 = aux_desc.load([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2])
+            c0 = c0 * tl.where(pre0 > 0, 2.0 * pre0, 0.5 * pre0)
+            c1 = c1 * tl.where(pre1 > 0, 2.0 * pre1, 0.5 * pre1)
+        c_desc.store([offs_am_c, offs_bn_c], c0)
+        c_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], c1)
+        if FORWARD:
+            aux0 = tl.where(c0 > 0, c0, 0.5 * c0)
+            aux1 = tl.where(c1 > 0, c1, 0.5 * c1)
+            aux_desc.store([offs_am_c, offs_bn_c], aux0 * aux0)
+            aux_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], aux1 * aux1)
+
+
+def linear_leaky_relu_square(a, b, aux=None):
+    M, K = a.shape
+    N, K2 = b.shape
+    assert K == K2
+    c = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    forward = aux is None
+    if aux is None:
+        aux = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    num_sms = torch.cuda.get_device_properties(a.device).multi_processor_count
+    BLOCK_SIZE_M, BLOCK_SIZE_N, BLOCK_SIZE_K = 256, 128, 64
+    num_stages = 4 if forward else 3
+    a_desc = TensorDescriptor.from_tensor(a, [BLOCK_SIZE_M, BLOCK_SIZE_K])
+    b_desc = TensorDescriptor.from_tensor(b, [BLOCK_SIZE_N, BLOCK_SIZE_K])
+    c_desc = TensorDescriptor.from_tensor(c, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    aux_desc = TensorDescriptor.from_tensor(aux, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    grid = lambda _meta: (
+        min(num_sms, triton.cdiv(M, BLOCK_SIZE_M) * triton.cdiv(N, BLOCK_SIZE_N)),
+    )
+    linear_leaky_relu_square_kernel[grid](
+        a_desc,
+        b_desc,
+        c_desc,
+        aux_desc,
+        M,
+        N,
+        K,
+        BLOCK_SIZE_M=BLOCK_SIZE_M,
+        BLOCK_SIZE_N=BLOCK_SIZE_N,
+        BLOCK_SIZE_K=BLOCK_SIZE_K,
+        NUM_SMS=num_sms,
+        FORWARD=forward,
+        num_stages=num_stages,
+        num_warps=8,
+    )
+    if forward:
+        return c, aux
+    return c
+
+
+class FusedLinearLeakyReLUSquareFunction(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, x, w1, w2):
+        x_flat = x.reshape(-1, x.shape[-1])
+        pre, post = linear_leaky_relu_square(x_flat, w1)
+        out = F.linear(post, w2)
+        ctx.save_for_backward(x, w1, w2, pre, post)
+        return out.view(*x.shape[:-1], out.shape[-1])
+
+    @staticmethod
+    def backward(ctx, grad_output):
+        x, w1, w2, pre, post = ctx.saved_tensors
+        x_flat = x.reshape(-1, x.shape[-1])
+        grad_output_flat = grad_output.reshape(-1, grad_output.shape[-1])
+        dw2 = grad_output_flat.T @ post
+        dpre = linear_leaky_relu_square(grad_output_flat, w2.T.contiguous(), aux=pre)
+        dw1 = dpre.T @ x_flat
+        dx = dpre @ w1
+        return dx.view_as(x), dw1, dw2
+
+
+FusedLeakyReLUSquareMLP = FusedLinearLeakyReLUSquareFunction.apply
+
+
+class Rotary(nn.Module):
+    def __init__(self, dim, base=1e4, train_seq_len=1024, rope_dims=0, yarn=True):
+        super().__init__()
+        self.dim = dim
+        self.base = base
+        self.train_seq_len = train_seq_len
+        self.yarn = yarn
+        self.rope_dims = rope_dims if rope_dims > 0 else dim
+        inv_freq = 1.0 / base ** (
+            torch.arange(0, self.rope_dims, 2, dtype=torch.float32) / self.rope_dims
+        )
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        self._seq_len_cached = 0
+        self._cos_cached = None
+        self._sin_cached = None
+
+    def forward(self, seq_len, device, dtype):
+        if (
+            self._cos_cached is None
+            or self._sin_cached is None
+            or self._seq_len_cached < seq_len
+            or self._cos_cached.device != device
+        ):
+            rd = self.rope_dims
+            if self.yarn and seq_len > self.train_seq_len:
+                scale = seq_len / self.train_seq_len
+                new_base = self.base * scale ** (rd / (rd - 2))
+                inv_freq = 1.0 / new_base ** (
+                    torch.arange(0, rd, 2, dtype=torch.float32, device=device) / rd
+                )
+            else:
+                inv_freq = self.inv_freq.float().to(device)
+            t = torch.arange(seq_len, device=device, dtype=torch.float32)
+            freqs = torch.outer(t, inv_freq)
+            self._cos_cached = freqs.cos()[None, :, None, :]
+            self._sin_cached = freqs.sin()[None, :, None, :]
+            self._seq_len_cached = seq_len
+        return self._cos_cached[:, :seq_len].to(dtype=dtype), self._sin_cached[:, :seq_len].to(dtype=dtype)
+
+
+def apply_rotary_emb(x, cos, sin, rope_dims=0):
+    if rope_dims > 0 and rope_dims < x.size(-1):
+        x_rope, x_pass = x[..., :rope_dims], x[..., rope_dims:]
+        half = rope_dims // 2
+        x1, x2 = x_rope[..., :half], x_rope[..., half:]
+        x_rope = torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+        return torch.cat((x_rope, x_pass), dim=-1)
+    half = x.size(-1) // 2
+    x1, x2 = x[..., :half], x[..., half:]
+    return torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+
+
+class CausalSelfAttention(nn.Module):
+    def __init__(
+        self, dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=True,
+        attn_out_gate=False, attn_out_gate_src="proj", gate_window=12,
+        gated_attn=False, gated_attn_init_std=0.01,
+        sparse_attn_gate=False, sparse_attn_gate_init_std=0.0, sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        if dim % num_heads != 0:
+            raise ValueError("model_dim must be divisible by num_heads")
+        if num_heads % num_kv_heads != 0:
+            raise ValueError("num_heads must be divisible by num_kv_heads")
+        if int(attn_out_gate) + int(gated_attn) + int(sparse_attn_gate) > 1:
+            raise ValueError(
+                "attn_out_gate, gated_attn, and sparse_attn_gate are mutually exclusive"
+            )
+        self.num_heads = num_heads
+        self.num_kv_heads = num_kv_heads
+        self.head_dim = dim // num_heads
+        if self.head_dim % 2 != 0:
+            raise ValueError("head_dim must be even for RoPE")
+        self.q_gain = nn.Parameter(
+            torch.full((num_heads,), qk_gain_init, dtype=torch.float32)
+        )
+        self.rope_dims = 0
+        self.rotary = Rotary(self.head_dim, base=rope_base, train_seq_len=train_seq_len, yarn=yarn)
+        self.use_xsa = False
+        # AttnOutGate (PR #1667 MarioPaerle): per-head multiplicative gate on attention
+        # output. CastedLinear so restore_fp32_params casts back to fp32 for GPTQ.
+        # _zero_init -> 2*sigmoid(0)=1 -> transparent at init.
+        self.attn_out_gate = attn_out_gate
+        self.attn_out_gate_src = attn_out_gate_src
+        self.gate_window = gate_window
+        if attn_out_gate:
+            self.attn_gate_proj = CastedLinear(gate_window, num_heads, bias=False)
+            self.attn_gate_proj._zero_init = True
+        # Gated Attention (arXiv:2505.06708, Qwen, NeurIPS 2025). Per-head sigmoid
+        # gate on SDPA output, BEFORE out_proj. Gate projection W_g: (num_heads, dim).
+        # Name "attn_gate_w" contains "attn_gate" substring so it matches
+        # CONTROL_TENSOR_NAME_PATTERNS and routes to the scalar AdamW group.
+        # fp32 Parameter -> restore_fp32_params path covers it via the ndim<2 OR
+        # name-pattern check (name matches "attn_gate"). Cast to x.dtype on use.
+        self.gated_attn = gated_attn
+        if gated_attn:
+            W = torch.empty(num_heads, dim, dtype=torch.float32)
+            nn.init.normal_(W, mean=0.0, std=gated_attn_init_std)
+            self.attn_gate_w = nn.Parameter(W)
+        # Sparse attention head-output gate (modded-nanogpt style). Keeps dense SDPA
+        # and only narrows the gate input to the first gate_window residual dims.
+        # W_g: (num_heads, gate_window). y_{t,h} <- sigmoid(scale * W_g_h @ x_t[:gate_window]) * y_{t,h}.
+        # Shares attn_gate_w name with dense GatedAttn so the quant routing
+        # (CONTROL_TENSOR_NAME_PATTERNS / attn_gate_w int8 passthrough) is unchanged.
+        self.sparse_attn_gate = sparse_attn_gate
+        self.sparse_attn_gate_scale = sparse_attn_gate_scale
+        if sparse_attn_gate:
+            W = torch.empty(num_heads, gate_window, dtype=torch.float32)
+            if sparse_attn_gate_init_std > 0:
+                nn.init.normal_(W, mean=0.0, std=sparse_attn_gate_init_std)
+            else:
+                nn.init.zeros_(W)
+            self.attn_gate_w = nn.Parameter(W)
+
+    def _xsa_efficient(self, y, v):
+        B, T, H, D = y.shape
+        Hkv = v.size(-2)
+        group = H // Hkv
+        y_g = y.reshape(B, T, Hkv, group, D)
+        vn = F.normalize(v, dim=-1).unsqueeze(-2)
+        proj = (y_g * vn).sum(dim=-1, keepdim=True) * vn
+        return (y_g - proj).reshape(B, T, H, D)
+
+    def forward(self, x, q_w, k_w, v_w, out_w, cu_seqlens=None, max_seqlen=0):
+        bsz, seqlen, dim = x.shape
+        # q_raw kept around as a tap point for attn_out_gate_src='q' (post-projection,
+        # pre-reshape, pre-RoPE).
+        q_raw = F.linear(x, q_w.to(x.dtype))
+        q = q_raw.reshape(bsz, seqlen, self.num_heads, self.head_dim)
+        k = F.linear(x, k_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        v = F.linear(x, v_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = self.rotary(seqlen, x.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, self.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, self.rope_dims)
+        q = q * self.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        if cu_seqlens is not None:
+            y = flash_attn_varlen_func(
+                q[0],
+                k[0],
+                v[0],
+                cu_seqlens_q=cu_seqlens,
+                cu_seqlens_k=cu_seqlens,
+                max_seqlen_q=max_seqlen,
+                max_seqlen_k=max_seqlen,
+                causal=True,
+                window_size=(-1, -1),
+            )[None]
+        else:
+            y = flash_attn_3_func(q, k, v, causal=True)
+        if self.use_xsa:
+            y = self._xsa_efficient(y, v)
+        # AttnOutGate inlined (PR #1667). Inline + .contiguous() barrier so torch.compile
+        # fullgraph=True is happy (this avoids the @torch.compiler.disable trap that
+        # crashed gates v3). Per-head gate on (B,T,H,D) tensor: g shape [B,T,H], broadcast
+        # over D via [..., None]. zero-init weight -> 2*sigmoid(0)=1 -> transparent.
+        if self.attn_out_gate:
+            gate_src = q_raw if self.attn_out_gate_src == "q" else x
+            gate_in = gate_src[..., : self.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(self.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (arXiv:2505.06708 G1). Inline + .contiguous() barrier so
+        # torch.compile fullgraph=True is happy. Per-head gate on (B,T,H,D): g shape
+        # [B,T,H], broadcast over D via [..., None]. Paper: g = sigmoid(x @ W_g.T)
+        # where W_g: (H, dim). .to(x.dtype) on fp32 param before broadcast with bf16.
+        if self.gated_attn:
+            x_c = x.contiguous()
+            g = torch.sigmoid(F.linear(x_c, self.attn_gate_w.to(x.dtype)))
+            y = y * g[..., None]
+        # Sparse head-output gate: narrower (gate_window) input, same shape g as GatedAttn.
+        if self.sparse_attn_gate:
+            gate_in = x[..., : self.gate_window].contiguous()
+            g = torch.sigmoid(
+                self.sparse_attn_gate_scale
+                * F.linear(gate_in, self.attn_gate_w.to(x.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        self._last_proj_input = y.detach() if getattr(self, "_calib", False) else None
+        return F.linear(y, out_w.to(x.dtype))
+
+
+class MLP(nn.Module):
+    def __init__(self, dim, mlp_mult):
+        super().__init__()
+        self.use_fused = True
+
+    def forward(self, x, up_w, down_w):
+        if self.training and self.use_fused:
+            return FusedLeakyReLUSquareMLP(x, up_w.to(x.dtype), down_w.to(x.dtype))
+        hidden = F.leaky_relu(F.linear(x, up_w.to(x.dtype)), negative_slope=0.5).square()
+        self._last_down_input = hidden.detach() if getattr(self, "_calib", False) else None
+        return F.linear(hidden, down_w.to(x.dtype))
+
+
+class Block(nn.Module):
+    def __init__(
+        self,
+        dim,
+        num_heads,
+        num_kv_heads,
+        mlp_mult,
+        rope_base,
+        qk_gain_init,
+        train_seq_len,
+        layer_idx=0,
+        ln_scale=False,
+        yarn=True,
+        attn_out_gate=False,
+        attn_out_gate_src="proj",
+        gate_window=12,
+        gated_attn=False,
+        gated_attn_init_std=0.01,
+        sparse_attn_gate=False,
+        sparse_attn_gate_init_std=0.0,
+        sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        self.attn_norm = RMSNorm()
+        self.mlp_norm = RMSNorm()
+        self.attn = CausalSelfAttention(
+            dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=yarn,
+            attn_out_gate=attn_out_gate, attn_out_gate_src=attn_out_gate_src, gate_window=gate_window,
+            gated_attn=gated_attn, gated_attn_init_std=gated_attn_init_std,
+            sparse_attn_gate=sparse_attn_gate,
+            sparse_attn_gate_init_std=sparse_attn_gate_init_std,
+            sparse_attn_gate_scale=sparse_attn_gate_scale,
+        )
+        self.mlp = MLP(dim, mlp_mult)
+        self.attn_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.mlp_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.resid_mix = nn.Parameter(
+            torch.stack((torch.ones(dim), torch.zeros(dim))).float()
+        )
+        self.ln_scale_factor = 1.0 / math.sqrt(layer_idx + 1) if ln_scale else 1.0
+
+    def forward(self, x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=None, max_seqlen=0):
+        mix = self.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        attn_out = self.attn(
+            self.attn_norm(x_in) * self.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens,
+            max_seqlen=max_seqlen,
+        )
+        x_out = x_in + self.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        x_out = x_out + self.mlp_scale.to(dtype=x_out.dtype)[
+            None, None, :
+        ] * self.mlp(self.mlp_norm(x_out) * self.ln_scale_factor, up_w, down_w)
+        return x_out
+
+class GPT(nn.Module):
+    def __init__(self, h):
+        super().__init__()
+        if h.logit_softcap <= 0.0:
+            raise ValueError(f"logit_softcap must be positive, got {h.logit_softcap}")
+        self.tie_embeddings = h.tie_embeddings
+        self.tied_embed_init_std = h.tied_embed_init_std
+        self.logit_softcap = h.logit_softcap
+        self.fused_ce_enabled = bool(h.fused_ce_enabled)
+        self.tok_emb = nn.Embedding(h.vocab_size, h.model_dim)
+        self.num_layers = h.num_layers
+        head_dim = h.model_dim // h.num_heads
+        kv_dim = h.num_kv_heads * head_dim
+        hidden_dim = int(h.mlp_mult * h.model_dim)
+        self.qo_bank = nn.Parameter(torch.empty(2 * h.num_layers, h.model_dim, h.model_dim))
+        self.kv_bank = nn.Parameter(torch.empty(2 * h.num_layers, kv_dim, h.model_dim))
+        self.mlp_up_bank = nn.Parameter(torch.empty(h.num_layers, hidden_dim, h.model_dim))
+        self.mlp_down_bank = nn.Parameter(torch.empty(h.num_layers, h.model_dim, hidden_dim))
+        self.num_encoder_layers = h.num_layers // 2
+        self.num_decoder_layers = h.num_layers - self.num_encoder_layers
+        self.blocks = nn.ModuleList(
+            [
+                Block(
+                    h.model_dim,
+                    h.num_heads,
+                    h.num_kv_heads,
+                    h.mlp_mult,
+                    h.rope_base,
+                    h.qk_gain_init,
+                    h.train_seq_len,
+                    layer_idx=i,
+                    ln_scale=h.ln_scale,
+                    yarn=h.rope_yarn,
+                    attn_out_gate=h.attn_out_gate_enabled,
+                    attn_out_gate_src=h.attn_out_gate_src,
+                    gate_window=h.gate_window,
+                    gated_attn=h.gated_attn_enabled,
+                    gated_attn_init_std=h.gated_attn_init_std,
+                    sparse_attn_gate=h.sparse_attn_gate_enabled,
+                    sparse_attn_gate_init_std=h.sparse_attn_gate_init_std,
+                    sparse_attn_gate_scale=h.sparse_attn_gate_scale,
+                )
+                for i in range(h.num_layers)
+            ]
+        )
+        if h.rope_dims > 0:
+            head_dim = h.model_dim // h.num_heads
+            for block in self.blocks:
+                block.attn.rope_dims = h.rope_dims
+                block.attn.rotary = Rotary(
+                    head_dim,
+                    base=h.rope_base,
+                    train_seq_len=h.train_seq_len,
+                    rope_dims=h.rope_dims,
+                    yarn=h.rope_yarn,
+                )
+        self.final_norm = RMSNorm()
+        self.lm_head = (
+            None
+            if h.tie_embeddings
+            else CastedLinear(h.model_dim, h.vocab_size, bias=False)
+        )
+        if self.lm_head is not None:
+            self.lm_head._zero_init = True
+        if h.xsa_last_n > 0:
+            for i in range(max(0, h.num_layers - h.xsa_last_n), h.num_layers):
+                self.blocks[i].attn.use_xsa = True
+        self.looping_active = False
+        if h.num_loops > 0:
+            loop_seg = list(range(h.loop_start, h.loop_end + 1))
+            all_indices = list(range(h.loop_start))
+            for _ in range(h.num_loops + 1):
+                all_indices.extend(loop_seg)
+            all_indices.extend(range(h.loop_end + 1, h.num_layers))
+            num_enc = len(all_indices) // 2
+            self.encoder_indices = all_indices[:num_enc]
+            self.decoder_indices = all_indices[num_enc:]
+        else:
+            self.encoder_indices = list(range(self.num_encoder_layers))
+            self.decoder_indices = list(range(self.num_encoder_layers, h.num_layers))
+        self.num_skip_weights = min(
+            len(self.encoder_indices), len(self.decoder_indices)
+        )
+        self.skip_weights = nn.Parameter(
+            torch.ones(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+        )
+        self.skip_gates = (
+            nn.Parameter(
+                torch.zeros(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+            )
+            if h.skip_gates_enabled
+            else None
+        )
+        self.parallel_start_layer = h.parallel_start_layer
+        self.parallel_final_lane = h.parallel_final_lane.lower()
+        self.parallel_post_lambdas = nn.Parameter(
+            torch.ones(h.num_layers, 2, 2, dtype=torch.float32)
+        )
+        self.parallel_resid_lambdas = nn.Parameter(
+            torch.full((h.num_layers, 2), 1.1, dtype=torch.float32)
+        )
+        # SmearGate (PR #1667 / modded-nanogpt @classiclarryd):
+        #   x_t <- x_t + lam * sigmoid(W * x_t[:gate_window]) * x_{t-1}.
+        # Per-token forward-1 smear of the embedding lane. W zero-init + lam=0 ->
+        # transparent at init. Uses CastedLinear so restore_fp32_params handles dtype.
+        self.smear_gate_enabled = h.smear_gate_enabled
+        if self.smear_gate_enabled:
+            self.smear_window = h.gate_window
+            self.smear_gate = CastedLinear(self.smear_window, 1, bias=False)
+            self.smear_gate._zero_init = True
+            self.smear_lambda = nn.Parameter(torch.zeros(1, dtype=torch.float32))
+        self._init_weights()
+
+    def _init_weights(self):
+        if self.tie_embeddings:
+            nn.init.normal_(self.tok_emb.weight, mean=0.0, std=self.tied_embed_init_std)
+        n = self.num_layers
+        proj_scale = 1.0 / math.sqrt(2 * n)
+        for i in range(n):
+            nn.init.orthogonal_(self.qo_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.qo_bank.data[n + i])
+            self.qo_bank.data[n + i].mul_(proj_scale)
+            nn.init.orthogonal_(self.kv_bank.data[i], gain=1.0)
+            nn.init.orthogonal_(self.kv_bank.data[n + i], gain=1.0)
+        for i in range(n):
+            nn.init.orthogonal_(self.mlp_up_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.mlp_down_bank.data[i])
+            self.mlp_down_bank.data[i].mul_(proj_scale)
+        for name, module in self.named_modules():
+            if isinstance(module, nn.Linear):
+                if getattr(module, "_zero_init", False):
+                    nn.init.zeros_(module.weight)
+                elif (
+                    module.weight.ndim == 2
+                    and module.weight.shape[0] >= 64
+                    and module.weight.shape[1] >= 64
+                ):
+                    nn.init.orthogonal_(module.weight, gain=1.0)
+
+    def _bank_weights(self, i):
+        n = self.num_layers
+        return (
+            self.qo_bank[i],
+            self.kv_bank[i],
+            self.kv_bank[n + i],
+            self.qo_bank[n + i],
+            self.mlp_up_bank[i],
+            self.mlp_down_bank[i],
+        )
+
+    def _parallel_block(
+        self, block_idx, lane0, lane1, x0,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+        cu_seqlens=None, max_seqlen=0,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        attn_out = block.attn(
+            block.attn_norm(attn_read) * block.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+        )
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * block.mlp(
+            block.mlp_norm(mlp_read) * block.ln_scale_factor, up_w, down_w
+        )
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+    def _final_parallel_hidden(self, lane0, lane1):
+        if self.parallel_final_lane == "mlp":
+            return lane1
+        if self.parallel_final_lane == "attn":
+            return lane0
+        return 0.5 * (lane0 + lane1)
+
+    def _forward_hidden(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        """Run the encoder/decoder stack to the final RMSNorm; returns pre-projection hidden.
+        Shared by eval (softcap+projection via forward_logits) and train (fused CE path)."""
+        x = self.tok_emb(input_ids)
+        # SmearGate (PR #1667). lam=0 + W=0 -> identity at init.
+        # Cross-doc leak fix: zero the prev-token smear at any position whose current token
+        # is BOS, so the BOS embedding starting doc N+1 in a packed stream is not
+        # contaminated by doc N's last token (audited issue on PR#1797 base).
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else range(self.num_encoder_layers)
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else range(
+                self.num_encoder_layers,
+                self.num_encoder_layers + self.num_decoder_layers,
+            )
+        )
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block(
+                    i, lane0, lane1, x0, q_w, k_w, v_w, out_w, up_w, down_w,
+                    cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        return x
+
+    def _project_logits(self, hidden):
+        if self.tie_embeddings:
+            return F.linear(hidden, self.tok_emb.weight)
+        return self.lm_head(hidden)
+
+    def forward_logits(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        return self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+
+    def forward(self, input_ids, target_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        flat_targets = target_ids.reshape(-1)
+        # Fused softcapped-CE kernel (training path only). Applies softcap inside the
+        # Triton kernel; takes pre-softcap logits_proj. Non-fused path matches stock
+        # PR-1736 numerics exactly (softcap in fp32, then F.cross_entropy on fp32).
+        if self.fused_ce_enabled:
+            return softcapped_cross_entropy(
+                logits_proj.reshape(-1, logits_proj.size(-1)),
+                flat_targets,
+                self.logit_softcap,
+                reduction="mean",
+            )
+        logits = self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+        return F.cross_entropy(
+            logits.reshape(-1, logits.size(-1)).float(),
+            flat_targets,
+            reduction="mean",
+        )
+
+    def forward_ttt(self, input_ids, target_ids, lora):
+        x = self.tok_emb(input_ids)
+        # SmearGate on the TTT path — same inline compute as forward_logits.
+        # Cross-doc leak fix: see _forward_hidden comment.
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else list(range(self.num_encoder_layers))
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else list(
+                range(
+                    self.num_encoder_layers,
+                    self.num_encoder_layers + self.num_decoder_layers,
+                )
+            )
+        )
+        slot = 0
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block_with_lora(
+                    i, lane0, lane1, x0, lora, slot,
+                    q_w, k_w, v_w, out_w, up_w, down_w,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        if self.tie_embeddings:
+            logits = F.linear(x, self.tok_emb.weight)
+        else:
+            logits = self.lm_head(x)
+        logits = logits + lora.lm_head_lora(x)
+        logits = self.logit_softcap * torch.tanh(logits / self.logit_softcap)
+        bsz, sl, V = logits.shape
+        return F.cross_entropy(
+            logits.float().reshape(-1, V), target_ids.reshape(-1), reduction="none"
+        ).reshape(bsz, sl)
+
+    def _block_with_lora(self, block, x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w):
+        mix = block.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        n = block.attn_norm(x_in) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        # Keep raw Q for AttnOutGate src='q' (matches forward path semantics).
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT path) — inline + .contiguous() barrier, same as the eval path.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT path). Gate input is n (post-norm block input), same
+        # as eval path. .to(n.dtype) on fp32 param before bf16 broadcast.
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT path) — must match the eval path in
+        # forward() exactly, else training (which applied the gate) and TTT eval (which
+        # skipped it) produce mismatched representations and catastrophic BPB regression.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        x_out = x_in + block.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        mlp_n = block.mlp_norm(x_out) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        x_out = x_out + block.mlp_scale.to(dtype=x_out.dtype)[None, None, :] * mlp_out
+        return x_out
+
+    def _parallel_block_with_lora(
+        self, block_idx, lane0, lane1, x0, lora, slot,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        n = block.attn_norm(attn_read) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT parallel path) — inline + .contiguous() barrier.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT parallel path). Gate input is n (post-norm block input).
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT parallel path) — must match the
+        # eval path in forward() to keep train/eval semantics in sync.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_n = block.mlp_norm(mlp_read) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * mlp_out
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+
+class BatchedLinearLoRA(nn.Module):
+    # PR-1767: rank-scaled output (alpha/rank), like standard LoRA. Decouples
+    # effective magnitude from rank so changing rank does not change LR scale.
+    _ALPHA = float(os.environ.get("TTT_LORA_ALPHA", "144"))
+    # PR-1767: optionally keep A warm across per-doc resets (only B is zeroed).
+    # Accumulates useful feature directions across documents within a TTT phase.
+    _WARM_START_A = bool(int(os.environ.get("TTT_WARM_START_A", "1")))
+
+    def __init__(self, bsz, in_features, out_features, rank):
+        super().__init__()
+        self._bound = 1.0 / math.sqrt(in_features)
+        self._scale = self._ALPHA / rank
+        self.A = nn.Parameter(
+            torch.empty(bsz, rank, in_features).uniform_(-self._bound, self._bound)
+        )
+        self.B = nn.Parameter(torch.zeros(bsz, out_features, rank))
+
+    def reset(self):
+        with torch.no_grad():
+            if not self._WARM_START_A:
+                self.A.uniform_(-self._bound, self._bound)
+            self.B.zero_()
+
+    def forward(self, x):
+        return ((x @ self.A.transpose(1, 2)) @ self.B.transpose(1, 2)) * self._scale
+
+
+class BatchedTTTLoRA(nn.Module):
+    def __init__(self, bsz, model, rank, k_lora=True, mlp_lora=True, o_lora=True):
+        super().__init__()
+        self.bsz = bsz
+        dim = model.qo_bank.shape[-1]
+        vocab = model.tok_emb.num_embeddings
+        if getattr(model, "looping_active", False):
+            num_slots = len(model.encoder_indices) + len(model.decoder_indices)
+        else:
+            num_slots = len(model.blocks)
+        kv_dim = model.blocks[0].attn.num_kv_heads * (
+            dim // model.blocks[0].attn.num_heads
+        )
+        embed_dim = model.tok_emb.embedding_dim
+        self.lm_head_lora = BatchedLinearLoRA(bsz, embed_dim, vocab, rank)
+        self.q_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+        )
+        self.v_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+        )
+        self.k_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+            )
+            if k_lora
+            else None
+        )
+        self.mlp_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if mlp_lora
+            else None
+        )
+        self.o_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if o_lora
+            else None
+        )
+
+    def reset(self):
+        with torch.no_grad():
+            self.lm_head_lora.reset()
+            for loras in [self.q_loras, self.v_loras, self.k_loras,
+                          self.mlp_loras, self.o_loras]:
+                if loras is not None:
+                    for lora in loras:
+                        lora.reset()
+
+
+# Polar Express per-iteration minimax Newton-Schulz coefficients (PR #1344).
+# Replaces the fixed (3.4445, -4.775, 2.0315) coefficients of stock Muon.
+# Applied at backend_steps=5 — taking more than 5 iterations from this list
+# falls back to the final (converged) tuple via the slice guard below.
+_PE_COEFFS = (
+    (8.156554524902461, -22.48329292557795, 15.878769915207462),
+    (4.042929935166739, -2.808917465908714, 0.5000178451051316),
+    (3.8916678022926607, -2.772484153217685, 0.5060648178503393),
+    (3.285753657755655, -2.3681294933425376, 0.46449024233003106),
+    (2.3465413258596377, -1.7097828382687081, 0.42323551169305323),
+)
+
+
+@torch.compile
+def zeropower_via_newtonschulz5(G, steps=10, eps=1e-07):
+    was_2d = G.ndim == 2
+    if was_2d:
+        G = G.unsqueeze(0)
+    X = G.bfloat16()
+    transposed = X.size(-2) > X.size(-1)
+    if transposed:
+        X = X.mT
+    X = X / (X.norm(dim=(-2, -1), keepdim=True) + eps)
+    coeffs = _PE_COEFFS[:steps] if steps <= len(_PE_COEFFS) else _PE_COEFFS
+    for a, b, c in coeffs:
+        A = X @ X.mT
+        B = b * A + c * (A @ A)
+        X = a * X + B @ X
+    if transposed:
+        X = X.mT
+    if was_2d:
+        X = X.squeeze(0)
+    return X
+
+
+class Muon(torch.optim.Optimizer):
+    def __init__(
+        self,
+        params,
+        lr,
+        momentum,
+        backend_steps,
+        nesterov=True,
+        weight_decay=0.0,
+        row_normalize=False,
+    ):
+        super().__init__(
+            params,
+            dict(
+                lr=lr,
+                momentum=momentum,
+                backend_steps=backend_steps,
+                nesterov=nesterov,
+                weight_decay=weight_decay,
+                row_normalize=row_normalize,
+            ),
+        )
+        self._built = False
+
+    def _build(self):
+        self._distributed = dist.is_available() and dist.is_initialized()
+        self._world_size = dist.get_world_size() if self._distributed else 1
+        self._rank = dist.get_rank() if self._distributed else 0
+        ws = self._world_size
+        self._bank_meta = []
+        for group in self.param_groups:
+            for p in group["params"]:
+                B = p.shape[0]
+                padded_B = ((B + ws - 1) // ws) * ws
+                shard_B = padded_B // ws
+                tail = p.shape[1:]
+                dev = p.device
+                self._bank_meta.append({
+                    "p": p,
+                    "B": B,
+                    "padded_grad": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard_mom": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "full_update": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "scale": max(1, p.shape[-2] / p.shape[-1]) ** 0.5,
+                })
+        self._bank_meta.sort(key=lambda m: -m["p"].numel())
+        self._built = True
+
+    def launch_reduce_scatters(self):
+        if not self._built:
+            self._build()
+        if not self._distributed:
+            return
+        self._rs_futures = []
+        for m in self._bank_meta:
+            p = m["p"]
+            if p.grad is None:
+                self._rs_futures.append(None)
+                continue
+            pg = m["padded_grad"]
+            pg[: m["B"]].copy_(p.grad)
+            fut = dist.reduce_scatter_tensor(
+                m["shard"], pg, op=dist.ReduceOp.AVG, async_op=True
+            )
+            self._rs_futures.append(fut)
+
+    @torch.no_grad()
+    def step(self, closure=None):
+        loss = None
+        if closure is not None:
+            with torch.enable_grad():
+                loss = closure()
+        if not self._built:
+            self._build()
+        for group in self.param_groups:
+            lr = group["lr"]
+            momentum = group["momentum"]
+            backend_steps = group["backend_steps"]
+            nesterov = group["nesterov"]
+            wd = group.get("weight_decay", 0.0)
+            row_normalize = group.get("row_normalize", False)
+            prev_ag_handle = None
+            prev_m = None
+            sharded = self._distributed and hasattr(self, "_rs_futures")
+            for idx, m in enumerate(self._bank_meta):
+                p = m["p"]
+                if p.grad is None:
+                    continue
+                if prev_ag_handle is not None:
+                    prev_ag_handle.wait()
+                    pp = prev_m["p"]
+                    upd = prev_m["full_update"][: prev_m["B"]]
+                    if wd > 0.0:
+                        pp.data.mul_(1.0 - lr * wd)
+                    pp.add_(upd, alpha=-lr * prev_m["scale"])
+                if sharded and self._rs_futures[idx] is not None:
+                    self._rs_futures[idx].wait()
+                    g = m["shard"]
+                    buf = m["shard_mom"]
+                else:
+                    g = p.grad.bfloat16()
+                    state = self.state[p]
+                    if "momentum_buffer" not in state:
+                        state["momentum_buffer"] = torch.zeros_like(g)
+                    buf = state["momentum_buffer"]
+                buf.mul_(momentum).add_(g)
+                if nesterov:
+                    update = g.add(buf, alpha=momentum)
+                else:
+                    update = buf
+                if row_normalize:
+                    rn = update.float().norm(dim=-1, keepdim=True).clamp_min(1e-07)
+                    update = update / rn.to(update.dtype)
+                update = zeropower_via_newtonschulz5(update, steps=backend_steps)
+                if sharded:
+                    prev_ag_handle = dist.all_gather_into_tensor(
+                        m["full_update"], update, async_op=True
+                    )
+                    prev_m = m
+                else:
+                    if wd > 0.0:
+                        p.data.mul_(1.0 - lr * wd)
+                    p.add_(update, alpha=-lr * m["scale"])
+            if prev_ag_handle is not None:
+                prev_ag_handle.wait()
+                pp = prev_m["p"]
+                upd = prev_m["full_update"][: prev_m["B"]]
+                if wd > 0.0:
+                    pp.data.mul_(1.0 - lr * wd)
+                pp.add_(upd, alpha=-lr * prev_m["scale"])
+            if hasattr(self, "_rs_futures"):
+                del self._rs_futures
+        return loss
+
+
+CONTROL_TENSOR_NAME_PATTERNS = tuple(
+    pattern
+    for pattern in os.environ.get(
+        "CONTROL_TENSOR_NAME_PATTERNS",
+        "attn_scale,attn_scales,mlp_scale,mlp_scales,resid_mix,resid_mixes,q_gain,skip_weight,skip_weights,skip_gates,parallel_post_lambdas,parallel_resid_lambdas,attn_gate_proj,attn_gate_w,smear_gate,smear_lambda",
+    ).split(",")
+    if pattern
+)
+
+
+PACKED_REPLICATED_GRAD_MAX_NUMEL = 1 << 15
+
+
+class Optimizers:
+    def __init__(self, h, base_model):
+        matrix_params = [
+            base_model.qo_bank,
+            base_model.kv_bank,
+            base_model.mlp_up_bank,
+            base_model.mlp_down_bank,
+        ]
+        block_named_params = list(base_model.blocks.named_parameters())
+        scalar_params = [
+            p
+            for (name, p) in block_named_params
+            if p.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ]
+        if base_model.skip_weights.numel() > 0:
+            scalar_params.append(base_model.skip_weights)
+        if base_model.skip_gates is not None and base_model.skip_gates.numel() > 0:
+            scalar_params.append(base_model.skip_gates)
+        if base_model.parallel_post_lambdas is not None:
+            scalar_params.append(base_model.parallel_post_lambdas)
+        if base_model.parallel_resid_lambdas is not None:
+            scalar_params.append(base_model.parallel_resid_lambdas)
+        # SmearGate params live on GPT root (not in .blocks), so add them by hand.
+        # Both are tiny (gate_window scalars + 1 lambda). Optimized via scalar Adam.
+        if getattr(base_model, "smear_gate_enabled", False):
+            scalar_params.append(base_model.smear_gate.weight)
+            scalar_params.append(base_model.smear_lambda)
+        token_lr = h.tied_embed_lr if h.tie_embeddings else h.embed_lr
+        tok_params = [
+            {"params": [base_model.tok_emb.weight], "lr": token_lr, "base_lr": token_lr}
+        ]
+        self.optimizer_tok = torch.optim.AdamW(
+            tok_params,
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.embed_wd,
+            fused=True,
+        )
+        self.optimizer_muon = Muon(
+            matrix_params,
+            lr=h.matrix_lr,
+            momentum=h.muon_momentum,
+            backend_steps=h.muon_backend_steps,
+            weight_decay=h.muon_wd,
+            row_normalize=h.muon_row_normalize,
+        )
+        for group in self.optimizer_muon.param_groups:
+            group["base_lr"] = h.matrix_lr
+        self.optimizer_scalar = torch.optim.AdamW(
+            [{"params": scalar_params, "lr": h.scalar_lr, "base_lr": h.scalar_lr}],
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.adam_wd,
+            fused=True,
+        )
+        self.optimizers = [
+            self.optimizer_tok,
+            self.optimizer_muon,
+            self.optimizer_scalar,
+        ]
+        self.replicated_params = list(tok_params[0]["params"])
+        self.replicated_params.extend(scalar_params)
+        self.replicated_large_params = []
+        self.replicated_packed_params = []
+        for p in self.replicated_params:
+            if p.numel() <= PACKED_REPLICATED_GRAD_MAX_NUMEL:
+                self.replicated_packed_params.append(p)
+            else:
+                self.replicated_large_params.append(p)
+        self._aux_stream = torch.cuda.Stream()
+
+    def __iter__(self):
+        return iter(self.optimizers)
+
+    def zero_grad_all(self):
+        for opt in self.optimizers:
+            opt.zero_grad(set_to_none=True)
+
+    def _all_reduce_packed_grads(self):
+        grads_by_key = collections.defaultdict(list)
+        for p in self.replicated_packed_params:
+            if p.grad is not None:
+                grads_by_key[(p.grad.device, p.grad.dtype)].append(p.grad)
+        for grads in grads_by_key.values():
+            flat = torch.empty(
+                sum(g.numel() for g in grads),
+                device=grads[0].device,
+                dtype=grads[0].dtype,
+            )
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                flat[offset : offset + n].copy_(g.contiguous().view(-1))
+                offset += n
+            dist.all_reduce(flat, op=dist.ReduceOp.AVG)
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                g.copy_(flat[offset : offset + n].view_as(g))
+                offset += n
+
+    def step(self, distributed=False):
+        self.optimizer_muon.launch_reduce_scatters()
+        if distributed:
+            reduce_handles = [
+                dist.all_reduce(p.grad, op=dist.ReduceOp.AVG, async_op=True)
+                for p in self.replicated_large_params
+                if p.grad is not None
+            ]
+            self._all_reduce_packed_grads()
+            for handle in reduce_handles:
+                handle.wait()
+        self._aux_stream.wait_stream(torch.cuda.current_stream())
+        with torch.cuda.stream(self._aux_stream):
+            self.optimizer_tok.step()
+            self.optimizer_scalar.step()
+        self.optimizer_muon.step()
+        torch.cuda.current_stream().wait_stream(self._aux_stream)
+        self.zero_grad_all()
+
+
+def restore_fp32_params(model):
+    for module in model.modules():
+        if isinstance(module, CastedLinear):
+            module.float()
+    for name, param in model.named_parameters():
+        if (
+            param.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ) and param.dtype != torch.float32:
+            param.data = param.data.float()
+    if hasattr(model, "qo_bank") and model.qo_bank is not None:
+        model.qo_bank.data = model.qo_bank.data.float()
+        model.kv_bank.data = model.kv_bank.data.float()
+    model.mlp_up_bank.data = model.mlp_up_bank.data.float()
+    model.mlp_down_bank.data = model.mlp_down_bank.data.float()
+
+
+def collect_hessians(model, train_loader, h, device, n_calibration_batches=64):
+    hessians = {}
+    hooks = []
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = True
+        block.mlp._calib = True
+        block.mlp.use_fused = False
+
+    def make_attn_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            for suffix in ["c_q", "c_k", "c_v"]:
+                name = f"blocks.{layer_idx}.attn.{suffix}.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            y = module._last_proj_input
+            if y is not None:
+                y = y.float()
+                if y.ndim == 3:
+                    y = y.reshape(-1, y.shape[-1])
+                name = f"blocks.{layer_idx}.attn.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        y.shape[1], y.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(y.T, y)
+        return hook_fn
+
+    def make_mlp_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            name = f"blocks.{layer_idx}.mlp.fc.weight"
+            if name not in hessians:
+                hessians[name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[name].addmm_(x.T, x)
+            h_act = module._last_down_input
+            if h_act is not None:
+                h_act = h_act.float()
+                if h_act.ndim == 3:
+                    h_act = h_act.reshape(-1, h_act.shape[-1])
+                name = f"blocks.{layer_idx}.mlp.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        h_act.shape[1], h_act.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(h_act.T, h_act)
+        return hook_fn
+
+    for i, block in enumerate(model.blocks):
+        hooks.append(block.attn.register_forward_hook(make_attn_hook(i)))
+        hooks.append(block.mlp.register_forward_hook(make_mlp_hook(i)))
+
+    # Hessian hooks for embedding factorization projection layers
+    def make_linear_input_hook(weight_name):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            if weight_name not in hessians:
+                hessians[weight_name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[weight_name].addmm_(x.T, x)
+        return hook_fn
+
+    if model.tie_embeddings:
+        hook_module = model.final_norm
+
+        def make_output_hook(name):
+            def hook_fn(module, inp, out):
+                x = out.detach().float()
+                if x.ndim == 3:
+                    x = x.reshape(-1, x.shape[-1])
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            return hook_fn
+
+        hooks.append(
+            hook_module.register_forward_hook(make_output_hook("tok_emb.weight"))
+        )
+    model.eval()
+    with torch.no_grad():
+        for _ in range(n_calibration_batches):
+            x, _ = train_loader.next_batch(h.train_batch_tokens, h.grad_accum_steps)
+            model.forward_logits(x)
+    for hook in hooks:
+        hook.remove()
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = False
+        block.mlp._calib = False
+        block.mlp.use_fused = True
+    for name in hessians:
+        hessians[name] = hessians[name].cpu() / n_calibration_batches
+    return hessians
+
+
+def gptq_quantize_weight(w, H, clip_sigmas=3.0, clip_range=63, block_size=128):
+    W_orig = w.float().clone()
+    rows, cols = W_orig.shape
+    H = H.float().clone()
+    dead = torch.diag(H) == 0
+    H[dead, dead] = 1
+    damp = 0.01 * H.diag().mean()
+    H.diagonal().add_(damp)
+    perm = torch.argsort(H.diag(), descending=True)
+    invperm = torch.argsort(perm)
+    W_perm = W_orig[:, perm].clone()
+    W_perm[:, dead[perm]] = 0
+    H = H[perm][:, perm]
+    Hinv = torch.cholesky_inverse(torch.linalg.cholesky(H))
+    Hinv = torch.linalg.cholesky(Hinv, upper=True)
+    row_std = W_orig.std(dim=1)
+    s = (clip_sigmas * row_std / clip_range).clamp_min(1e-10).to(torch.float16)
+    sf = s.float()
+    Q = torch.zeros(rows, cols, dtype=torch.int8)
+    W_work = W_perm.clone()
+    for i1 in range(0, cols, block_size):
+        i2 = min(i1 + block_size, cols)
+        W_block = W_work[:, i1:i2].clone()
+        Hinv_block = Hinv[i1:i2, i1:i2]
+        Err = torch.zeros(rows, i2 - i1)
+        for j in range(i2 - i1):
+            w_col = W_block[:, j]
+            d = Hinv_block[j, j]
+            q_col = torch.clamp(torch.round(w_col / sf), -clip_range, clip_range)
+            Q[:, i1 + j] = q_col.to(torch.int8)
+            err = (w_col - q_col.float() * sf) / d
+            Err[:, j] = err
+            W_block[:, j:] -= err.unsqueeze(1) * Hinv_block[j, j:].unsqueeze(0)
+        if i2 < cols:
+            W_work[:, i2:] -= Err @ Hinv[i1:i2, i2:]
+    return Q[:, invperm], s
+
+
+def _quantize_gate_int8_row(w):
+    # Symmetric int8-per-row quantization for small gate tensors. w shape
+    # (R, C) -> (R,) scales in fp16, int8 values in [-127, 127]. Single scale
+    # per row keeps accuracy high while halving storage vs fp16.
+    W = w.float().contiguous()
+    row_max = W.abs().amax(dim=1).clamp_min(1e-10)
+    s = (row_max / 127.0).to(torch.float16)
+    sf = s.float().view(-1, 1)
+    q = torch.clamp(torch.round(W / sf), -127, 127).to(torch.int8)
+    return q, s
+
+
+def _lqer_pack(A, B, bits):
+    rng = 2 ** (bits - 1) - 1
+    sA = (A.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    sB = (B.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    qB = torch.clamp(torch.round(B / sB.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    return qA, sA, qB, sB
+
+
+def _lqer_pack_asym(A, B, g=64):
+    # A: INT2 per-matrix scalar (signed [-2,1], scale = |A|max/1.5).
+    sA = (A.abs().amax().clamp_min(1e-10) / 1.5).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float()), -2, 1).to(torch.int8)
+    # B: INT4 groupwise g over flattened B (signed [-8,7], per-group scale).
+    Bf = B.reshape(-1, g)
+    Bmax = Bf.abs().amax(dim=-1, keepdim=True).clamp_min(1e-10)
+    sB = (Bmax / 7.5).to(torch.float16).reshape(-1)
+    qB = torch.clamp(torch.round(Bf / sB.float().reshape(-1, 1)), -8, 7).to(
+        torch.int8
+    ).reshape(B.shape)
+    return qA, sA, qB, sB
+
+
+def gptq_mixed_quantize(state_dict, hessians, h):
+    result = {}
+    meta = {}
+    quant_gate = bool(getattr(h, "gated_attn_quant_gate", False))
+    lqer_on = bool(getattr(h, "lqer_enabled", False))
+    lqer_cands = {}
+    for (name, tensor) in state_dict.items():
+        t = tensor.detach().cpu().contiguous()
+        # Dedicated int8-per-row path for attn_gate_w (bypasses both GPTQ and
+        # fp16 passthrough). Applied BEFORE the numel<=65536 passthrough check
+        # so the gate tensor is routed here instead of to fp16.
+        if (
+            quant_gate
+            and t.is_floating_point()
+            and t.ndim == 2
+            and name.endswith(".attn_gate_w")
+            # Dense GatedAttn: (num_heads, dim) = (8, 512) = 4096.
+            # Sparse gate: (num_heads, gate_window) = (8, 12) = 96.
+            # Both need int8-per-row routing; the 1024 lower bound in stock
+            # PR-1736 presumed dense-only. Widen to catch both.
+            and 32 <= t.numel() <= 8192
+        ):
+            gq, gs = _quantize_gate_int8_row(t)
+            result[name + ".gq"] = gq
+            result[name + ".gs"] = gs
+            meta[name] = "gate_int8_row"
+            continue
+        if not t.is_floating_point() or t.numel() <= 65536:
+            result[name] = t.to(torch.float16) if t.is_floating_point() else t
+            meta[name] = "passthrough (float16)"
+            continue
+        if "tok_emb" in name:
+            cs = h.embed_clip_sigmas
+        elif ".mlp." in name:
+            cs = h.mlp_clip_sigmas
+        elif ".attn." in name:
+            cs = h.attn_clip_sigmas
+        else:
+            cs = h.matrix_clip_sigmas
+        bits = h.embed_bits if "tok_emb" in name else h.matrix_bits
+        clip_range = 2 ** (bits - 1) - 1
+        ret = gptq_quantize_weight(
+            t, hessians[name], clip_sigmas=cs, clip_range=clip_range
+        )
+        q, s = ret
+        result[name + ".q"] = q
+        result[name + ".scale"] = s
+        meta[name] = f"gptq (int{bits})"
+        if lqer_on:
+            W_q = q.float() * s.float().view(-1, 1)
+            E = t.float() - W_q
+            lqer_cands[name] = (E, float(E.norm()))
+    if lqer_on and lqer_cands:
+        top = sorted(lqer_cands.items(), key=lambda kv: -kv[1][1])[: h.lqer_top_k]
+        asym_on = bool(getattr(h, "lqer_asym_enabled", False))
+        asym_g = int(getattr(h, "lqer_asym_group", 64))
+        for (name, (E, _)) in top:
+            U, S, Vh = torch.linalg.svd(E, full_matrices=False)
+            r = min(h.lqer_rank, S.numel())
+            A = (U[:, :r] * S[:r]).contiguous()
+            B = Vh[:r, :].contiguous()
+            if asym_on and B.numel() % asym_g == 0:
+                qA, sA, qB, sB = _lqer_pack_asym(A, B, asym_g)
+                result[name + ".lqA_a"] = qA
+                result[name + ".lqAs_a"] = sA
+                result[name + ".lqB_a"] = qB
+                result[name + ".lqBs_a"] = sB
+                meta[name] = meta[name] + "+lqer_asym"
+            else:
+                qA, sA, qB, sB = _lqer_pack(A, B, h.lqer_factor_bits)
+                result[name + ".lqA"] = qA
+                result[name + ".lqAs"] = sA
+                result[name + ".lqB"] = qB
+                result[name + ".lqBs"] = sB
+                meta[name] = meta[name] + "+lqer"
+    categories = collections.defaultdict(set)
+    for (name, cat) in meta.items():
+        short = re.sub("\\.\\d+$", "", re.sub("blocks\\.\\d+", "blocks", name))
+        categories[cat].add(short)
+    log("Quantized weights:")
+    for cat in sorted(categories):
+        log(f"  {cat}: {', '.join(sorted(categories[cat]))}")
+    return result, meta
+
+def dequantize_mixed(result, meta, template_sd):
+    out = {}
+    for (name, orig) in template_sd.items():
+        info = meta.get(name)
+        if info is None:
+            continue
+        orig_dtype = orig.dtype
+        if "passthrough" in info:
+            t = result[name]
+            if t.dtype == torch.float16 and orig_dtype in (
+                torch.float32,
+                torch.bfloat16,
+            ):
+                t = t.to(orig_dtype)
+            out[name] = t
+            continue
+        if info == "gate_int8_row":
+            gq = result[name + ".gq"]
+            gs = result[name + ".gs"]
+            out[name] = (gq.float() * gs.float().view(-1, 1)).to(orig_dtype)
+            continue
+        q, s = result[name + ".q"], result[name + ".scale"]
+        if s.ndim > 0:
+            W = q.float() * s.float().view(q.shape[0], *[1] * (q.ndim - 1))
+        else:
+            W = q.float() * float(s.item())
+        if "lqer_asym" in info:
+            qA_t = result[name + ".lqA_a"]
+            sA_t = result[name + ".lqAs_a"]
+            qB_t = result[name + ".lqB_a"]
+            sB_t = result[name + ".lqBs_a"]
+            qA = qA_t.float() * float(sA_t)
+            g_sz = qB_t.numel() // sB_t.numel()
+            qB = (qB_t.reshape(-1, g_sz).float() * sB_t.float().view(-1, 1)).reshape(
+                qB_t.shape
+            )
+            W = W + qA @ qB
+        elif "lqer" in info:
+            qA = result[name + ".lqA"].float() * result[name + ".lqAs"].float().view(-1, 1)
+            qB = result[name + ".lqB"].float() * result[name + ".lqBs"].float().view(-1, 1)
+            W = W + qA @ qB
+        out[name] = W.to(orig_dtype)
+    return out
+
+
+_BSHF_MAGIC = b"BSHF"
+
+
+# ── Per-group lrzip compression (ported from PR#1586 via PR#1667/1729) ────────
+
+_GROUP_ORDER = [
+    "_tok_emb.weight.q",
+    "attn.c_k.weight.q", "attn.c_q.weight.q",
+    "attn.c_v.weight.q", "attn.proj.weight.q",
+    "mlp.fc.weight.q", "mlp.proj.weight.q",
+]
+_SIMSORT_KEYS = {"_tok_emb.weight.q", "attn.c_q.weight.q", "mlp.fc.weight.q"}
+_PACK_MAGIC = b"PGRP"
+
+
+def _similarity_sort_l1(matrix):
+    import numpy as _np
+    n = matrix.shape[0]
+    used = _np.zeros(n, dtype=bool)
+    order = [0]
+    used[0] = True
+    cur = matrix[0].astype(_np.float32)
+    for _ in range(n - 1):
+        dists = _np.sum(_np.abs(matrix[~used].astype(_np.float32) - cur), axis=1)
+        unused = _np.where(~used)[0]
+        best = unused[_np.argmin(dists)]
+        order.append(best)
+        used[best] = True
+        cur = matrix[best].astype(_np.float32)
+    return _np.array(order, dtype=_np.uint16)
+
+
+def _lrzip_compress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.bin")
+    out = f"{inp}.lrz"
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-z", "-L", "9", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _lrzip_decompress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.lrz")
+    out = os.path.join(tmpdir, f"{label}.bin")
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-d", "-f", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _pack_streams(streams):
+    import struct
+    n = len(streams)
+    hdr = _PACK_MAGIC + struct.pack("<I", n)
+    for s in streams:
+        hdr += struct.pack("<I", len(s))
+    return hdr + b"".join(streams)
+
+
+def _unpack_streams(blob):
+    import struct
+    assert blob[:4] == _PACK_MAGIC
+    n = struct.unpack("<I", blob[4:8])[0]
+    off = 8
+    lengths = [struct.unpack("<I", blob[off + i*4:off + i*4 + 4])[0] for i in range(n)]
+    off += n * 4
+    streams = []
+    for length in lengths:
+        streams.append(blob[off:off + length])
+        off += length
+    return streams
+
+
+def _compress(raw, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.compress(raw, quality=11)
+    if compressor == "lzma":
+        import lzma
+        return lzma.compress(raw, preset=9)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _decompress(blob, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.decompress(blob)
+    if compressor == "lzma":
+        import lzma
+        return lzma.decompress(blob)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _serialize_pergroup(quant_result, quant_meta, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    groups = collections.defaultdict(list)
+    remainder = {}
+    for name, t in sorted(quant_result.items()):
+        if t.dtype != torch.int8:
+            remainder[name] = t
+            continue
+        parts = name.split(".")
+        routed = False
+        if parts[0] == "blocks" and parts[1].isdigit():
+            key = ".".join(parts[2:])
+            if key in _GROUP_ORDER:
+                groups[key].append((int(parts[1]), t))
+                routed = True
+        else:
+            group_key = "_" + name
+            if group_key in _GROUP_ORDER:
+                groups[group_key] = [(0, t)]
+                routed = True
+        if not routed:
+            # int8 tensor that doesn't fit a known group (e.g. gate_int8_row
+            # tensors like attn.attn_gate_w.gq from GATED_ATTN). Stash in
+            # the brotli-compressed remainder blob so it round-trips.
+            remainder[name] = t
+
+    streams = []
+    all_perms = b""
+    shape_manifest = {}
+
+    for group_key in _GROUP_ORDER:
+        if group_key not in groups:
+            streams.append(b"")
+            continue
+        tensors = sorted(groups[group_key], key=lambda x: x[0])
+        blob = b""
+        grp_shapes = []
+        for idx, t in tensors:
+            arr = t.numpy()
+            orig_shape = arr.shape
+            if arr.ndim == 2:
+                if group_key in _SIMSORT_KEYS:
+                    order = _similarity_sort_l1(arr)
+                    all_perms += order.tobytes()
+                    arr = arr[order]
+                arr = _np.ascontiguousarray(arr.T)
+            blob += arr.tobytes()
+            grp_shapes.append(orig_shape)
+        shape_manifest[group_key] = grp_shapes
+        compressed = _lrzip_compress(blob, tmpdir, group_key.replace(".", "_"))
+        streams.append(compressed)
+
+    remainder_buf = io.BytesIO()
+    torch.save({"r": remainder, "m": quant_meta, "s": shape_manifest}, remainder_buf)
+    streams.append(brotli.compress(remainder_buf.getvalue(), quality=11, lgwin=24))
+    streams.append(brotli.compress(all_perms, quality=11) if all_perms else b"")
+
+    return _pack_streams(streams)
+
+
+def _deserialize_pergroup(blob, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    streams = _unpack_streams(blob)
+    n_groups = len(_GROUP_ORDER)
+
+    remainder_state = torch.load(
+        io.BytesIO(brotli.decompress(streams[n_groups])), map_location="cpu"
+    )
+    quant_meta = remainder_state["m"]
+    quant_result = dict(remainder_state["r"])
+    shape_manifest = remainder_state["s"]
+    all_perms = brotli.decompress(streams[n_groups + 1]) if streams[n_groups + 1] else b""
+
+    def _decompress_one(args):
+        i, gk, data = args
+        if not data:
+            return gk, b""
+        return gk, _lrzip_decompress(data, tmpdir, f"d_{gk.replace('.', '_')}")
+
+    from concurrent.futures import ThreadPoolExecutor as _TPool
+    with _TPool(max_workers=n_groups) as pool:
+        futs = [pool.submit(_decompress_one, (i, gk, streams[i])) for i, gk in enumerate(_GROUP_ORDER)]
+        raw_groups = {f.result()[0]: f.result()[1] for f in futs}
+
+    perm_off = 0
+    for group_key in _GROUP_ORDER:
+        raw = raw_groups.get(group_key, b"")
+        if not raw:
+            continue
+        grp_shapes = shape_manifest[group_key]
+        data_arr = _np.frombuffer(raw, dtype=_np.int8)
+
+        if group_key.startswith("_"):
+            tensor_names = [group_key[1:]]
+        else:
+            tensor_names = [f"blocks.{i}.{group_key}" for i in range(num_layers)]
+
+        offset = 0
+        for tname, orig_shape in zip(tensor_names, grp_shapes):
+            n_elem = 1
+            for d in orig_shape:
+                n_elem *= d
+            chunk = data_arr[offset:offset + n_elem].copy()
+            offset += n_elem
+
+            if len(orig_shape) == 2:
+                rows, cols = orig_shape
+                chunk = chunk.reshape(cols, rows).T
+
+                if group_key in _SIMSORT_KEYS:
+                    perm = _np.frombuffer(all_perms[perm_off:perm_off + rows * 2], dtype=_np.uint16)
+                    perm_off += rows * 2
+                    inv_perm = _np.empty_like(perm)
+                    inv_perm[perm] = _np.arange(rows, dtype=_np.uint16)
+                    chunk = chunk[inv_perm]
+
+                chunk = chunk.reshape(orig_shape)
+
+            quant_result[tname] = torch.from_numpy(_np.ascontiguousarray(chunk))
+
+    return quant_result, quant_meta
+
+
+def _unbank_state_dict(state_dict, num_layers):
+    sd = {}
+    n = num_layers
+    for k, v in state_dict.items():
+        t = v.detach().cpu() if v is not None else None
+        if k == "qo_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_q.weight"] = t[i]
+                sd[f"blocks.{i}.attn.proj.weight"] = t[n + i]
+        elif k == "kv_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_k.weight"] = t[i]
+                sd[f"blocks.{i}.attn.c_v.weight"] = t[n + i]
+        elif k == "mlp_up_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.fc.weight"] = t[i]
+        elif k == "mlp_down_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.proj.weight"] = t[i]
+        else:
+            if t is not None:
+                sd[k] = t
+    return sd
+
+
+def _rebank_state_dict(flat_sd, num_layers, model_dim, kv_dim, hidden_dim):
+    sd = {}
+    n = num_layers
+    sd["qo_bank"] = torch.zeros(2 * n, model_dim, model_dim)
+    sd["kv_bank"] = torch.zeros(2 * n, kv_dim, model_dim)
+    for i in range(n):
+        sd["qo_bank"][i] = flat_sd[f"blocks.{i}.attn.c_q.weight"]
+        sd["qo_bank"][n + i] = flat_sd[f"blocks.{i}.attn.proj.weight"]
+        sd["kv_bank"][i] = flat_sd[f"blocks.{i}.attn.c_k.weight"]
+        sd["kv_bank"][n + i] = flat_sd[f"blocks.{i}.attn.c_v.weight"]
+    sd["mlp_up_bank"] = torch.zeros(n, hidden_dim, model_dim)
+    sd["mlp_down_bank"] = torch.zeros(n, model_dim, hidden_dim)
+    for i in range(n):
+        sd["mlp_up_bank"][i] = flat_sd[f"blocks.{i}.mlp.fc.weight"]
+        sd["mlp_down_bank"][i] = flat_sd[f"blocks.{i}.mlp.proj.weight"]
+    for k, v in flat_sd.items():
+        if not (
+            k.startswith("blocks.")
+            and any(
+                p in k
+                for p in [
+                    ".attn.c_q.", ".attn.c_k.", ".attn.c_v.",
+                    ".attn.proj.", ".mlp.fc.", ".mlp.proj.",
+                ]
+            )
+        ):
+            sd[k] = v
+    return sd
+
+
+
+def _compressed_code_size(code):
+    import brotli
+    code_raw = code.encode("utf-8")
+    try:
+        minified = subprocess.run(
+            ["pyminify", "--no-rename-locals", "--no-hoist-literals", "--remove-literal-statements", "--remove-asserts", "--prefer-single-line", "-"],
+            input=code_raw, capture_output=True, check=True,
+        ).stdout
+    except (FileNotFoundError, subprocess.CalledProcessError):
+        minified = code_raw
+    compressed = brotli.compress(minified, quality=11)
+    encoded = base64.b85encode(compressed)
+    wrapper = b"import brotli as B,base64 as b\nexec(B.decompress(b.b85decode(\"" + encoded + b"\")))\n"
+    return len(code_raw), len(wrapper)
+
+
+def serialize(h, base_model, code):
+    code_bytes_uncompressed, code_bytes = _compressed_code_size(code)
+    if h.is_main_process:
+        torch.save(base_model.state_dict(), h.model_path)
+        model_bytes = os.path.getsize(h.model_path)
+        log(f"Serialized model: {model_bytes} bytes")
+        log(f"Code size (uncompressed): {code_bytes_uncompressed} bytes")
+        log(f"Code size (compressed): {code_bytes} bytes")
+    sd_cpu = _unbank_state_dict(base_model.state_dict(), h.num_layers)
+    device = torch.device("cuda", h.local_rank)
+    t0 = time.perf_counter()
+    calib_loader = ShuffledSequenceLoader(h, device)
+    log("GPTQ:collecting Hessians from calibration data...")
+    hessians = collect_hessians(
+        base_model,
+        calib_loader,
+        h,
+        device,
+        n_calibration_batches=h.gptq_calibration_batches,
+    )
+    log(f"GPTQ:collected {len(hessians)} Hessians in {time.perf_counter()-t0:.1f}s")
+    quant_result, quant_meta = gptq_mixed_quantize(sd_cpu, hessians, h)
+    if h.compressor == "pergroup":
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_")
+        log("Serialize: per-group lrzip compression...")
+        t1 = time.perf_counter()
+        quant_blob = _serialize_pergroup(quant_result, quant_meta, h.num_layers, tmpdir)
+        log(f"Serialize: per-group compression done in {time.perf_counter()-t1:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_buf = io.BytesIO()
+        torch.save({"w": quant_result, "m": quant_meta}, quant_buf)
+        quant_raw = quant_buf.getvalue()
+        quant_blob = _compress(quant_raw, h.compressor)
+    quant_file_bytes = len(quant_blob)
+    bytes_total = quant_file_bytes + code_bytes
+    if h.is_main_process:
+        with open(h.quantized_model_path, "wb") as f:
+            f.write(quant_blob)
+        log(f"Serialized model quantized+{h.compressor}: {quant_file_bytes} bytes")
+        log(f"Total submission size quantized+{h.compressor}: {bytes_total} bytes")
+    return bytes_total, quant_file_bytes
+
+
+def deserialize(h, device):
+    eval_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(eval_model)
+    flat_template = _unbank_state_dict(eval_model.state_dict(), h.num_layers)
+    with open(h.quantized_model_path, "rb") as f:
+        quant_blob_disk = f.read()
+    if quant_blob_disk[:4] == _PACK_MAGIC:
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_dec_")
+        log("Deserialize: per-group lrzip decompression...")
+        t0 = time.perf_counter()
+        quant_result, quant_meta = _deserialize_pergroup(
+            quant_blob_disk, h.num_layers, tmpdir
+        )
+        log(f"Deserialize: decompression done in {time.perf_counter()-t0:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_state = torch.load(
+            io.BytesIO(_decompress(quant_blob_disk, h.compressor)), map_location="cpu"
+        )
+        quant_result, quant_meta = quant_state["w"], quant_state["m"]
+    deq_flat = dequantize_mixed(quant_result, quant_meta, flat_template)
+    head_dim = h.model_dim // h.num_heads
+    kv_dim = h.num_kv_heads * head_dim
+    hidden_dim = int(h.mlp_mult * h.model_dim)
+    deq_state = _rebank_state_dict(deq_flat, h.num_layers, h.model_dim, kv_dim, hidden_dim)
+    eval_model.load_state_dict(deq_state, strict=True)
+    return eval_model
+
+
+def _loss_bpb(loss_sum, token_count, byte_count):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_count.item())
+    return val_loss, val_bpb
+
+
+def eval_val(h, device, val_data, model, forward_logits_fn=None):
+    seq_len = h.eval_seq_len
+    local_batch_tokens = h.val_batch_tokens // (h.world_size * h.grad_accum_steps)
+    if local_batch_tokens < seq_len:
+        raise ValueError(
+            f"VAL_BATCH_SIZE must provide at least one sequence per rank; got VAL_BATCH_SIZE={h.val_batch_tokens}, WORLD_SIZE={h.world_size}, GRAD_ACCUM_STEPS={h.grad_accum_steps}, seq_len={seq_len}"
+        )
+    local_batch_seqs = local_batch_tokens // seq_len
+    total_seqs = (val_data.val_tokens.numel() - 1) // seq_len
+    seq_start = total_seqs * h.rank // h.world_size
+    seq_end = total_seqs * (h.rank + 1) // h.world_size
+
+    # TODO: Don't truncate this.
+    seq_end = seq_start + ((seq_end - seq_start) // local_batch_seqs) * local_batch_seqs
+
+    val_loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    val_token_count = torch.zeros((), device=device, dtype=torch.float64)
+    val_byte_count = torch.zeros((), device=device, dtype=torch.float64)
+    run_forward_logits = (
+        (model.module.forward_logits if hasattr(model, "module") else model.forward_logits)
+        if forward_logits_fn is None
+        else forward_logits_fn
+    )
+    model.eval()
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    with torch.no_grad():
+        for batch_seq_start in range(seq_start, seq_end, local_batch_seqs):
+            batch_seq_end = min(batch_seq_start + local_batch_seqs, seq_end)
+            raw_start = batch_seq_start * seq_len
+            raw_end = batch_seq_end * seq_len + 1
+            local = val_data.val_tokens[raw_start:raw_end].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            x = local[:-1]
+            y = local[1:]
+            bos_pos = (x == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                bos_pos, x.numel(), x.device, h.eval_seq_len, 64
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                logits = run_forward_logits(
+                    x[None], cu_seqlens=cu_seqlens, max_seqlen=max_seqlen
+                ).detach()
+            per_token_loss = F.cross_entropy(
+                logits.reshape(-1, logits.size(-1)).float(),
+                y.reshape(-1),
+                reduction="none",
+            )
+            val_loss_sum += per_token_loss.to(torch.float64).sum()
+            val_token_count += float(y.numel())
+            prev_ids = x
+            tgt_ids = y
+            sidecar_slice = val_data.val_bytes[raw_start + 1 : raw_end].to(
+                device=device, dtype=torch.int32, non_blocking=True
+            )
+            val_byte_count += sidecar_slice.to(torch.float64).sum()
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(val_loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_token_count, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_byte_count, op=dist.ReduceOp.SUM)
+    model.train()
+    return _loss_bpb(val_loss_sum, val_token_count, val_byte_count)
+
+
+def _find_docs(all_tokens):
+    bos_positions = (all_tokens == BOS_ID).nonzero(as_tuple=True)[0].numpy()
+    docs = []
+    for i in range(len(bos_positions)):
+        start = int(bos_positions[i])
+        end = (
+            int(bos_positions[i + 1])
+            if i + 1 < len(bos_positions)
+            else all_tokens.numel()
+        )
+        if i + 1 < len(bos_positions):
+            end += 1
+        assert end - start >= 2
+        docs.append((start, end - start))
+    return docs
+
+
+def _build_ttt_global_batches(doc_entries, h, ascending=False):
+    batch_size = h.ttt_batch_size
+    global_doc_entries = sorted(doc_entries, key=lambda x: x[1][1])
+    global_batches = [
+        global_doc_entries[i : i + batch_size]
+        for i in range(0, len(global_doc_entries), batch_size)
+    ]
+    indexed = list(enumerate(global_batches))
+    if not ascending:
+        indexed.sort(key=lambda ib: -max(dl for _, (_, dl) in ib[1]))
+    return indexed
+
+
+def _init_batch_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(4, "little"))
+
+
+def _claim_next_batch(counter_path, queue_len):
+    try:
+        with open(counter_path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            idx = int.from_bytes(f.read(4), "little")
+            f.seek(0)
+            f.write((idx + 1).to_bytes(4, "little"))
+            f.flush()
+    except FileNotFoundError:
+        return queue_len
+    return idx
+
+
+def _compute_chunk_window(ci, pred_len, num_chunks, chunk_size, eval_seq_len):
+    chunk_end = pred_len if ci == num_chunks - 1 else (ci + 1) * chunk_size
+    win_start = max(0, chunk_end - eval_seq_len)
+    win_len = chunk_end - win_start
+    chunk_start = ci * chunk_size
+    chunk_offset = chunk_start - win_start
+    chunk_len = chunk_end - chunk_start
+    return win_start, win_len, chunk_offset, chunk_len
+
+
+def _accumulate_bpb(
+    ptl,
+    x,
+    y,
+    chunk_offsets,
+    chunk_lens,
+    pos_idx,
+    base_bytes_lut,
+    has_leading_space_lut,
+    is_boundary_token_lut,
+    loss_sum,
+    byte_sum,
+    token_count,
+    y_bytes=None,
+):
+    pos = pos_idx[: x.size(1)].unsqueeze(0)
+    mask = (
+        (chunk_lens.unsqueeze(1) > 0)
+        & (pos >= chunk_offsets.unsqueeze(1))
+        & (pos < (chunk_offsets + chunk_lens).unsqueeze(1))
+    )
+    mask_f64 = mask.to(torch.float64)
+    if y_bytes is not None:
+        tok_bytes = y_bytes.to(torch.float64)
+    else:
+        tok_bytes = base_bytes_lut[y].to(torch.float64)
+        tok_bytes += (has_leading_space_lut[y] & ~is_boundary_token_lut[x]).to(
+            torch.float64
+        )
+    loss_sum += (ptl.to(torch.float64) * mask_f64).sum()
+    byte_sum += (tok_bytes * mask_f64).sum()
+    token_count += chunk_lens.to(torch.float64).sum()
+
+
+def _loss_bpb_from_sums(loss_sum, token_count, byte_sum):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_sum.item())
+    return val_loss, val_bpb
+
+
+def _add_to_counter(path, delta):
+    try:
+        with open(path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            cur = int.from_bytes(f.read(8), "little", signed=True)
+            cur += int(delta)
+            f.seek(0)
+            f.write(int(cur).to_bytes(8, "little", signed=True))
+            f.flush()
+            return cur
+    except FileNotFoundError:
+        return int(delta)
+
+
+def _init_int64_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(8, "little", signed=True))
+
+
+def _select_ttt_doc_entries(docs, h):
+    doc_entries = list(enumerate(docs))
+    if h.val_doc_fraction < 1.0:
+        sample_n = max(1, int(round(len(docs) * h.val_doc_fraction)))
+        sampled_indices = sorted(
+            random.Random(h.seed).sample(range(len(docs)), sample_n)
+        )
+        return [(i, docs[i]) for i in sampled_indices]
+    return doc_entries
+
+
+def train_val_ttt_global_sgd_distributed(h, device, val_data, base_model, val_tokens, batch_seqs=None):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    seq_len = h.eval_seq_len
+    total_tokens = val_tokens.numel() - 1
+    ttt_chunk = h.global_ttt_chunk_tokens
+    batch_seqs = h.global_ttt_batch_seqs if batch_seqs is None else batch_seqs
+    num_chunks = (total_tokens + ttt_chunk - 1) // ttt_chunk
+    ttt_params = [p for p in base_model.parameters()]
+    for p in ttt_params:
+        p.requires_grad_(True)
+    optimizer = torch.optim.SGD(
+        ttt_params, lr=h.global_ttt_lr, momentum=h.global_ttt_momentum
+    )
+    t_start = time.perf_counter()
+    for ci in range(num_chunks):
+        chunk_start = ci * ttt_chunk
+        chunk_end = min((ci + 1) * ttt_chunk, total_tokens)
+        is_last_chunk = ci == num_chunks - 1
+        if is_last_chunk or h.global_ttt_epochs <= 0:
+            continue
+        base_model.train()
+        chunk_seqs = (chunk_end - chunk_start) // seq_len
+        if chunk_seqs <= 0:
+            continue
+        warmup_chunks = max(0, min(h.global_ttt_warmup_chunks, num_chunks - 1))
+        if warmup_chunks > 0 and ci < warmup_chunks:
+            warmup_denom = max(warmup_chunks - 1, 1)
+            warmup_t = ci / warmup_denom
+            lr_now = (
+                h.global_ttt_warmup_start_lr
+                + (h.global_ttt_lr - h.global_ttt_warmup_start_lr) * warmup_t
+            )
+        else:
+            decay_steps = max(num_chunks - 1 - warmup_chunks, 1)
+            decay_ci = max(ci - warmup_chunks, 0)
+            lr_now = h.global_ttt_lr * 0.5 * (
+                1.0 + math.cos(math.pi * decay_ci / decay_steps)
+            )
+        for pg in optimizer.param_groups:
+            pg["lr"] = lr_now
+        my_seq_s = chunk_seqs * h.rank // h.world_size
+        my_seq_e = chunk_seqs * (h.rank + 1) // h.world_size
+        my_chunk_seqs = my_seq_e - my_seq_s
+        for _ in range(h.global_ttt_epochs):
+            for bs in range(0, my_chunk_seqs, batch_seqs):
+                be = min(bs + batch_seqs, my_chunk_seqs)
+                actual_bs = my_seq_s + bs
+                start_tok = chunk_start + actual_bs * seq_len
+                end_tok = chunk_start + (my_seq_s + be) * seq_len + 1
+                if end_tok > val_tokens.numel():
+                    continue
+                local = val_tokens[start_tok:end_tok].to(device=device, dtype=torch.int64)
+                x_flat = local[:-1]
+                y_flat = local[1:]
+                optimizer.zero_grad(set_to_none=True)
+                with torch.enable_grad():
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                        if h.global_ttt_respect_doc_boundaries:
+                            bos_pos = (x_flat == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+                            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                                bos_pos, x_flat.numel(), x_flat.device, h.eval_seq_len, 64
+                            )
+                            loss = base_model(
+                                x_flat[None],
+                                y_flat[None],
+                                cu_seqlens=cu_seqlens,
+                                max_seqlen=max_seqlen,
+                            )
+                        else:
+                            x = x_flat.reshape(-1, seq_len)
+                            y = y_flat.reshape(-1, seq_len)
+                            loss = base_model(x, y)
+                loss.backward()
+                if dist.is_available() and dist.is_initialized():
+                    for p in ttt_params:
+                        if p.grad is not None:
+                            dist.all_reduce(p.grad, op=dist.ReduceOp.SUM)
+                            p.grad.mul_(1.0 / h.world_size)
+                if h.global_ttt_grad_clip > 0:
+                    torch.nn.utils.clip_grad_norm_(ttt_params, h.global_ttt_grad_clip)
+                optimizer.step()
+        base_model.eval()
+        if h.rank == 0:
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"tttg: c{ci+1}/{num_chunks} lr:{lr_now:.6f} t:{elapsed:.1f}s"
+            )
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.eval()
+
+
+def eval_val_ttt_phased(h, base_model, device, val_data, forward_ttt_train):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    for p in base_model.parameters():
+        p.requires_grad_(False)
+    all_tokens = val_data.val_tokens
+    all_tokens_idx = all_tokens.to(torch.int32)
+    docs = _find_docs(all_tokens)
+    doc_entries = _select_ttt_doc_entries(docs, h)
+    prefix_doc_limit = max(0, min(len(doc_entries), int(h.phased_ttt_prefix_docs)))
+    num_phases = max(1, int(h.phased_ttt_num_phases))
+    phase_boundaries = []
+    for pi in range(num_phases):
+        boundary = prefix_doc_limit * (pi + 1) // num_phases
+        phase_boundaries.append(boundary)
+    current_phase = 0
+    current_phase_boundary = phase_boundaries[0]
+    log(
+        "ttt_phased:"
+        f" total_docs:{len(doc_entries)} prefix_docs:{prefix_doc_limit} "
+        f"suffix_docs:{len(doc_entries) - prefix_doc_limit}"
+        f" num_phases:{num_phases} boundaries:{phase_boundaries}"
+    )
+    chunk_size, eval_seq_len = h.ttt_chunk_size, h.ttt_eval_seq_len
+    eval_batch_set = None
+    if h.ttt_eval_batches:
+        eval_batch_set = set(int(x) for x in h.ttt_eval_batches.split(",") if x.strip())
+    use_ascending = eval_batch_set is not None
+    global_batches_sorted = _build_ttt_global_batches(
+        doc_entries, h, ascending=use_ascending
+    )
+    queue_len = len(global_batches_sorted)
+    counter_path = f"/tmp/ttt_counter_{h.run_id}"
+    prefix_counter_path = f"/tmp/ttt_prefix_counter_{h.run_id}"
+    pause_flag_path = f"/tmp/ttt_pause_flag_{h.run_id}"
+    if h.rank == 0:
+        _init_batch_counter(counter_path)
+        _init_int64_counter(prefix_counter_path)
+        try:
+            os.remove(pause_flag_path)
+        except FileNotFoundError:
+            pass
+    if dist.is_available() and dist.is_initialized():
+        path_list = [counter_path, prefix_counter_path, pause_flag_path]
+        dist.broadcast_object_list(path_list, src=0)
+        counter_path, prefix_counter_path, pause_flag_path = path_list
+        dist.barrier()
+    loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    byte_sum = torch.zeros((), device=device, dtype=torch.float64)
+    token_count = torch.zeros((), device=device, dtype=torch.float64)
+    t_start = time.perf_counter()
+    reusable_lora = BatchedTTTLoRA(
+        h.ttt_batch_size, base_model, h.ttt_lora_rank,
+        k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+    ).to(device)
+
+    def _build_opt(lora):
+        if h.ttt_optimizer == "sgd":
+            return torch.optim.SGD(
+                lora.parameters(), lr=h.ttt_lora_lr,
+                momentum=h.ttt_beta1, weight_decay=h.ttt_weight_decay,
+            )
+        return torch.optim.AdamW(
+            lora.parameters(), lr=h.ttt_lora_lr,
+            betas=(h.ttt_beta1, h.ttt_beta2),
+            eps=1e-10, weight_decay=h.ttt_weight_decay, fused=True,
+        )
+
+    reusable_opt = _build_opt(reusable_lora)
+    local_scored_docs = []
+    global_ttt_done = prefix_doc_limit == 0
+    try:
+      while True:
+        queue_idx = _claim_next_batch(counter_path, queue_len)
+        if queue_idx >= queue_len:
+            break
+        orig_batch_idx, batch_entries = global_batches_sorted[queue_idx]
+        batch = [doc for _, doc in batch_entries]
+        bsz = len(batch)
+        prev_loss = loss_sum.item()
+        prev_bytes = byte_sum.item()
+        prev_tokens = token_count.item()
+        if bsz == reusable_lora.bsz:
+            reusable_lora.reset()
+            for s in reusable_opt.state.values():
+                for k, v in s.items():
+                    if isinstance(v, torch.Tensor):
+                        v.zero_()
+                    elif k == "step":
+                        s[k] = 0
+            cur_lora = reusable_lora
+            cur_opt = reusable_opt
+        else:
+            cur_lora = BatchedTTTLoRA(
+                bsz, base_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            cur_opt = _build_opt(cur_lora)
+        pred_lens = [doc_len - 1 for _, doc_len in batch]
+        num_chunks = [(pl + chunk_size - 1) // chunk_size for pl in pred_lens]
+        max_nc = max(num_chunks)
+        num_chunks_t = torch.tensor(num_chunks, dtype=torch.int64, device=device)
+        for ci in range(max_nc):
+            active = [ci < nc for nc in num_chunks]
+            needs_train = any(ci < nc - 1 for nc in num_chunks)
+            tok_starts = torch.zeros(bsz, dtype=torch.int64)
+            tok_wls = torch.zeros(bsz, dtype=torch.int64)
+            chunk_offsets_cpu = torch.zeros(bsz, dtype=torch.int64)
+            chunk_lens_cpu = torch.zeros(bsz, dtype=torch.int64)
+            for b in range(bsz):
+                if not active[b]:
+                    continue
+                doc_start, doc_len = batch[b]
+                win_start, win_len, chunk_offset, chunk_len = _compute_chunk_window(
+                    ci, pred_lens[b], num_chunks[b], chunk_size, eval_seq_len
+                )
+                tok_starts[b] = doc_start + win_start
+                tok_wls[b] = win_len
+                chunk_offsets_cpu[b] = chunk_offset
+                chunk_lens_cpu[b] = chunk_len
+            _, context_size, chunk_offset, _ = _compute_chunk_window(
+                ci, (ci + 1) * chunk_size, ci + 1, chunk_size, eval_seq_len
+            )
+            col_idx = torch.arange(context_size + 1)
+            idx = tok_starts.unsqueeze(1) + col_idx.unsqueeze(0)
+            idx.clamp_(max=all_tokens.numel() - 1)
+            gathered_gpu = all_tokens_idx[idx].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            valid = (col_idx[:context_size].unsqueeze(0) < tok_wls.unsqueeze(1)).to(
+                device, non_blocking=True
+            )
+            chunk_offsets = chunk_offsets_cpu.to(device, non_blocking=True)
+            chunk_lens = chunk_lens_cpu.to(device, non_blocking=True)
+            x = torch.where(valid, gathered_gpu[:, :context_size], 0)
+            y = torch.where(valid, gathered_gpu[:, 1 : context_size + 1], 0)
+            ctx_pos = torch.arange(context_size, device=device, dtype=torch.int64)
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+            # CaseOps sidecar-driven byte budget. Mirror the index pattern
+            # used to build y from all_tokens: y[b, j] corresponds to the
+            # token at global position tok_starts[b] + 1 + j (when valid).
+            y_bytes_arg = None
+            if val_data.caseops_enabled and val_data.val_bytes is not None:
+                y_idx = (
+                    tok_starts.unsqueeze(1)
+                    + 1
+                    + col_idx[:context_size].unsqueeze(0)
+                )
+                y_idx = y_idx.clamp_(max=val_data.val_bytes.numel() - 1)
+                y_bytes_arg = val_data.val_bytes[y_idx].to(
+                    device=device, dtype=torch.int32, non_blocking=True
+                )
+                # Mirror the `valid` masking used for y so out-of-range tokens
+                # contribute zero bytes (matches y=0 substitution above).
+                y_bytes_arg = torch.where(
+                    valid, y_bytes_arg, torch.zeros_like(y_bytes_arg)
+                )
+            with torch.no_grad():
+                _accumulate_bpb(
+                    per_tok_loss,
+                    x,
+                    y,
+                    chunk_offsets,
+                    chunk_lens,
+                    ctx_pos,
+                    val_data.base_bytes_lut,
+                    val_data.has_leading_space_lut,
+                    val_data.is_boundary_token_lut,
+                    loss_sum,
+                    byte_sum,
+                    token_count,
+                    y_bytes=y_bytes_arg,
+                )
+            if needs_train:
+                activate_chunk_mask = (num_chunks_t - 1 > ci).float()
+                for gi in range(h.ttt_grad_steps):
+                    if gi > 0:
+                        with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                            per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+                    per_doc = per_tok_loss[
+                        :, chunk_offset : chunk_offset + chunk_size
+                    ].mean(dim=-1)
+                    cur_opt.zero_grad(set_to_none=True)
+                    (per_doc * activate_chunk_mask).sum().backward()
+                    cur_opt.step()
+            else:
+                del per_tok_loss
+        batch_num = orig_batch_idx + 1
+        doc_lens = [dl for _, dl in batch]
+        should_report = batch_num in eval_batch_set if eval_batch_set is not None else True
+        if should_report:
+            cur_tokens = token_count.item()
+            cur_loss_val = loss_sum.item()
+            cur_bytes_val = byte_sum.item()
+            dt = cur_tokens - prev_tokens
+            db = cur_bytes_val - prev_bytes
+            if dt > 0 and db > 0:
+                b_loss = (cur_loss_val - prev_loss) / dt
+                b_bpb = b_loss / math.log(2.0) * (dt / db)
+            else:
+                b_loss = b_bpb = 0.0
+            r_loss = cur_loss_val / max(cur_tokens, 1)
+            r_bpb = r_loss / math.log(2.0) * (cur_tokens / max(cur_bytes_val, 1))
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"ttp: b{batch_num}/{queue_len} bl:{b_loss:.4f} bb:{b_bpb:.4f} "
+                f"rl:{r_loss:.4f} rb:{r_bpb:.4f} dl:{min(doc_lens)}-{max(doc_lens)} "
+                f"gd:{int(global_ttt_done)}"
+            )
+        if not global_ttt_done:
+            local_scored_docs.extend(
+                (orig_batch_idx, pos, doc_start, doc_len)
+                for pos, (doc_start, doc_len) in enumerate(batch)
+            )
+            prefix_done = _add_to_counter(prefix_counter_path, len(batch_entries))
+            if prefix_done >= current_phase_boundary:
+                try:
+                    with open(pause_flag_path, "x"):
+                        pass
+                except FileExistsError:
+                    pass
+            should_pause = os.path.exists(pause_flag_path)
+            if should_pause:
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                gathered_scored_docs = [None] * h.world_size
+                if dist.is_available() and dist.is_initialized():
+                    dist.all_gather_object(gathered_scored_docs, local_scored_docs)
+                else:
+                    gathered_scored_docs = [local_scored_docs]
+                scored_docs_for_global = []
+                for rank_docs in gathered_scored_docs:
+                    if rank_docs:
+                        scored_docs_for_global.extend(rank_docs)
+                scored_docs_for_global.sort(key=lambda x: (x[0], x[1]))
+                scored_docs_for_global = scored_docs_for_global[:current_phase_boundary]
+                scored_token_chunks = [
+                    val_data.val_tokens[doc_start : doc_start + doc_len]
+                    for _, _, doc_start, doc_len in scored_docs_for_global
+                ]
+                if scored_token_chunks:
+                    global_ttt_tokens = torch.cat(scored_token_chunks)
+                else:
+                    global_ttt_tokens = val_data.val_tokens[:0]
+                if h.rank == 0:
+                    prefix_done = 0
+                    try:
+                        with open(prefix_counter_path, "rb") as f:
+                            prefix_done = int.from_bytes(
+                                f.read(8), "little", signed=True
+                            )
+                    except FileNotFoundError:
+                        pass
+                    log(
+                        f"ttpp: phase:{current_phase + 1}/{num_phases} pd:{prefix_done} "
+                        f"gd:{len(scored_docs_for_global)} "
+                        f"t:{time.perf_counter() - t_start:.1f}s"
+                    )
+                train_val_ttt_global_sgd_distributed(
+                    h, device, val_data, base_model, global_ttt_tokens
+                )
+                for p in base_model.parameters():
+                    p.requires_grad_(False)
+                reusable_lora = BatchedTTTLoRA(
+                    h.ttt_batch_size, base_model, h.ttt_lora_rank,
+                    k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+                ).to(device)
+                reusable_opt = _build_opt(reusable_lora)
+                current_phase += 1
+                if current_phase >= num_phases:
+                    global_ttt_done = True
+                else:
+                    current_phase_boundary = phase_boundaries[current_phase]
+                    if h.rank == 0:
+                        try:
+                            os.remove(pause_flag_path)
+                        except FileNotFoundError:
+                            pass
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                if h.rank == 0:
+                    log(f"ttpr: phase:{current_phase}/{num_phases} t:{time.perf_counter() - t_start:.1f}s")
+        del cur_lora, cur_opt
+    finally:
+        pass
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(byte_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(token_count, op=dist.ReduceOp.SUM)
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.train()
+    return _loss_bpb_from_sums(loss_sum, token_count, byte_sum)
+
+
+def timed_eval(label, fn, *args, **kwargs):
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    val_loss, val_bpb = fn(*args, **kwargs)
+    torch.cuda.synchronize()
+    elapsed_ms = 1e3 * (time.perf_counter() - t0)
+    log(
+        f"{label} val_loss:{val_loss:.8f} val_bpb:{val_bpb:.8f} eval_time:{elapsed_ms:.0f}ms"
+    )
+    return val_loss, val_bpb
+
+
+def train_model(h, device, val_data):
+    base_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(base_model)
+    compiled_model = torch.compile(base_model, dynamic=False, fullgraph=True)
+    compiled_forward_logits = torch.compile(
+        base_model.forward_logits, dynamic=False, fullgraph=True
+    )
+    model = compiled_model
+    log(f"model_params:{sum(p.numel()for p in base_model.parameters())}")
+    optimizers = Optimizers(h, base_model)
+    train_loader = DocumentPackingLoader(h, device)
+    max_wallclock_ms = (
+        1e3 * h.max_wallclock_seconds if h.max_wallclock_seconds > 0 else None
+    )
+    if max_wallclock_ms is not None:
+        max_wallclock_ms -= h.gptq_reserve_seconds * 1e3
+        log(
+            f"gptq:reserving {h.gptq_reserve_seconds:.0f}s, effective={max_wallclock_ms:.0f}ms"
+        )
+
+    def training_frac(step, elapsed_ms):
+        if max_wallclock_ms is None:
+            return step / max(h.iterations, 1)
+        return elapsed_ms / max(max_wallclock_ms, 1e-09)
+
+    def lr_mul(frac):
+        if h.warmdown_frac <= 0:
+            return 1.0
+        if frac >= 1.0 - h.warmdown_frac:
+            return max((1.0 - frac) / h.warmdown_frac, h.min_lr)
+        return 1.0
+
+    _clip_params = [p for p in base_model.parameters() if p.requires_grad]
+    def step_fn(step, lr_scale):
+        train_loss = torch.zeros((), device=device)
+        for micro_step in range(h.grad_accum_steps):
+            x, y, cu_seqlens, _max_seqlen = train_loader.next_batch(
+                h.train_batch_tokens, h.grad_accum_steps
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                loss = model(x, y, cu_seqlens=cu_seqlens, max_seqlen=h.train_seq_len)
+            train_loss += loss.detach()
+            (loss / h.grad_accum_steps).backward()
+        train_loss /= h.grad_accum_steps
+        if step <= h.muon_momentum_warmup_steps:
+
+            frac = (
+
+                min(step / h.muon_momentum_warmup_steps, 1.0)
+
+                if h.muon_momentum_warmup_steps > 0
+
+                else 1.0
+
+            )
+
+            muon_momentum = (
+
+                1 - frac
+
+            ) * h.muon_momentum_warmup_start + frac * h.muon_momentum
+
+            for group in optimizers.optimizer_muon.param_groups:
+
+                group["momentum"] = muon_momentum
+        for opt in optimizers:
+            for group in opt.param_groups:
+                group["lr"] = group["base_lr"] * lr_scale
+        if h.grad_clip_norm > 0:
+            torch.nn.utils.clip_grad_norm_(_clip_params, h.grad_clip_norm)
+        optimizers.step(distributed=h.distributed)
+        return train_loss
+
+    if h.warmup_steps > 0:
+        initial_model_state = {
+            name: tensor.detach().cpu().clone()
+            for (name, tensor) in base_model.state_dict().items()
+        }
+        initial_optimizer_states = [
+            copy.deepcopy(opt.state_dict()) for opt in optimizers
+        ]
+        model.train()
+        num_tokens_local = h.train_batch_tokens // h.world_size
+        for blk in base_model.blocks:
+            blk.attn.rotary(num_tokens_local, device, torch.bfloat16)
+        cu_bucket_size = train_loader.cu_bucket_size
+        warmup_cu_buckets = tuple(cu_bucket_size * i for i in range(1, 5))
+        warmup_cu_iters = 3
+        x, y, cu_seqlens, _ = train_loader.next_batch(
+            h.train_batch_tokens, h.grad_accum_steps
+        )
+        log(f"warmup_cu_buckets:{','.join(str(b) for b in warmup_cu_buckets)} iters_each:{warmup_cu_iters}")
+        def _run_cu_bucket_warmup():
+            for bucket_len in warmup_cu_buckets:
+                boundaries = list(range(0, x.size(1), max(h.train_seq_len, 1)))
+                if boundaries[-1] != x.size(1):
+                    boundaries.append(x.size(1))
+                cu = torch.full((bucket_len,), x.size(1), dtype=torch.int32, device=device)
+                cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+                for _ in range(warmup_cu_iters):
+                    optimizers.zero_grad_all()
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                        wloss = model(x, y, cu_seqlens=cu, max_seqlen=h.train_seq_len)
+                    (wloss / h.grad_accum_steps).backward()
+            optimizers.zero_grad_all()
+        _run_cu_bucket_warmup()
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            _run_cu_bucket_warmup()
+            base_model.looping_active = False
+        for warmup_step in range(h.warmup_steps):
+            step_fn(warmup_step, 1.0)
+            if (
+                warmup_step <= 5
+                or (warmup_step + 1) % 10 == 0
+                or warmup_step + 1 == h.warmup_steps
+            ):
+                log(f"warmup_step: {warmup_step+1}/{h.warmup_steps}")
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            log(
+                f"loop_warmup:enabled encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+            for warmup_step in range(h.warmup_steps):
+                step_fn(warmup_step, 1.0)
+                if (
+                    warmup_step <= 5
+                    or (warmup_step + 1) % 10 == 0
+                    or warmup_step + 1 == h.warmup_steps
+                ):
+                    log(f"loop_warmup_step: {warmup_step+1}/{h.warmup_steps}")
+            base_model.looping_active = False
+        base_model.load_state_dict(initial_model_state, strict=True)
+        for (opt, state) in zip(optimizers, initial_optimizer_states, strict=True):
+            opt.load_state_dict(state)
+        optimizers.zero_grad_all()
+        train_loader = DocumentPackingLoader(h, device)
+    _live_state = base_model.state_dict(keep_vars=True)
+    ema_state = {
+        name: t.detach().float().clone()
+        for (name, t) in _live_state.items()
+    }
+    _ema_pairs = [(ema_state[name], t) for (name, t) in _live_state.items()]
+    ema_decay = h.ema_decay
+    training_time_ms = 0.0
+    stop_after_step = None
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    step = 0
+    while True:
+        last_step = (
+            step == h.iterations
+            or stop_after_step is not None
+            and step >= stop_after_step
+        )
+        should_validate = (
+            last_step or h.val_loss_every > 0 and step % h.val_loss_every == 0
+        )
+        if should_validate:
+            torch.cuda.synchronize()
+            training_time_ms += 1e3 * (time.perf_counter() - t0)
+            val_loss, val_bpb = eval_val(
+                h, device, val_data, model, compiled_forward_logits
+            )
+            log(
+                f"{step}/{h.iterations} val_loss: {val_loss:.4f} val_bpb: {val_bpb:.4f}"
+            )
+            torch.cuda.synchronize()
+            t0 = time.perf_counter()
+        if last_step:
+            if stop_after_step is not None and step < h.iterations:
+                log(
+                    f"stopping_early: wallclock_cap train_time: {training_time_ms:.0f}ms step: {step}/{h.iterations}"
+                )
+            break
+        elapsed_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        frac = training_frac(step, elapsed_ms)
+        scale = lr_mul(frac)
+        if (
+            h.num_loops > 0
+            and not base_model.looping_active
+            and frac >= h.enable_looping_at
+        ):
+            base_model.looping_active = True
+            log(
+                f"layer_loop:enabled step:{step} frac:{frac:.3f} encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+        train_loss = step_fn(step, scale)
+        with torch.no_grad():
+            for ema_t, t in _ema_pairs:
+                ema_t.mul_(ema_decay).add_(t.detach(), alpha=1.0 - ema_decay)
+        step += 1
+        approx_training_time_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        should_log_train = h.train_log_every > 0 and (
+            step <= 5 or step % h.train_log_every == 0 or stop_after_step is not None
+        )
+        if should_log_train:
+            tok_per_sec = step * h.train_batch_tokens / (approx_training_time_ms / 1e3)
+            log(
+                f"{step}/{h.iterations} train_loss: {train_loss.item():.4f} train_time: {approx_training_time_ms/60000:.1f}m tok/s: {tok_per_sec:.0f}"
+            )
+        reached_cap = (
+            max_wallclock_ms is not None and approx_training_time_ms >= max_wallclock_ms
+        )
+        if h.distributed and max_wallclock_ms is not None:
+            reached_cap_tensor = torch.tensor(int(reached_cap), device=device)
+            dist.all_reduce(reached_cap_tensor, op=dist.ReduceOp.MAX)
+            reached_cap = bool(reached_cap_tensor.item())
+        if stop_after_step is None and reached_cap:
+            stop_after_step = step
+    log(
+        f"peak memory allocated: {torch.cuda.max_memory_allocated()//1024//1024} MiB reserved: {torch.cuda.max_memory_reserved()//1024//1024} MiB"
+    )
+    log("ema:applying EMA weights")
+    current_state = base_model.state_dict()
+    avg_state = {
+        name: t.to(dtype=current_state[name].dtype) for (name, t) in ema_state.items()
+    }
+    base_model.load_state_dict(avg_state, strict=True)
+    return base_model, compiled_model, compiled_forward_logits
+
+
+def train_and_eval(h, device):
+    random.seed(h.seed)
+    np.random.seed(h.seed)
+    torch.manual_seed(h.seed)
+    torch.cuda.manual_seed_all(h.seed)
+    if h.artifact_dir and h.is_main_process:
+        os.makedirs(h.artifact_dir, exist_ok=True)
+    val_data = ValidationData(h, device)
+    log(
+        f"train_shards: {len(list(Path(h.datasets_dir).resolve().glob('fineweb_train_*.bin')))}"
+    )
+    log(f"val_tokens: {val_data.val_tokens.numel()-1}")
+    # TTT_EVAL_ONLY: skip training + GPTQ, jump straight to TTT eval on a
+    # pre-existing quantized artifact. Used to test TTT-only improvements
+    # (e.g., PR-1767's alpha/warm-start/WD) without retraining.
+    ttt_eval_only = os.environ.get("TTT_EVAL_ONLY", "0") == "1"
+    if ttt_eval_only:
+        log("TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval")
+        log(f"ttt_lora_alpha: {BatchedLinearLoRA._ALPHA}")
+        log(f"ttt_warm_start_a: {BatchedLinearLoRA._WARM_START_A}")
+        log(f"ttt_weight_decay: {h.ttt_weight_decay}")
+    else:
+        base_model, compiled_model, compiled_forward_logits = train_model(
+            h, device, val_data
+        )
+        torch._dynamo.reset()
+        timed_eval(
+            "diagnostic pre-quantization post-ema",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        if os.environ.get("PREQUANT_ONLY", "0") == "1":
+            log("PREQUANT_ONLY=1 — skipping serialize/GPTQ/post-quant eval/TTT")
+            return
+        serialize(h, base_model, Path(__file__).read_text(encoding="utf-8"))
+        if h.distributed:
+            dist.barrier()
+    eval_model = deserialize(h, device)
+    if h.num_loops > 0:
+        eval_model.looping_active = True
+    if not ttt_eval_only:
+        compiled_model = torch.compile(eval_model, dynamic=False, fullgraph=True)
+        compiled_forward_logits = torch.compile(
+            eval_model.forward_logits, dynamic=False, fullgraph=True
+        )
+        timed_eval(
+            "diagnostic quantized",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        del eval_model
+    if h.ttt_enabled:
+        if not ttt_eval_only:
+            del compiled_model
+        if ttt_eval_only:
+            del eval_model
+        torch._dynamo.reset()
+        torch.cuda.empty_cache()
+        ttt_model = deserialize(h, device)
+        if h.num_loops > 0:
+            ttt_model.looping_active = True
+        for p in ttt_model.parameters():
+            p.requires_grad_(False)
+
+        if h.rope_yarn:
+            _yarn_seqlen = h.train_batch_tokens // h.grad_accum_steps
+            for block in ttt_model.blocks:
+                block.attn.rotary(_yarn_seqlen, device, torch.bfloat16)
+        else:
+            for block in ttt_model.blocks:
+                block.attn.rotary._cos_cached = None
+                block.attn.rotary._sin_cached = None
+                block.attn.rotary._seq_len_cached = 0
+                block.attn.rotary(h.ttt_eval_seq_len, device, torch.bfloat16)
+
+        def _fwd_ttt_inner(input_ids, target_ids, lora):
+            return ttt_model.forward_ttt(input_ids, target_ids, lora=lora)
+
+        _fwd_ttt_compiled_inner = None
+
+        def _fwd_ttt(input_ids, target_ids, lora):
+            nonlocal _fwd_ttt_compiled_inner
+            if _fwd_ttt_compiled_inner is None:
+                _fwd_ttt_compiled_inner = torch.compile(_fwd_ttt_inner, dynamic=True)
+            return _fwd_ttt_compiled_inner(input_ids, target_ids, lora=lora)
+
+        fwd_ttt_compiled = _fwd_ttt
+        log(f"ttt_lora:warming up compile (random tokens, no val data)")
+        global BOS_ID
+        if BOS_ID is None:
+            BOS_ID = 1
+        t_warmup = time.perf_counter()
+        warmup_bszes = [h.ttt_batch_size]
+        for bsz in warmup_bszes:
+            wl = BatchedTTTLoRA(
+                bsz, ttt_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            wo = torch.optim.AdamW(
+                wl.parameters(),
+                lr=h.ttt_lora_lr,
+                betas=(h.ttt_beta1, h.ttt_beta2),
+                eps=1e-10,
+                weight_decay=h.ttt_weight_decay,
+                fused=True,
+            )
+            for ctx_len in (h.ttt_chunk_size, h.ttt_eval_seq_len):
+                xw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                yw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                    ptl = fwd_ttt_compiled(xw, yw, lora=wl)
+                ptl[:, : min(h.ttt_chunk_size, ctx_len)].mean(dim=-1).sum().backward()
+                wo.step()
+                wo.zero_grad(set_to_none=True)
+            del wl, wo
+        torch.cuda.empty_cache()
+        compile_elapsed = time.perf_counter() - t_warmup
+        log(f"ttt_lora:compile warmup done ({compile_elapsed:.1f}s)")
+        log("\nbeginning TTT eval timer")
+        torch.cuda.synchronize()
+        t_ttt = time.perf_counter()
+        ttt_val_loss, ttt_val_bpb = eval_val_ttt_phased(
+            h, ttt_model, device, val_data, forward_ttt_train=fwd_ttt_compiled
+        )
+        torch.cuda.synchronize()
+        ttt_eval_elapsed = time.perf_counter() - t_ttt
+        log(
+            "quantized_ttt_phased "
+            f"val_loss:{ttt_val_loss:.8f} val_bpb:{ttt_val_bpb:.8f} "
+            f"eval_time:{1e3*ttt_eval_elapsed:.0f}ms"
+        )
+        log(f"total_eval_time:{ttt_eval_elapsed:.1f}s")
+        del ttt_model
+
+
+def main():
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    if not torch.cuda.is_available():
+        raise RuntimeError("CUDA is required")
+    if world_size <= 0:
+        raise ValueError(f"WORLD_SIZE must be positive, got {world_size}")
+    if 8 % world_size != 0:
+        raise ValueError(
+            f"WORLD_SIZE={world_size} must divide 8 so grad_accum_steps stays integral"
+        )
+    device = torch.device("cuda", local_rank)
+    torch.cuda.set_device(device)
+    if distributed:
+        dist.init_process_group(backend="nccl", device_id=device)
+        dist.barrier()
+    torch.backends.cuda.matmul.allow_tf32 = True
+    torch.backends.cudnn.allow_tf32 = True
+    torch.set_float32_matmul_precision("high")
+    from torch.backends.cuda import (
+        enable_cudnn_sdp,
+        enable_flash_sdp,
+        enable_math_sdp,
+        enable_mem_efficient_sdp,
+    )
+
+    enable_cudnn_sdp(False)
+    enable_flash_sdp(True)
+    enable_mem_efficient_sdp(False)
+    enable_math_sdp(False)
+    torch._dynamo.config.optimize_ddp = False
+    torch._dynamo.config.cache_size_limit = 64
+    h = Hyperparameters()
+    set_logging_hparams(h)
+    if h.is_main_process:
+        os.makedirs(h.artifact_dir if h.artifact_dir else "logs", exist_ok=True)
+        log(100 * "=", console=False)
+        log("Hyperparameters:", console=True)
+        for (k, v) in sorted(vars(type(h)).items()):
+            if not k.startswith("_"):
+                log(f"  {k}: {v}", console=True)
+        log("=" * 100, console=False)
+        log("Source code:", console=False)
+        log("=" * 100, console=False)
+        with open(__file__, "r", encoding="utf-8") as _src:
+            log(_src.read(), console=False)
+        log("=" * 100, console=False)
+        log(f"Running Python {sys.version}", console=False)
+        log(f"Running PyTorch {torch.__version__}", console=False)
+        log("=" * 100, console=False)
+    train_and_eval(h, device)
+    if distributed:
+        dist.destroy_process_group()
+
+
+if __name__ == "__main__":
+    main()
+
+====================================================================================================
+Running Python 3.11.10 (main, Sep  7 2024, 18:35:41) [GCC 11.4.0]
+Running PyTorch 2.11.0+cu130
+====================================================================================================
+train_shards: 80
+val_tokens: 46688256
+model_params:35552455
+gptq:reserving 0s, effective=599500ms
+warmup_cu_buckets:64,128,192,256 iters_each:3
+warmup_step: 1/20
+warmup_step: 2/20
+warmup_step: 3/20
+warmup_step: 4/20
+warmup_step: 5/20
+warmup_step: 6/20
+warmup_step: 10/20
+warmup_step: 20/20
+loop_warmup:enabled encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+loop_warmup_step: 1/20
+loop_warmup_step: 2/20
+loop_warmup_step: 3/20
+loop_warmup_step: 4/20
+loop_warmup_step: 5/20
+loop_warmup_step: 6/20
+loop_warmup_step: 10/20
+loop_warmup_step: 20/20
+1/20000 train_loss: 9.2417 train_time: 0.0m tok/s: 17090267
+2/20000 train_loss: 13.0257 train_time: 0.0m tok/s: 7627690
+3/20000 train_loss: 10.4072 train_time: 0.0m tok/s: 7885081
+4/20000 train_loss: 9.0213 train_time: 0.0m tok/s: 7987632
+5/20000 train_loss: 8.0270 train_time: 0.0m tok/s: 8093319
+500/20000 train_loss: 2.8314 train_time: 0.8m tok/s: 8361736
+1000/20000 train_loss: 2.7614 train_time: 1.6m tok/s: 8338310
+1500/20000 train_loss: 2.6989 train_time: 2.4m tok/s: 8333829
+2000/20000 train_loss: 2.7614 train_time: 3.1m tok/s: 8334527
+2500/20000 train_loss: 2.6684 train_time: 3.9m tok/s: 8337016
+layer_loop:enabled step:2861 frac:0.450 encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+3000/20000 train_loss: 2.6377 train_time: 4.8m tok/s: 8162299
+3500/20000 train_loss: 2.6143 train_time: 6.0m tok/s: 7682798
+4000/20000 train_loss: 2.5371 train_time: 7.2m tok/s: 7305690
+4500/20000 train_loss: 2.5056 train_time: 8.3m tok/s: 7081000
+5000/20000 train_loss: 2.4696 train_time: 9.5m tok/s: 6911256
+5221/20000 val_loss: 2.4217 val_bpb: 1.0797
+stopping_early: wallclock_cap train_time: 599587ms step: 5221/20000
+peak memory allocated: 41660 MiB reserved: 48494 MiB
+ema:applying EMA weights
+diagnostic pre-quantization post-ema val_loss:2.39266005 val_bpb:1.06675795 eval_time:11062ms
+Serialized model: 131747517 bytes
+Code size (uncompressed): 162947 bytes
+Code size (compressed): 41104 bytes
+GPTQ:collecting Hessians from calibration data...
+GPTQ:collected 67 Hessians in 3.5s
+Quantized weights:
+  gate_int8_row: blocks.attn.attn_gate_w
+  gptq (int6): blocks.attn.c_k.weight, blocks.attn.c_q.weight, blocks.attn.c_v.weight, blocks.attn.proj.weight, blocks.mlp.fc.weight, blocks.mlp.proj.weight
+  gptq (int6)+lqer_asym: blocks.mlp.fc.weight
+  gptq (int7)+lqer_asym: tok_emb.weight
+  passthrough (float16): blocks.attn.q_gain, blocks.attn_scale, blocks.mlp_scale, blocks.resid_mix, parallel_post_lambdas, parallel_resid_lambdas, skip_gates, skip_weights, smear_gate.weight, smear_lambda
+Serialize: per-group lrzip compression...
+Serialize: per-group compression done in 136.3s
+Serialized model quantized+pergroup: 15769440 bytes
+Total submission size quantized+pergroup: 15810544 bytes
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.1s
+diagnostic quantized val_loss:2.41224213 val_bpb:1.07548854 eval_time:73997ms
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.1s
+ttt_lora:warming up compile (random tokens, no val data)
+ttt_lora:compile warmup done (183.2s)
+
+beginning TTT eval timer
+ttt_phased: total_docs:50000 prefix_docs:2500 suffix_docs:47500 num_phases:3 boundaries:[833, 1666, 2500]
+ttp: b777/782 bl:2.3579 bb:1.0806 rl:2.3579 rb:1.0806 dl:8216-9027 gd:0
+ttp: b772/782 bl:2.3523 bb:1.0914 rl:2.3556 rb:1.0849 dl:5632-5956 gd:0
+ttp: b769/782 bl:2.3769 bb:1.0882 rl:2.3612 rb:1.0858 dl:4969-5170 gd:0
+ttpp: phase:1/3 pd:1296 gd:833 t:227.0s
+tttg: c1/127 lr:0.001000 t:1.4s
+tttg: c2/127 lr:0.001000 t:1.5s
+tttg: c3/127 lr:0.000999 t:1.6s
+tttg: c4/127 lr:0.000999 t:1.6s
+tttg: c5/127 lr:0.000998 t:1.7s
+tttg: c6/127 lr:0.000996 t:1.8s
+tttg: c7/127 lr:0.000994 t:1.9s
+tttg: c8/127 lr:0.000992 t:2.0s
+tttg: c9/127 lr:0.000990 t:2.0s
+tttg: c10/127 lr:0.000987 t:2.1s
+tttg: c11/127 lr:0.000985 t:2.2s
+tttg: c12/127 lr:0.000981 t:2.3s
+tttg: c13/127 lr:0.000978 t:2.4s
+tttg: c14/127 lr:0.000974 t:2.4s
+tttg: c15/127 lr:0.000970 t:2.5s
+tttg: c16/127 lr:0.000965 t:2.6s
+tttg: c17/127 lr:0.000961 t:2.7s
+tttg: c18/127 lr:0.000956 t:2.8s
+tttg: c19/127 lr:0.000950 t:2.8s
+tttg: c20/127 lr:0.000945 t:2.9s
+tttg: c21/127 lr:0.000939 t:3.0s
+tttg: c22/127 lr:0.000933 t:3.1s
+tttg: c23/127 lr:0.000927 t:3.2s
+tttg: c24/127 lr:0.000920 t:3.2s
+tttg: c25/127 lr:0.000913 t:3.3s
+tttg: c26/127 lr:0.000906 t:3.4s
+tttg: c27/127 lr:0.000899 t:3.5s
+tttg: c28/127 lr:0.000891 t:3.5s
+tttg: c29/127 lr:0.000883 t:3.6s
+tttg: c30/127 lr:0.000875 t:3.7s
+tttg: c31/127 lr:0.000867 t:3.8s
+tttg: c32/127 lr:0.000858 t:3.9s
+tttg: c33/127 lr:0.000849 t:3.9s
+tttg: c34/127 lr:0.000840 t:4.0s
+tttg: c35/127 lr:0.000831 t:4.1s
+tttg: c36/127 lr:0.000821 t:4.2s
+tttg: c37/127 lr:0.000812 t:4.3s
+tttg: c38/127 lr:0.000802 t:4.4s
+tttg: c39/127 lr:0.000792 t:4.4s
+tttg: c40/127 lr:0.000782 t:4.5s
+tttg: c41/127 lr:0.000771 t:4.6s
+tttg: c42/127 lr:0.000761 t:4.7s
+tttg: c43/127 lr:0.000750 t:4.7s
+tttg: c44/127 lr:0.000739 t:4.8s
+tttg: c45/127 lr:0.000728 t:4.9s
+tttg: c46/127 lr:0.000717 t:5.0s
+tttg: c47/127 lr:0.000706 t:5.1s
+tttg: c48/127 lr:0.000694 t:5.1s
+tttg: c49/127 lr:0.000683 t:5.2s
+tttg: c50/127 lr:0.000671 t:5.3s
+tttg: c51/127 lr:0.000659 t:5.4s
+tttg: c52/127 lr:0.000647 t:5.5s
+tttg: c53/127 lr:0.000635 t:5.5s
+tttg: c54/127 lr:0.000623 t:5.6s
+tttg: c55/127 lr:0.000611 t:5.7s
+tttg: c56/127 lr:0.000599 t:5.8s
+tttg: c57/127 lr:0.000587 t:5.9s
+tttg: c58/127 lr:0.000575 t:5.9s
+tttg: c59/127 lr:0.000562 t:6.0s
+tttg: c60/127 lr:0.000550 t:6.1s
+tttg: c61/127 lr:0.000537 t:6.2s
+tttg: c62/127 lr:0.000525 t:6.3s
+tttg: c63/127 lr:0.000512 t:6.3s
+tttg: c64/127 lr:0.000500 t:6.4s
+tttg: c65/127 lr:0.000488 t:6.5s
+tttg: c66/127 lr:0.000475 t:6.6s
+tttg: c67/127 lr:0.000463 t:6.7s
+tttg: c68/127 lr:0.000450 t:6.7s
+tttg: c69/127 lr:0.000438 t:6.8s
+tttg: c70/127 lr:0.000425 t:6.9s
+tttg: c71/127 lr:0.000413 t:7.0s
+tttg: c72/127 lr:0.000401 t:7.1s
+tttg: c73/127 lr:0.000389 t:7.1s
+tttg: c74/127 lr:0.000377 t:7.2s
+tttg: c75/127 lr:0.000365 t:7.3s
+tttg: c76/127 lr:0.000353 t:7.4s
+tttg: c77/127 lr:0.000341 t:7.5s
+tttg: c78/127 lr:0.000329 t:7.5s
+tttg: c79/127 lr:0.000317 t:7.6s
+tttg: c80/127 lr:0.000306 t:7.7s
+tttg: c81/127 lr:0.000294 t:7.8s
+tttg: c82/127 lr:0.000283 t:7.9s
+tttg: c83/127 lr:0.000272 t:7.9s
+tttg: c84/127 lr:0.000261 t:8.0s
+tttg: c85/127 lr:0.000250 t:8.1s
+tttg: c86/127 lr:0.000239 t:8.2s
+tttg: c87/127 lr:0.000229 t:8.3s
+tttg: c88/127 lr:0.000218 t:8.3s
+tttg: c89/127 lr:0.000208 t:8.4s
+tttg: c90/127 lr:0.000198 t:8.5s
+tttg: c91/127 lr:0.000188 t:8.6s
+tttg: c92/127 lr:0.000179 t:8.7s
+tttg: c93/127 lr:0.000169 t:8.7s
+tttg: c94/127 lr:0.000160 t:8.8s
+tttg: c95/127 lr:0.000151 t:8.9s
+tttg: c96/127 lr:0.000142 t:9.0s
+tttg: c97/127 lr:0.000133 t:9.0s
+tttg: c98/127 lr:0.000125 t:9.1s
+tttg: c99/127 lr:0.000117 t:9.2s
+tttg: c100/127 lr:0.000109 t:9.3s
+tttg: c101/127 lr:0.000101 t:9.4s
+tttg: c102/127 lr:0.000094 t:9.5s
+tttg: c103/127 lr:0.000087 t:9.5s
+tttg: c104/127 lr:0.000080 t:9.6s
+tttg: c105/127 lr:0.000073 t:9.7s
+tttg: c106/127 lr:0.000067 t:9.8s
+tttg: c107/127 lr:0.000061 t:9.9s
+tttg: c108/127 lr:0.000055 t:9.9s
+tttg: c109/127 lr:0.000050 t:10.0s
+tttg: c110/127 lr:0.000044 t:10.1s
+tttg: c111/127 lr:0.000039 t:10.2s
+tttg: c112/127 lr:0.000035 t:10.3s
+tttg: c113/127 lr:0.000030 t:10.3s
+tttg: c114/127 lr:0.000026 t:10.4s
+tttg: c115/127 lr:0.000022 t:10.5s
+tttg: c116/127 lr:0.000019 t:10.6s
+tttg: c117/127 lr:0.000015 t:10.7s
+tttg: c118/127 lr:0.000013 t:10.7s
+tttg: c119/127 lr:0.000010 t:10.8s
+tttg: c120/127 lr:0.000008 t:10.9s
+tttg: c121/127 lr:0.000006 t:11.0s
+tttg: c122/127 lr:0.000004 t:11.1s
+tttg: c123/127 lr:0.000002 t:11.1s
+tttg: c124/127 lr:0.000001 t:11.2s
+tttg: c125/127 lr:0.000001 t:11.3s
+tttg: c126/127 lr:0.000000 t:11.4s
+ttpr: phase:1/3 t:240.2s
+ttp: b756/782 bl:2.3915 bb:1.0324 rl:2.3657 rb:1.0773 dl:3382-3461 gd:0
+ttp: b750/782 bl:2.4167 bb:1.0845 rl:2.3717 rb:1.0782 dl:3016-3073 gd:0
+ttpp: phase:2/3 pd:2128 gd:1666 t:363.4s
+tttg: c1/214 lr:0.001000 t:0.1s
+tttg: c2/214 lr:0.001000 t:0.2s
+tttg: c3/214 lr:0.001000 t:0.3s
+tttg: c4/214 lr:0.001000 t:0.3s
+tttg: c5/214 lr:0.000999 t:0.4s
+tttg: c6/214 lr:0.000999 t:0.5s
+tttg: c7/214 lr:0.000998 t:0.6s
+tttg: c8/214 lr:0.000997 t:0.6s
+tttg: c9/214 lr:0.000997 t:0.7s
+tttg: c10/214 lr:0.000996 t:0.8s
+tttg: c11/214 lr:0.000995 t:0.9s
+tttg: c12/214 lr:0.000993 t:0.9s
+tttg: c13/214 lr:0.000992 t:1.0s
+tttg: c14/214 lr:0.000991 t:1.1s
+tttg: c15/214 lr:0.000989 t:1.2s
+tttg: c16/214 lr:0.000988 t:1.3s
+tttg: c17/214 lr:0.000986 t:1.3s
+tttg: c18/214 lr:0.000984 t:1.4s
+tttg: c19/214 lr:0.000982 t:1.5s
+tttg: c20/214 lr:0.000980 t:1.6s
+tttg: c21/214 lr:0.000978 t:1.7s
+tttg: c22/214 lr:0.000976 t:1.8s
+tttg: c23/214 lr:0.000974 t:1.8s
+tttg: c24/214 lr:0.000972 t:1.9s
+tttg: c25/214 lr:0.000969 t:2.0s
+tttg: c26/214 lr:0.000966 t:2.1s
+tttg: c27/214 lr:0.000964 t:2.2s
+tttg: c28/214 lr:0.000961 t:2.2s
+tttg: c29/214 lr:0.000958 t:2.3s
+tttg: c30/214 lr:0.000955 t:2.4s
+tttg: c31/214 lr:0.000952 t:2.5s
+tttg: c32/214 lr:0.000949 t:2.6s
+tttg: c33/214 lr:0.000945 t:2.6s
+tttg: c34/214 lr:0.000942 t:2.7s
+tttg: c35/214 lr:0.000938 t:2.8s
+tttg: c36/214 lr:0.000935 t:2.9s
+tttg: c37/214 lr:0.000931 t:2.9s
+tttg: c38/214 lr:0.000927 t:3.0s
+tttg: c39/214 lr:0.000924 t:3.1s
+tttg: c40/214 lr:0.000920 t:3.2s
+tttg: c41/214 lr:0.000915 t:3.3s
+tttg: c42/214 lr:0.000911 t:3.4s
+tttg: c43/214 lr:0.000907 t:3.4s
+tttg: c44/214 lr:0.000903 t:3.5s
+tttg: c45/214 lr:0.000898 t:3.6s
+tttg: c46/214 lr:0.000894 t:3.7s
+tttg: c47/214 lr:0.000889 t:3.8s
+tttg: c48/214 lr:0.000885 t:3.8s
+tttg: c49/214 lr:0.000880 t:3.9s
+tttg: c50/214 lr:0.000875 t:4.0s
+tttg: c51/214 lr:0.000870 t:4.1s
+tttg: c52/214 lr:0.000865 t:4.2s
+tttg: c53/214 lr:0.000860 t:4.2s
+tttg: c54/214 lr:0.000855 t:4.3s
+tttg: c55/214 lr:0.000850 t:4.4s
+tttg: c56/214 lr:0.000844 t:4.5s
+tttg: c57/214 lr:0.000839 t:4.5s
+tttg: c58/214 lr:0.000833 t:4.6s
+tttg: c59/214 lr:0.000828 t:4.7s
+tttg: c60/214 lr:0.000822 t:4.8s
+tttg: c61/214 lr:0.000817 t:4.9s
+tttg: c62/214 lr:0.000811 t:4.9s
+tttg: c63/214 lr:0.000805 t:5.0s
+tttg: c64/214 lr:0.000799 t:5.1s
+tttg: c65/214 lr:0.000793 t:5.2s
+tttg: c66/214 lr:0.000787 t:5.3s
+tttg: c67/214 lr:0.000781 t:5.4s
+tttg: c68/214 lr:0.000775 t:5.4s
+tttg: c69/214 lr:0.000769 t:5.5s
+tttg: c70/214 lr:0.000763 t:5.6s
+tttg: c71/214 lr:0.000756 t:5.7s
+tttg: c72/214 lr:0.000750 t:5.7s
+tttg: c73/214 lr:0.000744 t:5.8s
+tttg: c74/214 lr:0.000737 t:5.9s
+tttg: c75/214 lr:0.000731 t:6.0s
+tttg: c76/214 lr:0.000724 t:6.1s
+tttg: c77/214 lr:0.000717 t:6.1s
+tttg: c78/214 lr:0.000711 t:6.2s
+tttg: c79/214 lr:0.000704 t:6.3s
+tttg: c80/214 lr:0.000697 t:6.4s
+tttg: c81/214 lr:0.000690 t:6.5s
+tttg: c82/214 lr:0.000684 t:6.6s
+tttg: c83/214 lr:0.000677 t:6.6s
+tttg: c84/214 lr:0.000670 t:6.7s
+tttg: c85/214 lr:0.000663 t:6.8s
+tttg: c86/214 lr:0.000656 t:6.9s
+tttg: c87/214 lr:0.000649 t:6.9s
+tttg: c88/214 lr:0.000642 t:7.0s
+tttg: c89/214 lr:0.000635 t:7.1s
+tttg: c90/214 lr:0.000628 t:7.2s
+tttg: c91/214 lr:0.000620 t:7.3s
+tttg: c92/214 lr:0.000613 t:7.3s
+tttg: c93/214 lr:0.000606 t:7.4s
+tttg: c94/214 lr:0.000599 t:7.5s
+tttg: c95/214 lr:0.000592 t:7.6s
+tttg: c96/214 lr:0.000584 t:7.7s
+tttg: c97/214 lr:0.000577 t:7.7s
+tttg: c98/214 lr:0.000570 t:7.8s
+tttg: c99/214 lr:0.000563 t:7.9s
+tttg: c100/214 lr:0.000555 t:8.0s
+tttg: c101/214 lr:0.000548 t:8.1s
+tttg: c102/214 lr:0.000541 t:8.1s
+tttg: c103/214 lr:0.000533 t:8.2s
+tttg: c104/214 lr:0.000526 t:8.3s
+tttg: c105/214 lr:0.000518 t:8.4s
+tttg: c106/214 lr:0.000511 t:8.5s
+tttg: c107/214 lr:0.000504 t:8.5s
+tttg: c108/214 lr:0.000496 t:8.6s
+tttg: c109/214 lr:0.000489 t:8.7s
+tttg: c110/214 lr:0.000482 t:8.8s
+tttg: c111/214 lr:0.000474 t:8.9s
+tttg: c112/214 lr:0.000467 t:8.9s
+tttg: c113/214 lr:0.000459 t:9.0s
+tttg: c114/214 lr:0.000452 t:9.1s
+tttg: c115/214 lr:0.000445 t:9.2s
+tttg: c116/214 lr:0.000437 t:9.3s
+tttg: c117/214 lr:0.000430 t:9.3s
+tttg: c118/214 lr:0.000423 t:9.4s
+tttg: c119/214 lr:0.000416 t:9.5s
+tttg: c120/214 lr:0.000408 t:9.6s
+tttg: c121/214 lr:0.000401 t:9.7s
+tttg: c122/214 lr:0.000394 t:9.7s
+tttg: c123/214 lr:0.000387 t:9.8s
+tttg: c124/214 lr:0.000380 t:9.9s
+tttg: c125/214 lr:0.000372 t:10.0s
+tttg: c126/214 lr:0.000365 t:10.1s
+tttg: c127/214 lr:0.000358 t:10.1s
+tttg: c128/214 lr:0.000351 t:10.2s
+tttg: c129/214 lr:0.000344 t:10.3s
+tttg: c130/214 lr:0.000337 t:10.4s
+tttg: c131/214 lr:0.000330 t:10.5s
+tttg: c132/214 lr:0.000323 t:10.5s
+tttg: c133/214 lr:0.000316 t:10.6s
+tttg: c134/214 lr:0.000310 t:10.7s
+tttg: c135/214 lr:0.000303 t:10.8s
+tttg: c136/214 lr:0.000296 t:10.9s
+tttg: c137/214 lr:0.000289 t:10.9s
+tttg: c138/214 lr:0.000283 t:11.0s
+tttg: c139/214 lr:0.000276 t:11.1s
+tttg: c140/214 lr:0.000269 t:11.2s
+tttg: c141/214 lr:0.000263 t:11.3s
+tttg: c142/214 lr:0.000256 t:11.4s
+tttg: c143/214 lr:0.000250 t:11.4s
+tttg: c144/214 lr:0.000244 t:11.5s
+tttg: c145/214 lr:0.000237 t:11.6s
+tttg: c146/214 lr:0.000231 t:11.7s
+tttg: c147/214 lr:0.000225 t:11.7s
+tttg: c148/214 lr:0.000219 t:11.8s
+tttg: c149/214 lr:0.000213 t:11.9s
+tttg: c150/214 lr:0.000207 t:12.0s
+tttg: c151/214 lr:0.000201 t:12.1s
+tttg: c152/214 lr:0.000195 t:12.1s
+tttg: c153/214 lr:0.000189 t:12.2s
+tttg: c154/214 lr:0.000183 t:12.3s
+tttg: c155/214 lr:0.000178 t:12.4s
+tttg: c156/214 lr:0.000172 t:12.5s
+tttg: c157/214 lr:0.000167 t:12.5s
+tttg: c158/214 lr:0.000161 t:12.6s
+tttg: c159/214 lr:0.000156 t:12.7s
+tttg: c160/214 lr:0.000150 t:12.8s
+tttg: c161/214 lr:0.000145 t:12.9s
+tttg: c162/214 lr:0.000140 t:12.9s
+tttg: c163/214 lr:0.000135 t:13.0s
+tttg: c164/214 lr:0.000130 t:13.1s
+tttg: c165/214 lr:0.000125 t:13.2s
+tttg: c166/214 lr:0.000120 t:13.2s
+tttg: c167/214 lr:0.000115 t:13.3s
+tttg: c168/214 lr:0.000111 t:13.4s
+tttg: c169/214 lr:0.000106 t:13.5s
+tttg: c170/214 lr:0.000102 t:13.6s
+tttg: c171/214 lr:0.000097 t:13.6s
+tttg: c172/214 lr:0.000093 t:13.7s
+tttg: c173/214 lr:0.000089 t:13.8s
+tttg: c174/214 lr:0.000085 t:13.9s
+tttg: c175/214 lr:0.000080 t:14.0s
+tttg: c176/214 lr:0.000076 t:14.0s
+tttg: c177/214 lr:0.000073 t:14.1s
+tttg: c178/214 lr:0.000069 t:14.2s
+tttg: c179/214 lr:0.000065 t:14.3s
+tttg: c180/214 lr:0.000062 t:14.3s
+tttg: c181/214 lr:0.000058 t:14.4s
+tttg: c182/214 lr:0.000055 t:14.5s
+tttg: c183/214 lr:0.000051 t:14.6s
+tttg: c184/214 lr:0.000048 t:14.7s
+tttg: c185/214 lr:0.000045 t:14.7s
+tttg: c186/214 lr:0.000042 t:14.8s
+tttg: c187/214 lr:0.000039 t:14.9s
+tttg: c188/214 lr:0.000036 t:15.0s
+tttg: c189/214 lr:0.000034 t:15.1s
+tttg: c190/214 lr:0.000031 t:15.1s
+tttg: c191/214 lr:0.000028 t:15.2s
+tttg: c192/214 lr:0.000026 t:15.3s
+tttg: c193/214 lr:0.000024 t:15.4s
+tttg: c194/214 lr:0.000022 t:15.5s
+tttg: c195/214 lr:0.000020 t:15.5s
+tttg: c196/214 lr:0.000018 t:15.6s
+tttg: c197/214 lr:0.000016 t:15.7s
+tttg: c198/214 lr:0.000014 t:15.8s
+tttg: c199/214 lr:0.000012 t:15.8s
+tttg: c200/214 lr:0.000011 t:15.9s
+tttg: c201/214 lr:0.000009 t:16.0s
+tttg: c202/214 lr:0.000008 t:16.1s
+tttg: c203/214 lr:0.000007 t:16.2s
+tttg: c204/214 lr:0.000005 t:16.2s
+tttg: c205/214 lr:0.000004 t:16.3s
+tttg: c206/214 lr:0.000003 t:16.4s
+tttg: c207/214 lr:0.000003 t:16.5s
+tttg: c208/214 lr:0.000002 t:16.5s
+tttg: c209/214 lr:0.000001 t:16.6s
+tttg: c210/214 lr:0.000001 t:16.7s
+tttg: c211/214 lr:0.000000 t:16.8s
+tttg: c212/214 lr:0.000000 t:16.9s
+tttg: c213/214 lr:0.000000 t:16.9s
+ttpr: phase:2/3 t:382.0s
+ttp: b748/782 bl:2.3812 bb:1.0805 rl:2.3727 rb:1.0784 dl:2918-2965 gd:0
+ttpp: phase:3/3 pd:2960 gd:2500 t:396.9s
+tttg: c1/282 lr:0.001000 t:0.1s
+tttg: c2/282 lr:0.001000 t:0.2s
+tttg: c3/282 lr:0.001000 t:0.2s
+tttg: c4/282 lr:0.001000 t:0.3s
+tttg: c5/282 lr:0.001000 t:0.4s
+tttg: c6/282 lr:0.000999 t:0.5s
+tttg: c7/282 lr:0.000999 t:0.5s
+tttg: c8/282 lr:0.000998 t:0.6s
+tttg: c9/282 lr:0.000998 t:0.7s
+tttg: c10/282 lr:0.000997 t:0.8s
+tttg: c11/282 lr:0.000997 t:0.9s
+tttg: c12/282 lr:0.000996 t:0.9s
+tttg: c13/282 lr:0.000996 t:1.0s
+tttg: c14/282 lr:0.000995 t:1.1s
+tttg: c15/282 lr:0.000994 t:1.2s
+tttg: c16/282 lr:0.000993 t:1.2s
+tttg: c17/282 lr:0.000992 t:1.3s
+tttg: c18/282 lr:0.000991 t:1.4s
+tttg: c19/282 lr:0.000990 t:1.5s
+tttg: c20/282 lr:0.000989 t:1.6s
+tttg: c21/282 lr:0.000988 t:1.7s
+tttg: c22/282 lr:0.000986 t:1.7s
+tttg: c23/282 lr:0.000985 t:1.8s
+tttg: c24/282 lr:0.000984 t:1.9s
+tttg: c25/282 lr:0.000982 t:2.0s
+tttg: c26/282 lr:0.000981 t:2.1s
+tttg: c27/282 lr:0.000979 t:2.1s
+tttg: c28/282 lr:0.000977 t:2.2s
+tttg: c29/282 lr:0.000976 t:2.3s
+tttg: c30/282 lr:0.000974 t:2.4s
+tttg: c31/282 lr:0.000972 t:2.5s
+tttg: c32/282 lr:0.000970 t:2.5s
+tttg: c33/282 lr:0.000968 t:2.6s
+tttg: c34/282 lr:0.000966 t:2.7s
+tttg: c35/282 lr:0.000964 t:2.8s
+tttg: c36/282 lr:0.000962 t:2.9s
+tttg: c37/282 lr:0.000960 t:2.9s
+tttg: c38/282 lr:0.000958 t:3.0s
+tttg: c39/282 lr:0.000956 t:3.1s
+tttg: c40/282 lr:0.000953 t:3.2s
+tttg: c41/282 lr:0.000951 t:3.3s
+tttg: c42/282 lr:0.000948 t:3.3s
+tttg: c43/282 lr:0.000946 t:3.4s
+tttg: c44/282 lr:0.000943 t:3.5s
+tttg: c45/282 lr:0.000941 t:3.6s
+tttg: c46/282 lr:0.000938 t:3.7s
+tttg: c47/282 lr:0.000935 t:3.7s
+tttg: c48/282 lr:0.000933 t:3.8s
+tttg: c49/282 lr:0.000930 t:3.9s
+tttg: c50/282 lr:0.000927 t:4.0s
+tttg: c51/282 lr:0.000924 t:4.1s
+tttg: c52/282 lr:0.000921 t:4.1s
+tttg: c53/282 lr:0.000918 t:4.2s
+tttg: c54/282 lr:0.000915 t:4.3s
+tttg: c55/282 lr:0.000912 t:4.4s
+tttg: c56/282 lr:0.000908 t:4.5s
+tttg: c57/282 lr:0.000905 t:4.5s
+tttg: c58/282 lr:0.000902 t:4.6s
+tttg: c59/282 lr:0.000899 t:4.7s
+tttg: c60/282 lr:0.000895 t:4.8s
+tttg: c61/282 lr:0.000892 t:4.9s
+tttg: c62/282 lr:0.000888 t:4.9s
+tttg: c63/282 lr:0.000885 t:5.0s
+tttg: c64/282 lr:0.000881 t:5.1s
+tttg: c65/282 lr:0.000877 t:5.2s
+tttg: c66/282 lr:0.000874 t:5.3s
+tttg: c67/282 lr:0.000870 t:5.3s
+tttg: c68/282 lr:0.000866 t:5.4s
+tttg: c69/282 lr:0.000862 t:5.5s
+tttg: c70/282 lr:0.000858 t:5.6s
+tttg: c71/282 lr:0.000855 t:5.6s
+tttg: c72/282 lr:0.000851 t:5.7s
+tttg: c73/282 lr:0.000847 t:5.8s
+tttg: c74/282 lr:0.000843 t:5.9s
+tttg: c75/282 lr:0.000838 t:6.0s
+tttg: c76/282 lr:0.000834 t:6.0s
+tttg: c77/282 lr:0.000830 t:6.1s
+tttg: c78/282 lr:0.000826 t:6.2s
+tttg: c79/282 lr:0.000822 t:6.3s
+tttg: c80/282 lr:0.000817 t:6.4s
+tttg: c81/282 lr:0.000813 t:6.4s
+tttg: c82/282 lr:0.000809 t:6.5s
+tttg: c83/282 lr:0.000804 t:6.6s
+tttg: c84/282 lr:0.000800 t:6.7s
+tttg: c85/282 lr:0.000795 t:6.8s
+tttg: c86/282 lr:0.000791 t:6.8s
+tttg: c87/282 lr:0.000786 t:6.9s
+tttg: c88/282 lr:0.000782 t:7.0s
+tttg: c89/282 lr:0.000777 t:7.1s
+tttg: c90/282 lr:0.000772 t:7.2s
+tttg: c91/282 lr:0.000768 t:7.2s
+tttg: c92/282 lr:0.000763 t:7.3s
+tttg: c93/282 lr:0.000758 t:7.4s
+tttg: c94/282 lr:0.000753 t:7.5s
+tttg: c95/282 lr:0.000748 t:7.6s
+tttg: c96/282 lr:0.000744 t:7.6s
+tttg: c97/282 lr:0.000739 t:7.7s
+tttg: c98/282 lr:0.000734 t:7.8s
+tttg: c99/282 lr:0.000729 t:7.9s
+tttg: c100/282 lr:0.000724 t:8.0s
+tttg: c101/282 lr:0.000719 t:8.0s
+tttg: c102/282 lr:0.000714 t:8.1s
+tttg: c103/282 lr:0.000709 t:8.2s
+tttg: c104/282 lr:0.000704 t:8.3s
+tttg: c105/282 lr:0.000698 t:8.4s
+tttg: c106/282 lr:0.000693 t:8.4s
+tttg: c107/282 lr:0.000688 t:8.5s
+tttg: c108/282 lr:0.000683 t:8.6s
+tttg: c109/282 lr:0.000678 t:8.7s
+tttg: c110/282 lr:0.000672 t:8.8s
+tttg: c111/282 lr:0.000667 t:8.8s
+tttg: c112/282 lr:0.000662 t:8.9s
+tttg: c113/282 lr:0.000657 t:9.0s
+tttg: c114/282 lr:0.000651 t:9.1s
+tttg: c115/282 lr:0.000646 t:9.2s
+tttg: c116/282 lr:0.000641 t:9.2s
+tttg: c117/282 lr:0.000635 t:9.3s
+tttg: c118/282 lr:0.000630 t:9.4s
+tttg: c119/282 lr:0.000624 t:9.5s
+tttg: c120/282 lr:0.000619 t:9.6s
+tttg: c121/282 lr:0.000614 t:9.6s
+tttg: c122/282 lr:0.000608 t:9.7s
+tttg: c123/282 lr:0.000603 t:9.8s
+tttg: c124/282 lr:0.000597 t:9.9s
+tttg: c125/282 lr:0.000592 t:10.0s
+tttg: c126/282 lr:0.000586 t:10.0s
+tttg: c127/282 lr:0.000581 t:10.1s
+tttg: c128/282 lr:0.000575 t:10.2s
+tttg: c129/282 lr:0.000570 t:10.3s
+tttg: c130/282 lr:0.000564 t:10.4s
+tttg: c131/282 lr:0.000559 t:10.5s
+tttg: c132/282 lr:0.000553 t:10.5s
+tttg: c133/282 lr:0.000547 t:10.6s
+tttg: c134/282 lr:0.000542 t:10.7s
+tttg: c135/282 lr:0.000536 t:10.8s
+tttg: c136/282 lr:0.000531 t:10.8s
+tttg: c137/282 lr:0.000525 t:10.9s
+tttg: c138/282 lr:0.000520 t:11.0s
+tttg: c139/282 lr:0.000514 t:11.1s
+tttg: c140/282 lr:0.000508 t:11.2s
+tttg: c141/282 lr:0.000503 t:11.2s
+tttg: c142/282 lr:0.000497 t:11.3s
+tttg: c143/282 lr:0.000492 t:11.4s
+tttg: c144/282 lr:0.000486 t:11.5s
+tttg: c145/282 lr:0.000480 t:11.6s
+tttg: c146/282 lr:0.000475 t:11.6s
+tttg: c147/282 lr:0.000469 t:11.7s
+tttg: c148/282 lr:0.000464 t:11.8s
+tttg: c149/282 lr:0.000458 t:11.9s
+tttg: c150/282 lr:0.000453 t:12.0s
+tttg: c151/282 lr:0.000447 t:12.0s
+tttg: c152/282 lr:0.000441 t:12.1s
+tttg: c153/282 lr:0.000436 t:12.2s
+tttg: c154/282 lr:0.000430 t:12.3s
+tttg: c155/282 lr:0.000425 t:12.4s
+tttg: c156/282 lr:0.000419 t:12.4s
+tttg: c157/282 lr:0.000414 t:12.5s
+tttg: c158/282 lr:0.000408 t:12.6s
+tttg: c159/282 lr:0.000403 t:12.7s
+tttg: c160/282 lr:0.000397 t:12.8s
+tttg: c161/282 lr:0.000392 t:12.8s
+tttg: c162/282 lr:0.000386 t:12.9s
+tttg: c163/282 lr:0.000381 t:13.0s
+tttg: c164/282 lr:0.000376 t:13.1s
+tttg: c165/282 lr:0.000370 t:13.2s
+tttg: c166/282 lr:0.000365 t:13.2s
+tttg: c167/282 lr:0.000359 t:13.3s
+tttg: c168/282 lr:0.000354 t:13.4s
+tttg: c169/282 lr:0.000349 t:13.5s
+tttg: c170/282 lr:0.000343 t:13.6s
+tttg: c171/282 lr:0.000338 t:13.6s
+tttg: c172/282 lr:0.000333 t:13.7s
+tttg: c173/282 lr:0.000328 t:13.8s
+tttg: c174/282 lr:0.000322 t:13.9s
+tttg: c175/282 lr:0.000317 t:14.0s
+tttg: c176/282 lr:0.000312 t:14.0s
+tttg: c177/282 lr:0.000307 t:14.1s
+tttg: c178/282 lr:0.000302 t:14.2s
+tttg: c179/282 lr:0.000296 t:14.3s
+tttg: c180/282 lr:0.000291 t:14.4s
+tttg: c181/282 lr:0.000286 t:14.4s
+tttg: c182/282 lr:0.000281 t:14.5s
+tttg: c183/282 lr:0.000276 t:14.6s
+tttg: c184/282 lr:0.000271 t:14.7s
+tttg: c185/282 lr:0.000266 t:14.8s
+tttg: c186/282 lr:0.000261 t:14.8s
+tttg: c187/282 lr:0.000256 t:14.9s
+tttg: c188/282 lr:0.000252 t:15.0s
+tttg: c189/282 lr:0.000247 t:15.1s
+tttg: c190/282 lr:0.000242 t:15.2s
+tttg: c191/282 lr:0.000237 t:15.2s
+tttg: c192/282 lr:0.000232 t:15.3s
+tttg: c193/282 lr:0.000228 t:15.4s
+tttg: c194/282 lr:0.000223 t:15.5s
+tttg: c195/282 lr:0.000218 t:15.6s
+tttg: c196/282 lr:0.000214 t:15.6s
+tttg: c197/282 lr:0.000209 t:15.7s
+tttg: c198/282 lr:0.000205 t:15.8s
+tttg: c199/282 lr:0.000200 t:15.9s
+tttg: c200/282 lr:0.000196 t:16.0s
+tttg: c201/282 lr:0.000191 t:16.0s
+tttg: c202/282 lr:0.000187 t:16.1s
+tttg: c203/282 lr:0.000183 t:16.2s
+tttg: c204/282 lr:0.000178 t:16.3s
+tttg: c205/282 lr:0.000174 t:16.4s
+tttg: c206/282 lr:0.000170 t:16.4s
+tttg: c207/282 lr:0.000166 t:16.5s
+tttg: c208/282 lr:0.000162 t:16.6s
+tttg: c209/282 lr:0.000157 t:16.7s
+tttg: c210/282 lr:0.000153 t:16.8s
+tttg: c211/282 lr:0.000149 t:16.8s
+tttg: c212/282 lr:0.000145 t:16.9s
+tttg: c213/282 lr:0.000142 t:17.0s
+tttg: c214/282 lr:0.000138 t:17.1s
+tttg: c215/282 lr:0.000134 t:17.2s
+tttg: c216/282 lr:0.000130 t:17.2s
+tttg: c217/282 lr:0.000126 t:17.3s
+tttg: c218/282 lr:0.000123 t:17.4s
+tttg: c219/282 lr:0.000119 t:17.5s
+tttg: c220/282 lr:0.000115 t:17.6s
+tttg: c221/282 lr:0.000112 t:17.6s
+tttg: c222/282 lr:0.000108 t:17.7s
+tttg: c223/282 lr:0.000105 t:17.8s
+tttg: c224/282 lr:0.000101 t:17.9s
+tttg: c225/282 lr:0.000098 t:18.0s
+tttg: c226/282 lr:0.000095 t:18.0s
+tttg: c227/282 lr:0.000092 t:18.1s
+tttg: c228/282 lr:0.000088 t:18.2s
+tttg: c229/282 lr:0.000085 t:18.3s
+tttg: c230/282 lr:0.000082 t:18.4s
+tttg: c231/282 lr:0.000079 t:18.4s
+tttg: c232/282 lr:0.000076 t:18.5s
+tttg: c233/282 lr:0.000073 t:18.6s
+tttg: c234/282 lr:0.000070 t:18.7s
+tttg: c235/282 lr:0.000067 t:18.8s
+tttg: c236/282 lr:0.000065 t:18.8s
+tttg: c237/282 lr:0.000062 t:18.9s
+tttg: c238/282 lr:0.000059 t:19.0s
+tttg: c239/282 lr:0.000057 t:19.1s
+tttg: c240/282 lr:0.000054 t:19.2s
+tttg: c241/282 lr:0.000052 t:19.2s
+tttg: c242/282 lr:0.000049 t:19.3s
+tttg: c243/282 lr:0.000047 t:19.4s
+tttg: c244/282 lr:0.000044 t:19.5s
+tttg: c245/282 lr:0.000042 t:19.6s
+tttg: c246/282 lr:0.000040 t:19.6s
+tttg: c247/282 lr:0.000038 t:19.7s
+tttg: c248/282 lr:0.000036 t:19.8s
+tttg: c249/282 lr:0.000034 t:19.9s
+tttg: c250/282 lr:0.000032 t:20.0s
+tttg: c251/282 lr:0.000030 t:20.0s
+tttg: c252/282 lr:0.000028 t:20.1s
+tttg: c253/282 lr:0.000026 t:20.2s
+tttg: c254/282 lr:0.000024 t:20.3s
+tttg: c255/282 lr:0.000023 t:20.4s
+tttg: c256/282 lr:0.000021 t:20.4s
+tttg: c257/282 lr:0.000019 t:20.5s
+tttg: c258/282 lr:0.000018 t:20.6s
+tttg: c259/282 lr:0.000016 t:20.7s
+tttg: c260/282 lr:0.000015 t:20.8s
+tttg: c261/282 lr:0.000014 t:20.8s
+tttg: c262/282 lr:0.000012 t:20.9s
+tttg: c263/282 lr:0.000011 t:21.0s
+tttg: c264/282 lr:0.000010 t:21.1s
+tttg: c265/282 lr:0.000009 t:21.2s
+tttg: c266/282 lr:0.000008 t:21.2s
+tttg: c267/282 lr:0.000007 t:21.3s
+tttg: c268/282 lr:0.000006 t:21.4s
+tttg: c269/282 lr:0.000005 t:21.5s
+tttg: c270/282 lr:0.000004 t:21.6s
+tttg: c271/282 lr:0.000004 t:21.6s
+tttg: c272/282 lr:0.000003 t:21.7s
+tttg: c273/282 lr:0.000003 t:21.8s
+tttg: c274/282 lr:0.000002 t:21.9s
+tttg: c275/282 lr:0.000002 t:21.9s
+tttg: c276/282 lr:0.000001 t:22.0s
+tttg: c277/282 lr:0.000001 t:22.1s
+tttg: c278/282 lr:0.000000 t:22.2s
+tttg: c279/282 lr:0.000000 t:22.3s
+tttg: c280/282 lr:0.000000 t:22.3s
+tttg: c281/282 lr:0.000000 t:22.4s
+ttpr: phase:3/3 t:421.1s
+ttp: b732/782 bl:2.4514 bb:1.0529 rl:2.3787 rb:1.0764 dl:2354-2380 gd:1
+ttp: b724/782 bl:2.3507 bb:1.0286 rl:2.3768 rb:1.0732 dl:2151-2176 gd:1
+ttp: b713/782 bl:2.3780 bb:1.0346 rl:2.3769 rb:1.0710 dl:1953-1968 gd:1
+ttp: b711/782 bl:2.3198 bb:1.0415 rl:2.3740 rb:1.0694 dl:1919-1933 gd:1
+ttp: b699/782 bl:2.2928 bb:1.0352 rl:2.3703 rb:1.0679 dl:1768-1780 gd:1
+ttp: b690/782 bl:2.4643 bb:1.0663 rl:2.3741 rb:1.0678 dl:1672-1683 gd:1
+ttp: b681/782 bl:2.4192 bb:1.0588 rl:2.3758 rb:1.0675 dl:1587-1597 gd:1
+ttp: b678/782 bl:2.3681 bb:1.0432 rl:2.3756 rb:1.0666 dl:1562-1569 gd:1
+ttp: b668/782 bl:2.4271 bb:1.0721 rl:2.3773 rb:1.0668 dl:1486-1492 gd:1
+ttp: b657/782 bl:2.3754 bb:1.0216 rl:2.3772 rb:1.0654 dl:1410-1416 gd:1
+ttp: b650/782 bl:2.3999 bb:1.0431 rl:2.3778 rb:1.0647 dl:1365-1372 gd:1
+ttp: b645/782 bl:2.3527 bb:1.0329 rl:2.3772 rb:1.0638 dl:1333-1339 gd:1
+ttp: b639/782 bl:2.3235 bb:1.0282 rl:2.3758 rb:1.0629 dl:1299-1304 gd:1
+ttp: b627/782 bl:2.3907 bb:1.0514 rl:2.3761 rb:1.0626 dl:1234-1239 gd:1
+ttp: b620/782 bl:2.4031 bb:1.0343 rl:2.3768 rb:1.0620 dl:1197-1202 gd:1
+ttp: b610/782 bl:2.4280 bb:1.0480 rl:2.3778 rb:1.0617 dl:1149-1154 gd:1
+ttp: b601/782 bl:2.3546 bb:1.0348 rl:2.3774 rb:1.0611 dl:1110-1114 gd:1
+ttp: b593/782 bl:2.4033 bb:1.0591 rl:2.3779 rb:1.0611 dl:1074-1079 gd:1
+ttp: b585/782 bl:2.3778 bb:1.0352 rl:2.3779 rb:1.0606 dl:1043-1046 gd:1
+ttp: b581/782 bl:2.4282 bb:1.0262 rl:2.3787 rb:1.0600 dl:1026-1030 gd:1
+ttp: b572/782 bl:2.3855 bb:1.0534 rl:2.3789 rb:1.0598 dl:992-996 gd:1
+ttp: b564/782 bl:2.3267 bb:1.0353 rl:2.3780 rb:1.0595 dl:966-969 gd:1
+ttp: b556/782 bl:2.5461 bb:1.0814 rl:2.3806 rb:1.0598 dl:939-942 gd:1
+ttp: b547/782 bl:2.3403 bb:1.0654 rl:2.3800 rb:1.0599 dl:911-914 gd:1
+ttp: b537/782 bl:2.4417 bb:1.0678 rl:2.3809 rb:1.0600 dl:880-883 gd:1
+ttp: b529/782 bl:2.3955 bb:1.0252 rl:2.3811 rb:1.0595 dl:857-860 gd:1
+ttp: b525/782 bl:2.4102 bb:1.0217 rl:2.3814 rb:1.0590 dl:845-848 gd:1
+ttp: b517/782 bl:2.3934 bb:1.0453 rl:2.3816 rb:1.0588 dl:823-825 gd:1
+ttp: b506/782 bl:2.3622 bb:1.0354 rl:2.3814 rb:1.0585 dl:792-795 gd:1
+ttp: b499/782 bl:2.5385 bb:1.0889 rl:2.3832 rb:1.0589 dl:775-777 gd:1
+ttp: b491/782 bl:2.3928 bb:1.0279 rl:2.3833 rb:1.0586 dl:755-757 gd:1
+ttp: b483/782 bl:2.3993 bb:1.0244 rl:2.3834 rb:1.0582 dl:736-739 gd:1
+ttp: b475/782 bl:2.4723 bb:1.0797 rl:2.3844 rb:1.0584 dl:717-719 gd:1
+ttp: b467/782 bl:2.3926 bb:1.0665 rl:2.3844 rb:1.0585 dl:699-701 gd:1
+ttp: b459/782 bl:2.3602 bb:1.0703 rl:2.3842 rb:1.0586 dl:682-684 gd:1
+ttp: b451/782 bl:2.4006 bb:1.0194 rl:2.3844 rb:1.0582 dl:666-668 gd:1
+ttp: b444/782 bl:2.4099 bb:1.0688 rl:2.3846 rb:1.0583 dl:651-653 gd:1
+ttp: b436/782 bl:2.3117 bb:0.9986 rl:2.3839 rb:1.0578 dl:636-638 gd:1
+ttp: b428/782 bl:2.3886 bb:1.0572 rl:2.3840 rb:1.0578 dl:621-623 gd:1
+ttp: b422/782 bl:2.3820 bb:1.0482 rl:2.3840 rb:1.0577 dl:608-610 gd:1
+ttp: b413/782 bl:2.3398 bb:1.0394 rl:2.3836 rb:1.0575 dl:592-594 gd:1
+ttp: b405/782 bl:2.4329 bb:1.0365 rl:2.3840 rb:1.0574 dl:577-579 gd:1
+ttp: b394/782 bl:2.3841 bb:1.0600 rl:2.3840 rb:1.0574 dl:557-559 gd:1
+ttp: b386/782 bl:2.4511 bb:1.1010 rl:2.3845 rb:1.0577 dl:544-546 gd:1
+ttp: b378/782 bl:2.3900 bb:1.0601 rl:2.3845 rb:1.0577 dl:530-532 gd:1
+ttp: b371/782 bl:2.4087 bb:1.0537 rl:2.3847 rb:1.0577 dl:519-521 gd:1
+ttp: b362/782 bl:2.3720 bb:1.0410 rl:2.3846 rb:1.0576 dl:504-506 gd:1
+ttp: b354/782 bl:2.3733 bb:1.0731 rl:2.3845 rb:1.0577 dl:491-492 gd:1
+ttp: b346/782 bl:2.4690 bb:1.1072 rl:2.3850 rb:1.0580 dl:479-480 gd:1
+ttp: b338/782 bl:2.5181 bb:1.1210 rl:2.3858 rb:1.0583 dl:467-468 gd:1
+ttp: b331/782 bl:2.3696 bb:1.0700 rl:2.3857 rb:1.0584 dl:457-458 gd:1
+ttp: b323/782 bl:2.3508 bb:1.0508 rl:2.3855 rb:1.0584 dl:445-447 gd:1
+ttp: b315/782 bl:2.3463 bb:1.0755 rl:2.3853 rb:1.0585 dl:433-434 gd:1
+ttp: b305/782 bl:2.4007 bb:1.0778 rl:2.3854 rb:1.0585 dl:418-420 gd:1
+ttp: b297/782 bl:2.4726 bb:1.0897 rl:2.3858 rb:1.0587 dl:407-408 gd:1
+ttp: b289/782 bl:2.3439 bb:1.0787 rl:2.3856 rb:1.0588 dl:395-397 gd:1
+ttp: b281/782 bl:2.4589 bb:1.0988 rl:2.3860 rb:1.0590 dl:384-385 gd:1
+ttp: b277/782 bl:2.4022 bb:1.0917 rl:2.3861 rb:1.0591 dl:379-380 gd:1
+ttp: b267/782 bl:2.5330 bb:1.1264 rl:2.3867 rb:1.0594 dl:366-367 gd:1
+ttp: b258/782 bl:2.4502 bb:1.1093 rl:2.3870 rb:1.0596 dl:355-356 gd:1
+ttp: b250/782 bl:2.3855 bb:1.1206 rl:2.3870 rb:1.0599 dl:345-346 gd:1
+ttp: b242/782 bl:2.4265 bb:1.0784 rl:2.3871 rb:1.0599 dl:335-337 gd:1
+ttp: b236/782 bl:2.3527 bb:1.1103 rl:2.3870 rb:1.0601 dl:328-329 gd:1
+ttp: b227/782 bl:2.4754 bb:1.1270 rl:2.3873 rb:1.0604 dl:318-319 gd:1
+ttp: b218/782 bl:2.4826 bb:1.1271 rl:2.3877 rb:1.0606 dl:308-309 gd:1
+ttp: b210/782 bl:2.3541 bb:1.0816 rl:2.3875 rb:1.0607 dl:299-300 gd:1
+ttp: b202/782 bl:2.4900 bb:1.1527 rl:2.3879 rb:1.0610 dl:291-292 gd:1
+ttp: b194/782 bl:2.4540 bb:1.1219 rl:2.3881 rb:1.0612 dl:282-283 gd:1
+ttp: b186/782 bl:2.3360 bb:1.0369 rl:2.3879 rb:1.0611 dl:274-275 gd:1
+ttp: b178/782 bl:2.5251 bb:1.1926 rl:2.3884 rb:1.0615 dl:266-267 gd:1
+ttp: b170/782 bl:2.5488 bb:1.1350 rl:2.3888 rb:1.0617 dl:258-259 gd:1
+ttp: b161/782 bl:2.4216 bb:1.0803 rl:2.3889 rb:1.0617 dl:250-251 gd:1
+ttp: b154/782 bl:2.4917 bb:1.0946 rl:2.3892 rb:1.0618 dl:243-244 gd:1
+ttp: b146/782 bl:2.4757 bb:1.1234 rl:2.3894 rb:1.0620 dl:235-236 gd:1
+ttp: b138/782 bl:2.4260 bb:1.1294 rl:2.3895 rb:1.0622 dl:228-229 gd:1
+ttp: b130/782 bl:2.4487 bb:1.1795 rl:2.3897 rb:1.0624 dl:221-222 gd:1
+ttp: b122/782 bl:2.4558 bb:1.1259 rl:2.3899 rb:1.0626 dl:213-214 gd:1
+ttp: b114/782 bl:2.5420 bb:1.1234 rl:2.3902 rb:1.0627 dl:206-207 gd:1
+ttp: b106/782 bl:2.4728 bb:1.1242 rl:2.3904 rb:1.0629 dl:199-200 gd:1
+ttp: b98/782 bl:2.5303 bb:1.2210 rl:2.3907 rb:1.0632 dl:192-193 gd:1
+ttp: b89/782 bl:2.5104 bb:1.1310 rl:2.3909 rb:1.0633 dl:184-185 gd:1
+ttp: b81/782 bl:2.6527 bb:1.1883 rl:2.3915 rb:1.0636 dl:177-178 gd:1
+ttp: b73/782 bl:2.5837 bb:1.2437 rl:2.3918 rb:1.0639 dl:170-171 gd:1
+ttp: b65/782 bl:2.5421 bb:1.1914 rl:2.3921 rb:1.0641 dl:164-165 gd:1
+ttp: b57/782 bl:2.5817 bb:1.1838 rl:2.3924 rb:1.0643 dl:156-157 gd:1
+ttp: b49/782 bl:2.6352 bb:1.1796 rl:2.3928 rb:1.0645 dl:149-150 gd:1
+ttp: b41/782 bl:2.5380 bb:1.2043 rl:2.3931 rb:1.0647 dl:141-142 gd:1
+ttp: b33/782 bl:2.6639 bb:1.2621 rl:2.3935 rb:1.0650 dl:133-134 gd:1
+ttp: b25/782 bl:2.5877 bb:1.1835 rl:2.3937 rb:1.0651 dl:125-126 gd:1
+ttp: b17/782 bl:2.7951 bb:1.2866 rl:2.3942 rb:1.0654 dl:115-117 gd:1
+ttp: b9/782 bl:2.7755 bb:1.2276 rl:2.3947 rb:1.0656 dl:103-105 gd:1
+quantized_ttt_phased val_loss:2.38419604 val_bpb:1.06299064 eval_time:510971ms
+total_eval_time:511.0s
diff --git a/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed42.log b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed42.log
new file mode 100644
index 0000000000..c034c61782
--- /dev/null
+++ b/records/track_10min_16mb/2026-05-01_Mockingbird_8xH100/train_seed42.log
@@ -0,0 +1,4800 @@
+====================================================================================================
+Hyperparameters:
+  adam_eps: 1e-08
+  adam_wd: 0.02
+  artifact_dir: logs
+  attn_clip_sigmas: 13.0
+  attn_out_gate_enabled: False
+  attn_out_gate_src: proj
+  beta1: 0.9
+  beta2: 0.99
+  build_seconds: 600
+  caseops_enabled: True
+  compressor: pergroup
+  data_dir: /workspace/SOTA_FINAL/data
+  datasets_dir: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved
+  distributed: True
+  ema_decay: 0.9965
+  embed_bits: 7
+  embed_clip_sigmas: 14.0
+  embed_lr: 0.6
+  embed_wd: 0.085
+  enable_looping_at: 0.45
+  eval_seconds: 600
+  eval_seq_len: 2048
+  eval_stride: 64
+  fused_ce_enabled: True
+  gate_window: 12
+  gated_attn_enabled: False
+  gated_attn_init_std: 0.01
+  gated_attn_quant_gate: True
+  global_ttt_batch_seqs: 32
+  global_ttt_chunk_tokens: 32768
+  global_ttt_epochs: 1
+  global_ttt_grad_clip: 1.0
+  global_ttt_lr: 0.001
+  global_ttt_momentum: 0.9
+  global_ttt_respect_doc_boundaries: True
+  global_ttt_warmup_chunks: 0
+  global_ttt_warmup_start_lr: 0.0
+  gptq_calibration_batches: 16
+  gptq_reserve_seconds: 0.5
+  grad_accum_steps: 1
+  grad_clip_norm: 0.3
+  hypothesis: Promote the best legal SP10240 CaseOps Side4 mechanics candidate to a clean standard 8x run: 11L MLP3.75 with loop2 enabled at 0.45, keeping PR1855 LQER/pergroup/phased-TTT compression/eval machinery fixed.
+  is_main_process: True
+  iterations: 20000
+  ln_scale: True
+  local_rank: 0
+  logfile: logs/pr1855_sp10240_caseops_mlp375_late045_8x_seed42.txt
+  logit_softcap: 30.0
+  loop_end: 5
+  loop_start: 3
+  lqer_asym_enabled: True
+  lqer_asym_group: 64
+  lqer_enabled: True
+  lqer_factor_bits: 4
+  lqer_rank: 4
+  lqer_top_k: 3
+  matrix_bits: 6
+  matrix_clip_sigmas: 12.85
+  matrix_lr: 0.026
+  max_wallclock_seconds: 600.0
+  min_lr: 0.1
+  mlp_clip_sigmas: 11.5
+  mlp_mult: 3.75
+  model_dim: 512
+  model_path: logs/final_model.pt
+  muon_backend_steps: 5
+  muon_momentum: 0.97
+  muon_momentum_warmup_start: 0.92
+  muon_momentum_warmup_steps: 1500
+  muon_row_normalize: True
+  muon_wd: 0.095
+  num_heads: 8
+  num_kv_heads: 4
+  num_layers: 11
+  num_loops: 2
+  parallel_final_lane: mean
+  parallel_start_layer: 8
+  parent_run: 2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x
+  phased_ttt_num_phases: 3
+  phased_ttt_prefix_docs: 2500
+  qk_gain_init: 5.25
+  quantized_model_path: logs/final_model.int6.ptz
+  rank: 0
+  rope_base: 10000.0
+  rope_dims: 16
+  rope_train_seq_len: 2048
+  rope_yarn: False
+  run_id: pr1855_sp10240_caseops_mlp375_late045_8x_seed42
+  run_kind: new_experiment
+  run_label: standard_8x
+  scalar_lr: 0.02
+  seed: 42
+  size_cap_bytes: 16000000
+  skip_gates_enabled: True
+  smear_gate_enabled: True
+  source_parent: legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py
+  source_parent_sha256: 454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b
+  source_tokenizer_lane: scripts/prepare_sp10240_caseops_data.py
+  sparse_attn_gate_enabled: True
+  sparse_attn_gate_init_std: 0.0
+  sparse_attn_gate_scale: 0.5
+  test_date: 2026-04-30
+  test_id: 2026-04-30_pr1855_sp10240_caseops_mlp375_late045_8x
+  tie_embeddings: True
+  tied_embed_init_std: 0.005
+  tied_embed_lr: 0.03
+  tokenizer_path: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/tokenizers/fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model
+  train_batch_tokens: 786432
+  train_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin
+  train_log_every: 500
+  train_seq_len: 2048
+  ttt_batch_size: 64
+  ttt_beta1: 0.0
+  ttt_beta2: 0.99
+  ttt_chunk_size: 48
+  ttt_enabled: True
+  ttt_eval_batches: 
+  ttt_eval_seq_len: 2048
+  ttt_grad_steps: 1
+  ttt_k_lora: True
+  ttt_lora_lr: 0.0001
+  ttt_lora_rank: 80
+  ttt_mlp_lora: True
+  ttt_o_lora: True
+  ttt_optimizer: adam
+  ttt_weight_decay: 0.5
+  val_batch_tokens: 524288
+  val_bytes_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin
+  val_doc_fraction: 1.0
+  val_files: /workspace/SOTA_FINAL/data/datasets/fineweb10B_sp10240_caseops/datasets/datasets/fineweb10B_sp10240_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin
+  val_loss_every: 0
+  vocab_size: 10240
+  warmdown_frac: 0.85
+  warmup_steps: 20
+  world_size: 8
+  xsa_last_n: 11
+====================================================================================================
+Source code:
+====================================================================================================
+import base64, collections, copy, fcntl, glob, io, lzma, math, os
+from pathlib import Path
+import random, re, subprocess, sys, time, uuid, numpy as np, sentencepiece as spm, torch, torch.distributed as dist, torch.nn.functional as F
+from torch import Tensor, nn
+from flash_attn_interface import (
+    flash_attn_func as flash_attn_3_func,
+    flash_attn_varlen_func,
+)
+from concurrent.futures import ThreadPoolExecutor
+import triton
+import triton.language as tl
+from triton.tools.tensor_descriptor import TensorDescriptor
+
+
+# ===== Fused softcapped cross-entropy (Triton) — training-only path =====
+# Replaces the eager
+#     logits_softcap = softcap * tanh(logits / softcap)
+#     F.cross_entropy(logits_softcap.float(), targets, reduction="mean")
+# sequence with a single fused kernel that reads logits_proj once, applies
+# softcap in-register, and computes (LSE, loss) in one streaming pass. The
+# backward kernel mirrors the forward so there's no stored softcapped logits.
+# Numerically identical to the eager path up to fp32 accumulation differences.
+_FUSED_CE_LIBRARY = "pgsubmission1draft7fusedce"
+_FUSED_CE_BLOCK_SIZE = 1024
+_FUSED_CE_NUM_WARPS = 4
+
+
+@triton.jit
+def _softcapped_ce_fwd_kernel(
+    logits_ptr, losses_ptr, lse_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    max_val = -float("inf")
+    sum_exp = 0.0
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=-float("inf"),
+        ).to(tl.float32)
+        z = A * tl.sigmoid(val * inv_C)
+        z = tl.where(mask, z, -float("inf"))
+        curr_max = tl.max(z, axis=0)
+        new_max = tl.maximum(max_val, curr_max)
+        sum_exp = sum_exp * tl.exp(max_val - new_max) + tl.sum(tl.exp(z - new_max), axis=0)
+        max_val = new_max
+    lse = max_val + tl.log(sum_exp)
+    tl.store(lse_ptr + row_idx, lse)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    target_val = tl.load(logits_row_ptr + target * stride_logits_v).to(tl.float32)
+    target_z = A * tl.sigmoid(target_val * inv_C)
+    tl.store(losses_ptr + row_idx, lse - target_z)
+
+
+@triton.jit
+def _softcapped_ce_bwd_kernel(
+    grad_logits_ptr, grad_losses_ptr, lse_ptr, logits_ptr, targets_ptr,
+    stride_logits_n, stride_logits_v,
+    stride_grad_n, stride_grad_v,
+    n_rows, n_cols, softcap,
+    block_size: tl.constexpr,
+):
+    row_idx = tl.program_id(0).to(tl.int64)
+    logits_row_ptr = logits_ptr + row_idx * stride_logits_n
+    grad_row_ptr = grad_logits_ptr + row_idx * stride_grad_n
+    lse = tl.load(lse_ptr + row_idx)
+    grad_loss = tl.load(grad_losses_ptr + row_idx).to(tl.float32)
+    target = tl.load(targets_ptr + row_idx).to(tl.int32)
+    A = 2.0 * softcap
+    inv_C = 2.0 / softcap
+    dz_dx_scale = A * inv_C
+    for off in range(0, n_cols, block_size):
+        cols = off + tl.arange(0, block_size)
+        mask = cols < n_cols
+        val = tl.load(
+            logits_row_ptr + cols * stride_logits_v,
+            mask=mask, other=0.0,
+        ).to(tl.float32)
+        sigmoid_u = tl.sigmoid(val * inv_C)
+        z = A * sigmoid_u
+        probs = tl.exp(z - lse)
+        grad_z = grad_loss * (probs - tl.where(cols == target, 1.0, 0.0))
+        grad_x = grad_z * (dz_dx_scale * sigmoid_u * (1.0 - sigmoid_u))
+        tl.store(grad_row_ptr + cols * stride_grad_v, grad_x, mask=mask)
+
+
+def _validate_softcapped_ce_inputs(
+    logits: Tensor, targets: Tensor, softcap: float,
+) -> tuple[Tensor, Tensor]:
+    if logits.ndim != 2:
+        raise ValueError(f"Expected logits.ndim=2, got {logits.ndim}")
+    if targets.ndim != 1:
+        raise ValueError(f"Expected targets.ndim=1, got {targets.ndim}")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    if not logits.is_cuda or not targets.is_cuda:
+        raise ValueError("softcapped_cross_entropy requires CUDA tensors")
+    if softcap <= 0.0:
+        raise ValueError(f"softcap must be positive, got {softcap}")
+    if logits.dtype not in (torch.float16, torch.bfloat16, torch.float32):
+        raise ValueError(f"Unsupported logits dtype: {logits.dtype}")
+    logits = logits.contiguous()
+    targets = targets.contiguous()
+    if targets.dtype != torch.int64:
+        targets = targets.to(dtype=torch.int64)
+    return logits, targets
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce", mutates_args=())
+def softcapped_ce_op(logits: Tensor, targets: Tensor, softcap: float) -> tuple[Tensor, Tensor]:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    n_rows, n_cols = logits.shape
+    losses = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    lse = torch.empty((n_rows,), device=logits.device, dtype=torch.float32)
+    _softcapped_ce_fwd_kernel[(n_rows,)](
+        logits, losses, lse, targets,
+        logits.stride(0), logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return losses, lse
+
+
+@softcapped_ce_op.register_fake
+def _(logits: Tensor, targets: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1:
+        raise ValueError("softcapped_ce fake impl expects 2D logits and 1D targets")
+    if logits.shape[0] != targets.shape[0]:
+        raise ValueError(
+            f"Expected matching rows, got logits={tuple(logits.shape)} targets={tuple(targets.shape)}"
+        )
+    n_rows = logits.shape[0]
+    return (
+        logits.new_empty((n_rows,), dtype=torch.float32),
+        logits.new_empty((n_rows,), dtype=torch.float32),
+    )
+
+
+@torch.library.custom_op(f"{_FUSED_CE_LIBRARY}::softcapped_ce_backward", mutates_args=())
+def softcapped_ce_backward_op(
+    logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float,
+) -> Tensor:
+    logits, targets = _validate_softcapped_ce_inputs(logits, targets, float(softcap))
+    lse = lse.contiguous()
+    grad_losses = grad_losses.contiguous().to(dtype=torch.float32)
+    if lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("Expected 1D lse and grad_losses")
+    if lse.shape[0] != logits.shape[0] or grad_losses.shape[0] != logits.shape[0]:
+        raise ValueError(
+            f"Expected row-aligned lse/grad_losses, got logits={tuple(logits.shape)} "
+            f"lse={tuple(lse.shape)} grad_losses={tuple(grad_losses.shape)}"
+        )
+    grad_logits = torch.empty_like(logits)
+    n_rows, n_cols = logits.shape
+    _softcapped_ce_bwd_kernel[(n_rows,)](
+        grad_logits, grad_losses, lse, logits, targets,
+        logits.stride(0), logits.stride(1),
+        grad_logits.stride(0), grad_logits.stride(1),
+        n_rows, n_cols, float(softcap),
+        block_size=_FUSED_CE_BLOCK_SIZE, num_warps=_FUSED_CE_NUM_WARPS,
+    )
+    return grad_logits
+
+
+@softcapped_ce_backward_op.register_fake
+def _(logits: Tensor, targets: Tensor, lse: Tensor, grad_losses: Tensor, softcap: float):
+    if logits.ndim != 2 or targets.ndim != 1 or lse.ndim != 1 or grad_losses.ndim != 1:
+        raise ValueError("softcapped_ce_backward fake impl expects 2D logits and 1D row tensors")
+    if (
+        logits.shape[0] != targets.shape[0]
+        or logits.shape[0] != lse.shape[0]
+        or logits.shape[0] != grad_losses.shape[0]
+    ):
+        raise ValueError("softcapped_ce_backward fake impl expects row-aligned tensors")
+    return logits.new_empty(logits.shape)
+
+
+def _softcapped_ce_setup_context(
+    ctx: torch.autograd.function.FunctionCtx, inputs, output,
+) -> None:
+    logits, targets, softcap = inputs
+    _losses, lse = output
+    ctx.save_for_backward(logits, targets, lse)
+    ctx.softcap = float(softcap)
+
+
+def _softcapped_ce_backward(
+    ctx: torch.autograd.function.FunctionCtx, grad_losses: Tensor, grad_lse: "Tensor | None",
+):
+    del grad_lse
+    logits, targets, lse = ctx.saved_tensors
+    grad_logits = torch.ops.pgsubmission1draft7fusedce.softcapped_ce_backward(
+        logits, targets, lse, grad_losses, ctx.softcap
+    )
+    return grad_logits, None, None
+
+
+softcapped_ce_op.register_autograd(
+    _softcapped_ce_backward, setup_context=_softcapped_ce_setup_context,
+)
+
+
+def softcapped_cross_entropy(
+    logits: Tensor, targets: Tensor, softcap: float, reduction: str = "mean",
+) -> Tensor:
+    losses, _lse = torch.ops.pgsubmission1draft7fusedce.softcapped_ce(
+        logits, targets, float(softcap)
+    )
+    if reduction == "none":
+        return losses
+    if reduction == "sum":
+        return losses.sum()
+    if reduction == "mean":
+        return losses.mean()
+    raise ValueError(f"Unsupported reduction={reduction!r}")
+
+
+class Hyperparameters:
+    data_dir = os.environ.get("DATA_DIR", "./data/")
+    seed = int(os.environ.get("SEED", 1337))
+    run_id = os.environ.get("RUN_ID", str(uuid.uuid4()))
+    iterations = int(os.environ.get("ITERATIONS", 20000))
+    warmdown_frac = float(os.environ.get("WARMDOWN_FRAC", 0.75))
+    warmup_steps = int(os.environ.get("WARMUP_STEPS", 20))
+    train_batch_tokens = int(os.environ.get("TRAIN_BATCH_TOKENS", 786432))
+    # Fused softcapped CE (Triton). Training-only — forward_logits eval path still uses
+    # eager softcap+F.cross_entropy. Default ON since validated as at-worst neutral.
+    fused_ce_enabled = bool(int(os.environ.get("FUSED_CE_ENABLED", "1")))
+    train_seq_len = int(os.environ.get("TRAIN_SEQ_LEN", 2048))
+    train_log_every = int(os.environ.get("TRAIN_LOG_EVERY", 500))
+    max_wallclock_seconds = float(os.environ.get("MAX_WALLCLOCK_SECONDS", 6e2))
+    val_batch_tokens = int(os.environ.get("VAL_BATCH_TOKENS", 524288))
+    eval_seq_len = int(os.environ.get("EVAL_SEQ_LEN", 2048))
+    val_loss_every = int(os.environ.get("VAL_LOSS_EVERY", 4000))
+    vocab_size = int(os.environ.get("VOCAB_SIZE", 8192))
+    num_layers = int(os.environ.get("NUM_LAYERS", 11))
+    xsa_last_n = int(os.environ.get("XSA_LAST_N", 11))
+    model_dim = int(os.environ.get("MODEL_DIM", 512))
+    num_kv_heads = int(os.environ.get("NUM_KV_HEADS", 4))
+    num_heads = int(os.environ.get("NUM_HEADS", 8))
+    mlp_mult = float(os.environ.get("MLP_MULT", 4.0))
+    skip_gates_enabled = bool(int(os.environ.get("SKIP_GATES_ENABLED", "1")))
+    tie_embeddings = bool(int(os.environ.get("TIE_EMBEDDINGS", "1")))
+    logit_softcap = float(os.environ.get("LOGIT_SOFTCAP", 3e1))
+    rope_base = float(os.environ.get("ROPE_BASE", 1e4))
+    rope_dims = int(os.environ.get("ROPE_DIMS", 16))
+    rope_train_seq_len = int(os.environ.get("ROPE_TRAIN_SEQ_LEN", 2048))
+    rope_yarn = bool(int(os.environ.get("ROPE_YARN", "0")))
+    ln_scale = bool(int(os.environ.get("LN_SCALE", "1")))
+    qk_gain_init = float(os.environ.get("QK_GAIN_INIT", 5.0))
+    num_loops = int(os.environ.get("NUM_LOOPS", 2))
+    loop_start = int(os.environ.get("LOOP_START", 3))
+    loop_end = int(os.environ.get("LOOP_END", 5))
+    enable_looping_at = float(os.environ.get("ENABLE_LOOPING_AT", 0.35))
+    parallel_start_layer = int(os.environ.get("PARALLEL_START_LAYER", 8))
+    parallel_final_lane = os.environ.get("PARALLEL_FINAL_LANE", "mean")
+    min_lr = float(os.environ.get("MIN_LR", 0.0))
+    embed_lr = float(os.environ.get("EMBED_LR", 0.6))
+    tied_embed_lr = float(os.environ.get("TIED_EMBED_LR", 0.03))
+    tied_embed_init_std = float(os.environ.get("TIED_EMBED_INIT_STD", 0.005))
+    matrix_lr = float(os.environ.get("MATRIX_LR", 0.026))
+    scalar_lr = float(os.environ.get("SCALAR_LR", 0.02))
+    muon_momentum = float(os.environ.get("MUON_MOMENTUM", 0.97))
+    muon_backend_steps = int(os.environ.get("MUON_BACKEND_STEPS", 5))
+    muon_momentum_warmup_start = float(
+        os.environ.get("MUON_MOMENTUM_WARMUP_START", 0.92)
+    )
+    muon_momentum_warmup_steps = int(os.environ.get("MUON_MOMENTUM_WARMUP_STEPS", 1500))
+    muon_row_normalize = bool(int(os.environ.get("MUON_ROW_NORMALIZE", "1")))
+    beta1 = float(os.environ.get("BETA1", 0.9))
+    beta2 = float(os.environ.get("BETA2", 0.95))
+    adam_eps = float(os.environ.get("ADAM_EPS", 1e-08))
+    grad_clip_norm = float(os.environ.get("GRAD_CLIP_NORM", 0.3))
+    eval_stride = int(os.environ.get("EVAL_STRIDE", 64))
+    adam_wd = float(os.environ.get("ADAM_WD", 0.02))
+    muon_wd = float(os.environ.get("MUON_WD", 0.095))
+    embed_wd = float(os.environ.get("EMBED_WD", 0.085))
+    ema_decay = float(os.environ.get("EMA_DECAY", 0.9965))
+    ttt_enabled = bool(int(os.environ.get("TTT_ENABLED", "1")))
+    ttt_lora_rank = int(os.environ.get("TTT_LORA_RANK", 96))
+    ttt_lora_lr = float(os.environ.get("TTT_LORA_LR", 0.0001))
+    ttt_chunk_size = int(os.environ.get("TTT_CHUNK_SIZE", 48))
+    ttt_eval_seq_len = int(os.environ.get("TTT_EVAL_SEQ_LEN", 2048))
+    ttt_batch_size = int(os.environ.get("TTT_BATCH_SIZE", 64))
+    ttt_grad_steps = int(os.environ.get("TTT_GRAD_STEPS", 1))
+    ttt_weight_decay = float(os.environ.get("TTT_WEIGHT_DECAY", 1.0))
+    ttt_beta1 = float(os.environ.get("TTT_BETA1", 0))
+    ttt_beta2 = float(os.environ.get("TTT_BETA2", 0.999))
+    ttt_k_lora = bool(int(os.environ.get("TTT_K_LORA", "1")))
+    ttt_mlp_lora = bool(int(os.environ.get("TTT_MLP_LORA", "1")))
+    ttt_o_lora = bool(int(os.environ.get("TTT_O_LORA", "1")))
+    ttt_optimizer = os.environ.get("TTT_OPTIMIZER", "adam")
+    ttt_eval_batches = os.environ.get("TTT_EVAL_BATCHES", "")
+    val_doc_fraction = float(os.environ.get("VAL_DOC_FRACTION", 1.0))
+    compressor = os.environ.get("COMPRESSOR", "brotli")
+    gptq_calibration_batches = int(os.environ.get("GPTQ_CALIBRATION_BATCHES", 16))
+    gptq_reserve_seconds = float(os.environ.get("GPTQ_RESERVE_SECONDS", 4.0))
+    phased_ttt_prefix_docs = int(os.environ.get("PHASED_TTT_PREFIX_DOCS", 2000))
+    phased_ttt_num_phases = int(os.environ.get("PHASED_TTT_NUM_PHASES", 1))
+    global_ttt_lr = float(os.environ.get("GLOBAL_TTT_LR", 0.001))
+    global_ttt_momentum = float(os.environ.get("GLOBAL_TTT_MOMENTUM", 0.9))
+    global_ttt_epochs = int(os.environ.get("GLOBAL_TTT_EPOCHS", 1))
+    global_ttt_chunk_tokens = int(os.environ.get("GLOBAL_TTT_CHUNK_TOKENS", 32768))
+    global_ttt_batch_seqs = int(os.environ.get("GLOBAL_TTT_BATCH_SEQS", 32))
+    global_ttt_warmup_start_lr = float(os.environ.get("GLOBAL_TTT_WARMUP_START_LR", 0.0))
+    global_ttt_warmup_chunks = int(os.environ.get("GLOBAL_TTT_WARMUP_CHUNKS", 0))
+    global_ttt_grad_clip = float(os.environ.get("GLOBAL_TTT_GRAD_CLIP", 1.0))
+    global_ttt_respect_doc_boundaries = bool(int(os.environ.get("GLOBAL_TTT_RESPECT_DOC_BOUNDARIES", "1")))
+    matrix_bits = int(os.environ.get("MATRIX_BITS", 6))
+    embed_bits = int(os.environ.get("EMBED_BITS", 8))
+    matrix_clip_sigmas = float(os.environ.get("MATRIX_CLIP_SIGMAS", 12.85))
+    embed_clip_sigmas = float(os.environ.get("EMBED_CLIP_SIGMAS", 2e1))
+    mlp_clip_sigmas = float(os.environ.get("MLP_CLIP_SIGMAS", 10.0))
+    attn_clip_sigmas = float(os.environ.get("ATTN_CLIP_SIGMAS", 13.0))
+    # AttnOutGate (per-head multiplicative output gate, PR #1667 MarioPaerle).
+    # Zero-init weight: 2*sigmoid(0)=1 -> transparent at start. Source defaults to
+    # block input x ('proj'); 'q' uses raw Q projection output.
+    attn_out_gate_enabled = bool(int(os.environ.get("ATTN_OUT_GATE_ENABLED", "0")))
+    attn_out_gate_src = os.environ.get("ATTN_OUT_GATE_SRC", "proj")
+    # SmearGate (input-dependent forward-1 token smear, modded-nanogpt @classiclarryd
+    # via PR #1667). x_t <- x_t + lam * sigmoid(W*x_t[:gate_window]) * x_{t-1}.
+    # lam=0 + W=0 -> transparent at init.
+    smear_gate_enabled = bool(int(os.environ.get("SMEAR_GATE_ENABLED", "0")))
+    # Window: first GATE_WINDOW dims of the source feed the gate projection.
+    gate_window = int(os.environ.get("GATE_WINDOW", 12))
+    # Gated Attention (Qwen, NeurIPS 2025 Best Paper, arXiv:2505.06708;
+    # qiuzh20/gated_attention). Per-head sigmoid gate on SDPA output, BEFORE
+    # out_proj. Gate input = full block input x (paper's headwise G1 variant
+    # driven from hidden_states). W_g shape (num_heads, dim), plain sigmoid.
+    # Near-zero init gives g~0.5 at step 0 (half attention output); per-block
+    # attn_scale (init 1.0) compensates during training. Name contains
+    # "attn_gate" so CONTROL_TENSOR_NAME_PATTERNS routes it to scalar AdamW.
+    gated_attn_enabled = bool(int(os.environ.get("GATED_ATTN_ENABLED", "0")))
+    gated_attn_init_std = float(os.environ.get("GATED_ATTN_INIT_STD", 0.01))
+    # Dedicated int8-per-row quantization for `attn_gate_w` tensors. These are
+    # small ((num_heads, dim) = (8, 512) = 4096 params) and bypass GPTQ via the
+    # numel<=65536 passthrough branch -> stored as fp16 (8 KB/layer, ~65 KB total
+    # compressed). int8-per-row cuts the raw tensor in half with negligible BPB
+    # impact: scales per head (8 values), symmetric quant over [-127, 127].
+    # No Hessian needed (gate weights not in collect_hessians()).
+    gated_attn_quant_gate = bool(int(os.environ.get("GATED_ATTN_QUANT_GATE", "0")))
+    # Sparse Attention Gate (modded-nanogpt-style). Keeps dense SDPA and only
+    # swaps the output-gate input to the first GATE_WINDOW residual dims.
+    # W_g: (num_heads, gate_window) = (8, 12) = 96 params/layer (~44K total),
+    # vs dense GatedAttn's (8, 512) = 4K/layer (~44K diff). Name "attn_gate_w"
+    # is shared so quant routing and int8 gate passthrough Just Work. Gate
+    # passthrough int8 still applies via GATED_ATTN_QUANT_GATE=1.
+    # Mutually exclusive with ATTN_OUT_GATE_ENABLED and GATED_ATTN_ENABLED.
+    sparse_attn_gate_enabled = bool(int(os.environ.get("SPARSE_ATTN_GATE_ENABLED", "0")))
+    sparse_attn_gate_init_std = float(os.environ.get("SPARSE_ATTN_GATE_INIT_STD", 0.0))
+    sparse_attn_gate_scale = float(os.environ.get("SPARSE_ATTN_GATE_SCALE", 1.0))
+    # LQER asymmetric rank-k correction on top-K quant-error tensors (PR #1530 v2 port).
+    # Computes SVD of E = W_fp - W_quant, packs top-r A,B as INT2/INT4 (asym) or INTk (sym).
+    lqer_enabled = bool(int(os.environ.get("LQER_ENABLED", "1")))
+    lqer_rank = int(os.environ.get("LQER_RANK", 4))
+    lqer_top_k = int(os.environ.get("LQER_TOP_K", 3))
+    lqer_factor_bits = int(os.environ.get("LQER_FACTOR_BITS", 4))
+    lqer_asym_enabled = bool(int(os.environ.get("LQER_ASYM_ENABLED", "1")))
+    lqer_asym_group = int(os.environ.get("LQER_ASYM_GROUP", "64"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    rank = int(os.environ.get("RANK", "0"))
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    is_main_process = rank == 0
+    grad_accum_steps = 8 // world_size
+    # CaseOps integration: optional override of dataset root + tokenizer path.
+    # When CASEOPS_ENABLED=1, the wrapper loads a per-token byte sidecar
+    # (fineweb_val_bytes_*.bin, identical shard layout to val_*.bin) and uses
+    # it as the canonical raw-byte budget for BPB accounting. The sidecar
+    # REPLACES the build_sentencepiece_luts byte-counting path entirely.
+    caseops_enabled = bool(int(os.environ.get("CASEOPS_ENABLED", "0")))
+    _default_caseops_data = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "datasets",
+        "fineweb10B_sp8192_lossless_caps_caseops_v1_reserved",
+    )
+    _default_caseops_tok = os.path.join(
+        data_dir,
+        "datasets",
+        "fineweb10B_sp8192_caseops",
+        "datasets",
+        "tokenizers",
+        "fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model",
+    )
+    if caseops_enabled:
+        datasets_dir = os.environ.get("DATA_PATH", _default_caseops_data)
+        tokenizer_path = os.environ.get("TOKENIZER_PATH", _default_caseops_tok)
+    else:
+        datasets_dir = os.environ.get(
+            "DATA_PATH",
+            os.path.join(data_dir, "datasets", f"fineweb10B_sp{vocab_size}"),
+        )
+        tokenizer_path = os.environ.get(
+            "TOKENIZER_PATH",
+            os.path.join(data_dir, "tokenizers", f"fineweb_{vocab_size}_bpe.model"),
+        )
+    train_files = os.path.join(datasets_dir, "fineweb_train_*.bin")
+    val_files = os.path.join(datasets_dir, "fineweb_val_*.bin")
+    val_bytes_files = os.path.join(datasets_dir, "fineweb_val_bytes_*.bin")
+    artifact_dir = os.environ.get("ARTIFACT_DIR", "")
+    logfile = (
+        os.path.join(artifact_dir, f"{run_id}.txt")
+        if artifact_dir
+        else f"logs/{run_id}.txt"
+    )
+    model_path = (
+        os.path.join(artifact_dir, "final_model.pt")
+        if artifact_dir
+        else "final_model.pt"
+    )
+    quantized_model_path = (
+        os.path.join(artifact_dir, "final_model.int6.ptz")
+        if artifact_dir
+        else "final_model.int6.ptz"
+    )
+
+
+# ===== 2026-04-30 SP10240 CaseOps MLP3.75 late045 promoted test car =====
+# Source of truth for this new experiment. The launcher only checks files and
+# calls this run.py; it does not define model or eval conditions.
+TEST_ID = "2026-04-30_pr1855_sp10240_caseops_mlp375_late045_8x"
+TEST_DATE = "2026-04-30"
+RUN_LABEL = "standard_8x"
+RUN_KIND = "new_experiment"
+SOURCE_PARENT = "legs/2026-04-30_pr1855_sp8192_lqer_smeargate_repro_8x/run.py"
+SOURCE_PARENT_SHA256 = "454f710d174be80f4603069ca952833d694f60d1d34c0c25703528323bc8878b"
+SOURCE_TOKENIZER_LANE = "scripts/prepare_sp10240_caseops_data.py"
+PARENT_RUN = "2026-04-30_caseops4_gpu1_mlp375_late045_dup_1x"
+HYPOTHESIS = (
+    "Promote the best legal SP10240 CaseOps Side4 mechanics candidate to a "
+    "clean standard 8x run: 11L MLP3.75 with loop2 enabled at 0.45, keeping "
+    "PR1855 LQER/pergroup/phased-TTT compression/eval machinery fixed."
+)
+SIZE_CAP_BYTES = 16000000
+BUILD_SECONDS = 600
+EVAL_SECONDS = 600
+
+Hyperparameters.test_id = TEST_ID
+Hyperparameters.test_date = TEST_DATE
+Hyperparameters.run_label = RUN_LABEL
+Hyperparameters.run_kind = RUN_KIND
+Hyperparameters.source_parent = SOURCE_PARENT
+Hyperparameters.source_parent_sha256 = SOURCE_PARENT_SHA256
+Hyperparameters.source_tokenizer_lane = SOURCE_TOKENIZER_LANE
+Hyperparameters.parent_run = PARENT_RUN
+Hyperparameters.hypothesis = HYPOTHESIS
+Hyperparameters.size_cap_bytes = SIZE_CAP_BYTES
+Hyperparameters.build_seconds = BUILD_SECONDS
+Hyperparameters.eval_seconds = EVAL_SECONDS
+
+Hyperparameters.data_dir = "/workspace/SOTA_FINAL/data"
+_caseops_root = os.path.join(
+    Hyperparameters.data_dir, "datasets", "fineweb10B_sp10240_caseops", "datasets"
+)
+Hyperparameters.vocab_size = 10240
+Hyperparameters.caseops_enabled = True
+Hyperparameters.datasets_dir = os.path.join(
+    _caseops_root, "datasets", "fineweb10B_sp10240_lossless_caps_caseops_v1_reserved"
+)
+Hyperparameters.train_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_train_*.bin")
+Hyperparameters.val_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_*.bin")
+Hyperparameters.val_bytes_files = os.path.join(Hyperparameters.datasets_dir, "fineweb_val_bytes_*.bin")
+Hyperparameters.tokenizer_path = os.path.join(
+    _caseops_root, "tokenizers", "fineweb_10240_bpe_lossless_caps_caseops_v1_reserved.model"
+)
+
+Hyperparameters.seed = 42
+Hyperparameters.run_id = "pr1855_sp10240_caseops_mlp375_late045_8x_seed42"
+Hyperparameters.artifact_dir = "logs"
+Hyperparameters.logfile = os.path.join(Hyperparameters.artifact_dir, f"{Hyperparameters.run_id}.txt")
+Hyperparameters.model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.pt")
+Hyperparameters.quantized_model_path = os.path.join(Hyperparameters.artifact_dir, "final_model.int6.ptz")
+Hyperparameters.iterations = 20000
+Hyperparameters.max_wallclock_seconds = float(BUILD_SECONDS)
+Hyperparameters.num_layers = 11
+Hyperparameters.xsa_last_n = 11
+Hyperparameters.model_dim = 512
+Hyperparameters.num_heads = 8
+Hyperparameters.num_kv_heads = 4
+Hyperparameters.mlp_mult = 3.75
+Hyperparameters.num_loops = 2
+Hyperparameters.loop_start = 3
+Hyperparameters.loop_end = 5
+Hyperparameters.enable_looping_at = 0.45
+Hyperparameters.parallel_start_layer = 8
+Hyperparameters.qk_gain_init = 5.25
+Hyperparameters.warmdown_frac = 0.85
+Hyperparameters.warmup_steps = 20
+Hyperparameters.min_lr = 0.1
+Hyperparameters.matrix_lr = 0.026
+Hyperparameters.beta2 = 0.99
+Hyperparameters.muon_backend_steps = 5
+Hyperparameters.grad_clip_norm = 0.3
+Hyperparameters.val_loss_every = 0
+Hyperparameters.ttt_enabled = True
+Hyperparameters.ttt_lora_rank = 80
+Hyperparameters.ttt_chunk_size = 48
+Hyperparameters.ttt_weight_decay = 0.5
+Hyperparameters.ttt_beta2 = 0.99
+Hyperparameters.phased_ttt_prefix_docs = 2500
+Hyperparameters.phased_ttt_num_phases = 3
+Hyperparameters.global_ttt_momentum = 0.9
+Hyperparameters.compressor = "pergroup"
+Hyperparameters.gptq_reserve_seconds = 0.5
+Hyperparameters.gptq_calibration_batches = 16
+Hyperparameters.matrix_bits = 6
+Hyperparameters.embed_bits = 7
+Hyperparameters.mlp_clip_sigmas = 11.5
+Hyperparameters.attn_clip_sigmas = 13.0
+Hyperparameters.embed_clip_sigmas = 14.0
+Hyperparameters.gated_attn_quant_gate = True
+Hyperparameters.sparse_attn_gate_enabled = True
+Hyperparameters.sparse_attn_gate_scale = 0.5
+Hyperparameters.gate_window = 12
+Hyperparameters.smear_gate_enabled = True
+Hyperparameters.lqer_enabled = True
+Hyperparameters.lqer_asym_enabled = True
+Hyperparameters.lqer_rank = 4
+Hyperparameters.lqer_factor_bits = 4
+Hyperparameters.lqer_asym_group = 64
+Hyperparameters.lqer_top_k = 3
+Hyperparameters.fused_ce_enabled = True
+
+_logger_hparams = None
+
+
+def set_logging_hparams(h):
+    global _logger_hparams
+    _logger_hparams = h
+
+
+def log(msg, console=True):
+    if _logger_hparams is None:
+        print(msg)
+        return
+    if _logger_hparams.is_main_process:
+        if console:
+            print(msg)
+        if _logger_hparams.logfile is not None:
+            with open(_logger_hparams.logfile, "a", encoding="utf-8") as f:
+                print(msg, file=f)
+
+
+class ValidationData:
+    def __init__(self, h, device):
+        self.sp = spm.SentencePieceProcessor(model_file=h.tokenizer_path)
+        if int(self.sp.vocab_size()) != h.vocab_size:
+            raise ValueError(
+                f"VOCAB_SIZE={h.vocab_size} does not match tokenizer vocab_size={int(self.sp.vocab_size())}"
+            )
+        self.val_tokens = load_validation_tokens(h.val_files, h.eval_seq_len)
+        self.caseops_enabled = bool(getattr(h, "caseops_enabled", False))
+        if self.caseops_enabled:
+            self.base_bytes_lut = None
+            self.has_leading_space_lut = None
+            self.is_boundary_token_lut = None
+        else:
+            (
+                self.base_bytes_lut,
+                self.has_leading_space_lut,
+                self.is_boundary_token_lut,
+            ) = build_sentencepiece_luts(self.sp, h.vocab_size, device)
+        self.val_bytes = None
+        if self.caseops_enabled:
+            self.val_bytes = load_validation_byte_sidecar(
+                h.val_bytes_files, h.eval_seq_len, self.val_tokens.numel()
+            )
+
+
+def build_sentencepiece_luts(sp, vocab_size, device):
+    sp_vocab_size = int(sp.vocab_size())
+    assert (
+        sp.piece_to_id("▁") != sp.unk_id()
+    ), "Tokenizer must have '▁' (space) as its own token for correct BPB byte counting"
+    table_size = max(sp_vocab_size, vocab_size)
+    base_bytes_np = np.zeros((table_size,), dtype=np.int16)
+    has_leading_space_np = np.zeros((table_size,), dtype=np.bool_)
+    is_boundary_token_np = np.ones((table_size,), dtype=np.bool_)
+    for token_id in range(sp_vocab_size):
+        if sp.is_control(token_id) or sp.is_unknown(token_id) or sp.is_unused(token_id):
+            continue
+        is_boundary_token_np[token_id] = False
+        if sp.is_byte(token_id):
+            base_bytes_np[token_id] = 1
+            continue
+        piece = sp.id_to_piece(token_id)
+        if piece.startswith("▁"):
+            has_leading_space_np[token_id] = True
+            piece = piece[1:]
+        base_bytes_np[token_id] = len(piece.encode("utf-8"))
+    return (
+        torch.tensor(base_bytes_np, dtype=torch.int16, device=device),
+        torch.tensor(has_leading_space_np, dtype=torch.bool, device=device),
+        torch.tensor(is_boundary_token_np, dtype=torch.bool, device=device),
+    )
+
+
+def load_validation_tokens(pattern, seq_len):
+    # Filter out CaseOps byte sidecar shards which share the val_*.bin glob.
+    files = [
+        Path(p)
+        for p in sorted(glob.glob(pattern))
+        if "_bytes_" not in Path(p).name
+    ]
+    if not files:
+        raise FileNotFoundError(f"No files found for pattern: {pattern}")
+    tokens = torch.cat([load_data_shard(file) for file in files]).contiguous()
+    usable = (tokens.numel() - 1) // seq_len * seq_len
+    if usable <= 0:
+        raise ValueError(f"Validation split is too short for TRAIN_SEQ_LEN={seq_len}")
+    return tokens[: usable + 1]
+
+
+def load_validation_byte_sidecar(pattern, seq_len, expected_len):
+    """Load CaseOps per-token byte sidecar(s). Same shard layout as token shards
+    (256 int32 header + uint16 array). Each entry = canonical raw-text byte
+    budget for that token in the corresponding val shard. Returns a CPU
+    int16 tensor sliced to match expected_len (i.e. val_tokens length)."""
+    files = [Path(p) for p in sorted(glob.glob(pattern))]
+    if not files:
+        raise FileNotFoundError(f"No byte sidecar files for pattern: {pattern}")
+    shards = [load_data_shard(file) for file in files]
+    # load_data_shard returns uint16 — that's exactly what the sidecar stores.
+    bytes_full = torch.cat(shards).contiguous()
+    if bytes_full.numel() < expected_len:
+        raise ValueError(
+            f"Byte sidecar too short: {bytes_full.numel()} < val_tokens {expected_len}"
+        )
+    return bytes_full[:expected_len].to(torch.int32)
+
+
+def load_data_shard(file):
+    header_bytes = 256 * np.dtype("<i4").itemsize
+    token_bytes = np.dtype("<u2").itemsize
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    num_tokens = int(header[2])
+    expected_size = header_bytes + num_tokens * token_bytes
+    if file.stat().st_size != expected_size:
+        raise ValueError(
+            f"Shard size mismatch for {file}: expected {expected_size} bytes"
+        )
+    tokens_np = np.fromfile(file, dtype="<u2", count=num_tokens, offset=header_bytes)
+    if tokens_np.size != num_tokens:
+        raise ValueError(f"Short read for {file}")
+    return torch.from_numpy(tokens_np.astype(np.uint16, copy=False))
+
+
+_SHARD_HEADER_BYTES = 256 * np.dtype("<i4").itemsize
+_SHARD_NTOKENS_CACHE = {}
+_MMAP_CACHE = {}
+
+
+def _read_num_tokens(file):
+    key = str(file)
+    cached = _SHARD_NTOKENS_CACHE.get(key)
+    if cached is not None:
+        return cached
+    header = np.fromfile(file, dtype="<i4", count=256)
+    if header.size != 256 or int(header[0]) != 20240520 or int(header[1]) != 1:
+        raise ValueError(f"Unexpected shard header for {file}")
+    n = int(header[2])
+    _SHARD_NTOKENS_CACHE[key] = n
+    return n
+
+
+def _get_shard_memmap(file):
+    key = str(file)
+    mm = _MMAP_CACHE.get(key)
+    if mm is not None:
+        return mm
+    n = _read_num_tokens(file)
+    mm = np.memmap(file, mode="r", dtype="<u2", offset=_SHARD_HEADER_BYTES, shape=(n,))
+    _MMAP_CACHE[key] = mm
+    return mm
+
+
+BOS_ID = None
+
+
+def get_next_multiple_of_n(v, n):
+    return ((v + n - 1) // n) * n
+
+
+def _build_cu_seqlens(bos_pos, total_len, device, max_doc_len=0, bucket_size=64):
+    if not bos_pos or bos_pos[0] != 0:
+        bos_pos = [0] + bos_pos
+    seg_starts = []
+    starts_with_end = bos_pos + [total_len]
+    for i in range(len(starts_with_end) - 1):
+        start = starts_with_end[i]
+        end = starts_with_end[i + 1]
+        if max_doc_len > 0:
+            pos = start
+            while pos < end:
+                seg_starts.append(pos)
+                pos += max_doc_len
+        else:
+            seg_starts.append(start)
+    boundaries = seg_starts + [total_len]
+    padded_len = get_next_multiple_of_n(len(boundaries), bucket_size)
+    cu = torch.full((padded_len,), total_len, dtype=torch.int32, device=device)
+    cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+    seg_ends = seg_starts[1:] + [total_len]
+    max_seqlen = max(end - start for start, end in zip(seg_starts, seg_ends))
+    return cu, max_seqlen
+
+class DocumentPackingLoader:
+    _shard_pool = ThreadPoolExecutor(1)
+
+    def __init__(self, h, device, cu_bucket_size=64):
+        self.rank = h.rank
+        self.world_size = h.world_size
+        self.device = device
+        self.cu_bucket_size = cu_bucket_size
+        self.max_seq_len = h.train_seq_len
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files
+        self.file_iter = iter(self.files)
+        self._init_shard(load_data_shard(next(self.file_iter)))
+        self._next_shard = self._submit_next_shard()
+        self._batch_pool = ThreadPoolExecutor(1)
+        self._prefetch_queue = []
+
+    def _init_shard(self, tokens):
+        global BOS_ID
+        self.tokens = tokens
+        self.shard_size = tokens.numel()
+        if BOS_ID is None:
+            BOS_ID = 1
+        self.bos_idx = (
+            (tokens == BOS_ID).nonzero(as_tuple=True)[0].to(torch.int64).cpu().numpy()
+        )
+        self.cursor = int(self.bos_idx[0])
+
+    def _submit_next_shard(self):
+        try:
+            path = next(self.file_iter)
+            return self._shard_pool.submit(load_data_shard, path)
+        except StopIteration:
+            return None
+
+    def _advance_shard(self):
+        if self._next_shard is None:
+            self.file_iter = iter(self.files)
+            self._next_shard = self._shard_pool.submit(
+                load_data_shard, next(self.file_iter)
+            )
+        self._init_shard(self._next_shard.result())
+        self._next_shard = self._submit_next_shard()
+
+    def _local_doc_starts(self, local_start, total_len):
+        lo = np.searchsorted(self.bos_idx, local_start, side="left")
+        hi = np.searchsorted(self.bos_idx, local_start + total_len, side="left")
+        return (self.bos_idx[lo:hi] - local_start).tolist()
+
+    def _prepare_batch(self, num_tokens_local, max_seq_len):
+        per_rank_span = num_tokens_local + 1
+        global_span = per_rank_span * self.world_size
+        while self.cursor + global_span > self.shard_size:
+            self._advance_shard()
+        local_start = self.cursor + self.rank * per_rank_span
+        buf = self.tokens[local_start : local_start + per_rank_span]
+        inputs = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        targets = torch.empty(per_rank_span - 1, dtype=torch.int64, pin_memory=True)
+        inputs.copy_(buf[:-1])
+        targets.copy_(buf[1:])
+        starts = self._local_doc_starts(local_start, inputs.numel())
+        cu_seqlens, max_seqlen = _build_cu_seqlens(
+            starts, inputs.numel(), inputs.device, max_seq_len, self.cu_bucket_size
+        )
+        cu_seqlens = cu_seqlens.pin_memory()
+        self.cursor += global_span
+        return inputs, targets, cu_seqlens, max_seqlen
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        num_tokens_local = global_tokens // (self.world_size * grad_accum_steps)
+        while len(self._prefetch_queue) < 2:
+            self._prefetch_queue.append(
+                self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        inputs, targets, cu_seqlens, max_seqlen = self._prefetch_queue.pop(0).result()
+        self._prefetch_queue.append(
+            self._batch_pool.submit(self._prepare_batch, num_tokens_local, self.max_seq_len))
+        return (
+            inputs[None].to(self.device, non_blocking=True),
+            targets[None].to(self.device, non_blocking=True),
+            cu_seqlens.to(self.device, non_blocking=True),
+            max_seqlen,
+        )
+
+
+class ShuffledSequenceLoader:
+    def __init__(self, h, device):
+        self.world_size = h.world_size
+        self.seq_len = h.train_seq_len
+        self.device = device
+        all_files = [Path(p) for p in sorted(glob.glob(h.train_files))]
+        if not all_files:
+            raise FileNotFoundError(f"No files found for pattern: {h.train_files}")
+        self.files = all_files[h.rank :: h.world_size]
+        self.rng = np.random.Generator(np.random.PCG64(h.rank))
+        self.num_tokens = [_read_num_tokens(f) for f in self.files]
+        self.start_inds = [[] for _ in self.files]
+        for si in range(len(self.files)):
+            self._reset_shard(si)
+
+    def _reset_shard(self, si):
+        max_phase = min(
+            self.seq_len - 1, max(0, self.num_tokens[si] - self.seq_len - 1)
+        )
+        phase = int(self.rng.integers(max_phase + 1)) if max_phase > 0 else 0
+        num_sequences = (self.num_tokens[si] - 1 - phase) // self.seq_len
+        sequence_order = self.rng.permutation(num_sequences)
+        self.start_inds[si] = (phase + sequence_order * self.seq_len).tolist()
+
+    def next_batch(self, global_tokens, grad_accum_steps):
+        device_tokens = global_tokens // (self.world_size * grad_accum_steps)
+        device_batch_size = device_tokens // self.seq_len
+        remaining = np.array([len(s) for s in self.start_inds], dtype=np.float64)
+        x = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        y = torch.empty((device_batch_size, self.seq_len), dtype=torch.int64)
+        for bi in range(device_batch_size):
+            total = remaining.sum()
+            if total <= 0:
+                for si in range(len(self.files)):
+                    self._reset_shard(si)
+                remaining = np.array(
+                    [len(s) for s in self.start_inds], dtype=np.float64
+                )
+                total = remaining.sum()
+            probs = remaining / total
+            si = int(self.rng.choice(len(self.files), p=probs))
+            start_ind = self.start_inds[si].pop()
+            remaining[si] -= 1
+            mm = _get_shard_memmap(self.files[si])
+            window = torch.as_tensor(
+                np.array(mm[start_ind : start_ind + self.seq_len + 1], dtype=np.int64)
+            )
+            x[bi] = window[:-1]
+            y[bi] = window[1:]
+        return x.to(self.device, non_blocking=True), y.to(
+            self.device, non_blocking=True
+        )
+
+
+class RMSNorm(nn.Module):
+    def __init__(self, eps=None):
+        super().__init__()
+        self.eps = eps
+
+    def forward(self, x):
+        return F.rms_norm(x, (x.size(-1),), eps=self.eps)
+
+
+class CastedLinear(nn.Linear):
+    def forward(self, x):
+        w = self.weight.to(x.dtype)
+        bias = self.bias.to(x.dtype) if self.bias is not None else None
+        return F.linear(x, w, bias)
+
+
+@triton.jit
+def linear_leaky_relu_square_kernel(
+    a_desc,
+    b_desc,
+    c_desc,
+    aux_desc,
+    M,
+    N,
+    K,
+    BLOCK_SIZE_M: tl.constexpr,
+    BLOCK_SIZE_N: tl.constexpr,
+    BLOCK_SIZE_K: tl.constexpr,
+    NUM_SMS: tl.constexpr,
+    FORWARD: tl.constexpr,
+):
+    dtype = tl.bfloat16
+    start_pid = tl.program_id(axis=0)
+    num_pid_m = tl.cdiv(M, BLOCK_SIZE_M)
+    num_pid_n = tl.cdiv(N, BLOCK_SIZE_N)
+    k_tiles = tl.cdiv(K, BLOCK_SIZE_K)
+    num_tiles = num_pid_m * num_pid_n
+    tile_id_c = start_pid - NUM_SMS
+    for tile_id in tl.range(start_pid, num_tiles, NUM_SMS, flatten=True):
+        pid_m = tile_id // num_pid_n
+        pid_n = tile_id % num_pid_n
+        offs_am = pid_m * BLOCK_SIZE_M
+        offs_bn = pid_n * BLOCK_SIZE_N
+        accumulator = tl.zeros((BLOCK_SIZE_M, BLOCK_SIZE_N), dtype=tl.float32)
+        for ki in range(k_tiles):
+            offs_k = ki * BLOCK_SIZE_K
+            a = a_desc.load([offs_am, offs_k])
+            b = b_desc.load([offs_bn, offs_k])
+            accumulator = tl.dot(a, b.T, accumulator)
+        tile_id_c += NUM_SMS
+        offs_am_c = offs_am
+        offs_bn_c = offs_bn
+        acc = tl.reshape(accumulator, (BLOCK_SIZE_M, 2, BLOCK_SIZE_N // 2))
+        acc = tl.permute(acc, (0, 2, 1))
+        acc0, acc1 = tl.split(acc)
+        c0 = acc0.to(dtype)
+        c1 = acc1.to(dtype)
+        if not FORWARD:
+            pre0 = aux_desc.load([offs_am_c, offs_bn_c])
+            pre1 = aux_desc.load([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2])
+            c0 = c0 * tl.where(pre0 > 0, 2.0 * pre0, 0.5 * pre0)
+            c1 = c1 * tl.where(pre1 > 0, 2.0 * pre1, 0.5 * pre1)
+        c_desc.store([offs_am_c, offs_bn_c], c0)
+        c_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], c1)
+        if FORWARD:
+            aux0 = tl.where(c0 > 0, c0, 0.5 * c0)
+            aux1 = tl.where(c1 > 0, c1, 0.5 * c1)
+            aux_desc.store([offs_am_c, offs_bn_c], aux0 * aux0)
+            aux_desc.store([offs_am_c, offs_bn_c + BLOCK_SIZE_N // 2], aux1 * aux1)
+
+
+def linear_leaky_relu_square(a, b, aux=None):
+    M, K = a.shape
+    N, K2 = b.shape
+    assert K == K2
+    c = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    forward = aux is None
+    if aux is None:
+        aux = torch.empty((M, N), device=a.device, dtype=a.dtype)
+    num_sms = torch.cuda.get_device_properties(a.device).multi_processor_count
+    BLOCK_SIZE_M, BLOCK_SIZE_N, BLOCK_SIZE_K = 256, 128, 64
+    num_stages = 4 if forward else 3
+    a_desc = TensorDescriptor.from_tensor(a, [BLOCK_SIZE_M, BLOCK_SIZE_K])
+    b_desc = TensorDescriptor.from_tensor(b, [BLOCK_SIZE_N, BLOCK_SIZE_K])
+    c_desc = TensorDescriptor.from_tensor(c, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    aux_desc = TensorDescriptor.from_tensor(aux, [BLOCK_SIZE_M, BLOCK_SIZE_N // 2])
+    grid = lambda _meta: (
+        min(num_sms, triton.cdiv(M, BLOCK_SIZE_M) * triton.cdiv(N, BLOCK_SIZE_N)),
+    )
+    linear_leaky_relu_square_kernel[grid](
+        a_desc,
+        b_desc,
+        c_desc,
+        aux_desc,
+        M,
+        N,
+        K,
+        BLOCK_SIZE_M=BLOCK_SIZE_M,
+        BLOCK_SIZE_N=BLOCK_SIZE_N,
+        BLOCK_SIZE_K=BLOCK_SIZE_K,
+        NUM_SMS=num_sms,
+        FORWARD=forward,
+        num_stages=num_stages,
+        num_warps=8,
+    )
+    if forward:
+        return c, aux
+    return c
+
+
+class FusedLinearLeakyReLUSquareFunction(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, x, w1, w2):
+        x_flat = x.reshape(-1, x.shape[-1])
+        pre, post = linear_leaky_relu_square(x_flat, w1)
+        out = F.linear(post, w2)
+        ctx.save_for_backward(x, w1, w2, pre, post)
+        return out.view(*x.shape[:-1], out.shape[-1])
+
+    @staticmethod
+    def backward(ctx, grad_output):
+        x, w1, w2, pre, post = ctx.saved_tensors
+        x_flat = x.reshape(-1, x.shape[-1])
+        grad_output_flat = grad_output.reshape(-1, grad_output.shape[-1])
+        dw2 = grad_output_flat.T @ post
+        dpre = linear_leaky_relu_square(grad_output_flat, w2.T.contiguous(), aux=pre)
+        dw1 = dpre.T @ x_flat
+        dx = dpre @ w1
+        return dx.view_as(x), dw1, dw2
+
+
+FusedLeakyReLUSquareMLP = FusedLinearLeakyReLUSquareFunction.apply
+
+
+class Rotary(nn.Module):
+    def __init__(self, dim, base=1e4, train_seq_len=1024, rope_dims=0, yarn=True):
+        super().__init__()
+        self.dim = dim
+        self.base = base
+        self.train_seq_len = train_seq_len
+        self.yarn = yarn
+        self.rope_dims = rope_dims if rope_dims > 0 else dim
+        inv_freq = 1.0 / base ** (
+            torch.arange(0, self.rope_dims, 2, dtype=torch.float32) / self.rope_dims
+        )
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        self._seq_len_cached = 0
+        self._cos_cached = None
+        self._sin_cached = None
+
+    def forward(self, seq_len, device, dtype):
+        if (
+            self._cos_cached is None
+            or self._sin_cached is None
+            or self._seq_len_cached < seq_len
+            or self._cos_cached.device != device
+        ):
+            rd = self.rope_dims
+            if self.yarn and seq_len > self.train_seq_len:
+                scale = seq_len / self.train_seq_len
+                new_base = self.base * scale ** (rd / (rd - 2))
+                inv_freq = 1.0 / new_base ** (
+                    torch.arange(0, rd, 2, dtype=torch.float32, device=device) / rd
+                )
+            else:
+                inv_freq = self.inv_freq.float().to(device)
+            t = torch.arange(seq_len, device=device, dtype=torch.float32)
+            freqs = torch.outer(t, inv_freq)
+            self._cos_cached = freqs.cos()[None, :, None, :]
+            self._sin_cached = freqs.sin()[None, :, None, :]
+            self._seq_len_cached = seq_len
+        return self._cos_cached[:, :seq_len].to(dtype=dtype), self._sin_cached[:, :seq_len].to(dtype=dtype)
+
+
+def apply_rotary_emb(x, cos, sin, rope_dims=0):
+    if rope_dims > 0 and rope_dims < x.size(-1):
+        x_rope, x_pass = x[..., :rope_dims], x[..., rope_dims:]
+        half = rope_dims // 2
+        x1, x2 = x_rope[..., :half], x_rope[..., half:]
+        x_rope = torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+        return torch.cat((x_rope, x_pass), dim=-1)
+    half = x.size(-1) // 2
+    x1, x2 = x[..., :half], x[..., half:]
+    return torch.cat((x1 * cos + x2 * sin, x1 * -sin + x2 * cos), dim=-1)
+
+
+class CausalSelfAttention(nn.Module):
+    def __init__(
+        self, dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=True,
+        attn_out_gate=False, attn_out_gate_src="proj", gate_window=12,
+        gated_attn=False, gated_attn_init_std=0.01,
+        sparse_attn_gate=False, sparse_attn_gate_init_std=0.0, sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        if dim % num_heads != 0:
+            raise ValueError("model_dim must be divisible by num_heads")
+        if num_heads % num_kv_heads != 0:
+            raise ValueError("num_heads must be divisible by num_kv_heads")
+        if int(attn_out_gate) + int(gated_attn) + int(sparse_attn_gate) > 1:
+            raise ValueError(
+                "attn_out_gate, gated_attn, and sparse_attn_gate are mutually exclusive"
+            )
+        self.num_heads = num_heads
+        self.num_kv_heads = num_kv_heads
+        self.head_dim = dim // num_heads
+        if self.head_dim % 2 != 0:
+            raise ValueError("head_dim must be even for RoPE")
+        self.q_gain = nn.Parameter(
+            torch.full((num_heads,), qk_gain_init, dtype=torch.float32)
+        )
+        self.rope_dims = 0
+        self.rotary = Rotary(self.head_dim, base=rope_base, train_seq_len=train_seq_len, yarn=yarn)
+        self.use_xsa = False
+        # AttnOutGate (PR #1667 MarioPaerle): per-head multiplicative gate on attention
+        # output. CastedLinear so restore_fp32_params casts back to fp32 for GPTQ.
+        # _zero_init -> 2*sigmoid(0)=1 -> transparent at init.
+        self.attn_out_gate = attn_out_gate
+        self.attn_out_gate_src = attn_out_gate_src
+        self.gate_window = gate_window
+        if attn_out_gate:
+            self.attn_gate_proj = CastedLinear(gate_window, num_heads, bias=False)
+            self.attn_gate_proj._zero_init = True
+        # Gated Attention (arXiv:2505.06708, Qwen, NeurIPS 2025). Per-head sigmoid
+        # gate on SDPA output, BEFORE out_proj. Gate projection W_g: (num_heads, dim).
+        # Name "attn_gate_w" contains "attn_gate" substring so it matches
+        # CONTROL_TENSOR_NAME_PATTERNS and routes to the scalar AdamW group.
+        # fp32 Parameter -> restore_fp32_params path covers it via the ndim<2 OR
+        # name-pattern check (name matches "attn_gate"). Cast to x.dtype on use.
+        self.gated_attn = gated_attn
+        if gated_attn:
+            W = torch.empty(num_heads, dim, dtype=torch.float32)
+            nn.init.normal_(W, mean=0.0, std=gated_attn_init_std)
+            self.attn_gate_w = nn.Parameter(W)
+        # Sparse attention head-output gate (modded-nanogpt style). Keeps dense SDPA
+        # and only narrows the gate input to the first gate_window residual dims.
+        # W_g: (num_heads, gate_window). y_{t,h} <- sigmoid(scale * W_g_h @ x_t[:gate_window]) * y_{t,h}.
+        # Shares attn_gate_w name with dense GatedAttn so the quant routing
+        # (CONTROL_TENSOR_NAME_PATTERNS / attn_gate_w int8 passthrough) is unchanged.
+        self.sparse_attn_gate = sparse_attn_gate
+        self.sparse_attn_gate_scale = sparse_attn_gate_scale
+        if sparse_attn_gate:
+            W = torch.empty(num_heads, gate_window, dtype=torch.float32)
+            if sparse_attn_gate_init_std > 0:
+                nn.init.normal_(W, mean=0.0, std=sparse_attn_gate_init_std)
+            else:
+                nn.init.zeros_(W)
+            self.attn_gate_w = nn.Parameter(W)
+
+    def _xsa_efficient(self, y, v):
+        B, T, H, D = y.shape
+        Hkv = v.size(-2)
+        group = H // Hkv
+        y_g = y.reshape(B, T, Hkv, group, D)
+        vn = F.normalize(v, dim=-1).unsqueeze(-2)
+        proj = (y_g * vn).sum(dim=-1, keepdim=True) * vn
+        return (y_g - proj).reshape(B, T, H, D)
+
+    def forward(self, x, q_w, k_w, v_w, out_w, cu_seqlens=None, max_seqlen=0):
+        bsz, seqlen, dim = x.shape
+        # q_raw kept around as a tap point for attn_out_gate_src='q' (post-projection,
+        # pre-reshape, pre-RoPE).
+        q_raw = F.linear(x, q_w.to(x.dtype))
+        q = q_raw.reshape(bsz, seqlen, self.num_heads, self.head_dim)
+        k = F.linear(x, k_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        v = F.linear(x, v_w.to(x.dtype)).reshape(bsz, seqlen, self.num_kv_heads, self.head_dim)
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = self.rotary(seqlen, x.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, self.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, self.rope_dims)
+        q = q * self.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        if cu_seqlens is not None:
+            y = flash_attn_varlen_func(
+                q[0],
+                k[0],
+                v[0],
+                cu_seqlens_q=cu_seqlens,
+                cu_seqlens_k=cu_seqlens,
+                max_seqlen_q=max_seqlen,
+                max_seqlen_k=max_seqlen,
+                causal=True,
+                window_size=(-1, -1),
+            )[None]
+        else:
+            y = flash_attn_3_func(q, k, v, causal=True)
+        if self.use_xsa:
+            y = self._xsa_efficient(y, v)
+        # AttnOutGate inlined (PR #1667). Inline + .contiguous() barrier so torch.compile
+        # fullgraph=True is happy (this avoids the @torch.compiler.disable trap that
+        # crashed gates v3). Per-head gate on (B,T,H,D) tensor: g shape [B,T,H], broadcast
+        # over D via [..., None]. zero-init weight -> 2*sigmoid(0)=1 -> transparent.
+        if self.attn_out_gate:
+            gate_src = q_raw if self.attn_out_gate_src == "q" else x
+            gate_in = gate_src[..., : self.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(self.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (arXiv:2505.06708 G1). Inline + .contiguous() barrier so
+        # torch.compile fullgraph=True is happy. Per-head gate on (B,T,H,D): g shape
+        # [B,T,H], broadcast over D via [..., None]. Paper: g = sigmoid(x @ W_g.T)
+        # where W_g: (H, dim). .to(x.dtype) on fp32 param before broadcast with bf16.
+        if self.gated_attn:
+            x_c = x.contiguous()
+            g = torch.sigmoid(F.linear(x_c, self.attn_gate_w.to(x.dtype)))
+            y = y * g[..., None]
+        # Sparse head-output gate: narrower (gate_window) input, same shape g as GatedAttn.
+        if self.sparse_attn_gate:
+            gate_in = x[..., : self.gate_window].contiguous()
+            g = torch.sigmoid(
+                self.sparse_attn_gate_scale
+                * F.linear(gate_in, self.attn_gate_w.to(x.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        self._last_proj_input = y.detach() if getattr(self, "_calib", False) else None
+        return F.linear(y, out_w.to(x.dtype))
+
+
+class MLP(nn.Module):
+    def __init__(self, dim, mlp_mult):
+        super().__init__()
+        self.use_fused = True
+
+    def forward(self, x, up_w, down_w):
+        if self.training and self.use_fused:
+            return FusedLeakyReLUSquareMLP(x, up_w.to(x.dtype), down_w.to(x.dtype))
+        hidden = F.leaky_relu(F.linear(x, up_w.to(x.dtype)), negative_slope=0.5).square()
+        self._last_down_input = hidden.detach() if getattr(self, "_calib", False) else None
+        return F.linear(hidden, down_w.to(x.dtype))
+
+
+class Block(nn.Module):
+    def __init__(
+        self,
+        dim,
+        num_heads,
+        num_kv_heads,
+        mlp_mult,
+        rope_base,
+        qk_gain_init,
+        train_seq_len,
+        layer_idx=0,
+        ln_scale=False,
+        yarn=True,
+        attn_out_gate=False,
+        attn_out_gate_src="proj",
+        gate_window=12,
+        gated_attn=False,
+        gated_attn_init_std=0.01,
+        sparse_attn_gate=False,
+        sparse_attn_gate_init_std=0.0,
+        sparse_attn_gate_scale=1.0,
+    ):
+        super().__init__()
+        self.attn_norm = RMSNorm()
+        self.mlp_norm = RMSNorm()
+        self.attn = CausalSelfAttention(
+            dim, num_heads, num_kv_heads, rope_base, qk_gain_init, train_seq_len, yarn=yarn,
+            attn_out_gate=attn_out_gate, attn_out_gate_src=attn_out_gate_src, gate_window=gate_window,
+            gated_attn=gated_attn, gated_attn_init_std=gated_attn_init_std,
+            sparse_attn_gate=sparse_attn_gate,
+            sparse_attn_gate_init_std=sparse_attn_gate_init_std,
+            sparse_attn_gate_scale=sparse_attn_gate_scale,
+        )
+        self.mlp = MLP(dim, mlp_mult)
+        self.attn_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.mlp_scale = nn.Parameter(torch.ones(dim, dtype=torch.float32))
+        self.resid_mix = nn.Parameter(
+            torch.stack((torch.ones(dim), torch.zeros(dim))).float()
+        )
+        self.ln_scale_factor = 1.0 / math.sqrt(layer_idx + 1) if ln_scale else 1.0
+
+    def forward(self, x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=None, max_seqlen=0):
+        mix = self.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        attn_out = self.attn(
+            self.attn_norm(x_in) * self.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens,
+            max_seqlen=max_seqlen,
+        )
+        x_out = x_in + self.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        x_out = x_out + self.mlp_scale.to(dtype=x_out.dtype)[
+            None, None, :
+        ] * self.mlp(self.mlp_norm(x_out) * self.ln_scale_factor, up_w, down_w)
+        return x_out
+
+class GPT(nn.Module):
+    def __init__(self, h):
+        super().__init__()
+        if h.logit_softcap <= 0.0:
+            raise ValueError(f"logit_softcap must be positive, got {h.logit_softcap}")
+        self.tie_embeddings = h.tie_embeddings
+        self.tied_embed_init_std = h.tied_embed_init_std
+        self.logit_softcap = h.logit_softcap
+        self.fused_ce_enabled = bool(h.fused_ce_enabled)
+        self.tok_emb = nn.Embedding(h.vocab_size, h.model_dim)
+        self.num_layers = h.num_layers
+        head_dim = h.model_dim // h.num_heads
+        kv_dim = h.num_kv_heads * head_dim
+        hidden_dim = int(h.mlp_mult * h.model_dim)
+        self.qo_bank = nn.Parameter(torch.empty(2 * h.num_layers, h.model_dim, h.model_dim))
+        self.kv_bank = nn.Parameter(torch.empty(2 * h.num_layers, kv_dim, h.model_dim))
+        self.mlp_up_bank = nn.Parameter(torch.empty(h.num_layers, hidden_dim, h.model_dim))
+        self.mlp_down_bank = nn.Parameter(torch.empty(h.num_layers, h.model_dim, hidden_dim))
+        self.num_encoder_layers = h.num_layers // 2
+        self.num_decoder_layers = h.num_layers - self.num_encoder_layers
+        self.blocks = nn.ModuleList(
+            [
+                Block(
+                    h.model_dim,
+                    h.num_heads,
+                    h.num_kv_heads,
+                    h.mlp_mult,
+                    h.rope_base,
+                    h.qk_gain_init,
+                    h.train_seq_len,
+                    layer_idx=i,
+                    ln_scale=h.ln_scale,
+                    yarn=h.rope_yarn,
+                    attn_out_gate=h.attn_out_gate_enabled,
+                    attn_out_gate_src=h.attn_out_gate_src,
+                    gate_window=h.gate_window,
+                    gated_attn=h.gated_attn_enabled,
+                    gated_attn_init_std=h.gated_attn_init_std,
+                    sparse_attn_gate=h.sparse_attn_gate_enabled,
+                    sparse_attn_gate_init_std=h.sparse_attn_gate_init_std,
+                    sparse_attn_gate_scale=h.sparse_attn_gate_scale,
+                )
+                for i in range(h.num_layers)
+            ]
+        )
+        if h.rope_dims > 0:
+            head_dim = h.model_dim // h.num_heads
+            for block in self.blocks:
+                block.attn.rope_dims = h.rope_dims
+                block.attn.rotary = Rotary(
+                    head_dim,
+                    base=h.rope_base,
+                    train_seq_len=h.train_seq_len,
+                    rope_dims=h.rope_dims,
+                    yarn=h.rope_yarn,
+                )
+        self.final_norm = RMSNorm()
+        self.lm_head = (
+            None
+            if h.tie_embeddings
+            else CastedLinear(h.model_dim, h.vocab_size, bias=False)
+        )
+        if self.lm_head is not None:
+            self.lm_head._zero_init = True
+        if h.xsa_last_n > 0:
+            for i in range(max(0, h.num_layers - h.xsa_last_n), h.num_layers):
+                self.blocks[i].attn.use_xsa = True
+        self.looping_active = False
+        if h.num_loops > 0:
+            loop_seg = list(range(h.loop_start, h.loop_end + 1))
+            all_indices = list(range(h.loop_start))
+            for _ in range(h.num_loops + 1):
+                all_indices.extend(loop_seg)
+            all_indices.extend(range(h.loop_end + 1, h.num_layers))
+            num_enc = len(all_indices) // 2
+            self.encoder_indices = all_indices[:num_enc]
+            self.decoder_indices = all_indices[num_enc:]
+        else:
+            self.encoder_indices = list(range(self.num_encoder_layers))
+            self.decoder_indices = list(range(self.num_encoder_layers, h.num_layers))
+        self.num_skip_weights = min(
+            len(self.encoder_indices), len(self.decoder_indices)
+        )
+        self.skip_weights = nn.Parameter(
+            torch.ones(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+        )
+        self.skip_gates = (
+            nn.Parameter(
+                torch.zeros(self.num_skip_weights, h.model_dim, dtype=torch.float32)
+            )
+            if h.skip_gates_enabled
+            else None
+        )
+        self.parallel_start_layer = h.parallel_start_layer
+        self.parallel_final_lane = h.parallel_final_lane.lower()
+        self.parallel_post_lambdas = nn.Parameter(
+            torch.ones(h.num_layers, 2, 2, dtype=torch.float32)
+        )
+        self.parallel_resid_lambdas = nn.Parameter(
+            torch.full((h.num_layers, 2), 1.1, dtype=torch.float32)
+        )
+        # SmearGate (PR #1667 / modded-nanogpt @classiclarryd):
+        #   x_t <- x_t + lam * sigmoid(W * x_t[:gate_window]) * x_{t-1}.
+        # Per-token forward-1 smear of the embedding lane. W zero-init + lam=0 ->
+        # transparent at init. Uses CastedLinear so restore_fp32_params handles dtype.
+        self.smear_gate_enabled = h.smear_gate_enabled
+        if self.smear_gate_enabled:
+            self.smear_window = h.gate_window
+            self.smear_gate = CastedLinear(self.smear_window, 1, bias=False)
+            self.smear_gate._zero_init = True
+            self.smear_lambda = nn.Parameter(torch.zeros(1, dtype=torch.float32))
+        self._init_weights()
+
+    def _init_weights(self):
+        if self.tie_embeddings:
+            nn.init.normal_(self.tok_emb.weight, mean=0.0, std=self.tied_embed_init_std)
+        n = self.num_layers
+        proj_scale = 1.0 / math.sqrt(2 * n)
+        for i in range(n):
+            nn.init.orthogonal_(self.qo_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.qo_bank.data[n + i])
+            self.qo_bank.data[n + i].mul_(proj_scale)
+            nn.init.orthogonal_(self.kv_bank.data[i], gain=1.0)
+            nn.init.orthogonal_(self.kv_bank.data[n + i], gain=1.0)
+        for i in range(n):
+            nn.init.orthogonal_(self.mlp_up_bank.data[i], gain=1.0)
+            nn.init.zeros_(self.mlp_down_bank.data[i])
+            self.mlp_down_bank.data[i].mul_(proj_scale)
+        for name, module in self.named_modules():
+            if isinstance(module, nn.Linear):
+                if getattr(module, "_zero_init", False):
+                    nn.init.zeros_(module.weight)
+                elif (
+                    module.weight.ndim == 2
+                    and module.weight.shape[0] >= 64
+                    and module.weight.shape[1] >= 64
+                ):
+                    nn.init.orthogonal_(module.weight, gain=1.0)
+
+    def _bank_weights(self, i):
+        n = self.num_layers
+        return (
+            self.qo_bank[i],
+            self.kv_bank[i],
+            self.kv_bank[n + i],
+            self.qo_bank[n + i],
+            self.mlp_up_bank[i],
+            self.mlp_down_bank[i],
+        )
+
+    def _parallel_block(
+        self, block_idx, lane0, lane1, x0,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+        cu_seqlens=None, max_seqlen=0,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        attn_out = block.attn(
+            block.attn_norm(attn_read) * block.ln_scale_factor,
+            q_w, k_w, v_w, out_w,
+            cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+        )
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * block.mlp(
+            block.mlp_norm(mlp_read) * block.ln_scale_factor, up_w, down_w
+        )
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+    def _final_parallel_hidden(self, lane0, lane1):
+        if self.parallel_final_lane == "mlp":
+            return lane1
+        if self.parallel_final_lane == "attn":
+            return lane0
+        return 0.5 * (lane0 + lane1)
+
+    def _forward_hidden(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        """Run the encoder/decoder stack to the final RMSNorm; returns pre-projection hidden.
+        Shared by eval (softcap+projection via forward_logits) and train (fused CE path)."""
+        x = self.tok_emb(input_ids)
+        # SmearGate (PR #1667). lam=0 + W=0 -> identity at init.
+        # Cross-doc leak fix: zero the prev-token smear at any position whose current token
+        # is BOS, so the BOS embedding starting doc N+1 in a packed stream is not
+        # contaminated by doc N's last token (audited issue on PR#1797 base).
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else range(self.num_encoder_layers)
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else range(
+                self.num_encoder_layers,
+                self.num_encoder_layers + self.num_decoder_layers,
+            )
+        )
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block(
+                    i, lane0, lane1, x0, q_w, k_w, v_w, out_w, up_w, down_w,
+                    cu_seqlens=cu_seqlens, max_seqlen=max_seqlen,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self.blocks[i](x, x0, q_w, k_w, v_w, out_w, up_w, down_w, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        return x
+
+    def _project_logits(self, hidden):
+        if self.tie_embeddings:
+            return F.linear(hidden, self.tok_emb.weight)
+        return self.lm_head(hidden)
+
+    def forward_logits(self, input_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        return self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+
+    def forward(self, input_ids, target_ids, cu_seqlens=None, max_seqlen=0):
+        hidden = self._forward_hidden(input_ids, cu_seqlens=cu_seqlens, max_seqlen=max_seqlen)
+        logits_proj = self._project_logits(hidden)
+        flat_targets = target_ids.reshape(-1)
+        # Fused softcapped-CE kernel (training path only). Applies softcap inside the
+        # Triton kernel; takes pre-softcap logits_proj. Non-fused path matches stock
+        # PR-1736 numerics exactly (softcap in fp32, then F.cross_entropy on fp32).
+        if self.fused_ce_enabled:
+            return softcapped_cross_entropy(
+                logits_proj.reshape(-1, logits_proj.size(-1)),
+                flat_targets,
+                self.logit_softcap,
+                reduction="mean",
+            )
+        logits = self.logit_softcap * torch.tanh(logits_proj / self.logit_softcap)
+        return F.cross_entropy(
+            logits.reshape(-1, logits.size(-1)).float(),
+            flat_targets,
+            reduction="mean",
+        )
+
+    def forward_ttt(self, input_ids, target_ids, lora):
+        x = self.tok_emb(input_ids)
+        # SmearGate on the TTT path — same inline compute as forward_logits.
+        # Cross-doc leak fix: see _forward_hidden comment.
+        if self.smear_gate_enabled:
+            sl = self.smear_lambda.to(dtype=x.dtype)
+            gate_in = x[:, 1:, : self.smear_window].contiguous()
+            g = sl * torch.sigmoid(self.smear_gate(gate_in))
+            not_bos = (input_ids[:, 1:] != BOS_ID).to(x.dtype).unsqueeze(-1)
+            x = torch.cat([x[:, :1], x[:, 1:] + g * x[:, :-1] * not_bos], dim=1)
+        x = F.rms_norm(x, (x.size(-1),))
+        x0 = x
+        skips = []
+        enc_iter = (
+            self.encoder_indices
+            if self.looping_active
+            else list(range(self.num_encoder_layers))
+        )
+        dec_iter = (
+            self.decoder_indices
+            if self.looping_active
+            else list(
+                range(
+                    self.num_encoder_layers,
+                    self.num_encoder_layers + self.num_decoder_layers,
+                )
+            )
+        )
+        slot = 0
+        for i in enc_iter:
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+            skips.append(x)
+        psl = self.parallel_start_layer
+        lane0 = None
+        lane1 = None
+        for skip_idx, i in enumerate(dec_iter):
+            q_w, k_w, v_w, out_w, up_w, down_w = self._bank_weights(i)
+            if i >= psl and psl > 0:
+                if lane0 is None:
+                    lane0 = x
+                    lane1 = x
+                if skip_idx < self.num_skip_weights and skips:
+                    skip = skips.pop()
+                    w = self.skip_weights[skip_idx].to(dtype=lane0.dtype)[None, None, :]
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=lane0.dtype))[None, None, :]
+                        lane0 = torch.lerp(w * skip, lane0, g)
+                    else:
+                        lane0 = lane0 + w * skip
+                lane0, lane1 = self._parallel_block_with_lora(
+                    i, lane0, lane1, x0, lora, slot,
+                    q_w, k_w, v_w, out_w, up_w, down_w,
+                )
+            else:
+                if skip_idx < self.num_skip_weights and skips:
+                    scaled_skip = (
+                        self.skip_weights[skip_idx].to(dtype=x.dtype)[None, None, :]
+                        * skips.pop()
+                    )
+                    if self.skip_gates is not None:
+                        g = torch.sigmoid(self.skip_gates[skip_idx].to(dtype=x.dtype))[None, None, :]
+                        x = torch.lerp(scaled_skip, x, g)
+                    else:
+                        x = x + scaled_skip
+                x = self._block_with_lora(self.blocks[i], x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w)
+            slot += 1
+        if lane0 is not None:
+            x = self._final_parallel_hidden(lane0, lane1)
+        x = self.final_norm(x)
+        if self.tie_embeddings:
+            logits = F.linear(x, self.tok_emb.weight)
+        else:
+            logits = self.lm_head(x)
+        logits = logits + lora.lm_head_lora(x)
+        logits = self.logit_softcap * torch.tanh(logits / self.logit_softcap)
+        bsz, sl, V = logits.shape
+        return F.cross_entropy(
+            logits.float().reshape(-1, V), target_ids.reshape(-1), reduction="none"
+        ).reshape(bsz, sl)
+
+    def _block_with_lora(self, block, x, x0, lora, slot, q_w, k_w, v_w, out_w, up_w, down_w):
+        mix = block.resid_mix.to(dtype=x.dtype)
+        x_in = mix[0][None, None, :] * x + mix[1][None, None, :] * x0
+        n = block.attn_norm(x_in) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        # Keep raw Q for AttnOutGate src='q' (matches forward path semantics).
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT path) — inline + .contiguous() barrier, same as the eval path.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT path). Gate input is n (post-norm block input), same
+        # as eval path. .to(n.dtype) on fp32 param before bf16 broadcast.
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT path) — must match the eval path in
+        # forward() exactly, else training (which applied the gate) and TTT eval (which
+        # skipped it) produce mismatched representations and catastrophic BPB regression.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        x_out = x_in + block.attn_scale.to(dtype=x_in.dtype)[None, None, :] * attn_out
+        mlp_n = block.mlp_norm(x_out) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        x_out = x_out + block.mlp_scale.to(dtype=x_out.dtype)[None, None, :] * mlp_out
+        return x_out
+
+    def _parallel_block_with_lora(
+        self, block_idx, lane0, lane1, x0, lora, slot,
+        q_w, k_w, v_w, out_w, up_w, down_w,
+    ):
+        block = self.blocks[block_idx]
+        mix = block.resid_mix.to(dtype=lane0.dtype)
+        attn_read = mix[0][None, None, :] * lane0 + mix[1][None, None, :] * x0
+        n = block.attn_norm(attn_read) * block.ln_scale_factor
+        attn = block.attn
+        bsz, seqlen, dim = n.shape
+        q_raw = F.linear(n, q_w.to(n.dtype)) + lora.q_loras[slot](n)
+        q = q_raw.reshape(bsz, seqlen, attn.num_heads, attn.head_dim)
+        k = F.linear(n, k_w.to(n.dtype))
+        if lora.k_loras is not None:
+            k = k + lora.k_loras[slot](n)
+        k = k.reshape(bsz, seqlen, attn.num_kv_heads, attn.head_dim)
+        v = (F.linear(n, v_w.to(n.dtype)) + lora.v_loras[slot](n)).reshape(
+            bsz, seqlen, attn.num_kv_heads, attn.head_dim
+        )
+        q = F.rms_norm(q, (q.size(-1),))
+        k = F.rms_norm(k, (k.size(-1),))
+        cos, sin = attn.rotary(seqlen, n.device, q.dtype)
+        q = apply_rotary_emb(q, cos, sin, attn.rope_dims)
+        k = apply_rotary_emb(k, cos, sin, attn.rope_dims)
+        q = q * attn.q_gain.to(dtype=q.dtype)[None, None, :, None]
+        y = flash_attn_3_func(q, k, v, causal=True)
+        if attn.use_xsa:
+            y = attn._xsa_efficient(y, v)
+        # AttnOutGate (TTT parallel path) — inline + .contiguous() barrier.
+        if attn.attn_out_gate:
+            gate_src = q_raw if attn.attn_out_gate_src == "q" else n
+            gate_in = gate_src[..., : attn.gate_window].contiguous()
+            g = 2.0 * torch.sigmoid(attn.attn_gate_proj(gate_in))
+            y = y * g[..., None]
+        # Gated Attention (TTT parallel path). Gate input is n (post-norm block input).
+        if attn.gated_attn:
+            n_c = n.contiguous()
+            g = torch.sigmoid(F.linear(n_c, attn.attn_gate_w.to(n.dtype)))
+            y = y * g[..., None]
+        # Sparse attention head-output gate (TTT parallel path) — must match the
+        # eval path in forward() to keep train/eval semantics in sync.
+        if attn.sparse_attn_gate:
+            gate_in = n[..., : attn.gate_window].contiguous()
+            g = torch.sigmoid(
+                attn.sparse_attn_gate_scale
+                * F.linear(gate_in, attn.attn_gate_w.to(n.dtype))
+            )
+            y = y * g[..., None]
+        y = y.reshape(bsz, seqlen, dim)
+        attn_out = F.linear(y, out_w.to(n.dtype))
+        if lora.o_loras is not None:
+            attn_out = attn_out + lora.o_loras[slot](n)
+        attn_out = block.attn_scale.to(dtype=attn_out.dtype)[None, None, :] * attn_out
+        mlp_read = lane1
+        mlp_n = block.mlp_norm(mlp_read) * block.ln_scale_factor
+        mlp_out = block.mlp(mlp_n, up_w, down_w)
+        if lora.mlp_loras is not None:
+            mlp_out = mlp_out + lora.mlp_loras[slot](mlp_n)
+        mlp_out = block.mlp_scale.to(dtype=lane1.dtype)[None, None, :] * mlp_out
+        attn_resid = self.parallel_resid_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        attn_post = self.parallel_post_lambdas[block_idx, 0].to(dtype=lane0.dtype)
+        mlp_resid = self.parallel_resid_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        mlp_post = self.parallel_post_lambdas[block_idx, 1].to(dtype=lane0.dtype)
+        lane0 = attn_resid * lane0 + attn_post[0] * attn_out + mlp_post[0] * mlp_out
+        lane1 = mlp_resid * lane1 + attn_post[1] * attn_out + mlp_post[1] * mlp_out
+        return lane0, lane1
+
+
+class BatchedLinearLoRA(nn.Module):
+    # PR-1767: rank-scaled output (alpha/rank), like standard LoRA. Decouples
+    # effective magnitude from rank so changing rank does not change LR scale.
+    _ALPHA = float(os.environ.get("TTT_LORA_ALPHA", "144"))
+    # PR-1767: optionally keep A warm across per-doc resets (only B is zeroed).
+    # Accumulates useful feature directions across documents within a TTT phase.
+    _WARM_START_A = bool(int(os.environ.get("TTT_WARM_START_A", "1")))
+
+    def __init__(self, bsz, in_features, out_features, rank):
+        super().__init__()
+        self._bound = 1.0 / math.sqrt(in_features)
+        self._scale = self._ALPHA / rank
+        self.A = nn.Parameter(
+            torch.empty(bsz, rank, in_features).uniform_(-self._bound, self._bound)
+        )
+        self.B = nn.Parameter(torch.zeros(bsz, out_features, rank))
+
+    def reset(self):
+        with torch.no_grad():
+            if not self._WARM_START_A:
+                self.A.uniform_(-self._bound, self._bound)
+            self.B.zero_()
+
+    def forward(self, x):
+        return ((x @ self.A.transpose(1, 2)) @ self.B.transpose(1, 2)) * self._scale
+
+
+class BatchedTTTLoRA(nn.Module):
+    def __init__(self, bsz, model, rank, k_lora=True, mlp_lora=True, o_lora=True):
+        super().__init__()
+        self.bsz = bsz
+        dim = model.qo_bank.shape[-1]
+        vocab = model.tok_emb.num_embeddings
+        if getattr(model, "looping_active", False):
+            num_slots = len(model.encoder_indices) + len(model.decoder_indices)
+        else:
+            num_slots = len(model.blocks)
+        kv_dim = model.blocks[0].attn.num_kv_heads * (
+            dim // model.blocks[0].attn.num_heads
+        )
+        embed_dim = model.tok_emb.embedding_dim
+        self.lm_head_lora = BatchedLinearLoRA(bsz, embed_dim, vocab, rank)
+        self.q_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+        )
+        self.v_loras = nn.ModuleList(
+            [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+        )
+        self.k_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, kv_dim, rank) for _ in range(num_slots)]
+            )
+            if k_lora
+            else None
+        )
+        self.mlp_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if mlp_lora
+            else None
+        )
+        self.o_loras = (
+            nn.ModuleList(
+                [BatchedLinearLoRA(bsz, dim, dim, rank) for _ in range(num_slots)]
+            )
+            if o_lora
+            else None
+        )
+
+    def reset(self):
+        with torch.no_grad():
+            self.lm_head_lora.reset()
+            for loras in [self.q_loras, self.v_loras, self.k_loras,
+                          self.mlp_loras, self.o_loras]:
+                if loras is not None:
+                    for lora in loras:
+                        lora.reset()
+
+
+# Polar Express per-iteration minimax Newton-Schulz coefficients (PR #1344).
+# Replaces the fixed (3.4445, -4.775, 2.0315) coefficients of stock Muon.
+# Applied at backend_steps=5 — taking more than 5 iterations from this list
+# falls back to the final (converged) tuple via the slice guard below.
+_PE_COEFFS = (
+    (8.156554524902461, -22.48329292557795, 15.878769915207462),
+    (4.042929935166739, -2.808917465908714, 0.5000178451051316),
+    (3.8916678022926607, -2.772484153217685, 0.5060648178503393),
+    (3.285753657755655, -2.3681294933425376, 0.46449024233003106),
+    (2.3465413258596377, -1.7097828382687081, 0.42323551169305323),
+)
+
+
+@torch.compile
+def zeropower_via_newtonschulz5(G, steps=10, eps=1e-07):
+    was_2d = G.ndim == 2
+    if was_2d:
+        G = G.unsqueeze(0)
+    X = G.bfloat16()
+    transposed = X.size(-2) > X.size(-1)
+    if transposed:
+        X = X.mT
+    X = X / (X.norm(dim=(-2, -1), keepdim=True) + eps)
+    coeffs = _PE_COEFFS[:steps] if steps <= len(_PE_COEFFS) else _PE_COEFFS
+    for a, b, c in coeffs:
+        A = X @ X.mT
+        B = b * A + c * (A @ A)
+        X = a * X + B @ X
+    if transposed:
+        X = X.mT
+    if was_2d:
+        X = X.squeeze(0)
+    return X
+
+
+class Muon(torch.optim.Optimizer):
+    def __init__(
+        self,
+        params,
+        lr,
+        momentum,
+        backend_steps,
+        nesterov=True,
+        weight_decay=0.0,
+        row_normalize=False,
+    ):
+        super().__init__(
+            params,
+            dict(
+                lr=lr,
+                momentum=momentum,
+                backend_steps=backend_steps,
+                nesterov=nesterov,
+                weight_decay=weight_decay,
+                row_normalize=row_normalize,
+            ),
+        )
+        self._built = False
+
+    def _build(self):
+        self._distributed = dist.is_available() and dist.is_initialized()
+        self._world_size = dist.get_world_size() if self._distributed else 1
+        self._rank = dist.get_rank() if self._distributed else 0
+        ws = self._world_size
+        self._bank_meta = []
+        for group in self.param_groups:
+            for p in group["params"]:
+                B = p.shape[0]
+                padded_B = ((B + ws - 1) // ws) * ws
+                shard_B = padded_B // ws
+                tail = p.shape[1:]
+                dev = p.device
+                self._bank_meta.append({
+                    "p": p,
+                    "B": B,
+                    "padded_grad": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "shard_mom": torch.zeros(shard_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "full_update": torch.zeros(padded_B, *tail, device=dev, dtype=torch.bfloat16),
+                    "scale": max(1, p.shape[-2] / p.shape[-1]) ** 0.5,
+                })
+        self._bank_meta.sort(key=lambda m: -m["p"].numel())
+        self._built = True
+
+    def launch_reduce_scatters(self):
+        if not self._built:
+            self._build()
+        if not self._distributed:
+            return
+        self._rs_futures = []
+        for m in self._bank_meta:
+            p = m["p"]
+            if p.grad is None:
+                self._rs_futures.append(None)
+                continue
+            pg = m["padded_grad"]
+            pg[: m["B"]].copy_(p.grad)
+            fut = dist.reduce_scatter_tensor(
+                m["shard"], pg, op=dist.ReduceOp.AVG, async_op=True
+            )
+            self._rs_futures.append(fut)
+
+    @torch.no_grad()
+    def step(self, closure=None):
+        loss = None
+        if closure is not None:
+            with torch.enable_grad():
+                loss = closure()
+        if not self._built:
+            self._build()
+        for group in self.param_groups:
+            lr = group["lr"]
+            momentum = group["momentum"]
+            backend_steps = group["backend_steps"]
+            nesterov = group["nesterov"]
+            wd = group.get("weight_decay", 0.0)
+            row_normalize = group.get("row_normalize", False)
+            prev_ag_handle = None
+            prev_m = None
+            sharded = self._distributed and hasattr(self, "_rs_futures")
+            for idx, m in enumerate(self._bank_meta):
+                p = m["p"]
+                if p.grad is None:
+                    continue
+                if prev_ag_handle is not None:
+                    prev_ag_handle.wait()
+                    pp = prev_m["p"]
+                    upd = prev_m["full_update"][: prev_m["B"]]
+                    if wd > 0.0:
+                        pp.data.mul_(1.0 - lr * wd)
+                    pp.add_(upd, alpha=-lr * prev_m["scale"])
+                if sharded and self._rs_futures[idx] is not None:
+                    self._rs_futures[idx].wait()
+                    g = m["shard"]
+                    buf = m["shard_mom"]
+                else:
+                    g = p.grad.bfloat16()
+                    state = self.state[p]
+                    if "momentum_buffer" not in state:
+                        state["momentum_buffer"] = torch.zeros_like(g)
+                    buf = state["momentum_buffer"]
+                buf.mul_(momentum).add_(g)
+                if nesterov:
+                    update = g.add(buf, alpha=momentum)
+                else:
+                    update = buf
+                if row_normalize:
+                    rn = update.float().norm(dim=-1, keepdim=True).clamp_min(1e-07)
+                    update = update / rn.to(update.dtype)
+                update = zeropower_via_newtonschulz5(update, steps=backend_steps)
+                if sharded:
+                    prev_ag_handle = dist.all_gather_into_tensor(
+                        m["full_update"], update, async_op=True
+                    )
+                    prev_m = m
+                else:
+                    if wd > 0.0:
+                        p.data.mul_(1.0 - lr * wd)
+                    p.add_(update, alpha=-lr * m["scale"])
+            if prev_ag_handle is not None:
+                prev_ag_handle.wait()
+                pp = prev_m["p"]
+                upd = prev_m["full_update"][: prev_m["B"]]
+                if wd > 0.0:
+                    pp.data.mul_(1.0 - lr * wd)
+                pp.add_(upd, alpha=-lr * prev_m["scale"])
+            if hasattr(self, "_rs_futures"):
+                del self._rs_futures
+        return loss
+
+
+CONTROL_TENSOR_NAME_PATTERNS = tuple(
+    pattern
+    for pattern in os.environ.get(
+        "CONTROL_TENSOR_NAME_PATTERNS",
+        "attn_scale,attn_scales,mlp_scale,mlp_scales,resid_mix,resid_mixes,q_gain,skip_weight,skip_weights,skip_gates,parallel_post_lambdas,parallel_resid_lambdas,attn_gate_proj,attn_gate_w,smear_gate,smear_lambda",
+    ).split(",")
+    if pattern
+)
+
+
+PACKED_REPLICATED_GRAD_MAX_NUMEL = 1 << 15
+
+
+class Optimizers:
+    def __init__(self, h, base_model):
+        matrix_params = [
+            base_model.qo_bank,
+            base_model.kv_bank,
+            base_model.mlp_up_bank,
+            base_model.mlp_down_bank,
+        ]
+        block_named_params = list(base_model.blocks.named_parameters())
+        scalar_params = [
+            p
+            for (name, p) in block_named_params
+            if p.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ]
+        if base_model.skip_weights.numel() > 0:
+            scalar_params.append(base_model.skip_weights)
+        if base_model.skip_gates is not None and base_model.skip_gates.numel() > 0:
+            scalar_params.append(base_model.skip_gates)
+        if base_model.parallel_post_lambdas is not None:
+            scalar_params.append(base_model.parallel_post_lambdas)
+        if base_model.parallel_resid_lambdas is not None:
+            scalar_params.append(base_model.parallel_resid_lambdas)
+        # SmearGate params live on GPT root (not in .blocks), so add them by hand.
+        # Both are tiny (gate_window scalars + 1 lambda). Optimized via scalar Adam.
+        if getattr(base_model, "smear_gate_enabled", False):
+            scalar_params.append(base_model.smear_gate.weight)
+            scalar_params.append(base_model.smear_lambda)
+        token_lr = h.tied_embed_lr if h.tie_embeddings else h.embed_lr
+        tok_params = [
+            {"params": [base_model.tok_emb.weight], "lr": token_lr, "base_lr": token_lr}
+        ]
+        self.optimizer_tok = torch.optim.AdamW(
+            tok_params,
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.embed_wd,
+            fused=True,
+        )
+        self.optimizer_muon = Muon(
+            matrix_params,
+            lr=h.matrix_lr,
+            momentum=h.muon_momentum,
+            backend_steps=h.muon_backend_steps,
+            weight_decay=h.muon_wd,
+            row_normalize=h.muon_row_normalize,
+        )
+        for group in self.optimizer_muon.param_groups:
+            group["base_lr"] = h.matrix_lr
+        self.optimizer_scalar = torch.optim.AdamW(
+            [{"params": scalar_params, "lr": h.scalar_lr, "base_lr": h.scalar_lr}],
+            betas=(h.beta1, h.beta2),
+            eps=h.adam_eps,
+            weight_decay=h.adam_wd,
+            fused=True,
+        )
+        self.optimizers = [
+            self.optimizer_tok,
+            self.optimizer_muon,
+            self.optimizer_scalar,
+        ]
+        self.replicated_params = list(tok_params[0]["params"])
+        self.replicated_params.extend(scalar_params)
+        self.replicated_large_params = []
+        self.replicated_packed_params = []
+        for p in self.replicated_params:
+            if p.numel() <= PACKED_REPLICATED_GRAD_MAX_NUMEL:
+                self.replicated_packed_params.append(p)
+            else:
+                self.replicated_large_params.append(p)
+        self._aux_stream = torch.cuda.Stream()
+
+    def __iter__(self):
+        return iter(self.optimizers)
+
+    def zero_grad_all(self):
+        for opt in self.optimizers:
+            opt.zero_grad(set_to_none=True)
+
+    def _all_reduce_packed_grads(self):
+        grads_by_key = collections.defaultdict(list)
+        for p in self.replicated_packed_params:
+            if p.grad is not None:
+                grads_by_key[(p.grad.device, p.grad.dtype)].append(p.grad)
+        for grads in grads_by_key.values():
+            flat = torch.empty(
+                sum(g.numel() for g in grads),
+                device=grads[0].device,
+                dtype=grads[0].dtype,
+            )
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                flat[offset : offset + n].copy_(g.contiguous().view(-1))
+                offset += n
+            dist.all_reduce(flat, op=dist.ReduceOp.AVG)
+            offset = 0
+            for g in grads:
+                n = g.numel()
+                g.copy_(flat[offset : offset + n].view_as(g))
+                offset += n
+
+    def step(self, distributed=False):
+        self.optimizer_muon.launch_reduce_scatters()
+        if distributed:
+            reduce_handles = [
+                dist.all_reduce(p.grad, op=dist.ReduceOp.AVG, async_op=True)
+                for p in self.replicated_large_params
+                if p.grad is not None
+            ]
+            self._all_reduce_packed_grads()
+            for handle in reduce_handles:
+                handle.wait()
+        self._aux_stream.wait_stream(torch.cuda.current_stream())
+        with torch.cuda.stream(self._aux_stream):
+            self.optimizer_tok.step()
+            self.optimizer_scalar.step()
+        self.optimizer_muon.step()
+        torch.cuda.current_stream().wait_stream(self._aux_stream)
+        self.zero_grad_all()
+
+
+def restore_fp32_params(model):
+    for module in model.modules():
+        if isinstance(module, CastedLinear):
+            module.float()
+    for name, param in model.named_parameters():
+        if (
+            param.ndim < 2
+            or any(pattern in name for pattern in CONTROL_TENSOR_NAME_PATTERNS)
+        ) and param.dtype != torch.float32:
+            param.data = param.data.float()
+    if hasattr(model, "qo_bank") and model.qo_bank is not None:
+        model.qo_bank.data = model.qo_bank.data.float()
+        model.kv_bank.data = model.kv_bank.data.float()
+    model.mlp_up_bank.data = model.mlp_up_bank.data.float()
+    model.mlp_down_bank.data = model.mlp_down_bank.data.float()
+
+
+def collect_hessians(model, train_loader, h, device, n_calibration_batches=64):
+    hessians = {}
+    hooks = []
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = True
+        block.mlp._calib = True
+        block.mlp.use_fused = False
+
+    def make_attn_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            for suffix in ["c_q", "c_k", "c_v"]:
+                name = f"blocks.{layer_idx}.attn.{suffix}.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            y = module._last_proj_input
+            if y is not None:
+                y = y.float()
+                if y.ndim == 3:
+                    y = y.reshape(-1, y.shape[-1])
+                name = f"blocks.{layer_idx}.attn.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        y.shape[1], y.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(y.T, y)
+        return hook_fn
+
+    def make_mlp_hook(layer_idx):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            name = f"blocks.{layer_idx}.mlp.fc.weight"
+            if name not in hessians:
+                hessians[name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[name].addmm_(x.T, x)
+            h_act = module._last_down_input
+            if h_act is not None:
+                h_act = h_act.float()
+                if h_act.ndim == 3:
+                    h_act = h_act.reshape(-1, h_act.shape[-1])
+                name = f"blocks.{layer_idx}.mlp.proj.weight"
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        h_act.shape[1], h_act.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(h_act.T, h_act)
+        return hook_fn
+
+    for i, block in enumerate(model.blocks):
+        hooks.append(block.attn.register_forward_hook(make_attn_hook(i)))
+        hooks.append(block.mlp.register_forward_hook(make_mlp_hook(i)))
+
+    # Hessian hooks for embedding factorization projection layers
+    def make_linear_input_hook(weight_name):
+        def hook_fn(module, inp, out):
+            x = inp[0].detach().float()
+            if x.ndim == 3:
+                x = x.reshape(-1, x.shape[-1])
+            if weight_name not in hessians:
+                hessians[weight_name] = torch.zeros(
+                    x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                )
+            hessians[weight_name].addmm_(x.T, x)
+        return hook_fn
+
+    if model.tie_embeddings:
+        hook_module = model.final_norm
+
+        def make_output_hook(name):
+            def hook_fn(module, inp, out):
+                x = out.detach().float()
+                if x.ndim == 3:
+                    x = x.reshape(-1, x.shape[-1])
+                if name not in hessians:
+                    hessians[name] = torch.zeros(
+                        x.shape[1], x.shape[1], dtype=torch.float32, device=device
+                    )
+                hessians[name].addmm_(x.T, x)
+            return hook_fn
+
+        hooks.append(
+            hook_module.register_forward_hook(make_output_hook("tok_emb.weight"))
+        )
+    model.eval()
+    with torch.no_grad():
+        for _ in range(n_calibration_batches):
+            x, _ = train_loader.next_batch(h.train_batch_tokens, h.grad_accum_steps)
+            model.forward_logits(x)
+    for hook in hooks:
+        hook.remove()
+    for i, block in enumerate(model.blocks):
+        block.attn._calib = False
+        block.mlp._calib = False
+        block.mlp.use_fused = True
+    for name in hessians:
+        hessians[name] = hessians[name].cpu() / n_calibration_batches
+    return hessians
+
+
+def gptq_quantize_weight(w, H, clip_sigmas=3.0, clip_range=63, block_size=128):
+    W_orig = w.float().clone()
+    rows, cols = W_orig.shape
+    H = H.float().clone()
+    dead = torch.diag(H) == 0
+    H[dead, dead] = 1
+    damp = 0.01 * H.diag().mean()
+    H.diagonal().add_(damp)
+    perm = torch.argsort(H.diag(), descending=True)
+    invperm = torch.argsort(perm)
+    W_perm = W_orig[:, perm].clone()
+    W_perm[:, dead[perm]] = 0
+    H = H[perm][:, perm]
+    Hinv = torch.cholesky_inverse(torch.linalg.cholesky(H))
+    Hinv = torch.linalg.cholesky(Hinv, upper=True)
+    row_std = W_orig.std(dim=1)
+    s = (clip_sigmas * row_std / clip_range).clamp_min(1e-10).to(torch.float16)
+    sf = s.float()
+    Q = torch.zeros(rows, cols, dtype=torch.int8)
+    W_work = W_perm.clone()
+    for i1 in range(0, cols, block_size):
+        i2 = min(i1 + block_size, cols)
+        W_block = W_work[:, i1:i2].clone()
+        Hinv_block = Hinv[i1:i2, i1:i2]
+        Err = torch.zeros(rows, i2 - i1)
+        for j in range(i2 - i1):
+            w_col = W_block[:, j]
+            d = Hinv_block[j, j]
+            q_col = torch.clamp(torch.round(w_col / sf), -clip_range, clip_range)
+            Q[:, i1 + j] = q_col.to(torch.int8)
+            err = (w_col - q_col.float() * sf) / d
+            Err[:, j] = err
+            W_block[:, j:] -= err.unsqueeze(1) * Hinv_block[j, j:].unsqueeze(0)
+        if i2 < cols:
+            W_work[:, i2:] -= Err @ Hinv[i1:i2, i2:]
+    return Q[:, invperm], s
+
+
+def _quantize_gate_int8_row(w):
+    # Symmetric int8-per-row quantization for small gate tensors. w shape
+    # (R, C) -> (R,) scales in fp16, int8 values in [-127, 127]. Single scale
+    # per row keeps accuracy high while halving storage vs fp16.
+    W = w.float().contiguous()
+    row_max = W.abs().amax(dim=1).clamp_min(1e-10)
+    s = (row_max / 127.0).to(torch.float16)
+    sf = s.float().view(-1, 1)
+    q = torch.clamp(torch.round(W / sf), -127, 127).to(torch.int8)
+    return q, s
+
+
+def _lqer_pack(A, B, bits):
+    rng = 2 ** (bits - 1) - 1
+    sA = (A.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    sB = (B.abs().amax(dim=1).clamp_min(1e-10) / rng).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    qB = torch.clamp(torch.round(B / sB.float().view(-1, 1)), -rng, rng).to(torch.int8)
+    return qA, sA, qB, sB
+
+
+def _lqer_pack_asym(A, B, g=64):
+    # A: INT2 per-matrix scalar (signed [-2,1], scale = |A|max/1.5).
+    sA = (A.abs().amax().clamp_min(1e-10) / 1.5).to(torch.float16)
+    qA = torch.clamp(torch.round(A / sA.float()), -2, 1).to(torch.int8)
+    # B: INT4 groupwise g over flattened B (signed [-8,7], per-group scale).
+    Bf = B.reshape(-1, g)
+    Bmax = Bf.abs().amax(dim=-1, keepdim=True).clamp_min(1e-10)
+    sB = (Bmax / 7.5).to(torch.float16).reshape(-1)
+    qB = torch.clamp(torch.round(Bf / sB.float().reshape(-1, 1)), -8, 7).to(
+        torch.int8
+    ).reshape(B.shape)
+    return qA, sA, qB, sB
+
+
+def gptq_mixed_quantize(state_dict, hessians, h):
+    result = {}
+    meta = {}
+    quant_gate = bool(getattr(h, "gated_attn_quant_gate", False))
+    lqer_on = bool(getattr(h, "lqer_enabled", False))
+    lqer_cands = {}
+    for (name, tensor) in state_dict.items():
+        t = tensor.detach().cpu().contiguous()
+        # Dedicated int8-per-row path for attn_gate_w (bypasses both GPTQ and
+        # fp16 passthrough). Applied BEFORE the numel<=65536 passthrough check
+        # so the gate tensor is routed here instead of to fp16.
+        if (
+            quant_gate
+            and t.is_floating_point()
+            and t.ndim == 2
+            and name.endswith(".attn_gate_w")
+            # Dense GatedAttn: (num_heads, dim) = (8, 512) = 4096.
+            # Sparse gate: (num_heads, gate_window) = (8, 12) = 96.
+            # Both need int8-per-row routing; the 1024 lower bound in stock
+            # PR-1736 presumed dense-only. Widen to catch both.
+            and 32 <= t.numel() <= 8192
+        ):
+            gq, gs = _quantize_gate_int8_row(t)
+            result[name + ".gq"] = gq
+            result[name + ".gs"] = gs
+            meta[name] = "gate_int8_row"
+            continue
+        if not t.is_floating_point() or t.numel() <= 65536:
+            result[name] = t.to(torch.float16) if t.is_floating_point() else t
+            meta[name] = "passthrough (float16)"
+            continue
+        if "tok_emb" in name:
+            cs = h.embed_clip_sigmas
+        elif ".mlp." in name:
+            cs = h.mlp_clip_sigmas
+        elif ".attn." in name:
+            cs = h.attn_clip_sigmas
+        else:
+            cs = h.matrix_clip_sigmas
+        bits = h.embed_bits if "tok_emb" in name else h.matrix_bits
+        clip_range = 2 ** (bits - 1) - 1
+        ret = gptq_quantize_weight(
+            t, hessians[name], clip_sigmas=cs, clip_range=clip_range
+        )
+        q, s = ret
+        result[name + ".q"] = q
+        result[name + ".scale"] = s
+        meta[name] = f"gptq (int{bits})"
+        if lqer_on:
+            W_q = q.float() * s.float().view(-1, 1)
+            E = t.float() - W_q
+            lqer_cands[name] = (E, float(E.norm()))
+    if lqer_on and lqer_cands:
+        top = sorted(lqer_cands.items(), key=lambda kv: -kv[1][1])[: h.lqer_top_k]
+        asym_on = bool(getattr(h, "lqer_asym_enabled", False))
+        asym_g = int(getattr(h, "lqer_asym_group", 64))
+        for (name, (E, _)) in top:
+            U, S, Vh = torch.linalg.svd(E, full_matrices=False)
+            r = min(h.lqer_rank, S.numel())
+            A = (U[:, :r] * S[:r]).contiguous()
+            B = Vh[:r, :].contiguous()
+            if asym_on and B.numel() % asym_g == 0:
+                qA, sA, qB, sB = _lqer_pack_asym(A, B, asym_g)
+                result[name + ".lqA_a"] = qA
+                result[name + ".lqAs_a"] = sA
+                result[name + ".lqB_a"] = qB
+                result[name + ".lqBs_a"] = sB
+                meta[name] = meta[name] + "+lqer_asym"
+            else:
+                qA, sA, qB, sB = _lqer_pack(A, B, h.lqer_factor_bits)
+                result[name + ".lqA"] = qA
+                result[name + ".lqAs"] = sA
+                result[name + ".lqB"] = qB
+                result[name + ".lqBs"] = sB
+                meta[name] = meta[name] + "+lqer"
+    categories = collections.defaultdict(set)
+    for (name, cat) in meta.items():
+        short = re.sub("\\.\\d+$", "", re.sub("blocks\\.\\d+", "blocks", name))
+        categories[cat].add(short)
+    log("Quantized weights:")
+    for cat in sorted(categories):
+        log(f"  {cat}: {', '.join(sorted(categories[cat]))}")
+    return result, meta
+
+def dequantize_mixed(result, meta, template_sd):
+    out = {}
+    for (name, orig) in template_sd.items():
+        info = meta.get(name)
+        if info is None:
+            continue
+        orig_dtype = orig.dtype
+        if "passthrough" in info:
+            t = result[name]
+            if t.dtype == torch.float16 and orig_dtype in (
+                torch.float32,
+                torch.bfloat16,
+            ):
+                t = t.to(orig_dtype)
+            out[name] = t
+            continue
+        if info == "gate_int8_row":
+            gq = result[name + ".gq"]
+            gs = result[name + ".gs"]
+            out[name] = (gq.float() * gs.float().view(-1, 1)).to(orig_dtype)
+            continue
+        q, s = result[name + ".q"], result[name + ".scale"]
+        if s.ndim > 0:
+            W = q.float() * s.float().view(q.shape[0], *[1] * (q.ndim - 1))
+        else:
+            W = q.float() * float(s.item())
+        if "lqer_asym" in info:
+            qA_t = result[name + ".lqA_a"]
+            sA_t = result[name + ".lqAs_a"]
+            qB_t = result[name + ".lqB_a"]
+            sB_t = result[name + ".lqBs_a"]
+            qA = qA_t.float() * float(sA_t)
+            g_sz = qB_t.numel() // sB_t.numel()
+            qB = (qB_t.reshape(-1, g_sz).float() * sB_t.float().view(-1, 1)).reshape(
+                qB_t.shape
+            )
+            W = W + qA @ qB
+        elif "lqer" in info:
+            qA = result[name + ".lqA"].float() * result[name + ".lqAs"].float().view(-1, 1)
+            qB = result[name + ".lqB"].float() * result[name + ".lqBs"].float().view(-1, 1)
+            W = W + qA @ qB
+        out[name] = W.to(orig_dtype)
+    return out
+
+
+_BSHF_MAGIC = b"BSHF"
+
+
+# ── Per-group lrzip compression (ported from PR#1586 via PR#1667/1729) ────────
+
+_GROUP_ORDER = [
+    "_tok_emb.weight.q",
+    "attn.c_k.weight.q", "attn.c_q.weight.q",
+    "attn.c_v.weight.q", "attn.proj.weight.q",
+    "mlp.fc.weight.q", "mlp.proj.weight.q",
+]
+_SIMSORT_KEYS = {"_tok_emb.weight.q", "attn.c_q.weight.q", "mlp.fc.weight.q"}
+_PACK_MAGIC = b"PGRP"
+
+
+def _similarity_sort_l1(matrix):
+    import numpy as _np
+    n = matrix.shape[0]
+    used = _np.zeros(n, dtype=bool)
+    order = [0]
+    used[0] = True
+    cur = matrix[0].astype(_np.float32)
+    for _ in range(n - 1):
+        dists = _np.sum(_np.abs(matrix[~used].astype(_np.float32) - cur), axis=1)
+        unused = _np.where(~used)[0]
+        best = unused[_np.argmin(dists)]
+        order.append(best)
+        used[best] = True
+        cur = matrix[best].astype(_np.float32)
+    return _np.array(order, dtype=_np.uint16)
+
+
+def _lrzip_compress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.bin")
+    out = f"{inp}.lrz"
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-z", "-L", "9", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _lrzip_decompress(data, tmpdir, label):
+    inp = os.path.join(tmpdir, f"{label}.lrz")
+    out = os.path.join(tmpdir, f"{label}.bin")
+    with open(inp, "wb") as f:
+        f.write(data)
+    subprocess.run(["lrzip", "-d", "-f", "-o", out, inp], capture_output=True, check=True)
+    with open(out, "rb") as f:
+        result = f.read()
+    os.remove(inp); os.remove(out)
+    return result
+
+
+def _pack_streams(streams):
+    import struct
+    n = len(streams)
+    hdr = _PACK_MAGIC + struct.pack("<I", n)
+    for s in streams:
+        hdr += struct.pack("<I", len(s))
+    return hdr + b"".join(streams)
+
+
+def _unpack_streams(blob):
+    import struct
+    assert blob[:4] == _PACK_MAGIC
+    n = struct.unpack("<I", blob[4:8])[0]
+    off = 8
+    lengths = [struct.unpack("<I", blob[off + i*4:off + i*4 + 4])[0] for i in range(n)]
+    off += n * 4
+    streams = []
+    for length in lengths:
+        streams.append(blob[off:off + length])
+        off += length
+    return streams
+
+
+def _compress(raw, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.compress(raw, quality=11)
+    if compressor == "lzma":
+        import lzma
+        return lzma.compress(raw, preset=9)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _decompress(blob, compressor):
+    if compressor == "brotli":
+        import brotli
+        return brotli.decompress(blob)
+    if compressor == "lzma":
+        import lzma
+        return lzma.decompress(blob)
+    raise ValueError(f"unknown compressor {compressor!r}")
+
+
+def _serialize_pergroup(quant_result, quant_meta, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    groups = collections.defaultdict(list)
+    remainder = {}
+    for name, t in sorted(quant_result.items()):
+        if t.dtype != torch.int8:
+            remainder[name] = t
+            continue
+        parts = name.split(".")
+        routed = False
+        if parts[0] == "blocks" and parts[1].isdigit():
+            key = ".".join(parts[2:])
+            if key in _GROUP_ORDER:
+                groups[key].append((int(parts[1]), t))
+                routed = True
+        else:
+            group_key = "_" + name
+            if group_key in _GROUP_ORDER:
+                groups[group_key] = [(0, t)]
+                routed = True
+        if not routed:
+            # int8 tensor that doesn't fit a known group (e.g. gate_int8_row
+            # tensors like attn.attn_gate_w.gq from GATED_ATTN). Stash in
+            # the brotli-compressed remainder blob so it round-trips.
+            remainder[name] = t
+
+    streams = []
+    all_perms = b""
+    shape_manifest = {}
+
+    for group_key in _GROUP_ORDER:
+        if group_key not in groups:
+            streams.append(b"")
+            continue
+        tensors = sorted(groups[group_key], key=lambda x: x[0])
+        blob = b""
+        grp_shapes = []
+        for idx, t in tensors:
+            arr = t.numpy()
+            orig_shape = arr.shape
+            if arr.ndim == 2:
+                if group_key in _SIMSORT_KEYS:
+                    order = _similarity_sort_l1(arr)
+                    all_perms += order.tobytes()
+                    arr = arr[order]
+                arr = _np.ascontiguousarray(arr.T)
+            blob += arr.tobytes()
+            grp_shapes.append(orig_shape)
+        shape_manifest[group_key] = grp_shapes
+        compressed = _lrzip_compress(blob, tmpdir, group_key.replace(".", "_"))
+        streams.append(compressed)
+
+    remainder_buf = io.BytesIO()
+    torch.save({"r": remainder, "m": quant_meta, "s": shape_manifest}, remainder_buf)
+    streams.append(brotli.compress(remainder_buf.getvalue(), quality=11, lgwin=24))
+    streams.append(brotli.compress(all_perms, quality=11) if all_perms else b"")
+
+    return _pack_streams(streams)
+
+
+def _deserialize_pergroup(blob, num_layers, tmpdir):
+    import brotli
+    import numpy as _np
+    streams = _unpack_streams(blob)
+    n_groups = len(_GROUP_ORDER)
+
+    remainder_state = torch.load(
+        io.BytesIO(brotli.decompress(streams[n_groups])), map_location="cpu"
+    )
+    quant_meta = remainder_state["m"]
+    quant_result = dict(remainder_state["r"])
+    shape_manifest = remainder_state["s"]
+    all_perms = brotli.decompress(streams[n_groups + 1]) if streams[n_groups + 1] else b""
+
+    def _decompress_one(args):
+        i, gk, data = args
+        if not data:
+            return gk, b""
+        return gk, _lrzip_decompress(data, tmpdir, f"d_{gk.replace('.', '_')}")
+
+    from concurrent.futures import ThreadPoolExecutor as _TPool
+    with _TPool(max_workers=n_groups) as pool:
+        futs = [pool.submit(_decompress_one, (i, gk, streams[i])) for i, gk in enumerate(_GROUP_ORDER)]
+        raw_groups = {f.result()[0]: f.result()[1] for f in futs}
+
+    perm_off = 0
+    for group_key in _GROUP_ORDER:
+        raw = raw_groups.get(group_key, b"")
+        if not raw:
+            continue
+        grp_shapes = shape_manifest[group_key]
+        data_arr = _np.frombuffer(raw, dtype=_np.int8)
+
+        if group_key.startswith("_"):
+            tensor_names = [group_key[1:]]
+        else:
+            tensor_names = [f"blocks.{i}.{group_key}" for i in range(num_layers)]
+
+        offset = 0
+        for tname, orig_shape in zip(tensor_names, grp_shapes):
+            n_elem = 1
+            for d in orig_shape:
+                n_elem *= d
+            chunk = data_arr[offset:offset + n_elem].copy()
+            offset += n_elem
+
+            if len(orig_shape) == 2:
+                rows, cols = orig_shape
+                chunk = chunk.reshape(cols, rows).T
+
+                if group_key in _SIMSORT_KEYS:
+                    perm = _np.frombuffer(all_perms[perm_off:perm_off + rows * 2], dtype=_np.uint16)
+                    perm_off += rows * 2
+                    inv_perm = _np.empty_like(perm)
+                    inv_perm[perm] = _np.arange(rows, dtype=_np.uint16)
+                    chunk = chunk[inv_perm]
+
+                chunk = chunk.reshape(orig_shape)
+
+            quant_result[tname] = torch.from_numpy(_np.ascontiguousarray(chunk))
+
+    return quant_result, quant_meta
+
+
+def _unbank_state_dict(state_dict, num_layers):
+    sd = {}
+    n = num_layers
+    for k, v in state_dict.items():
+        t = v.detach().cpu() if v is not None else None
+        if k == "qo_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_q.weight"] = t[i]
+                sd[f"blocks.{i}.attn.proj.weight"] = t[n + i]
+        elif k == "kv_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.attn.c_k.weight"] = t[i]
+                sd[f"blocks.{i}.attn.c_v.weight"] = t[n + i]
+        elif k == "mlp_up_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.fc.weight"] = t[i]
+        elif k == "mlp_down_bank":
+            for i in range(n):
+                sd[f"blocks.{i}.mlp.proj.weight"] = t[i]
+        else:
+            if t is not None:
+                sd[k] = t
+    return sd
+
+
+def _rebank_state_dict(flat_sd, num_layers, model_dim, kv_dim, hidden_dim):
+    sd = {}
+    n = num_layers
+    sd["qo_bank"] = torch.zeros(2 * n, model_dim, model_dim)
+    sd["kv_bank"] = torch.zeros(2 * n, kv_dim, model_dim)
+    for i in range(n):
+        sd["qo_bank"][i] = flat_sd[f"blocks.{i}.attn.c_q.weight"]
+        sd["qo_bank"][n + i] = flat_sd[f"blocks.{i}.attn.proj.weight"]
+        sd["kv_bank"][i] = flat_sd[f"blocks.{i}.attn.c_k.weight"]
+        sd["kv_bank"][n + i] = flat_sd[f"blocks.{i}.attn.c_v.weight"]
+    sd["mlp_up_bank"] = torch.zeros(n, hidden_dim, model_dim)
+    sd["mlp_down_bank"] = torch.zeros(n, model_dim, hidden_dim)
+    for i in range(n):
+        sd["mlp_up_bank"][i] = flat_sd[f"blocks.{i}.mlp.fc.weight"]
+        sd["mlp_down_bank"][i] = flat_sd[f"blocks.{i}.mlp.proj.weight"]
+    for k, v in flat_sd.items():
+        if not (
+            k.startswith("blocks.")
+            and any(
+                p in k
+                for p in [
+                    ".attn.c_q.", ".attn.c_k.", ".attn.c_v.",
+                    ".attn.proj.", ".mlp.fc.", ".mlp.proj.",
+                ]
+            )
+        ):
+            sd[k] = v
+    return sd
+
+
+
+def _compressed_code_size(code):
+    import brotli
+    code_raw = code.encode("utf-8")
+    try:
+        minified = subprocess.run(
+            ["pyminify", "--no-rename-locals", "--no-hoist-literals", "--remove-literal-statements", "--remove-asserts", "--prefer-single-line", "-"],
+            input=code_raw, capture_output=True, check=True,
+        ).stdout
+    except (FileNotFoundError, subprocess.CalledProcessError):
+        minified = code_raw
+    compressed = brotli.compress(minified, quality=11)
+    encoded = base64.b85encode(compressed)
+    wrapper = b"import brotli as B,base64 as b\nexec(B.decompress(b.b85decode(\"" + encoded + b"\")))\n"
+    return len(code_raw), len(wrapper)
+
+
+def serialize(h, base_model, code):
+    code_bytes_uncompressed, code_bytes = _compressed_code_size(code)
+    if h.is_main_process:
+        torch.save(base_model.state_dict(), h.model_path)
+        model_bytes = os.path.getsize(h.model_path)
+        log(f"Serialized model: {model_bytes} bytes")
+        log(f"Code size (uncompressed): {code_bytes_uncompressed} bytes")
+        log(f"Code size (compressed): {code_bytes} bytes")
+    sd_cpu = _unbank_state_dict(base_model.state_dict(), h.num_layers)
+    device = torch.device("cuda", h.local_rank)
+    t0 = time.perf_counter()
+    calib_loader = ShuffledSequenceLoader(h, device)
+    log("GPTQ:collecting Hessians from calibration data...")
+    hessians = collect_hessians(
+        base_model,
+        calib_loader,
+        h,
+        device,
+        n_calibration_batches=h.gptq_calibration_batches,
+    )
+    log(f"GPTQ:collected {len(hessians)} Hessians in {time.perf_counter()-t0:.1f}s")
+    quant_result, quant_meta = gptq_mixed_quantize(sd_cpu, hessians, h)
+    if h.compressor == "pergroup":
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_")
+        log("Serialize: per-group lrzip compression...")
+        t1 = time.perf_counter()
+        quant_blob = _serialize_pergroup(quant_result, quant_meta, h.num_layers, tmpdir)
+        log(f"Serialize: per-group compression done in {time.perf_counter()-t1:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_buf = io.BytesIO()
+        torch.save({"w": quant_result, "m": quant_meta}, quant_buf)
+        quant_raw = quant_buf.getvalue()
+        quant_blob = _compress(quant_raw, h.compressor)
+    quant_file_bytes = len(quant_blob)
+    bytes_total = quant_file_bytes + code_bytes
+    if h.is_main_process:
+        with open(h.quantized_model_path, "wb") as f:
+            f.write(quant_blob)
+        log(f"Serialized model quantized+{h.compressor}: {quant_file_bytes} bytes")
+        log(f"Total submission size quantized+{h.compressor}: {bytes_total} bytes")
+    return bytes_total, quant_file_bytes
+
+
+def deserialize(h, device):
+    eval_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(eval_model)
+    flat_template = _unbank_state_dict(eval_model.state_dict(), h.num_layers)
+    with open(h.quantized_model_path, "rb") as f:
+        quant_blob_disk = f.read()
+    if quant_blob_disk[:4] == _PACK_MAGIC:
+        import tempfile
+        tmpdir = tempfile.mkdtemp(prefix="pgrp_dec_")
+        log("Deserialize: per-group lrzip decompression...")
+        t0 = time.perf_counter()
+        quant_result, quant_meta = _deserialize_pergroup(
+            quant_blob_disk, h.num_layers, tmpdir
+        )
+        log(f"Deserialize: decompression done in {time.perf_counter()-t0:.1f}s")
+        try:
+            os.rmdir(tmpdir)
+        except OSError:
+            pass
+    else:
+        quant_state = torch.load(
+            io.BytesIO(_decompress(quant_blob_disk, h.compressor)), map_location="cpu"
+        )
+        quant_result, quant_meta = quant_state["w"], quant_state["m"]
+    deq_flat = dequantize_mixed(quant_result, quant_meta, flat_template)
+    head_dim = h.model_dim // h.num_heads
+    kv_dim = h.num_kv_heads * head_dim
+    hidden_dim = int(h.mlp_mult * h.model_dim)
+    deq_state = _rebank_state_dict(deq_flat, h.num_layers, h.model_dim, kv_dim, hidden_dim)
+    eval_model.load_state_dict(deq_state, strict=True)
+    return eval_model
+
+
+def _loss_bpb(loss_sum, token_count, byte_count):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_count.item())
+    return val_loss, val_bpb
+
+
+def eval_val(h, device, val_data, model, forward_logits_fn=None):
+    seq_len = h.eval_seq_len
+    local_batch_tokens = h.val_batch_tokens // (h.world_size * h.grad_accum_steps)
+    if local_batch_tokens < seq_len:
+        raise ValueError(
+            f"VAL_BATCH_SIZE must provide at least one sequence per rank; got VAL_BATCH_SIZE={h.val_batch_tokens}, WORLD_SIZE={h.world_size}, GRAD_ACCUM_STEPS={h.grad_accum_steps}, seq_len={seq_len}"
+        )
+    local_batch_seqs = local_batch_tokens // seq_len
+    total_seqs = (val_data.val_tokens.numel() - 1) // seq_len
+    seq_start = total_seqs * h.rank // h.world_size
+    seq_end = total_seqs * (h.rank + 1) // h.world_size
+
+    # TODO: Don't truncate this.
+    seq_end = seq_start + ((seq_end - seq_start) // local_batch_seqs) * local_batch_seqs
+
+    val_loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    val_token_count = torch.zeros((), device=device, dtype=torch.float64)
+    val_byte_count = torch.zeros((), device=device, dtype=torch.float64)
+    run_forward_logits = (
+        (model.module.forward_logits if hasattr(model, "module") else model.forward_logits)
+        if forward_logits_fn is None
+        else forward_logits_fn
+    )
+    model.eval()
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    with torch.no_grad():
+        for batch_seq_start in range(seq_start, seq_end, local_batch_seqs):
+            batch_seq_end = min(batch_seq_start + local_batch_seqs, seq_end)
+            raw_start = batch_seq_start * seq_len
+            raw_end = batch_seq_end * seq_len + 1
+            local = val_data.val_tokens[raw_start:raw_end].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            x = local[:-1]
+            y = local[1:]
+            bos_pos = (x == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                bos_pos, x.numel(), x.device, h.eval_seq_len, 64
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                logits = run_forward_logits(
+                    x[None], cu_seqlens=cu_seqlens, max_seqlen=max_seqlen
+                ).detach()
+            per_token_loss = F.cross_entropy(
+                logits.reshape(-1, logits.size(-1)).float(),
+                y.reshape(-1),
+                reduction="none",
+            )
+            val_loss_sum += per_token_loss.to(torch.float64).sum()
+            val_token_count += float(y.numel())
+            prev_ids = x
+            tgt_ids = y
+            sidecar_slice = val_data.val_bytes[raw_start + 1 : raw_end].to(
+                device=device, dtype=torch.int32, non_blocking=True
+            )
+            val_byte_count += sidecar_slice.to(torch.float64).sum()
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(val_loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_token_count, op=dist.ReduceOp.SUM)
+        dist.all_reduce(val_byte_count, op=dist.ReduceOp.SUM)
+    model.train()
+    return _loss_bpb(val_loss_sum, val_token_count, val_byte_count)
+
+
+def _find_docs(all_tokens):
+    bos_positions = (all_tokens == BOS_ID).nonzero(as_tuple=True)[0].numpy()
+    docs = []
+    for i in range(len(bos_positions)):
+        start = int(bos_positions[i])
+        end = (
+            int(bos_positions[i + 1])
+            if i + 1 < len(bos_positions)
+            else all_tokens.numel()
+        )
+        if i + 1 < len(bos_positions):
+            end += 1
+        assert end - start >= 2
+        docs.append((start, end - start))
+    return docs
+
+
+def _build_ttt_global_batches(doc_entries, h, ascending=False):
+    batch_size = h.ttt_batch_size
+    global_doc_entries = sorted(doc_entries, key=lambda x: x[1][1])
+    global_batches = [
+        global_doc_entries[i : i + batch_size]
+        for i in range(0, len(global_doc_entries), batch_size)
+    ]
+    indexed = list(enumerate(global_batches))
+    if not ascending:
+        indexed.sort(key=lambda ib: -max(dl for _, (_, dl) in ib[1]))
+    return indexed
+
+
+def _init_batch_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(4, "little"))
+
+
+def _claim_next_batch(counter_path, queue_len):
+    try:
+        with open(counter_path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            idx = int.from_bytes(f.read(4), "little")
+            f.seek(0)
+            f.write((idx + 1).to_bytes(4, "little"))
+            f.flush()
+    except FileNotFoundError:
+        return queue_len
+    return idx
+
+
+def _compute_chunk_window(ci, pred_len, num_chunks, chunk_size, eval_seq_len):
+    chunk_end = pred_len if ci == num_chunks - 1 else (ci + 1) * chunk_size
+    win_start = max(0, chunk_end - eval_seq_len)
+    win_len = chunk_end - win_start
+    chunk_start = ci * chunk_size
+    chunk_offset = chunk_start - win_start
+    chunk_len = chunk_end - chunk_start
+    return win_start, win_len, chunk_offset, chunk_len
+
+
+def _accumulate_bpb(
+    ptl,
+    x,
+    y,
+    chunk_offsets,
+    chunk_lens,
+    pos_idx,
+    base_bytes_lut,
+    has_leading_space_lut,
+    is_boundary_token_lut,
+    loss_sum,
+    byte_sum,
+    token_count,
+    y_bytes=None,
+):
+    pos = pos_idx[: x.size(1)].unsqueeze(0)
+    mask = (
+        (chunk_lens.unsqueeze(1) > 0)
+        & (pos >= chunk_offsets.unsqueeze(1))
+        & (pos < (chunk_offsets + chunk_lens).unsqueeze(1))
+    )
+    mask_f64 = mask.to(torch.float64)
+    if y_bytes is not None:
+        tok_bytes = y_bytes.to(torch.float64)
+    else:
+        tok_bytes = base_bytes_lut[y].to(torch.float64)
+        tok_bytes += (has_leading_space_lut[y] & ~is_boundary_token_lut[x]).to(
+            torch.float64
+        )
+    loss_sum += (ptl.to(torch.float64) * mask_f64).sum()
+    byte_sum += (tok_bytes * mask_f64).sum()
+    token_count += chunk_lens.to(torch.float64).sum()
+
+
+def _loss_bpb_from_sums(loss_sum, token_count, byte_sum):
+    val_loss = (loss_sum / token_count).item()
+    val_bpb = val_loss / math.log(2.0) * (token_count.item() / byte_sum.item())
+    return val_loss, val_bpb
+
+
+def _add_to_counter(path, delta):
+    try:
+        with open(path, "r+b") as f:
+            fcntl.flock(f, fcntl.LOCK_EX)
+            cur = int.from_bytes(f.read(8), "little", signed=True)
+            cur += int(delta)
+            f.seek(0)
+            f.write(int(cur).to_bytes(8, "little", signed=True))
+            f.flush()
+            return cur
+    except FileNotFoundError:
+        return int(delta)
+
+
+def _init_int64_counter(path):
+    with open(path, "wb") as f:
+        f.write((0).to_bytes(8, "little", signed=True))
+
+
+def _select_ttt_doc_entries(docs, h):
+    doc_entries = list(enumerate(docs))
+    if h.val_doc_fraction < 1.0:
+        sample_n = max(1, int(round(len(docs) * h.val_doc_fraction)))
+        sampled_indices = sorted(
+            random.Random(h.seed).sample(range(len(docs)), sample_n)
+        )
+        return [(i, docs[i]) for i in sampled_indices]
+    return doc_entries
+
+
+def train_val_ttt_global_sgd_distributed(h, device, val_data, base_model, val_tokens, batch_seqs=None):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    seq_len = h.eval_seq_len
+    total_tokens = val_tokens.numel() - 1
+    ttt_chunk = h.global_ttt_chunk_tokens
+    batch_seqs = h.global_ttt_batch_seqs if batch_seqs is None else batch_seqs
+    num_chunks = (total_tokens + ttt_chunk - 1) // ttt_chunk
+    ttt_params = [p for p in base_model.parameters()]
+    for p in ttt_params:
+        p.requires_grad_(True)
+    optimizer = torch.optim.SGD(
+        ttt_params, lr=h.global_ttt_lr, momentum=h.global_ttt_momentum
+    )
+    t_start = time.perf_counter()
+    for ci in range(num_chunks):
+        chunk_start = ci * ttt_chunk
+        chunk_end = min((ci + 1) * ttt_chunk, total_tokens)
+        is_last_chunk = ci == num_chunks - 1
+        if is_last_chunk or h.global_ttt_epochs <= 0:
+            continue
+        base_model.train()
+        chunk_seqs = (chunk_end - chunk_start) // seq_len
+        if chunk_seqs <= 0:
+            continue
+        warmup_chunks = max(0, min(h.global_ttt_warmup_chunks, num_chunks - 1))
+        if warmup_chunks > 0 and ci < warmup_chunks:
+            warmup_denom = max(warmup_chunks - 1, 1)
+            warmup_t = ci / warmup_denom
+            lr_now = (
+                h.global_ttt_warmup_start_lr
+                + (h.global_ttt_lr - h.global_ttt_warmup_start_lr) * warmup_t
+            )
+        else:
+            decay_steps = max(num_chunks - 1 - warmup_chunks, 1)
+            decay_ci = max(ci - warmup_chunks, 0)
+            lr_now = h.global_ttt_lr * 0.5 * (
+                1.0 + math.cos(math.pi * decay_ci / decay_steps)
+            )
+        for pg in optimizer.param_groups:
+            pg["lr"] = lr_now
+        my_seq_s = chunk_seqs * h.rank // h.world_size
+        my_seq_e = chunk_seqs * (h.rank + 1) // h.world_size
+        my_chunk_seqs = my_seq_e - my_seq_s
+        for _ in range(h.global_ttt_epochs):
+            for bs in range(0, my_chunk_seqs, batch_seqs):
+                be = min(bs + batch_seqs, my_chunk_seqs)
+                actual_bs = my_seq_s + bs
+                start_tok = chunk_start + actual_bs * seq_len
+                end_tok = chunk_start + (my_seq_s + be) * seq_len + 1
+                if end_tok > val_tokens.numel():
+                    continue
+                local = val_tokens[start_tok:end_tok].to(device=device, dtype=torch.int64)
+                x_flat = local[:-1]
+                y_flat = local[1:]
+                optimizer.zero_grad(set_to_none=True)
+                with torch.enable_grad():
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                        if h.global_ttt_respect_doc_boundaries:
+                            bos_pos = (x_flat == BOS_ID).nonzero(as_tuple=True)[0].tolist()
+                            cu_seqlens, max_seqlen = _build_cu_seqlens(
+                                bos_pos, x_flat.numel(), x_flat.device, h.eval_seq_len, 64
+                            )
+                            loss = base_model(
+                                x_flat[None],
+                                y_flat[None],
+                                cu_seqlens=cu_seqlens,
+                                max_seqlen=max_seqlen,
+                            )
+                        else:
+                            x = x_flat.reshape(-1, seq_len)
+                            y = y_flat.reshape(-1, seq_len)
+                            loss = base_model(x, y)
+                loss.backward()
+                if dist.is_available() and dist.is_initialized():
+                    for p in ttt_params:
+                        if p.grad is not None:
+                            dist.all_reduce(p.grad, op=dist.ReduceOp.SUM)
+                            p.grad.mul_(1.0 / h.world_size)
+                if h.global_ttt_grad_clip > 0:
+                    torch.nn.utils.clip_grad_norm_(ttt_params, h.global_ttt_grad_clip)
+                optimizer.step()
+        base_model.eval()
+        if h.rank == 0:
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"tttg: c{ci+1}/{num_chunks} lr:{lr_now:.6f} t:{elapsed:.1f}s"
+            )
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.eval()
+
+
+def eval_val_ttt_phased(h, base_model, device, val_data, forward_ttt_train):
+    global BOS_ID
+    if BOS_ID is None:
+        BOS_ID = 1
+    base_model.eval()
+    for p in base_model.parameters():
+        p.requires_grad_(False)
+    all_tokens = val_data.val_tokens
+    all_tokens_idx = all_tokens.to(torch.int32)
+    docs = _find_docs(all_tokens)
+    doc_entries = _select_ttt_doc_entries(docs, h)
+    prefix_doc_limit = max(0, min(len(doc_entries), int(h.phased_ttt_prefix_docs)))
+    num_phases = max(1, int(h.phased_ttt_num_phases))
+    phase_boundaries = []
+    for pi in range(num_phases):
+        boundary = prefix_doc_limit * (pi + 1) // num_phases
+        phase_boundaries.append(boundary)
+    current_phase = 0
+    current_phase_boundary = phase_boundaries[0]
+    log(
+        "ttt_phased:"
+        f" total_docs:{len(doc_entries)} prefix_docs:{prefix_doc_limit} "
+        f"suffix_docs:{len(doc_entries) - prefix_doc_limit}"
+        f" num_phases:{num_phases} boundaries:{phase_boundaries}"
+    )
+    chunk_size, eval_seq_len = h.ttt_chunk_size, h.ttt_eval_seq_len
+    eval_batch_set = None
+    if h.ttt_eval_batches:
+        eval_batch_set = set(int(x) for x in h.ttt_eval_batches.split(",") if x.strip())
+    use_ascending = eval_batch_set is not None
+    global_batches_sorted = _build_ttt_global_batches(
+        doc_entries, h, ascending=use_ascending
+    )
+    queue_len = len(global_batches_sorted)
+    counter_path = f"/tmp/ttt_counter_{h.run_id}"
+    prefix_counter_path = f"/tmp/ttt_prefix_counter_{h.run_id}"
+    pause_flag_path = f"/tmp/ttt_pause_flag_{h.run_id}"
+    if h.rank == 0:
+        _init_batch_counter(counter_path)
+        _init_int64_counter(prefix_counter_path)
+        try:
+            os.remove(pause_flag_path)
+        except FileNotFoundError:
+            pass
+    if dist.is_available() and dist.is_initialized():
+        path_list = [counter_path, prefix_counter_path, pause_flag_path]
+        dist.broadcast_object_list(path_list, src=0)
+        counter_path, prefix_counter_path, pause_flag_path = path_list
+        dist.barrier()
+    loss_sum = torch.zeros((), device=device, dtype=torch.float64)
+    byte_sum = torch.zeros((), device=device, dtype=torch.float64)
+    token_count = torch.zeros((), device=device, dtype=torch.float64)
+    t_start = time.perf_counter()
+    reusable_lora = BatchedTTTLoRA(
+        h.ttt_batch_size, base_model, h.ttt_lora_rank,
+        k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+    ).to(device)
+
+    def _build_opt(lora):
+        if h.ttt_optimizer == "sgd":
+            return torch.optim.SGD(
+                lora.parameters(), lr=h.ttt_lora_lr,
+                momentum=h.ttt_beta1, weight_decay=h.ttt_weight_decay,
+            )
+        return torch.optim.AdamW(
+            lora.parameters(), lr=h.ttt_lora_lr,
+            betas=(h.ttt_beta1, h.ttt_beta2),
+            eps=1e-10, weight_decay=h.ttt_weight_decay, fused=True,
+        )
+
+    reusable_opt = _build_opt(reusable_lora)
+    local_scored_docs = []
+    global_ttt_done = prefix_doc_limit == 0
+    try:
+      while True:
+        queue_idx = _claim_next_batch(counter_path, queue_len)
+        if queue_idx >= queue_len:
+            break
+        orig_batch_idx, batch_entries = global_batches_sorted[queue_idx]
+        batch = [doc for _, doc in batch_entries]
+        bsz = len(batch)
+        prev_loss = loss_sum.item()
+        prev_bytes = byte_sum.item()
+        prev_tokens = token_count.item()
+        if bsz == reusable_lora.bsz:
+            reusable_lora.reset()
+            for s in reusable_opt.state.values():
+                for k, v in s.items():
+                    if isinstance(v, torch.Tensor):
+                        v.zero_()
+                    elif k == "step":
+                        s[k] = 0
+            cur_lora = reusable_lora
+            cur_opt = reusable_opt
+        else:
+            cur_lora = BatchedTTTLoRA(
+                bsz, base_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            cur_opt = _build_opt(cur_lora)
+        pred_lens = [doc_len - 1 for _, doc_len in batch]
+        num_chunks = [(pl + chunk_size - 1) // chunk_size for pl in pred_lens]
+        max_nc = max(num_chunks)
+        num_chunks_t = torch.tensor(num_chunks, dtype=torch.int64, device=device)
+        for ci in range(max_nc):
+            active = [ci < nc for nc in num_chunks]
+            needs_train = any(ci < nc - 1 for nc in num_chunks)
+            tok_starts = torch.zeros(bsz, dtype=torch.int64)
+            tok_wls = torch.zeros(bsz, dtype=torch.int64)
+            chunk_offsets_cpu = torch.zeros(bsz, dtype=torch.int64)
+            chunk_lens_cpu = torch.zeros(bsz, dtype=torch.int64)
+            for b in range(bsz):
+                if not active[b]:
+                    continue
+                doc_start, doc_len = batch[b]
+                win_start, win_len, chunk_offset, chunk_len = _compute_chunk_window(
+                    ci, pred_lens[b], num_chunks[b], chunk_size, eval_seq_len
+                )
+                tok_starts[b] = doc_start + win_start
+                tok_wls[b] = win_len
+                chunk_offsets_cpu[b] = chunk_offset
+                chunk_lens_cpu[b] = chunk_len
+            _, context_size, chunk_offset, _ = _compute_chunk_window(
+                ci, (ci + 1) * chunk_size, ci + 1, chunk_size, eval_seq_len
+            )
+            col_idx = torch.arange(context_size + 1)
+            idx = tok_starts.unsqueeze(1) + col_idx.unsqueeze(0)
+            idx.clamp_(max=all_tokens.numel() - 1)
+            gathered_gpu = all_tokens_idx[idx].to(
+                device=device, dtype=torch.int64, non_blocking=True
+            )
+            valid = (col_idx[:context_size].unsqueeze(0) < tok_wls.unsqueeze(1)).to(
+                device, non_blocking=True
+            )
+            chunk_offsets = chunk_offsets_cpu.to(device, non_blocking=True)
+            chunk_lens = chunk_lens_cpu.to(device, non_blocking=True)
+            x = torch.where(valid, gathered_gpu[:, :context_size], 0)
+            y = torch.where(valid, gathered_gpu[:, 1 : context_size + 1], 0)
+            ctx_pos = torch.arange(context_size, device=device, dtype=torch.int64)
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+            # CaseOps sidecar-driven byte budget. Mirror the index pattern
+            # used to build y from all_tokens: y[b, j] corresponds to the
+            # token at global position tok_starts[b] + 1 + j (when valid).
+            y_bytes_arg = None
+            if val_data.caseops_enabled and val_data.val_bytes is not None:
+                y_idx = (
+                    tok_starts.unsqueeze(1)
+                    + 1
+                    + col_idx[:context_size].unsqueeze(0)
+                )
+                y_idx = y_idx.clamp_(max=val_data.val_bytes.numel() - 1)
+                y_bytes_arg = val_data.val_bytes[y_idx].to(
+                    device=device, dtype=torch.int32, non_blocking=True
+                )
+                # Mirror the `valid` masking used for y so out-of-range tokens
+                # contribute zero bytes (matches y=0 substitution above).
+                y_bytes_arg = torch.where(
+                    valid, y_bytes_arg, torch.zeros_like(y_bytes_arg)
+                )
+            with torch.no_grad():
+                _accumulate_bpb(
+                    per_tok_loss,
+                    x,
+                    y,
+                    chunk_offsets,
+                    chunk_lens,
+                    ctx_pos,
+                    val_data.base_bytes_lut,
+                    val_data.has_leading_space_lut,
+                    val_data.is_boundary_token_lut,
+                    loss_sum,
+                    byte_sum,
+                    token_count,
+                    y_bytes=y_bytes_arg,
+                )
+            if needs_train:
+                activate_chunk_mask = (num_chunks_t - 1 > ci).float()
+                for gi in range(h.ttt_grad_steps):
+                    if gi > 0:
+                        with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                            per_tok_loss = forward_ttt_train(x, y, lora=cur_lora)
+                    per_doc = per_tok_loss[
+                        :, chunk_offset : chunk_offset + chunk_size
+                    ].mean(dim=-1)
+                    cur_opt.zero_grad(set_to_none=True)
+                    (per_doc * activate_chunk_mask).sum().backward()
+                    cur_opt.step()
+            else:
+                del per_tok_loss
+        batch_num = orig_batch_idx + 1
+        doc_lens = [dl for _, dl in batch]
+        should_report = batch_num in eval_batch_set if eval_batch_set is not None else True
+        if should_report:
+            cur_tokens = token_count.item()
+            cur_loss_val = loss_sum.item()
+            cur_bytes_val = byte_sum.item()
+            dt = cur_tokens - prev_tokens
+            db = cur_bytes_val - prev_bytes
+            if dt > 0 and db > 0:
+                b_loss = (cur_loss_val - prev_loss) / dt
+                b_bpb = b_loss / math.log(2.0) * (dt / db)
+            else:
+                b_loss = b_bpb = 0.0
+            r_loss = cur_loss_val / max(cur_tokens, 1)
+            r_bpb = r_loss / math.log(2.0) * (cur_tokens / max(cur_bytes_val, 1))
+            elapsed = time.perf_counter() - t_start
+            log(
+                f"ttp: b{batch_num}/{queue_len} bl:{b_loss:.4f} bb:{b_bpb:.4f} "
+                f"rl:{r_loss:.4f} rb:{r_bpb:.4f} dl:{min(doc_lens)}-{max(doc_lens)} "
+                f"gd:{int(global_ttt_done)}"
+            )
+        if not global_ttt_done:
+            local_scored_docs.extend(
+                (orig_batch_idx, pos, doc_start, doc_len)
+                for pos, (doc_start, doc_len) in enumerate(batch)
+            )
+            prefix_done = _add_to_counter(prefix_counter_path, len(batch_entries))
+            if prefix_done >= current_phase_boundary:
+                try:
+                    with open(pause_flag_path, "x"):
+                        pass
+                except FileExistsError:
+                    pass
+            should_pause = os.path.exists(pause_flag_path)
+            if should_pause:
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                gathered_scored_docs = [None] * h.world_size
+                if dist.is_available() and dist.is_initialized():
+                    dist.all_gather_object(gathered_scored_docs, local_scored_docs)
+                else:
+                    gathered_scored_docs = [local_scored_docs]
+                scored_docs_for_global = []
+                for rank_docs in gathered_scored_docs:
+                    if rank_docs:
+                        scored_docs_for_global.extend(rank_docs)
+                scored_docs_for_global.sort(key=lambda x: (x[0], x[1]))
+                scored_docs_for_global = scored_docs_for_global[:current_phase_boundary]
+                scored_token_chunks = [
+                    val_data.val_tokens[doc_start : doc_start + doc_len]
+                    for _, _, doc_start, doc_len in scored_docs_for_global
+                ]
+                if scored_token_chunks:
+                    global_ttt_tokens = torch.cat(scored_token_chunks)
+                else:
+                    global_ttt_tokens = val_data.val_tokens[:0]
+                if h.rank == 0:
+                    prefix_done = 0
+                    try:
+                        with open(prefix_counter_path, "rb") as f:
+                            prefix_done = int.from_bytes(
+                                f.read(8), "little", signed=True
+                            )
+                    except FileNotFoundError:
+                        pass
+                    log(
+                        f"ttpp: phase:{current_phase + 1}/{num_phases} pd:{prefix_done} "
+                        f"gd:{len(scored_docs_for_global)} "
+                        f"t:{time.perf_counter() - t_start:.1f}s"
+                    )
+                train_val_ttt_global_sgd_distributed(
+                    h, device, val_data, base_model, global_ttt_tokens
+                )
+                for p in base_model.parameters():
+                    p.requires_grad_(False)
+                reusable_lora = BatchedTTTLoRA(
+                    h.ttt_batch_size, base_model, h.ttt_lora_rank,
+                    k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+                ).to(device)
+                reusable_opt = _build_opt(reusable_lora)
+                current_phase += 1
+                if current_phase >= num_phases:
+                    global_ttt_done = True
+                else:
+                    current_phase_boundary = phase_boundaries[current_phase]
+                    if h.rank == 0:
+                        try:
+                            os.remove(pause_flag_path)
+                        except FileNotFoundError:
+                            pass
+                if dist.is_available() and dist.is_initialized():
+                    dist.barrier()
+                if h.rank == 0:
+                    log(f"ttpr: phase:{current_phase}/{num_phases} t:{time.perf_counter() - t_start:.1f}s")
+        del cur_lora, cur_opt
+    finally:
+        pass
+    if dist.is_available() and dist.is_initialized():
+        dist.all_reduce(loss_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(byte_sum, op=dist.ReduceOp.SUM)
+        dist.all_reduce(token_count, op=dist.ReduceOp.SUM)
+    for p in base_model.parameters():
+        p.requires_grad_(True)
+    base_model.train()
+    return _loss_bpb_from_sums(loss_sum, token_count, byte_sum)
+
+
+def timed_eval(label, fn, *args, **kwargs):
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    val_loss, val_bpb = fn(*args, **kwargs)
+    torch.cuda.synchronize()
+    elapsed_ms = 1e3 * (time.perf_counter() - t0)
+    log(
+        f"{label} val_loss:{val_loss:.8f} val_bpb:{val_bpb:.8f} eval_time:{elapsed_ms:.0f}ms"
+    )
+    return val_loss, val_bpb
+
+
+def train_model(h, device, val_data):
+    base_model = GPT(h).to(device).bfloat16()
+    restore_fp32_params(base_model)
+    compiled_model = torch.compile(base_model, dynamic=False, fullgraph=True)
+    compiled_forward_logits = torch.compile(
+        base_model.forward_logits, dynamic=False, fullgraph=True
+    )
+    model = compiled_model
+    log(f"model_params:{sum(p.numel()for p in base_model.parameters())}")
+    optimizers = Optimizers(h, base_model)
+    train_loader = DocumentPackingLoader(h, device)
+    max_wallclock_ms = (
+        1e3 * h.max_wallclock_seconds if h.max_wallclock_seconds > 0 else None
+    )
+    if max_wallclock_ms is not None:
+        max_wallclock_ms -= h.gptq_reserve_seconds * 1e3
+        log(
+            f"gptq:reserving {h.gptq_reserve_seconds:.0f}s, effective={max_wallclock_ms:.0f}ms"
+        )
+
+    def training_frac(step, elapsed_ms):
+        if max_wallclock_ms is None:
+            return step / max(h.iterations, 1)
+        return elapsed_ms / max(max_wallclock_ms, 1e-09)
+
+    def lr_mul(frac):
+        if h.warmdown_frac <= 0:
+            return 1.0
+        if frac >= 1.0 - h.warmdown_frac:
+            return max((1.0 - frac) / h.warmdown_frac, h.min_lr)
+        return 1.0
+
+    _clip_params = [p for p in base_model.parameters() if p.requires_grad]
+    def step_fn(step, lr_scale):
+        train_loss = torch.zeros((), device=device)
+        for micro_step in range(h.grad_accum_steps):
+            x, y, cu_seqlens, _max_seqlen = train_loader.next_batch(
+                h.train_batch_tokens, h.grad_accum_steps
+            )
+            with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                loss = model(x, y, cu_seqlens=cu_seqlens, max_seqlen=h.train_seq_len)
+            train_loss += loss.detach()
+            (loss / h.grad_accum_steps).backward()
+        train_loss /= h.grad_accum_steps
+        if step <= h.muon_momentum_warmup_steps:
+
+            frac = (
+
+                min(step / h.muon_momentum_warmup_steps, 1.0)
+
+                if h.muon_momentum_warmup_steps > 0
+
+                else 1.0
+
+            )
+
+            muon_momentum = (
+
+                1 - frac
+
+            ) * h.muon_momentum_warmup_start + frac * h.muon_momentum
+
+            for group in optimizers.optimizer_muon.param_groups:
+
+                group["momentum"] = muon_momentum
+        for opt in optimizers:
+            for group in opt.param_groups:
+                group["lr"] = group["base_lr"] * lr_scale
+        if h.grad_clip_norm > 0:
+            torch.nn.utils.clip_grad_norm_(_clip_params, h.grad_clip_norm)
+        optimizers.step(distributed=h.distributed)
+        return train_loss
+
+    if h.warmup_steps > 0:
+        initial_model_state = {
+            name: tensor.detach().cpu().clone()
+            for (name, tensor) in base_model.state_dict().items()
+        }
+        initial_optimizer_states = [
+            copy.deepcopy(opt.state_dict()) for opt in optimizers
+        ]
+        model.train()
+        num_tokens_local = h.train_batch_tokens // h.world_size
+        for blk in base_model.blocks:
+            blk.attn.rotary(num_tokens_local, device, torch.bfloat16)
+        cu_bucket_size = train_loader.cu_bucket_size
+        warmup_cu_buckets = tuple(cu_bucket_size * i for i in range(1, 5))
+        warmup_cu_iters = 3
+        x, y, cu_seqlens, _ = train_loader.next_batch(
+            h.train_batch_tokens, h.grad_accum_steps
+        )
+        log(f"warmup_cu_buckets:{','.join(str(b) for b in warmup_cu_buckets)} iters_each:{warmup_cu_iters}")
+        def _run_cu_bucket_warmup():
+            for bucket_len in warmup_cu_buckets:
+                boundaries = list(range(0, x.size(1), max(h.train_seq_len, 1)))
+                if boundaries[-1] != x.size(1):
+                    boundaries.append(x.size(1))
+                cu = torch.full((bucket_len,), x.size(1), dtype=torch.int32, device=device)
+                cu[: len(boundaries)] = torch.tensor(boundaries, dtype=torch.int32, device=device)
+                for _ in range(warmup_cu_iters):
+                    optimizers.zero_grad_all()
+                    with torch.autocast(device_type="cuda", dtype=torch.bfloat16, enabled=True):
+                        wloss = model(x, y, cu_seqlens=cu, max_seqlen=h.train_seq_len)
+                    (wloss / h.grad_accum_steps).backward()
+            optimizers.zero_grad_all()
+        _run_cu_bucket_warmup()
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            _run_cu_bucket_warmup()
+            base_model.looping_active = False
+        for warmup_step in range(h.warmup_steps):
+            step_fn(warmup_step, 1.0)
+            if (
+                warmup_step <= 5
+                or (warmup_step + 1) % 10 == 0
+                or warmup_step + 1 == h.warmup_steps
+            ):
+                log(f"warmup_step: {warmup_step+1}/{h.warmup_steps}")
+        if h.num_loops > 0:
+            base_model.looping_active = True
+            log(
+                f"loop_warmup:enabled encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+            for warmup_step in range(h.warmup_steps):
+                step_fn(warmup_step, 1.0)
+                if (
+                    warmup_step <= 5
+                    or (warmup_step + 1) % 10 == 0
+                    or warmup_step + 1 == h.warmup_steps
+                ):
+                    log(f"loop_warmup_step: {warmup_step+1}/{h.warmup_steps}")
+            base_model.looping_active = False
+        base_model.load_state_dict(initial_model_state, strict=True)
+        for (opt, state) in zip(optimizers, initial_optimizer_states, strict=True):
+            opt.load_state_dict(state)
+        optimizers.zero_grad_all()
+        train_loader = DocumentPackingLoader(h, device)
+    _live_state = base_model.state_dict(keep_vars=True)
+    ema_state = {
+        name: t.detach().float().clone()
+        for (name, t) in _live_state.items()
+    }
+    _ema_pairs = [(ema_state[name], t) for (name, t) in _live_state.items()]
+    ema_decay = h.ema_decay
+    training_time_ms = 0.0
+    stop_after_step = None
+    torch.cuda.synchronize()
+    t0 = time.perf_counter()
+    step = 0
+    while True:
+        last_step = (
+            step == h.iterations
+            or stop_after_step is not None
+            and step >= stop_after_step
+        )
+        should_validate = (
+            last_step or h.val_loss_every > 0 and step % h.val_loss_every == 0
+        )
+        if should_validate:
+            torch.cuda.synchronize()
+            training_time_ms += 1e3 * (time.perf_counter() - t0)
+            val_loss, val_bpb = eval_val(
+                h, device, val_data, model, compiled_forward_logits
+            )
+            log(
+                f"{step}/{h.iterations} val_loss: {val_loss:.4f} val_bpb: {val_bpb:.4f}"
+            )
+            torch.cuda.synchronize()
+            t0 = time.perf_counter()
+        if last_step:
+            if stop_after_step is not None and step < h.iterations:
+                log(
+                    f"stopping_early: wallclock_cap train_time: {training_time_ms:.0f}ms step: {step}/{h.iterations}"
+                )
+            break
+        elapsed_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        frac = training_frac(step, elapsed_ms)
+        scale = lr_mul(frac)
+        if (
+            h.num_loops > 0
+            and not base_model.looping_active
+            and frac >= h.enable_looping_at
+        ):
+            base_model.looping_active = True
+            log(
+                f"layer_loop:enabled step:{step} frac:{frac:.3f} encoder:{base_model.encoder_indices} decoder:{base_model.decoder_indices}"
+            )
+        train_loss = step_fn(step, scale)
+        with torch.no_grad():
+            for ema_t, t in _ema_pairs:
+                ema_t.mul_(ema_decay).add_(t.detach(), alpha=1.0 - ema_decay)
+        step += 1
+        approx_training_time_ms = training_time_ms + 1e3 * (time.perf_counter() - t0)
+        should_log_train = h.train_log_every > 0 and (
+            step <= 5 or step % h.train_log_every == 0 or stop_after_step is not None
+        )
+        if should_log_train:
+            tok_per_sec = step * h.train_batch_tokens / (approx_training_time_ms / 1e3)
+            log(
+                f"{step}/{h.iterations} train_loss: {train_loss.item():.4f} train_time: {approx_training_time_ms/60000:.1f}m tok/s: {tok_per_sec:.0f}"
+            )
+        reached_cap = (
+            max_wallclock_ms is not None and approx_training_time_ms >= max_wallclock_ms
+        )
+        if h.distributed and max_wallclock_ms is not None:
+            reached_cap_tensor = torch.tensor(int(reached_cap), device=device)
+            dist.all_reduce(reached_cap_tensor, op=dist.ReduceOp.MAX)
+            reached_cap = bool(reached_cap_tensor.item())
+        if stop_after_step is None and reached_cap:
+            stop_after_step = step
+    log(
+        f"peak memory allocated: {torch.cuda.max_memory_allocated()//1024//1024} MiB reserved: {torch.cuda.max_memory_reserved()//1024//1024} MiB"
+    )
+    log("ema:applying EMA weights")
+    current_state = base_model.state_dict()
+    avg_state = {
+        name: t.to(dtype=current_state[name].dtype) for (name, t) in ema_state.items()
+    }
+    base_model.load_state_dict(avg_state, strict=True)
+    return base_model, compiled_model, compiled_forward_logits
+
+
+def train_and_eval(h, device):
+    random.seed(h.seed)
+    np.random.seed(h.seed)
+    torch.manual_seed(h.seed)
+    torch.cuda.manual_seed_all(h.seed)
+    if h.artifact_dir and h.is_main_process:
+        os.makedirs(h.artifact_dir, exist_ok=True)
+    val_data = ValidationData(h, device)
+    log(
+        f"train_shards: {len(list(Path(h.datasets_dir).resolve().glob('fineweb_train_*.bin')))}"
+    )
+    log(f"val_tokens: {val_data.val_tokens.numel()-1}")
+    # TTT_EVAL_ONLY: skip training + GPTQ, jump straight to TTT eval on a
+    # pre-existing quantized artifact. Used to test TTT-only improvements
+    # (e.g., PR-1767's alpha/warm-start/WD) without retraining.
+    ttt_eval_only = os.environ.get("TTT_EVAL_ONLY", "0") == "1"
+    if ttt_eval_only:
+        log("TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval")
+        log(f"ttt_lora_alpha: {BatchedLinearLoRA._ALPHA}")
+        log(f"ttt_warm_start_a: {BatchedLinearLoRA._WARM_START_A}")
+        log(f"ttt_weight_decay: {h.ttt_weight_decay}")
+    else:
+        base_model, compiled_model, compiled_forward_logits = train_model(
+            h, device, val_data
+        )
+        torch._dynamo.reset()
+        timed_eval(
+            "diagnostic pre-quantization post-ema",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        if os.environ.get("PREQUANT_ONLY", "0") == "1":
+            log("PREQUANT_ONLY=1 — skipping serialize/GPTQ/post-quant eval/TTT")
+            return
+        serialize(h, base_model, Path(__file__).read_text(encoding="utf-8"))
+        if h.distributed:
+            dist.barrier()
+    eval_model = deserialize(h, device)
+    if h.num_loops > 0:
+        eval_model.looping_active = True
+    if not ttt_eval_only:
+        compiled_model = torch.compile(eval_model, dynamic=False, fullgraph=True)
+        compiled_forward_logits = torch.compile(
+            eval_model.forward_logits, dynamic=False, fullgraph=True
+        )
+        timed_eval(
+            "diagnostic quantized",
+            eval_val,
+            h,
+            device,
+            val_data,
+            compiled_model,
+            compiled_forward_logits,
+        )
+        del eval_model
+    if h.ttt_enabled:
+        if not ttt_eval_only:
+            del compiled_model
+        if ttt_eval_only:
+            del eval_model
+        torch._dynamo.reset()
+        torch.cuda.empty_cache()
+        ttt_model = deserialize(h, device)
+        if h.num_loops > 0:
+            ttt_model.looping_active = True
+        for p in ttt_model.parameters():
+            p.requires_grad_(False)
+
+        if h.rope_yarn:
+            _yarn_seqlen = h.train_batch_tokens // h.grad_accum_steps
+            for block in ttt_model.blocks:
+                block.attn.rotary(_yarn_seqlen, device, torch.bfloat16)
+        else:
+            for block in ttt_model.blocks:
+                block.attn.rotary._cos_cached = None
+                block.attn.rotary._sin_cached = None
+                block.attn.rotary._seq_len_cached = 0
+                block.attn.rotary(h.ttt_eval_seq_len, device, torch.bfloat16)
+
+        def _fwd_ttt_inner(input_ids, target_ids, lora):
+            return ttt_model.forward_ttt(input_ids, target_ids, lora=lora)
+
+        _fwd_ttt_compiled_inner = None
+
+        def _fwd_ttt(input_ids, target_ids, lora):
+            nonlocal _fwd_ttt_compiled_inner
+            if _fwd_ttt_compiled_inner is None:
+                _fwd_ttt_compiled_inner = torch.compile(_fwd_ttt_inner, dynamic=True)
+            return _fwd_ttt_compiled_inner(input_ids, target_ids, lora=lora)
+
+        fwd_ttt_compiled = _fwd_ttt
+        log(f"ttt_lora:warming up compile (random tokens, no val data)")
+        global BOS_ID
+        if BOS_ID is None:
+            BOS_ID = 1
+        t_warmup = time.perf_counter()
+        warmup_bszes = [h.ttt_batch_size]
+        for bsz in warmup_bszes:
+            wl = BatchedTTTLoRA(
+                bsz, ttt_model, h.ttt_lora_rank,
+                k_lora=h.ttt_k_lora, mlp_lora=h.ttt_mlp_lora, o_lora=h.ttt_o_lora,
+            ).to(device)
+            wo = torch.optim.AdamW(
+                wl.parameters(),
+                lr=h.ttt_lora_lr,
+                betas=(h.ttt_beta1, h.ttt_beta2),
+                eps=1e-10,
+                weight_decay=h.ttt_weight_decay,
+                fused=True,
+            )
+            for ctx_len in (h.ttt_chunk_size, h.ttt_eval_seq_len):
+                xw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                yw = torch.randint(0, h.vocab_size, (bsz, ctx_len), device=device, dtype=torch.int64)
+                with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
+                    ptl = fwd_ttt_compiled(xw, yw, lora=wl)
+                ptl[:, : min(h.ttt_chunk_size, ctx_len)].mean(dim=-1).sum().backward()
+                wo.step()
+                wo.zero_grad(set_to_none=True)
+            del wl, wo
+        torch.cuda.empty_cache()
+        compile_elapsed = time.perf_counter() - t_warmup
+        log(f"ttt_lora:compile warmup done ({compile_elapsed:.1f}s)")
+        log("\nbeginning TTT eval timer")
+        torch.cuda.synchronize()
+        t_ttt = time.perf_counter()
+        ttt_val_loss, ttt_val_bpb = eval_val_ttt_phased(
+            h, ttt_model, device, val_data, forward_ttt_train=fwd_ttt_compiled
+        )
+        torch.cuda.synchronize()
+        ttt_eval_elapsed = time.perf_counter() - t_ttt
+        log(
+            "quantized_ttt_phased "
+            f"val_loss:{ttt_val_loss:.8f} val_bpb:{ttt_val_bpb:.8f} "
+            f"eval_time:{1e3*ttt_eval_elapsed:.0f}ms"
+        )
+        log(f"total_eval_time:{ttt_eval_elapsed:.1f}s")
+        del ttt_model
+
+
+def main():
+    world_size = int(os.environ.get("WORLD_SIZE", "1"))
+    local_rank = int(os.environ.get("LOCAL_RANK", "0"))
+    distributed = "RANK" in os.environ and "WORLD_SIZE" in os.environ
+    if not torch.cuda.is_available():
+        raise RuntimeError("CUDA is required")
+    if world_size <= 0:
+        raise ValueError(f"WORLD_SIZE must be positive, got {world_size}")
+    if 8 % world_size != 0:
+        raise ValueError(
+            f"WORLD_SIZE={world_size} must divide 8 so grad_accum_steps stays integral"
+        )
+    device = torch.device("cuda", local_rank)
+    torch.cuda.set_device(device)
+    if distributed:
+        dist.init_process_group(backend="nccl", device_id=device)
+        dist.barrier()
+    torch.backends.cuda.matmul.allow_tf32 = True
+    torch.backends.cudnn.allow_tf32 = True
+    torch.set_float32_matmul_precision("high")
+    from torch.backends.cuda import (
+        enable_cudnn_sdp,
+        enable_flash_sdp,
+        enable_math_sdp,
+        enable_mem_efficient_sdp,
+    )
+
+    enable_cudnn_sdp(False)
+    enable_flash_sdp(True)
+    enable_mem_efficient_sdp(False)
+    enable_math_sdp(False)
+    torch._dynamo.config.optimize_ddp = False
+    torch._dynamo.config.cache_size_limit = 64
+    h = Hyperparameters()
+    set_logging_hparams(h)
+    if h.is_main_process:
+        os.makedirs(h.artifact_dir if h.artifact_dir else "logs", exist_ok=True)
+        log(100 * "=", console=False)
+        log("Hyperparameters:", console=True)
+        for (k, v) in sorted(vars(type(h)).items()):
+            if not k.startswith("_"):
+                log(f"  {k}: {v}", console=True)
+        log("=" * 100, console=False)
+        log("Source code:", console=False)
+        log("=" * 100, console=False)
+        with open(__file__, "r", encoding="utf-8") as _src:
+            log(_src.read(), console=False)
+        log("=" * 100, console=False)
+        log(f"Running Python {sys.version}", console=False)
+        log(f"Running PyTorch {torch.__version__}", console=False)
+        log("=" * 100, console=False)
+    train_and_eval(h, device)
+    if distributed:
+        dist.destroy_process_group()
+
+
+if __name__ == "__main__":
+    main()
+
+====================================================================================================
+Running Python 3.11.10 (main, Sep  7 2024, 18:35:41) [GCC 11.4.0]
+Running PyTorch 2.11.0+cu130
+====================================================================================================
+train_shards: 80
+val_tokens: 46688256
+model_params:35552455
+gptq:reserving 0s, effective=599500ms
+warmup_cu_buckets:64,128,192,256 iters_each:3
+warmup_step: 1/20
+warmup_step: 2/20
+warmup_step: 3/20
+warmup_step: 4/20
+warmup_step: 5/20
+warmup_step: 6/20
+warmup_step: 10/20
+warmup_step: 20/20
+loop_warmup:enabled encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+loop_warmup_step: 1/20
+loop_warmup_step: 2/20
+loop_warmup_step: 3/20
+loop_warmup_step: 4/20
+loop_warmup_step: 5/20
+loop_warmup_step: 6/20
+loop_warmup_step: 10/20
+loop_warmup_step: 20/20
+1/20000 train_loss: 9.2182 train_time: 0.0m tok/s: 16986989
+2/20000 train_loss: 12.9049 train_time: 0.0m tok/s: 11478176
+3/20000 train_loss: 10.3646 train_time: 0.0m tok/s: 10264844
+4/20000 train_loss: 8.9695 train_time: 0.0m tok/s: 9754365
+5/20000 train_loss: 8.0635 train_time: 0.0m tok/s: 9474196
+500/20000 train_loss: 2.8223 train_time: 0.8m tok/s: 8403682
+1000/20000 train_loss: 2.7586 train_time: 1.6m tok/s: 8378283
+1500/20000 train_loss: 2.6979 train_time: 2.3m tok/s: 8375063
+2000/20000 train_loss: 2.7597 train_time: 3.1m tok/s: 8375151
+2500/20000 train_loss: 2.6650 train_time: 3.9m tok/s: 8377853
+layer_loop:enabled step:2874 frac:0.450 encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10]
+3000/20000 train_loss: 2.6405 train_time: 4.8m tok/s: 8216247
+3500/20000 train_loss: 2.6184 train_time: 5.9m tok/s: 7729161
+4000/20000 train_loss: 2.5376 train_time: 7.1m tok/s: 7399029
+4500/20000 train_loss: 2.5070 train_time: 8.2m tok/s: 7162585
+5000/20000 train_loss: 2.4674 train_time: 9.4m tok/s: 6983189
+5264/20000 val_loss: 2.4200 val_bpb: 1.0790
+stopping_early: wallclock_cap train_time: 599612ms step: 5264/20000
+peak memory allocated: 41649 MiB reserved: 48396 MiB
+ema:applying EMA weights
+diagnostic pre-quantization post-ema val_loss:2.39050404 val_bpb:1.06579670 eval_time:8935ms
+Serialized model: 131747517 bytes
+Code size (uncompressed): 163036 bytes
+Code size (compressed): 41220 bytes
+GPTQ:collecting Hessians from calibration data...
+GPTQ:collected 67 Hessians in 3.4s
+Quantized weights:
+  gate_int8_row: blocks.attn.attn_gate_w
+  gptq (int6): blocks.attn.c_k.weight, blocks.attn.c_q.weight, blocks.attn.c_v.weight, blocks.attn.proj.weight, blocks.mlp.fc.weight, blocks.mlp.proj.weight
+  gptq (int6)+lqer_asym: blocks.mlp.fc.weight
+  gptq (int7)+lqer_asym: tok_emb.weight
+  passthrough (float16): blocks.attn.q_gain, blocks.attn_scale, blocks.mlp_scale, blocks.resid_mix, parallel_post_lambdas, parallel_resid_lambdas, skip_gates, skip_weights, smear_gate.weight, smear_lambda
+Serialize: per-group lrzip compression...
+Serialize: per-group compression done in 139.7s
+Serialized model quantized+pergroup: 15775768 bytes
+Total submission size quantized+pergroup: 15816988 bytes
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.1s
+diagnostic quantized val_loss:2.41030953 val_bpb:1.07462690 eval_time:9708ms
+Deserialize: per-group lrzip decompression...
+Deserialize: decompression done in 21.1s
+ttt_lora:warming up compile (random tokens, no val data)
+ttt_lora:compile warmup done (110.1s)
+
+beginning TTT eval timer
+ttt_phased: total_docs:50000 prefix_docs:2500 suffix_docs:47500 num_phases:3 boundaries:[833, 1666, 2500]
+ttp: b775/782 bl:2.3225 bb:1.0645 rl:2.3225 rb:1.0645 dl:6695-7323 gd:0
+ttp: b774/782 bl:2.3603 bb:1.0702 rl:2.3407 rb:1.0673 dl:6300-6695 gd:0
+ttp: b769/782 bl:2.3762 bb:1.0878 rl:2.3504 rb:1.0729 dl:4969-5170 gd:0
+ttp: b764/782 bl:2.3507 bb:1.0924 rl:2.3504 rb:1.0764 dl:4180-4277 gd:0
+ttpp: phase:1/3 pd:1296 gd:833 t:215.0s
+tttg: c1/127 lr:0.001000 t:0.3s
+tttg: c2/127 lr:0.001000 t:0.4s
+tttg: c3/127 lr:0.000999 t:0.5s
+tttg: c4/127 lr:0.000999 t:0.5s
+tttg: c5/127 lr:0.000998 t:0.6s
+tttg: c6/127 lr:0.000996 t:0.7s
+tttg: c7/127 lr:0.000994 t:0.8s
+tttg: c8/127 lr:0.000992 t:0.8s
+tttg: c9/127 lr:0.000990 t:0.9s
+tttg: c10/127 lr:0.000987 t:1.0s
+tttg: c11/127 lr:0.000985 t:1.1s
+tttg: c12/127 lr:0.000981 t:1.2s
+tttg: c13/127 lr:0.000978 t:1.2s
+tttg: c14/127 lr:0.000974 t:1.3s
+tttg: c15/127 lr:0.000970 t:1.4s
+tttg: c16/127 lr:0.000965 t:1.5s
+tttg: c17/127 lr:0.000961 t:1.5s
+tttg: c18/127 lr:0.000956 t:1.6s
+tttg: c19/127 lr:0.000950 t:1.7s
+tttg: c20/127 lr:0.000945 t:1.8s
+tttg: c21/127 lr:0.000939 t:1.9s
+tttg: c22/127 lr:0.000933 t:1.9s
+tttg: c23/127 lr:0.000927 t:2.0s
+tttg: c24/127 lr:0.000920 t:2.1s
+tttg: c25/127 lr:0.000913 t:2.2s
+tttg: c26/127 lr:0.000906 t:2.2s
+tttg: c27/127 lr:0.000899 t:2.3s
+tttg: c28/127 lr:0.000891 t:2.4s
+tttg: c29/127 lr:0.000883 t:2.5s
+tttg: c30/127 lr:0.000875 t:2.5s
+tttg: c31/127 lr:0.000867 t:2.6s
+tttg: c32/127 lr:0.000858 t:2.7s
+tttg: c33/127 lr:0.000849 t:2.8s
+tttg: c34/127 lr:0.000840 t:2.9s
+tttg: c35/127 lr:0.000831 t:2.9s
+tttg: c36/127 lr:0.000821 t:3.0s
+tttg: c37/127 lr:0.000812 t:3.1s
+tttg: c38/127 lr:0.000802 t:3.2s
+tttg: c39/127 lr:0.000792 t:3.2s
+tttg: c40/127 lr:0.000782 t:3.3s
+tttg: c41/127 lr:0.000771 t:3.4s
+tttg: c42/127 lr:0.000761 t:3.5s
+tttg: c43/127 lr:0.000750 t:3.6s
+tttg: c44/127 lr:0.000739 t:3.6s
+tttg: c45/127 lr:0.000728 t:3.7s
+tttg: c46/127 lr:0.000717 t:3.8s
+tttg: c47/127 lr:0.000706 t:3.9s
+tttg: c48/127 lr:0.000694 t:3.9s
+tttg: c49/127 lr:0.000683 t:4.0s
+tttg: c50/127 lr:0.000671 t:4.1s
+tttg: c51/127 lr:0.000659 t:4.2s
+tttg: c52/127 lr:0.000647 t:4.2s
+tttg: c53/127 lr:0.000635 t:4.3s
+tttg: c54/127 lr:0.000623 t:4.4s
+tttg: c55/127 lr:0.000611 t:4.5s
+tttg: c56/127 lr:0.000599 t:4.5s
+tttg: c57/127 lr:0.000587 t:4.6s
+tttg: c58/127 lr:0.000575 t:4.7s
+tttg: c59/127 lr:0.000562 t:4.8s
+tttg: c60/127 lr:0.000550 t:4.9s
+tttg: c61/127 lr:0.000537 t:4.9s
+tttg: c62/127 lr:0.000525 t:5.0s
+tttg: c63/127 lr:0.000512 t:5.1s
+tttg: c64/127 lr:0.000500 t:5.2s
+tttg: c65/127 lr:0.000488 t:5.3s
+tttg: c66/127 lr:0.000475 t:5.3s
+tttg: c67/127 lr:0.000463 t:5.4s
+tttg: c68/127 lr:0.000450 t:5.5s
+tttg: c69/127 lr:0.000438 t:5.6s
+tttg: c70/127 lr:0.000425 t:5.6s
+tttg: c71/127 lr:0.000413 t:5.7s
+tttg: c72/127 lr:0.000401 t:5.8s
+tttg: c73/127 lr:0.000389 t:5.9s
+tttg: c74/127 lr:0.000377 t:6.0s
+tttg: c75/127 lr:0.000365 t:6.0s
+tttg: c76/127 lr:0.000353 t:6.1s
+tttg: c77/127 lr:0.000341 t:6.2s
+tttg: c78/127 lr:0.000329 t:6.3s
+tttg: c79/127 lr:0.000317 t:6.4s
+tttg: c80/127 lr:0.000306 t:6.4s
+tttg: c81/127 lr:0.000294 t:6.5s
+tttg: c82/127 lr:0.000283 t:6.6s
+tttg: c83/127 lr:0.000272 t:6.7s
+tttg: c84/127 lr:0.000261 t:6.8s
+tttg: c85/127 lr:0.000250 t:6.8s
+tttg: c86/127 lr:0.000239 t:6.9s
+tttg: c87/127 lr:0.000229 t:7.0s
+tttg: c88/127 lr:0.000218 t:7.1s
+tttg: c89/127 lr:0.000208 t:7.2s
+tttg: c90/127 lr:0.000198 t:7.2s
+tttg: c91/127 lr:0.000188 t:7.3s
+tttg: c92/127 lr:0.000179 t:7.4s
+tttg: c93/127 lr:0.000169 t:7.5s
+tttg: c94/127 lr:0.000160 t:7.5s
+tttg: c95/127 lr:0.000151 t:7.6s
+tttg: c96/127 lr:0.000142 t:7.7s
+tttg: c97/127 lr:0.000133 t:7.8s
+tttg: c98/127 lr:0.000125 t:7.8s
+tttg: c99/127 lr:0.000117 t:7.9s
+tttg: c100/127 lr:0.000109 t:8.0s
+tttg: c101/127 lr:0.000101 t:8.1s
+tttg: c102/127 lr:0.000094 t:8.2s
+tttg: c103/127 lr:0.000087 t:8.2s
+tttg: c104/127 lr:0.000080 t:8.3s
+tttg: c105/127 lr:0.000073 t:8.4s
+tttg: c106/127 lr:0.000067 t:8.5s
+tttg: c107/127 lr:0.000061 t:8.5s
+tttg: c108/127 lr:0.000055 t:8.6s
+tttg: c109/127 lr:0.000050 t:8.7s
+tttg: c110/127 lr:0.000044 t:8.8s
+tttg: c111/127 lr:0.000039 t:8.9s
+tttg: c112/127 lr:0.000035 t:8.9s
+tttg: c113/127 lr:0.000030 t:9.0s
+tttg: c114/127 lr:0.000026 t:9.1s
+tttg: c115/127 lr:0.000022 t:9.2s
+tttg: c116/127 lr:0.000019 t:9.2s
+tttg: c117/127 lr:0.000015 t:9.3s
+tttg: c118/127 lr:0.000013 t:9.4s
+tttg: c119/127 lr:0.000010 t:9.5s
+tttg: c120/127 lr:0.000008 t:9.6s
+tttg: c121/127 lr:0.000006 t:9.6s
+tttg: c122/127 lr:0.000004 t:9.7s
+tttg: c123/127 lr:0.000002 t:9.8s
+tttg: c124/127 lr:0.000001 t:9.9s
+tttg: c125/127 lr:0.000001 t:10.0s
+tttg: c126/127 lr:0.000000 t:10.0s
+ttpr: phase:1/3 t:226.7s
+ttp: b760/782 bl:2.4332 bb:1.0751 rl:2.3622 rb:1.0762 dl:3729-3820 gd:0
+ttpp: phase:2/3 pd:2128 gd:1666 t:299.5s
+tttg: c1/214 lr:0.001000 t:0.1s
+tttg: c2/214 lr:0.001000 t:0.2s
+tttg: c3/214 lr:0.001000 t:0.2s
+tttg: c4/214 lr:0.001000 t:0.3s
+tttg: c5/214 lr:0.000999 t:0.4s
+tttg: c6/214 lr:0.000999 t:0.5s
+tttg: c7/214 lr:0.000998 t:0.5s
+tttg: c8/214 lr:0.000997 t:0.6s
+tttg: c9/214 lr:0.000997 t:0.7s
+tttg: c10/214 lr:0.000996 t:0.8s
+tttg: c11/214 lr:0.000995 t:0.9s
+tttg: c12/214 lr:0.000993 t:0.9s
+tttg: c13/214 lr:0.000992 t:1.0s
+tttg: c14/214 lr:0.000991 t:1.1s
+tttg: c15/214 lr:0.000989 t:1.2s
+tttg: c16/214 lr:0.000988 t:1.2s
+tttg: c17/214 lr:0.000986 t:1.3s
+tttg: c18/214 lr:0.000984 t:1.4s
+tttg: c19/214 lr:0.000982 t:1.5s
+tttg: c20/214 lr:0.000980 t:1.6s
+tttg: c21/214 lr:0.000978 t:1.6s
+tttg: c22/214 lr:0.000976 t:1.7s
+tttg: c23/214 lr:0.000974 t:1.8s
+tttg: c24/214 lr:0.000972 t:1.9s
+tttg: c25/214 lr:0.000969 t:1.9s
+tttg: c26/214 lr:0.000966 t:2.0s
+tttg: c27/214 lr:0.000964 t:2.1s
+tttg: c28/214 lr:0.000961 t:2.2s
+tttg: c29/214 lr:0.000958 t:2.2s
+tttg: c30/214 lr:0.000955 t:2.3s
+tttg: c31/214 lr:0.000952 t:2.4s
+tttg: c32/214 lr:0.000949 t:2.5s
+tttg: c33/214 lr:0.000945 t:2.6s
+tttg: c34/214 lr:0.000942 t:2.6s
+tttg: c35/214 lr:0.000938 t:2.7s
+tttg: c36/214 lr:0.000935 t:2.8s
+tttg: c37/214 lr:0.000931 t:2.9s
+tttg: c38/214 lr:0.000927 t:2.9s
+tttg: c39/214 lr:0.000924 t:3.0s
+tttg: c40/214 lr:0.000920 t:3.1s
+tttg: c41/214 lr:0.000915 t:3.2s
+tttg: c42/214 lr:0.000911 t:3.3s
+tttg: c43/214 lr:0.000907 t:3.3s
+tttg: c44/214 lr:0.000903 t:3.4s
+tttg: c45/214 lr:0.000898 t:3.5s
+tttg: c46/214 lr:0.000894 t:3.6s
+tttg: c47/214 lr:0.000889 t:3.6s
+tttg: c48/214 lr:0.000885 t:3.7s
+tttg: c49/214 lr:0.000880 t:3.8s
+tttg: c50/214 lr:0.000875 t:3.9s
+tttg: c51/214 lr:0.000870 t:3.9s
+tttg: c52/214 lr:0.000865 t:4.0s
+tttg: c53/214 lr:0.000860 t:4.1s
+tttg: c54/214 lr:0.000855 t:4.2s
+tttg: c55/214 lr:0.000850 t:4.3s
+tttg: c56/214 lr:0.000844 t:4.3s
+tttg: c57/214 lr:0.000839 t:4.4s
+tttg: c58/214 lr:0.000833 t:4.5s
+tttg: c59/214 lr:0.000828 t:4.6s
+tttg: c60/214 lr:0.000822 t:4.6s
+tttg: c61/214 lr:0.000817 t:4.7s
+tttg: c62/214 lr:0.000811 t:4.8s
+tttg: c63/214 lr:0.000805 t:4.9s
+tttg: c64/214 lr:0.000799 t:5.0s
+tttg: c65/214 lr:0.000793 t:5.0s
+tttg: c66/214 lr:0.000787 t:5.1s
+tttg: c67/214 lr:0.000781 t:5.2s
+tttg: c68/214 lr:0.000775 t:5.3s
+tttg: c69/214 lr:0.000769 t:5.4s
+tttg: c70/214 lr:0.000763 t:5.4s
+tttg: c71/214 lr:0.000756 t:5.5s
+tttg: c72/214 lr:0.000750 t:5.6s
+tttg: c73/214 lr:0.000744 t:5.7s
+tttg: c74/214 lr:0.000737 t:5.7s
+tttg: c75/214 lr:0.000731 t:5.8s
+tttg: c76/214 lr:0.000724 t:5.9s
+tttg: c77/214 lr:0.000717 t:6.0s
+tttg: c78/214 lr:0.000711 t:6.0s
+tttg: c79/214 lr:0.000704 t:6.1s
+tttg: c80/214 lr:0.000697 t:6.2s
+tttg: c81/214 lr:0.000690 t:6.3s
+tttg: c82/214 lr:0.000684 t:6.4s
+tttg: c83/214 lr:0.000677 t:6.4s
+tttg: c84/214 lr:0.000670 t:6.5s
+tttg: c85/214 lr:0.000663 t:6.6s
+tttg: c86/214 lr:0.000656 t:6.7s
+tttg: c87/214 lr:0.000649 t:6.7s
+tttg: c88/214 lr:0.000642 t:6.8s
+tttg: c89/214 lr:0.000635 t:6.9s
+tttg: c90/214 lr:0.000628 t:7.0s
+tttg: c91/214 lr:0.000620 t:7.0s
+tttg: c92/214 lr:0.000613 t:7.1s
+tttg: c93/214 lr:0.000606 t:7.2s
+tttg: c94/214 lr:0.000599 t:7.3s
+tttg: c95/214 lr:0.000592 t:7.3s
+tttg: c96/214 lr:0.000584 t:7.4s
+tttg: c97/214 lr:0.000577 t:7.5s
+tttg: c98/214 lr:0.000570 t:7.6s
+tttg: c99/214 lr:0.000563 t:7.7s
+tttg: c100/214 lr:0.000555 t:7.7s
+tttg: c101/214 lr:0.000548 t:7.8s
+tttg: c102/214 lr:0.000541 t:7.9s
+tttg: c103/214 lr:0.000533 t:8.0s
+tttg: c104/214 lr:0.000526 t:8.1s
+tttg: c105/214 lr:0.000518 t:8.1s
+tttg: c106/214 lr:0.000511 t:8.2s
+tttg: c107/214 lr:0.000504 t:8.3s
+tttg: c108/214 lr:0.000496 t:8.4s
+tttg: c109/214 lr:0.000489 t:8.4s
+tttg: c110/214 lr:0.000482 t:8.5s
+tttg: c111/214 lr:0.000474 t:8.6s
+tttg: c112/214 lr:0.000467 t:8.7s
+tttg: c113/214 lr:0.000459 t:8.7s
+tttg: c114/214 lr:0.000452 t:8.8s
+tttg: c115/214 lr:0.000445 t:8.9s
+tttg: c116/214 lr:0.000437 t:9.0s
+tttg: c117/214 lr:0.000430 t:9.1s
+tttg: c118/214 lr:0.000423 t:9.2s
+tttg: c119/214 lr:0.000416 t:9.2s
+tttg: c120/214 lr:0.000408 t:9.3s
+tttg: c121/214 lr:0.000401 t:9.4s
+tttg: c122/214 lr:0.000394 t:9.5s
+tttg: c123/214 lr:0.000387 t:9.5s
+tttg: c124/214 lr:0.000380 t:9.6s
+tttg: c125/214 lr:0.000372 t:9.7s
+tttg: c126/214 lr:0.000365 t:9.8s
+tttg: c127/214 lr:0.000358 t:9.9s
+tttg: c128/214 lr:0.000351 t:9.9s
+tttg: c129/214 lr:0.000344 t:10.0s
+tttg: c130/214 lr:0.000337 t:10.1s
+tttg: c131/214 lr:0.000330 t:10.2s
+tttg: c132/214 lr:0.000323 t:10.2s
+tttg: c133/214 lr:0.000316 t:10.3s
+tttg: c134/214 lr:0.000310 t:10.4s
+tttg: c135/214 lr:0.000303 t:10.5s
+tttg: c136/214 lr:0.000296 t:10.6s
+tttg: c137/214 lr:0.000289 t:10.6s
+tttg: c138/214 lr:0.000283 t:10.7s
+tttg: c139/214 lr:0.000276 t:10.8s
+tttg: c140/214 lr:0.000269 t:10.9s
+tttg: c141/214 lr:0.000263 t:11.0s
+tttg: c142/214 lr:0.000256 t:11.0s
+tttg: c143/214 lr:0.000250 t:11.1s
+tttg: c144/214 lr:0.000244 t:11.2s
+tttg: c145/214 lr:0.000237 t:11.3s
+tttg: c146/214 lr:0.000231 t:11.4s
+tttg: c147/214 lr:0.000225 t:11.4s
+tttg: c148/214 lr:0.000219 t:11.5s
+tttg: c149/214 lr:0.000213 t:11.6s
+tttg: c150/214 lr:0.000207 t:11.7s
+tttg: c151/214 lr:0.000201 t:11.8s
+tttg: c152/214 lr:0.000195 t:11.8s
+tttg: c153/214 lr:0.000189 t:11.9s
+tttg: c154/214 lr:0.000183 t:12.0s
+tttg: c155/214 lr:0.000178 t:12.1s
+tttg: c156/214 lr:0.000172 t:12.2s
+tttg: c157/214 lr:0.000167 t:12.3s
+tttg: c158/214 lr:0.000161 t:12.3s
+tttg: c159/214 lr:0.000156 t:12.4s
+tttg: c160/214 lr:0.000150 t:12.5s
+tttg: c161/214 lr:0.000145 t:12.6s
+tttg: c162/214 lr:0.000140 t:12.7s
+tttg: c163/214 lr:0.000135 t:12.7s
+tttg: c164/214 lr:0.000130 t:12.8s
+tttg: c165/214 lr:0.000125 t:12.9s
+tttg: c166/214 lr:0.000120 t:13.0s
+tttg: c167/214 lr:0.000115 t:13.0s
+tttg: c168/214 lr:0.000111 t:13.1s
+tttg: c169/214 lr:0.000106 t:13.2s
+tttg: c170/214 lr:0.000102 t:13.3s
+tttg: c171/214 lr:0.000097 t:13.4s
+tttg: c172/214 lr:0.000093 t:13.4s
+tttg: c173/214 lr:0.000089 t:13.5s
+tttg: c174/214 lr:0.000085 t:13.6s
+tttg: c175/214 lr:0.000080 t:13.7s
+tttg: c176/214 lr:0.000076 t:13.8s
+tttg: c177/214 lr:0.000073 t:13.8s
+tttg: c178/214 lr:0.000069 t:13.9s
+tttg: c179/214 lr:0.000065 t:14.0s
+tttg: c180/214 lr:0.000062 t:14.1s
+tttg: c181/214 lr:0.000058 t:14.2s
+tttg: c182/214 lr:0.000055 t:14.2s
+tttg: c183/214 lr:0.000051 t:14.3s
+tttg: c184/214 lr:0.000048 t:14.4s
+tttg: c185/214 lr:0.000045 t:14.5s
+tttg: c186/214 lr:0.000042 t:14.5s
+tttg: c187/214 lr:0.000039 t:14.6s
+tttg: c188/214 lr:0.000036 t:14.7s
+tttg: c189/214 lr:0.000034 t:14.8s
+tttg: c190/214 lr:0.000031 t:14.9s
+tttg: c191/214 lr:0.000028 t:14.9s
+tttg: c192/214 lr:0.000026 t:15.0s
+tttg: c193/214 lr:0.000024 t:15.1s
+tttg: c194/214 lr:0.000022 t:15.2s
+tttg: c195/214 lr:0.000020 t:15.2s
+tttg: c196/214 lr:0.000018 t:15.3s
+tttg: c197/214 lr:0.000016 t:15.4s
+tttg: c198/214 lr:0.000014 t:15.5s
+tttg: c199/214 lr:0.000012 t:15.5s
+tttg: c200/214 lr:0.000011 t:15.6s
+tttg: c201/214 lr:0.000009 t:15.7s
+tttg: c202/214 lr:0.000008 t:15.8s
+tttg: c203/214 lr:0.000007 t:15.9s
+tttg: c204/214 lr:0.000005 t:15.9s
+tttg: c205/214 lr:0.000004 t:16.0s
+tttg: c206/214 lr:0.000003 t:16.1s
+tttg: c207/214 lr:0.000003 t:16.2s
+tttg: c208/214 lr:0.000002 t:16.2s
+tttg: c209/214 lr:0.000001 t:16.3s
+tttg: c210/214 lr:0.000001 t:16.4s
+tttg: c211/214 lr:0.000000 t:16.5s
+tttg: c212/214 lr:0.000000 t:16.6s
+tttg: c213/214 lr:0.000000 t:16.6s
+ttpr: phase:2/3 t:317.8s
+ttp: b748/782 bl:2.3785 bb:1.0792 rl:2.3638 rb:1.0765 dl:2918-2965 gd:0
+ttpp: phase:3/3 pd:2960 gd:2500 t:332.9s
+tttg: c1/282 lr:0.001000 t:0.1s
+tttg: c2/282 lr:0.001000 t:0.2s
+tttg: c3/282 lr:0.001000 t:0.2s
+tttg: c4/282 lr:0.001000 t:0.3s
+tttg: c5/282 lr:0.001000 t:0.4s
+tttg: c6/282 lr:0.000999 t:0.5s
+tttg: c7/282 lr:0.000999 t:0.5s
+tttg: c8/282 lr:0.000998 t:0.6s
+tttg: c9/282 lr:0.000998 t:0.7s
+tttg: c10/282 lr:0.000997 t:0.8s
+tttg: c11/282 lr:0.000997 t:0.9s
+tttg: c12/282 lr:0.000996 t:1.0s
+tttg: c13/282 lr:0.000996 t:1.0s
+tttg: c14/282 lr:0.000995 t:1.1s
+tttg: c15/282 lr:0.000994 t:1.2s
+tttg: c16/282 lr:0.000993 t:1.3s
+tttg: c17/282 lr:0.000992 t:1.3s
+tttg: c18/282 lr:0.000991 t:1.4s
+tttg: c19/282 lr:0.000990 t:1.5s
+tttg: c20/282 lr:0.000989 t:1.6s
+tttg: c21/282 lr:0.000988 t:1.6s
+tttg: c22/282 lr:0.000986 t:1.7s
+tttg: c23/282 lr:0.000985 t:1.8s
+tttg: c24/282 lr:0.000984 t:1.9s
+tttg: c25/282 lr:0.000982 t:2.0s
+tttg: c26/282 lr:0.000981 t:2.0s
+tttg: c27/282 lr:0.000979 t:2.1s
+tttg: c28/282 lr:0.000977 t:2.2s
+tttg: c29/282 lr:0.000976 t:2.3s
+tttg: c30/282 lr:0.000974 t:2.3s
+tttg: c31/282 lr:0.000972 t:2.4s
+tttg: c32/282 lr:0.000970 t:2.5s
+tttg: c33/282 lr:0.000968 t:2.6s
+tttg: c34/282 lr:0.000966 t:2.7s
+tttg: c35/282 lr:0.000964 t:2.7s
+tttg: c36/282 lr:0.000962 t:2.8s
+tttg: c37/282 lr:0.000960 t:2.9s
+tttg: c38/282 lr:0.000958 t:3.0s
+tttg: c39/282 lr:0.000956 t:3.1s
+tttg: c40/282 lr:0.000953 t:3.1s
+tttg: c41/282 lr:0.000951 t:3.2s
+tttg: c42/282 lr:0.000948 t:3.3s
+tttg: c43/282 lr:0.000946 t:3.4s
+tttg: c44/282 lr:0.000943 t:3.5s
+tttg: c45/282 lr:0.000941 t:3.5s
+tttg: c46/282 lr:0.000938 t:3.6s
+tttg: c47/282 lr:0.000935 t:3.7s
+tttg: c48/282 lr:0.000933 t:3.8s
+tttg: c49/282 lr:0.000930 t:3.8s
+tttg: c50/282 lr:0.000927 t:3.9s
+tttg: c51/282 lr:0.000924 t:4.0s
+tttg: c52/282 lr:0.000921 t:4.1s
+tttg: c53/282 lr:0.000918 t:4.2s
+tttg: c54/282 lr:0.000915 t:4.2s
+tttg: c55/282 lr:0.000912 t:4.3s
+tttg: c56/282 lr:0.000908 t:4.4s
+tttg: c57/282 lr:0.000905 t:4.5s
+tttg: c58/282 lr:0.000902 t:4.5s
+tttg: c59/282 lr:0.000899 t:4.6s
+tttg: c60/282 lr:0.000895 t:4.7s
+tttg: c61/282 lr:0.000892 t:4.8s
+tttg: c62/282 lr:0.000888 t:4.9s
+tttg: c63/282 lr:0.000885 t:4.9s
+tttg: c64/282 lr:0.000881 t:5.0s
+tttg: c65/282 lr:0.000877 t:5.1s
+tttg: c66/282 lr:0.000874 t:5.2s
+tttg: c67/282 lr:0.000870 t:5.3s
+tttg: c68/282 lr:0.000866 t:5.3s
+tttg: c69/282 lr:0.000862 t:5.4s
+tttg: c70/282 lr:0.000858 t:5.5s
+tttg: c71/282 lr:0.000855 t:5.6s
+tttg: c72/282 lr:0.000851 t:5.7s
+tttg: c73/282 lr:0.000847 t:5.7s
+tttg: c74/282 lr:0.000843 t:5.8s
+tttg: c75/282 lr:0.000838 t:5.9s
+tttg: c76/282 lr:0.000834 t:6.0s
+tttg: c77/282 lr:0.000830 t:6.1s
+tttg: c78/282 lr:0.000826 t:6.1s
+tttg: c79/282 lr:0.000822 t:6.2s
+tttg: c80/282 lr:0.000817 t:6.3s
+tttg: c81/282 lr:0.000813 t:6.4s
+tttg: c82/282 lr:0.000809 t:6.4s
+tttg: c83/282 lr:0.000804 t:6.5s
+tttg: c84/282 lr:0.000800 t:6.6s
+tttg: c85/282 lr:0.000795 t:6.7s
+tttg: c86/282 lr:0.000791 t:6.8s
+tttg: c87/282 lr:0.000786 t:6.8s
+tttg: c88/282 lr:0.000782 t:6.9s
+tttg: c89/282 lr:0.000777 t:7.0s
+tttg: c90/282 lr:0.000772 t:7.1s
+tttg: c91/282 lr:0.000768 t:7.1s
+tttg: c92/282 lr:0.000763 t:7.2s
+tttg: c93/282 lr:0.000758 t:7.3s
+tttg: c94/282 lr:0.000753 t:7.4s
+tttg: c95/282 lr:0.000748 t:7.5s
+tttg: c96/282 lr:0.000744 t:7.5s
+tttg: c97/282 lr:0.000739 t:7.6s
+tttg: c98/282 lr:0.000734 t:7.7s
+tttg: c99/282 lr:0.000729 t:7.8s
+tttg: c100/282 lr:0.000724 t:7.9s
+tttg: c101/282 lr:0.000719 t:7.9s
+tttg: c102/282 lr:0.000714 t:8.0s
+tttg: c103/282 lr:0.000709 t:8.1s
+tttg: c104/282 lr:0.000704 t:8.2s
+tttg: c105/282 lr:0.000698 t:8.2s
+tttg: c106/282 lr:0.000693 t:8.3s
+tttg: c107/282 lr:0.000688 t:8.4s
+tttg: c108/282 lr:0.000683 t:8.5s
+tttg: c109/282 lr:0.000678 t:8.6s
+tttg: c110/282 lr:0.000672 t:8.6s
+tttg: c111/282 lr:0.000667 t:8.7s
+tttg: c112/282 lr:0.000662 t:8.8s
+tttg: c113/282 lr:0.000657 t:8.9s
+tttg: c114/282 lr:0.000651 t:8.9s
+tttg: c115/282 lr:0.000646 t:9.0s
+tttg: c116/282 lr:0.000641 t:9.1s
+tttg: c117/282 lr:0.000635 t:9.2s
+tttg: c118/282 lr:0.000630 t:9.3s
+tttg: c119/282 lr:0.000624 t:9.3s
+tttg: c120/282 lr:0.000619 t:9.4s
+tttg: c121/282 lr:0.000614 t:9.5s
+tttg: c122/282 lr:0.000608 t:9.6s
+tttg: c123/282 lr:0.000603 t:9.6s
+tttg: c124/282 lr:0.000597 t:9.7s
+tttg: c125/282 lr:0.000592 t:9.8s
+tttg: c126/282 lr:0.000586 t:9.9s
+tttg: c127/282 lr:0.000581 t:9.9s
+tttg: c128/282 lr:0.000575 t:10.0s
+tttg: c129/282 lr:0.000570 t:10.1s
+tttg: c130/282 lr:0.000564 t:10.2s
+tttg: c131/282 lr:0.000559 t:10.3s
+tttg: c132/282 lr:0.000553 t:10.3s
+tttg: c133/282 lr:0.000547 t:10.4s
+tttg: c134/282 lr:0.000542 t:10.5s
+tttg: c135/282 lr:0.000536 t:10.6s
+tttg: c136/282 lr:0.000531 t:10.6s
+tttg: c137/282 lr:0.000525 t:10.7s
+tttg: c138/282 lr:0.000520 t:10.8s
+tttg: c139/282 lr:0.000514 t:10.9s
+tttg: c140/282 lr:0.000508 t:11.0s
+tttg: c141/282 lr:0.000503 t:11.0s
+tttg: c142/282 lr:0.000497 t:11.1s
+tttg: c143/282 lr:0.000492 t:11.2s
+tttg: c144/282 lr:0.000486 t:11.3s
+tttg: c145/282 lr:0.000480 t:11.4s
+tttg: c146/282 lr:0.000475 t:11.4s
+tttg: c147/282 lr:0.000469 t:11.5s
+tttg: c148/282 lr:0.000464 t:11.6s
+tttg: c149/282 lr:0.000458 t:11.7s
+tttg: c150/282 lr:0.000453 t:11.7s
+tttg: c151/282 lr:0.000447 t:11.8s
+tttg: c152/282 lr:0.000441 t:11.9s
+tttg: c153/282 lr:0.000436 t:12.0s
+tttg: c154/282 lr:0.000430 t:12.0s
+tttg: c155/282 lr:0.000425 t:12.1s
+tttg: c156/282 lr:0.000419 t:12.2s
+tttg: c157/282 lr:0.000414 t:12.3s
+tttg: c158/282 lr:0.000408 t:12.4s
+tttg: c159/282 lr:0.000403 t:12.4s
+tttg: c160/282 lr:0.000397 t:12.5s
+tttg: c161/282 lr:0.000392 t:12.6s
+tttg: c162/282 lr:0.000386 t:12.7s
+tttg: c163/282 lr:0.000381 t:12.8s
+tttg: c164/282 lr:0.000376 t:12.8s
+tttg: c165/282 lr:0.000370 t:12.9s
+tttg: c166/282 lr:0.000365 t:13.0s
+tttg: c167/282 lr:0.000359 t:13.1s
+tttg: c168/282 lr:0.000354 t:13.2s
+tttg: c169/282 lr:0.000349 t:13.2s
+tttg: c170/282 lr:0.000343 t:13.3s
+tttg: c171/282 lr:0.000338 t:13.4s
+tttg: c172/282 lr:0.000333 t:13.5s
+tttg: c173/282 lr:0.000328 t:13.6s
+tttg: c174/282 lr:0.000322 t:13.6s
+tttg: c175/282 lr:0.000317 t:13.7s
+tttg: c176/282 lr:0.000312 t:13.8s
+tttg: c177/282 lr:0.000307 t:13.9s
+tttg: c178/282 lr:0.000302 t:13.9s
+tttg: c179/282 lr:0.000296 t:14.0s
+tttg: c180/282 lr:0.000291 t:14.1s
+tttg: c181/282 lr:0.000286 t:14.2s
+tttg: c182/282 lr:0.000281 t:14.3s
+tttg: c183/282 lr:0.000276 t:14.3s
+tttg: c184/282 lr:0.000271 t:14.4s
+tttg: c185/282 lr:0.000266 t:14.5s
+tttg: c186/282 lr:0.000261 t:14.6s
+tttg: c187/282 lr:0.000256 t:14.7s
+tttg: c188/282 lr:0.000252 t:14.7s
+tttg: c189/282 lr:0.000247 t:14.8s
+tttg: c190/282 lr:0.000242 t:14.9s
+tttg: c191/282 lr:0.000237 t:15.0s
+tttg: c192/282 lr:0.000232 t:15.1s
+tttg: c193/282 lr:0.000228 t:15.2s
+tttg: c194/282 lr:0.000223 t:15.2s
+tttg: c195/282 lr:0.000218 t:15.3s
+tttg: c196/282 lr:0.000214 t:15.4s
+tttg: c197/282 lr:0.000209 t:15.5s
+tttg: c198/282 lr:0.000205 t:15.5s
+tttg: c199/282 lr:0.000200 t:15.6s
+tttg: c200/282 lr:0.000196 t:15.7s
+tttg: c201/282 lr:0.000191 t:15.8s
+tttg: c202/282 lr:0.000187 t:15.9s
+tttg: c203/282 lr:0.000183 t:15.9s
+tttg: c204/282 lr:0.000178 t:16.0s
+tttg: c205/282 lr:0.000174 t:16.1s
+tttg: c206/282 lr:0.000170 t:16.2s
+tttg: c207/282 lr:0.000166 t:16.2s
+tttg: c208/282 lr:0.000162 t:16.3s
+tttg: c209/282 lr:0.000157 t:16.4s
+tttg: c210/282 lr:0.000153 t:16.5s
+tttg: c211/282 lr:0.000149 t:16.5s
+tttg: c212/282 lr:0.000145 t:16.6s
+tttg: c213/282 lr:0.000142 t:16.7s
+tttg: c214/282 lr:0.000138 t:16.8s
+tttg: c215/282 lr:0.000134 t:16.9s
+tttg: c216/282 lr:0.000130 t:16.9s
+tttg: c217/282 lr:0.000126 t:17.0s
+tttg: c218/282 lr:0.000123 t:17.1s
+tttg: c219/282 lr:0.000119 t:17.2s
+tttg: c220/282 lr:0.000115 t:17.3s
+tttg: c221/282 lr:0.000112 t:17.3s
+tttg: c222/282 lr:0.000108 t:17.4s
+tttg: c223/282 lr:0.000105 t:17.5s
+tttg: c224/282 lr:0.000101 t:17.6s
+tttg: c225/282 lr:0.000098 t:17.6s
+tttg: c226/282 lr:0.000095 t:17.7s
+tttg: c227/282 lr:0.000092 t:17.8s
+tttg: c228/282 lr:0.000088 t:17.9s
+tttg: c229/282 lr:0.000085 t:18.0s
+tttg: c230/282 lr:0.000082 t:18.0s
+tttg: c231/282 lr:0.000079 t:18.1s
+tttg: c232/282 lr:0.000076 t:18.2s
+tttg: c233/282 lr:0.000073 t:18.3s
+tttg: c234/282 lr:0.000070 t:18.3s
+tttg: c235/282 lr:0.000067 t:18.4s
+tttg: c236/282 lr:0.000065 t:18.5s
+tttg: c237/282 lr:0.000062 t:18.6s
+tttg: c238/282 lr:0.000059 t:18.7s
+tttg: c239/282 lr:0.000057 t:18.8s
+tttg: c240/282 lr:0.000054 t:18.8s
+tttg: c241/282 lr:0.000052 t:18.9s
+tttg: c242/282 lr:0.000049 t:19.0s
+tttg: c243/282 lr:0.000047 t:19.1s
+tttg: c244/282 lr:0.000044 t:19.2s
+tttg: c245/282 lr:0.000042 t:19.2s
+tttg: c246/282 lr:0.000040 t:19.3s
+tttg: c247/282 lr:0.000038 t:19.4s
+tttg: c248/282 lr:0.000036 t:19.5s
+tttg: c249/282 lr:0.000034 t:19.5s
+tttg: c250/282 lr:0.000032 t:19.6s
+tttg: c251/282 lr:0.000030 t:19.7s
+tttg: c252/282 lr:0.000028 t:19.8s
+tttg: c253/282 lr:0.000026 t:19.8s
+tttg: c254/282 lr:0.000024 t:19.9s
+tttg: c255/282 lr:0.000023 t:20.0s
+tttg: c256/282 lr:0.000021 t:20.1s
+tttg: c257/282 lr:0.000019 t:20.2s
+tttg: c258/282 lr:0.000018 t:20.2s
+tttg: c259/282 lr:0.000016 t:20.3s
+tttg: c260/282 lr:0.000015 t:20.4s
+tttg: c261/282 lr:0.000014 t:20.5s
+tttg: c262/282 lr:0.000012 t:20.6s
+tttg: c263/282 lr:0.000011 t:20.6s
+tttg: c264/282 lr:0.000010 t:20.7s
+tttg: c265/282 lr:0.000009 t:20.8s
+tttg: c266/282 lr:0.000008 t:20.9s
+tttg: c267/282 lr:0.000007 t:20.9s
+tttg: c268/282 lr:0.000006 t:21.0s
+tttg: c269/282 lr:0.000005 t:21.1s
+tttg: c270/282 lr:0.000004 t:21.2s
+tttg: c271/282 lr:0.000004 t:21.3s
+tttg: c272/282 lr:0.000003 t:21.3s
+tttg: c273/282 lr:0.000003 t:21.4s
+tttg: c274/282 lr:0.000002 t:21.5s
+tttg: c275/282 lr:0.000002 t:21.6s
+tttg: c276/282 lr:0.000001 t:21.6s
+tttg: c277/282 lr:0.000001 t:21.7s
+tttg: c278/282 lr:0.000000 t:21.8s
+tttg: c279/282 lr:0.000000 t:21.9s
+tttg: c280/282 lr:0.000000 t:22.0s
+tttg: c281/282 lr:0.000000 t:22.0s
+ttpr: phase:3/3 t:356.6s
+ttp: b733/782 bl:2.4555 bb:1.0978 rl:2.3707 rb:1.0782 dl:2381-2404 gd:1
+ttp: b723/782 bl:2.3518 bb:1.0358 rl:2.3695 rb:1.0754 dl:2131-2151 gd:1
+ttp: b715/782 bl:2.3534 bb:1.0181 rl:2.3686 rb:1.0721 dl:1986-2003 gd:1
+ttp: b711/782 bl:2.3166 bb:1.0400 rl:2.3660 rb:1.0705 dl:1919-1933 gd:1
+ttp: b698/782 bl:2.4171 bb:1.0482 rl:2.3683 rb:1.0694 dl:1759-1768 gd:1
+ttp: b694/782 bl:2.3753 bb:1.0562 rl:2.3686 rb:1.0689 dl:1713-1724 gd:1
+ttp: b686/782 bl:2.3318 bb:1.0256 rl:2.3672 rb:1.0672 dl:1635-1645 gd:1
+ttp: b678/782 bl:2.3622 bb:1.0406 rl:2.3670 rb:1.0662 dl:1562-1569 gd:1
+ttp: b669/782 bl:2.3502 bb:1.0695 rl:2.3664 rb:1.0663 dl:1493-1499 gd:1
+ttp: b658/782 bl:2.3200 bb:1.0266 rl:2.3651 rb:1.0651 dl:1416-1423 gd:1
+ttp: b654/782 bl:2.3547 bb:1.0249 rl:2.3648 rb:1.0639 dl:1391-1398 gd:1
+ttp: b642/782 bl:2.3074 bb:1.0114 rl:2.3632 rb:1.0625 dl:1316-1322 gd:1
+ttp: b637/782 bl:2.3874 bb:1.0560 rl:2.3639 rb:1.0624 dl:1288-1293 gd:1
+ttp: b631/782 bl:2.4539 bb:1.0560 rl:2.3660 rb:1.0622 dl:1254-1260 gd:1
+ttp: b620/782 bl:2.4048 bb:1.0350 rl:2.3669 rb:1.0616 dl:1197-1202 gd:1
+ttp: b609/782 bl:2.3860 bb:1.0382 rl:2.3673 rb:1.0611 dl:1144-1149 gd:1
+ttp: b606/782 bl:2.3849 bb:1.0573 rl:2.3676 rb:1.0610 dl:1130-1135 gd:1
+ttp: b598/782 bl:2.3732 bb:1.0427 rl:2.3677 rb:1.0606 dl:1096-1101 gd:1
+ttp: b591/782 bl:2.3522 bb:1.0500 rl:2.3674 rb:1.0604 dl:1067-1071 gd:1
+ttp: b583/782 bl:2.4389 bb:1.0811 rl:2.3687 rb:1.0608 dl:1034-1038 gd:1
+ttp: b575/782 bl:2.4593 bb:1.0686 rl:2.3702 rb:1.0609 dl:1004-1008 gd:1
+ttp: b567/782 bl:2.3966 bb:1.0382 rl:2.3706 rb:1.0606 dl:975-978 gd:1
+ttp: b559/782 bl:2.3509 bb:1.0177 rl:2.3703 rb:1.0599 dl:949-952 gd:1
+ttp: b550/782 bl:2.3334 bb:1.0075 rl:2.3698 rb:1.0591 dl:920-923 gd:1
+ttp: b542/782 bl:2.5119 bb:1.0754 rl:2.3718 rb:1.0593 dl:896-899 gd:1
+ttp: b534/782 bl:2.4135 bb:1.0470 rl:2.3723 rb:1.0592 dl:871-874 gd:1
+ttp: b525/782 bl:2.4094 bb:1.0214 rl:2.3728 rb:1.0587 dl:845-848 gd:1
+ttp: b517/782 bl:2.3901 bb:1.0439 rl:2.3730 rb:1.0585 dl:823-825 gd:1
+ttp: b509/782 bl:2.3891 bb:1.0262 rl:2.3732 rb:1.0581 dl:800-803 gd:1
+ttp: b501/782 bl:2.3987 bb:1.0461 rl:2.3735 rb:1.0579 dl:780-782 gd:1
+ttp: b493/782 bl:2.4034 bb:1.0714 rl:2.3738 rb:1.0581 dl:759-762 gd:1
+ttp: b485/782 bl:2.4147 bb:1.0261 rl:2.3742 rb:1.0577 dl:741-743 gd:1
+ttp: b477/782 bl:2.4652 bb:1.0819 rl:2.3752 rb:1.0580 dl:721-724 gd:1
+ttp: b466/782 bl:2.3686 bb:1.0123 rl:2.3751 rb:1.0575 dl:697-699 gd:1
+ttp: b459/782 bl:2.3530 bb:1.0671 rl:2.3749 rb:1.0576 dl:682-684 gd:1
+ttp: b451/782 bl:2.3974 bb:1.0180 rl:2.3751 rb:1.0572 dl:666-668 gd:1
+ttp: b443/782 bl:2.2884 bb:1.0484 rl:2.3743 rb:1.0572 dl:649-651 gd:1
+ttp: b435/782 bl:2.4515 bb:1.0527 rl:2.3750 rb:1.0571 dl:634-636 gd:1
+ttp: b427/782 bl:2.4081 bb:1.0795 rl:2.3753 rb:1.0573 dl:619-621 gd:1
+ttp: b420/782 bl:2.3432 bb:1.0560 rl:2.3750 rb:1.0573 dl:605-607 gd:1
+ttp: b413/782 bl:2.3314 bb:1.0356 rl:2.3747 rb:1.0571 dl:592-594 gd:1
+ttp: b405/782 bl:2.4287 bb:1.0347 rl:2.3751 rb:1.0569 dl:577-579 gd:1
+ttp: b394/782 bl:2.3875 bb:1.0615 rl:2.3752 rb:1.0570 dl:557-559 gd:1
+ttp: b386/782 bl:2.4462 bb:1.0988 rl:2.3757 rb:1.0573 dl:544-546 gd:1
+ttp: b378/782 bl:2.3825 bb:1.0568 rl:2.3757 rb:1.0573 dl:530-532 gd:1
+ttp: b370/782 bl:2.4102 bb:1.0855 rl:2.3759 rb:1.0574 dl:518-519 gd:1
+ttp: b362/782 bl:2.3756 bb:1.0426 rl:2.3759 rb:1.0574 dl:504-506 gd:1
+ttp: b354/782 bl:2.3781 bb:1.0753 rl:2.3760 rb:1.0575 dl:491-492 gd:1
+ttp: b346/782 bl:2.4672 bb:1.1064 rl:2.3765 rb:1.0578 dl:479-480 gd:1
+ttp: b339/782 bl:2.3830 bb:1.0701 rl:2.3765 rb:1.0578 dl:468-470 gd:1
+ttp: b331/782 bl:2.3737 bb:1.0719 rl:2.3765 rb:1.0579 dl:457-458 gd:1
+ttp: b323/782 bl:2.3515 bb:1.0512 rl:2.3764 rb:1.0579 dl:445-447 gd:1
+ttp: b315/782 bl:2.3419 bb:1.0734 rl:2.3762 rb:1.0579 dl:433-434 gd:1
+ttp: b306/782 bl:2.4525 bb:1.0835 rl:2.3766 rb:1.0581 dl:420-421 gd:1
+ttp: b298/782 bl:2.4084 bb:1.0725 rl:2.3768 rb:1.0581 dl:408-410 gd:1
+ttp: b290/782 bl:2.3155 bb:1.0368 rl:2.3765 rb:1.0580 dl:397-398 gd:1
+ttp: b282/782 bl:2.3759 bb:1.1254 rl:2.3765 rb:1.0583 dl:385-387 gd:1
+ttp: b274/782 bl:2.4577 bb:1.1090 rl:2.3768 rb:1.0586 dl:374-376 gd:1
+ttp: b266/782 bl:2.4068 bb:1.1071 rl:2.3770 rb:1.0588 dl:365-366 gd:1
+ttp: b258/782 bl:2.4452 bb:1.1071 rl:2.3772 rb:1.0590 dl:355-356 gd:1
+ttp: b250/782 bl:2.3816 bb:1.1187 rl:2.3773 rb:1.0592 dl:345-346 gd:1
+ttp: b242/782 bl:2.4302 bb:1.0800 rl:2.3775 rb:1.0593 dl:335-337 gd:1
+ttp: b236/782 bl:2.3496 bb:1.1089 rl:2.3774 rb:1.0595 dl:328-329 gd:1
+ttp: b228/782 bl:2.4640 bb:1.0934 rl:2.3777 rb:1.0596 dl:319-320 gd:1
+ttp: b219/782 bl:2.4730 bb:1.1188 rl:2.3780 rb:1.0598 dl:309-310 gd:1
+ttp: b210/782 bl:2.3508 bb:1.0801 rl:2.3779 rb:1.0599 dl:299-300 gd:1
+ttp: b202/782 bl:2.4969 bb:1.1559 rl:2.3783 rb:1.0602 dl:291-292 gd:1
+ttp: b194/782 bl:2.4524 bb:1.1212 rl:2.3786 rb:1.0604 dl:282-283 gd:1
+ttp: b185/782 bl:2.4417 bb:1.1018 rl:2.3788 rb:1.0605 dl:272-274 gd:1
+ttp: b177/782 bl:2.4267 bb:1.1060 rl:2.3789 rb:1.0606 dl:265-266 gd:1
+ttp: b169/782 bl:2.3785 bb:1.1080 rl:2.3789 rb:1.0608 dl:257-258 gd:1
+ttp: b158/782 bl:2.4514 bb:1.1558 rl:2.3791 rb:1.0610 dl:247-248 gd:1
+ttp: b150/782 bl:2.4520 bb:1.1813 rl:2.3793 rb:1.0613 dl:239-240 gd:1
+ttp: b144/782 bl:2.5058 bb:1.1494 rl:2.3796 rb:1.0615 dl:233-234 gd:1
+ttp: b138/782 bl:2.4207 bb:1.1269 rl:2.3797 rb:1.0617 dl:228-229 gd:1
+ttp: b133/782 bl:2.4639 bb:1.1348 rl:2.3800 rb:1.0619 dl:224-224 gd:1
+ttp: b125/782 bl:2.5713 bb:1.1825 rl:2.3804 rb:1.0622 dl:216-217 gd:1
+ttp: b118/782 bl:2.5606 bb:1.1785 rl:2.3808 rb:1.0624 dl:210-211 gd:1
+ttp: b112/782 bl:2.4993 bb:1.1528 rl:2.3811 rb:1.0626 dl:205-205 gd:1
+ttp: b102/782 bl:2.5192 bb:1.1670 rl:2.3814 rb:1.0628 dl:196-197 gd:1
+ttp: b95/782 bl:2.5801 bb:1.2407 rl:2.3818 rb:1.0632 dl:190-191 gd:1
+ttp: b88/782 bl:2.6049 bb:1.2197 rl:2.3823 rb:1.0635 dl:183-184 gd:1
+ttp: b84/782 bl:2.5083 bb:1.1365 rl:2.3825 rb:1.0636 dl:180-181 gd:1
+ttp: b77/782 bl:2.4988 bb:1.1836 rl:2.3828 rb:1.0639 dl:174-174 gd:1
+ttp: b68/782 bl:2.5991 bb:1.1938 rl:2.3832 rb:1.0641 dl:167-167 gd:1
+ttp: b58/782 bl:2.4206 bb:1.1558 rl:2.3832 rb:1.0642 dl:157-158 gd:1
+ttp: b51/782 bl:2.5765 bb:1.2242 rl:2.3835 rb:1.0645 dl:151-151 gd:1
+ttp: b42/782 bl:2.5228 bb:1.1589 rl:2.3838 rb:1.0646 dl:142-143 gd:1
+ttp: b33/782 bl:2.6558 bb:1.2583 rl:2.3841 rb:1.0649 dl:133-134 gd:1
+ttp: b26/782 bl:2.6243 bb:1.2319 rl:2.3845 rb:1.0651 dl:126-127 gd:1
+ttp: b19/782 bl:2.7089 bb:1.2279 rl:2.3849 rb:1.0653 dl:118-119 gd:1
+ttp: b12/782 bl:2.5799 bb:1.2051 rl:2.3851 rb:1.0655 dl:108-110 gd:1
+ttp: b4/782 bl:2.6868 bb:1.1963 rl:2.3854 rb:1.0656 dl:91-94 gd:1
+quantized_ttt_phased val_loss:2.38207881 val_bpb:1.06204667 eval_time:446734ms
+total_eval_time:446.7s