multiscreen-pytorch is a Windows-native PyTorch reference implementation of a
paper-like Multiscreen language model inspired by "Screening Is Enough".
This repository is intentionally scoped as a correctness-first reproduction scaffold:
- explicit
torch.nn.Modulemodel code - reproducible training and evaluation artifacts
- a matched Transformer baseline
- deterministic ABCDigits-style retrieval evaluation
- long-context perplexity and latency scaffolding
uv python install 3.12.9
uv venv --python 3.12.9
uv sync --extra dev --extra hf --extra hf_qwen
uv run python scripts\env_check.pyuv run python scripts\train_multiscreen.py --steps 12 --output-dir artifacts\smoke_multiscreen
uv run python scripts\train_transformer_baseline.py --steps 12 --output-dir artifacts\smoke_transformer
uv run python scripts\eval_abcdigits.py --checkpoint artifacts\smoke_multiscreen\checkpoint.pt
uv run python scripts\eval_long_context.py --checkpoint artifacts\smoke_multiscreen\checkpoint.pt
uv run python scripts\benchmark_latency.py --checkpoint artifacts\smoke_multiscreen\checkpoint.pt
uv run python scripts\sweep_learning_rate.py --model-kind multiscreen
uv run python scripts\experiment_hf_qwen.py --mode inspect --model-path "H:\Qwen3.5-9B-official-hf"multiscreen/config.py: explicit configsmultiscreen/layers.py: Screening unit, gating tile, TanhNormmultiscreen/model.py: Multiscreen LM and Transformer baselinemultiscreen/train.py: deterministic training utilitiesmultiscreen/metrics.py: perplexity, retrieval, and latency summariesscripts/: Windows-safe CLI entrypointsscripts/experiment_hf_qwen.py: local HF Qwen3.5 architecture inspection and inference experimentstests/: math, model, and smoke validation
- The reference path is pure PyTorch.
- Hugging Face integration is used only where it reduces glue.
- The code favors explicitness over performance shortcuts.
- For local
Qwen3.5experiments, this repo can inspect and optionally loadQwen3_5ForConditionalGenerationweights from a local HF directory.
Inspect the local model and tokenizer:
uv run python scripts\experiment_hf_qwen.py --mode inspect --model-path "H:\Qwen3.5-9B-official-hf"Run text generation when a suitable CUDA torch build is available:
uv run python scripts\experiment_hf_qwen.py `
--mode generate `
--model-path "H:\Qwen3.5-9B-official-hf" `
--device auto `
--weight-load 4bit `
--prompt "Explain the difference between linear attention and screening attention."Benchmark prompt throughput:
uv run python scripts\experiment_hf_qwen.py `
--mode benchmark `
--model-path "H:\Qwen3.5-9B-official-hf" `
--device auto `
--weight-load 4bit `
--prompt "Summarize the architecture in three bullets."