Mini-DLSS (DLSS-inspired, not NVIDIA DLSS): temporal video super-resolution trained on public VSR datasets, evaluated with standard SR metrics (PSNR/SSIM) plus temporal stability checks, and deployed via ONNX for reproducible inference.
- Task: temporal video super-resolution.
- Upscale factor:
2x(fixed for all experiments). - Focus: ML pipeline, model training, benchmarking, and deployment.
- Out of scope (for now): game-engine integration and custom rendering pipelines.
- Train: Vimeo-90K Septuplets.
- Eval/demo: REDS validation split (scene-disjoint from training data).
- Split rule: no overlapping scenes between train/val/test subsets.
- Local default setup in this repo uses merged Kaggle subsets (
vimeo-90k-3+vimeo-90k-4) via:- sequence root:
data/raw/vimeo90k_union/sequence - manifests:
data/splits/vimeo90k_union_{train,val,test}.txt
- sequence root:
- Baseline A: bicubic upscaling.
- Baseline B: single-image SR applied frame-by-frame.
- Ours: BasicVSR-style temporal model.
- Training path:
- Stage 1:
BasicVSR-Smallfor pipeline validation and quick cycles. - Stage 2: scaled model for quality-focused runs.
- Stage 1:
- Optional extension: temporal consistency loss and/or flow-assisted alignment.
- Primary quality metrics:
PSNRandSSIM. - Primary reporting domain:
Ychannel in YCbCr. - Secondary sanity report: RGB PSNR/SSIM (optional).
- Border crop before PSNR/SSIM: crop
scalepixels on each side (2pixels for 2x). - Temporal stability metric:
- Preferred: flow-warped temporal PSNR (
tPSNR) between consecutive outputs. - Fallback: frame-to-frame difference energy in low-texture regions.
- Preferred: flow-warped temporal PSNR (
- Report set:
- REDS validation aggregate metrics.
- Curated qualitative clips (side-by-side LR/Bicubic/Single-frame/Temporal/GT).
- Runtime target: Google Colab GPU for training.
- All runs resumable with periodic checkpoints.
- Validation runs at fixed interval during training.
- Two experiment tiers:
- Fast cycle: short training budget for iteration/debugging.
- Full cycle: long budget for final reporting.
- Seeds/configs/checkpoints tracked per run for reproducibility.
- Temporal model improves REDS-val PSNR over single-frame SR by at least
+0.2 dB(target range+0.2to+0.5 dB). - Temporal stability metric improves over single-frame SR, or does not regress.
- ONNX export succeeds and runs on a fixed 10-second clip with reported latency (
ms/frame) on local machine. - Repo includes reproducible train/eval/export/demo commands and final results artifacts.
- Week 1: lock protocol, metrics, splits, and run tracker format.
- Week 2: data ingestion, alignment checks, window sampling checks.
- Week 3: baseline training/eval and auto-generated markdown results table.
- Week 4: BasicVSR-Small training and validation loop stabilization.
- Week 5: scaled model training and reproducible best-checkpoint eval.
- Week 6: ablations (3/5/7 frames, loss variants, tiny speed model) and latency chart.
- Week 7: ONNX export + mp4 demo CLI.
- Week 8: final report page with metrics, videos, failure cases, and tradeoff analysis.
bicubicsingle_frame_srtemporal_vsr_3ftemporal_vsr_5ftemporal_vsr_7ftemporal_vsr_5f_l1_perceptualtemporal_vsr_5f_tiny(speed-oriented)
mini-dlss/
configs/
data/
scripts/
splits/
models/
metrics/
notebooks/
results/
tables/
videos/
train.py
eval.py
export_onnx.py
demo_video.py
README.md
Week 2 dataset check (clip counts + LR/HR alignment + temporal windows):
python data/scripts/freeze_vimeo_splits.py --vimeo-root /path/to/vimeo90k
python data/scripts/freeze_vimeo_union_splits.py \
--vimeo-a data/raw/kagglehub/datasets/wangsally/vimeo-90k-3/versions/1 \
--vimeo-b data/raw/kagglehub/datasets/wangsally/vimeo-90k-4/versions/1
python data/scripts/build_vimeo_union_root.py \
--seq-a data/raw/kagglehub/datasets/wangsally/vimeo-90k-3/versions/1/sequence \
--seq-b data/raw/kagglehub/datasets/wangsally/vimeo-90k-4/versions/1/sequence
python data/scripts/check_dataset.py \
--hr-root /path/to/reds/hr \
--lr-root /path/to/reds/lr \
--manifest data/splits/reds_val.txt \
--num-frames 5 \
--scale 2If REDS is not available yet, build a local REDS-style val set from Vimeo for pipeline validation:
python data/scripts/create_reds_val_from_vimeo.pyWeek 3 baseline eval (writes markdown table + example media):
python eval.py \
--config configs/temporal_small.toml \
--override '{"dataset":{"val_hr_root":"/path/to/reds/hr","val_lr_root":"/path/to/reds/lr"}}' \
--mode bicubic \
--save-videos 2 \
--save-images 8Week 4/5 temporal training (small -> full):
python train.py --config configs/temporal_small.toml \
--override '{"dataset":{"train_hr_root":"/path/to/vimeo/hr","val_hr_root":"/path/to/reds/hr","val_lr_root":"/path/to/reds/lr"}}'
python train.py --config configs/temporal_full.toml \
--override '{"dataset":{"train_hr_root":"/path/to/vimeo/hr","val_hr_root":"/path/to/reds/hr","val_lr_root":"/path/to/reds/lr"}}'Week 5 reproducible best-checkpoint evaluation:
python eval.py \
--config configs/temporal_full.toml \
--override '{"dataset":{"val_hr_root":"/path/to/reds/hr","val_lr_root":"/path/to/reds/lr"}}' \
--checkpoint results/runs/temporal_vsr_5f_full_2x/checkpoints/best.ptWeek 6 ablation tracking:
# Fill run rows in results/tables/experiment_tracker_template.md
# Then copy summary into results/tables/ablation_table.mdWeek 7 ONNX export:
python export_onnx.py \
--config configs/temporal_full.toml \
--checkpoint results/runs/temporal_vsr_5f_full_2x/checkpoints/best.pt \
--output results/onnx/mini_dlss_temporal_full.onnxWeek 7 demo inference on any LR mp4:
python demo_video.py \
--input /path/to/lr_clip.mp4 \
--output results/videos/demo_sr.mp4 \
--config configs/temporal_full.toml \
--checkpoint results/runs/temporal_vsr_5f_full_2x/checkpoints/best.pt- Keep
scale=2in all configs. - Use only manifests in
data/splits/. - Log every run in
results/tables/experiment_tracker_template.md. - Report Y-channel metrics with crop border = 2.
- Record the same fixed 10-second clip for latency (
ms/frame).