Skip to content

keytech42/gen-pixelart

Repository files navigation

Pixel Art Generative Model

A learning-oriented project comparing three generative architectures for producing 16x16 pixel art sprites. Built with PyTorch, tracked with MLflow.

Results

Colored sprites (8-color palette, 1000 epochs)

Training data (Kenney 1-Bit Pack colored variant, CC0):

Real colored sprites

Diffusion (DDIM, 50-step sampling):

Diffusion colored

Coherent colored sprites with correct palette use — buildings, monitors, characters. DDIM sampling produces these in under 1 second.

VAE:

VAE colored

Color placement is reasonable but shapes are blurry. Multi-color makes the VAE averaging problem more visible.

Monochrome sprites (2-color, 500 epochs)

Training data:

Real sprites

Diffusion VAE (BCE loss) VQ-VAE (with prior)
Diffusion VAE VQ-VAE

Diffusion produces the sharpest output. VAE with BCE loss gives better edges than MSE. VQ-VAE with a learned autoregressive prior generates coherent sprites (random indices produce checkerboard noise).

Architecture

The project uses the strategy pattern to swap generative architectures while sharing all infrastructure:

Trainer (context)
  ├── training loop, optimizer, device management
  ├── MLflow logging
  └── palette snapping + sample grids

GenerativeStrategy (ABC)
  ├── build_model(config) → nn.Module
  ├── train_step(model, optimizer, batch) → loss_dict
  ├── sample(model, n_samples, device) → images
  └── get_metrics(model, batch) → metrics

Concrete strategies:
  ├── VAEStrategy      (~138K params, continuous latent, reparameterization trick)
  ├── VQVAEStrategy    (~140K params, discrete codebook, straight-through estimator)
  └── DiffusionStrategy (7.4M params, U-Net, 1000-step linear noise schedule)

Adding a new strategy means implementing the ABC and adding one line to the registry. The Trainer, data pipeline, and logging remain untouched.

Tutorial Docs

This project is a learning tool. Each major concept has a tutorial-style explanation in docs/learn/:

Quick Start

# Install dependencies
uv sync

# Download dataset (Kenney 1-Bit Pack, CC0)
uv run python scripts/download_data.py

# Run smoke test (validates all strategy contracts)
uv run python scripts/smoke_test.py

# Train a specific strategy
uv run python scripts/train.py --config configs/diffusion.yaml

# Train all three and generate comparison grids
uv run python scripts/train_comparison.py

# View experiment tracking
mlflow ui

Project Structure

├── configs/                    # OmegaConf YAML configs (base + per-strategy)
├── docs/learn/                 # Tutorial docs (the project's primary learning artifact)
├── src/
│   ├── strategies/
│   │   ├── base.py             # GenerativeStrategy ABC
│   │   ├── vae.py              # VAE strategy
│   │   ├── vqvae.py            # VQ-VAE strategy
│   │   └── diffusion.py        # DDPM diffusion strategy
│   ├── models/
│   │   ├── encoder.py          # ConvEncoder, ConvDecoder, VAEModel
│   │   ├── codebook.py         # VectorQuantizer
│   │   ├── unet.py             # U-Net with time conditioning
│   │   └── blocks.py           # ResBlock, Downsample, Upsample, SinusoidalTimeEmbedding
│   ├── trainer.py              # Training loop (strategy-agnostic)
│   ├── palette.py              # K-means palette extraction + snapping
│   └── mlflow_logger.py        # MLflow wrapper
├── scripts/
│   ├── train.py                # Single strategy training
│   ├── train_comparison.py     # All-strategy comparison run
│   ├── smoke_test.py           # Strategy contract validation
│   └── download_data.py        # Dataset download + tilesheet slicing
└── .agents/dev_docs/           # Task plans, context, and design decisions

Dataset

Kenney 1-Bit Pack — 1,077 monochrome 16x16 sprites (CC0 license). The comparison training uses a focused subset of 873 sprites filtered for medium pixel density (characters, items, UI elements).

Requirements

  • Python 3.11+
  • PyTorch (MPS backend for Apple Silicon, CUDA, or CPU)
  • uv for dependency management

About

Pixel art generative model — comparing VAE, VQ-VAE, and DDPM diffusion via strategy pattern (PyTorch)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors