Reference document for building a Svelte-based image generation app with canvas/inpainting capabilities.
- Frontend: Svelte 5 + Vite (no SvelteKit, no TypeScript)
- Styling: Tailwind CSS (possibly with shadcn-svelte)
- Language: JavaScript only
- Backend: FastAPI (Python)
- UI Design: Custom/bespoke (not replicating original)
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import uvicorn
import torch
import json
import os
from io import BytesIO
import base64
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Restrict in production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Global pipeline - lazy loaded
pipe = NoneCONFIG_FILE = "config.json"
def load_config():
if os.path.exists(CONFIG_FILE):
try:
with open(CONFIG_FILE, "r") as f:
return json.load(f)
except Exception:
pass
return {
"cache_dir": None,
"model_id": "Tongyi-MAI/Z-Image-Turbo",
"cpu_offload": False
}
def save_config(config):
with open(CONFIG_FILE, "w") as f:
json.dump(config, f, indent=4)def get_pipeline():
global pipe
if pipe is None:
from diffusers import ZImagePipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" else torch.float32
config = load_config()
pipe = ZImagePipeline.from_pretrained(
config['model_id'],
torch_dtype=dtype,
low_cpu_mem_usage=False,
cache_dir=config.get('cache_dir')
)
if config.get("cpu_offload", False) and device == "cuda":
pipe.enable_model_cpu_offload()
else:
pipe.to(device)
return pipeclass GenerateRequest(BaseModel):
prompt: str
height: int = 1024
width: int = 1024
steps: int = 8
guidance_scale: float = 0.0
seed: int = -1
class SettingsRequest(BaseModel):
cache_dir: str
cpu_offload: bool = False| Endpoint | Method | Purpose |
|---|---|---|
/health |
GET | Returns {"status": "ok"} |
/settings |
GET | Returns current config |
/settings/model-path |
POST | Updates config, sets pipe=None to force reload |
/generate |
POST | Main generation endpoint |
@app.post("/generate")
async def generate(req: GenerateRequest):
# Validate dimensions
if req.height % 16 != 0 or req.width % 16 != 0:
raise HTTPException(400, "Height and Width must be divisible by 16")
pipeline = get_pipeline()
device = "cuda" if torch.cuda.is_available() else "cpu"
# Handle seed
generator = None
if req.seed != -1:
generator = torch.Generator(device).manual_seed(req.seed)
# Generate
image = pipeline(
prompt=req.prompt,
height=req.height,
width=req.width,
num_inference_steps=req.steps,
guidance_scale=req.guidance_scale,
generator=generator,
).images[0]
# Convert to base64
buffer = BytesIO()
image.save(buffer, format="PNG")
img_str = base64.b64encode(buffer.getvalue()).decode()
return {"image": f"data:image/png;base64,{img_str}"}- Model ID:
Tongyi-MAI/Z-Image-Turbo - Parameters: 6 billion
- Architecture: S3-DiT (Scalable Single-Stream Diffusion Transformer)
- Text Encoder: Qwen 4B
- VAE: Flux Autoencoder
- Optimized Steps: 8 (distilled model)
| Parameter | Type | Range | Default | Notes |
|---|---|---|---|---|
prompt |
str | Any | Required | Text description |
height |
int | 256-2048 | 1024 | Must be divisible by 16 |
width |
int | 256-2048 | 1024 | Must be divisible by 16 |
num_inference_steps |
int | 1-50 | 8 | More steps = higher quality, slower |
guidance_scale |
float | 0.0-10.0 | 0.0 | 0 = no guidance |
generator |
Generator | - | None | For seed control |
- Recommended VRAM: 16GB+
- CUDA Precision: bfloat16
- CPU Precision: float32
- CPU Offload: Available for low-VRAM GPUs (trades speed for memory)
These are the parameters the UI needs to expose for image generation:
- Range: 1-50
- Default: 8
- Range: 0-10 (0.1 increments)
- Default: 0.0
- Width: 256-2048px (must be divisible by 16)
- Height: 256-2048px (must be divisible by 16)
Useful Aspect Ratio Presets:
| Name | Ratio | Dimensions |
|---|---|---|
| Square | 1:1 | 1024×1024 |
| Portrait | 3:4 | 896×1152 |
| Landscape | 4:3 | 1152×896 |
| Wide | 16:9 | 1344×768 |
- -1 = random
- Any positive integer for reproducibility
// Using Svelte writable stores (src/lib/stores/)
// generation.js
import { writable } from 'svelte/store'
export const prompt = writable('')
export const generatedImage = writable(null) // data:image/png;base64,...
export const loading = writable(false)
export const settings = writable({
steps: 8,
guidance_scale: 0.0,
width: 1024,
height: 1024,
seed: -1
})// src/lib/api.js
const API_BASE = 'http://localhost:8000'
export async function generateImage(prompt, settings) {
const res = await fetch(`${API_BASE}/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ prompt, ...settings })
})
if (!res.ok) {
const error = await res.json()
throw new Error(error.detail || 'Generation failed')
}
return res.json() // { image: "data:image/png;base64,..." }
}
export async function getSettings() {
const res = await fetch(`${API_BASE}/settings`)
return res.json()
}
export async function updateSettings(settings) {
const res = await fetch(`${API_BASE}/settings/model-path`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(settings)
})
return res.json()
}# requirements.txt
fastapi
uvicorn
torch
transformers
accelerate
protobuf
sentencepiece
git+https://github.com/huggingface/diffusers.gitNotes:
- Diffusers must be from git for ZImagePipeline
- PyTorch installation depends on CUDA version
- Recommend Python 3.8+
# Create Svelte project with Vite
npm create vite@latest frontend -- --template svelte
cd frontend
# Install Tailwind
npm install -D tailwindcss postcss autoprefixer
npx tailwindcss init -p
# Install shadcn-svelte (optional - for UI components)
npx shadcn-svelte@latest init
# Icons
npm install lucide-svelte
# Canvas library (for later phases)
npm install konva svelte-konva{
"dependencies": {
"lucide-svelte": "^0.460.0",
"konva": "^9.3.0",
"svelte-konva": "^1.0.0"
},
"devDependencies": {
"svelte": "^5.0.0",
"vite": "^6.0.0",
"tailwindcss": "^3.4.0",
}
}# Validation (400)
if req.height % 16 != 0:
raise HTTPException(400, "Height must be divisible by 16")
# Server Error (500)
try:
result = operation()
except Exception as e:
print(f"Error: {e}")
raise HTTPException(500, str(e))try {
const res = await fetch(...)
if (!res.ok) throw new Error('Failed')
// success handling
} catch (e) {
console.error(e)
alert('User-friendly error message')
} finally {
loading = false
}function downloadImage() {
const link = document.createElement('a')
link.href = image // data URI
link.download = `z-image-${Date.now()}.png`
link.click()
}- Load: Read
config.jsonon startup, fallback to defaults - Display: Populate settings UI from config
- Update: POST to
/settings/model-pathwith new values - Persist: Backend saves to
config.json - Reload: Set
pipe = Noneto force model reload on next request - Feedback: Alert user that model will reload
| Endpoint | Purpose |
|---|---|
POST /inpaint |
Inpainting with mask |
POST /img2img |
Image-to-image generation |
POST /outpaint |
Extend canvas beyond boundaries |
class InpaintRequest(BaseModel):
prompt: str
image: str # Base64 encoded source image
mask: str # Base64 encoded mask (white = regenerate)
height: int = 1024
width: int = 1024
steps: int = 8
guidance_scale: float = 0.0
seed: int = -1
strength: float = 0.8 # How much to change (0-1)// Layers
layers: Layer[] = []
activeLayerId: string | null = null
// Canvas
canvasWidth: number = 1024
canvasHeight: number = 1024
zoom: number = 1.0
panX: number = 0
panY: number = 0
// Tools
activeTool: 'brush' | 'eraser' | 'select' | 'move' | 'pan' = 'brush'
brushSize: number = 50
brushHardness: number = 100
// Mask
maskCanvas: HTMLCanvasElement | null = null
showMask: boolean = true
maskOpacity: number = 0.5
// History
undoStack: CanvasState[] = []
redoStack: CanvasState[] = []-
Layer System
- Multiple image layers
- Layer visibility toggles
- Layer opacity controls
- Layer reordering (drag & drop)
- Layer merge/flatten
-
Selection Tools
- Rectangular selection
- Lasso selection
- Magic wand (color-based)
- Selection invert/expand/contract
-
Brush/Mask Tools
- Variable size brush
- Soft/hard edge options
- Mask painting mode
- Quick mask visualization
-
Canvas Navigation
- Pan (middle mouse / space+drag)
- Zoom (scroll wheel / +/- keys)
- Fit to screen
- Reset view
-
Inpainting Workflow
- Paint mask over areas to regenerate
- Mask feathering options
- Preserve composition checkbox
- Mask invert option
-
History
- Undo/redo stack
- History panel with thumbnails
- Snapshot system
-
Export Options
- Export single layer
- Export merged/flattened
- Export with transparency
- Export mask only
- Konva.js + svelte-konva - Canvas library with built-in layer support, transforms, events
- Fabric.js - Alternative with more object manipulation features
- shadcn-svelte - Tailwind-based UI components (optional)
- lucide-svelte - Icon library
- tailwind-merge or clsx - Class name utilities
project-root/
├── backend/
│ ├── main.py # FastAPI application
│ ├── routes/
│ │ ├── generate.py # Generation endpoints
│ │ ├── inpaint.py # Inpainting endpoints
│ │ └── settings.py # Settings endpoints
│ ├── services/
│ │ ├── pipeline.py # Pipeline management
│ │ └── image.py # Image processing utilities
│ └── config.py # Configuration management
│
├── frontend/
│ ├── src/
│ │ ├── lib/
│ │ │ ├── components/
│ │ │ │ ├── Canvas.svelte
│ │ │ │ ├── LayerPanel.svelte
│ │ │ │ ├── ToolPanel.svelte
│ │ │ │ └── ...
│ │ │ ├── stores/
│ │ │ │ ├── canvas.js # Canvas state
│ │ │ │ ├── layers.js # Layer management
│ │ │ │ ├── tools.js # Tool state
│ │ │ │ └── settings.js # App settings
│ │ │ └── utils/
│ │ │ ├── api.js # API communication
│ │ │ ├── canvas.js # Canvas utilities
│ │ │ └── image.js # Image processing
│ │ ├── App.svelte # Main app component
│ │ ├── main.js # Entry point
│ │ └── app.css # Tailwind imports
│ ├── index.html
│ ├── vite.config.js
│ ├── tailwind.config.js
│ └── package.json
│
├── config.json
├── requirements.txt
└── README.md