refactor: simplify examples to flat-file flash init pattern by deanq · Pull Request #29 · runpod/flash-examples

deanq · 2026-02-19T22:45:28Z

Prerequisite: runpod/flash#208

Summary

Refactor all 7 examples from the old FastAPI boilerplate pattern (main.py + routers + Pydantic models + mothership.py) to the simplified flash init flat-file pattern
Each example is now a directory of standalone @remote decorated files that flash run auto-discovers
Net change: 44 files reduced to 12 files, -1466 lines / +119 lines

Changes per example

Example	Before	After	Key change
01_hello_world	4 files	1 file	Trimmed gpu_worker.py
02_cpu_worker	4 files	1 file	Trimmed cpu_worker.py
03_mixed_workers	9 files	3 files	Flattened + new pipeline.py (LB endpoint)
04_dependencies	8 files	2 files	Flattened workers/
01_text_to_speech	6 files	1 file	Flattened, dropped /tts/audio binary endpoint
05_load_balancer	7 files	2 files	Flattened to gpu_lb.py + cpu_lb.py
01_network_volumes	6 files	2 files	Inlined NetworkVolume, removed FastAPI deps

What was removed

All main.py files (FastAPI app with routers, uvicorn startup)
All mothership.py files (LB config for the app)
All __init__.py router files (APIRouter + Pydantic request models)
All workers/ directory nesting

What was added

03_mixed_workers/pipeline.py: CpuLiveLoadBalancer endpoint that orchestrates CPU preprocess -> GPU inference -> CPU postprocess (replaces the old main.py /classify endpoint)

Test plan

flash run from each example directory discovers all @remote endpoints
Generated Swagger UI shows correct endpoints
QB endpoints accessible at /{file}/run_sync or /{file}/{function}/run_sync
LB endpoints accessible at their configured method/path

Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep gpu_worker.py with config + @Remote function only. flash run auto-discovers endpoints.

Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep cpu_worker.py with config + @Remote function only.

…eline Flatten workers/ to gpu_worker.py + cpu_worker.py. Create pipeline.py as CpuLiveLoadBalancer endpoint for CPU->GPU->CPU orchestration. Remove main.py, mothership.py, all __init__.py, workers/ directory.

Flatten workers/ to gpu_worker.py + cpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory.

Flatten workers/ to gpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory. Drop /tts/audio binary endpoint (was router-only).

Flatten workers/ to gpu_lb.py + cpu_lb.py. Remove main.py, mothership.py, workers/ directory.

…olume config Flatten workers/ to gpu_worker.py + cpu_worker.py. Inline NetworkVolume definition into each worker file. Remove FastAPI dependencies from CPU worker (pure dict returns). Remove main.py, workers/ directory.

Copilot

Pull request overview

This PR refactors 7 examples from the legacy FastAPI boilerplate pattern to the simplified flat-file flash init pattern, dramatically reducing complexity from 44 files to 12 files (-1466 lines / +119 lines). The refactoring replaces the old architecture (main.py + routers + Pydantic models + mothership.py) with standalone @remote decorated worker files that flash run auto-discovers.

Changes:

Eliminated all FastAPI boilerplate (main.py, routers, init.py files) in favor of direct @remote decorators
Removed all mothership.py configuration files and workers/ directory nesting
Added pipeline.py in 03_mixed_workers to demonstrate cross-worker orchestration via CpuLiveLoadBalancer

Reviewed changes

Copilot reviewed 40 out of 44 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
01_getting_started/01_hello_world/gpu_worker.py	Simplified from 4 files to single standalone GPU worker with `@remote` decorator
01_getting_started/01_hello_world/main.py	Removed FastAPI app boilerplate (no longer needed)
01_getting_started/01_hello_world/mothership.py	Removed mothership load balancer configuration
01_getting_started/01_hello_world/init.py	Removed package initialization
01_getting_started/02_cpu_worker/cpu_worker.py	Simplified from 4 files to single standalone CPU worker
01_getting_started/02_cpu_worker/main.py	Removed FastAPI app boilerplate
01_getting_started/02_cpu_worker/mothership.py	Removed mothership configuration
01_getting_started/02_cpu_worker/init.py	Removed package initialization
01_getting_started/03_mixed_workers/gpu_worker.py	Simplified GPU inference worker
01_getting_started/03_mixed_workers/cpu_worker.py	Simplified CPU preprocessing/postprocessing workers
01_getting_started/03_mixed_workers/pipeline.py	New orchestration layer with CpuLiveLoadBalancer endpoint
01_getting_started/03_mixed_workers/workers/gpu/init.py	Removed router boilerplate
01_getting_started/03_mixed_workers/workers/cpu/init.py	Removed router boilerplate
01_getting_started/03_mixed_workers/workers/init.py	Removed workers package
01_getting_started/03_mixed_workers/main.py	Removed FastAPI app
01_getting_started/03_mixed_workers/mothership.py	Removed mothership configuration
01_getting_started/03_mixed_workers/init.py	Removed package initialization
01_getting_started/04_dependencies/gpu_worker.py	Simplified dependency demonstration, removed dotenv from test code
01_getting_started/04_dependencies/cpu_worker.py	Simplified dependency demonstration, removed dotenv from test code
01_getting_started/04_dependencies/workers/gpu/init.py	Removed router boilerplate
01_getting_started/04_dependencies/workers/cpu/init.py	Removed router boilerplate
01_getting_started/04_dependencies/workers/init.py	Removed workers package
01_getting_started/04_dependencies/main.py	Removed FastAPI app
01_getting_started/04_dependencies/mothership.py	Removed mothership configuration
01_getting_started/04_dependencies/init.py	Removed package initialization
02_ml_inference/01_text_to_speech/gpu_worker.py	Flattened to single file, removed binary /tts/audio endpoint
02_ml_inference/01_text_to_speech/workers/gpu/init.py	Removed router boilerplate with Pydantic models
02_ml_inference/01_text_to_speech/workers/init.py	Removed workers package
02_ml_inference/01_text_to_speech/main.py	Removed FastAPI app
02_ml_inference/01_text_to_speech/mothership.py	Removed mothership configuration
02_ml_inference/01_text_to_speech/init.py	Removed package initialization
03_advanced_workers/05_load_balancer/gpu_lb.py	Flattened load-balanced GPU endpoints to standalone file
03_advanced_workers/05_load_balancer/cpu_lb.py	Flattened load-balanced CPU endpoints to standalone file
03_advanced_workers/05_load_balancer/workers/gpu/init.py	Removed router boilerplate
03_advanced_workers/05_load_balancer/workers/cpu/init.py	Removed router boilerplate
03_advanced_workers/05_load_balancer/workers/init.py	Removed workers package
03_advanced_workers/05_load_balancer/main.py	Removed FastAPI app
03_advanced_workers/05_load_balancer/mothership.py	Removed mothership configuration
05_data_workflows/01_network_volumes/gpu_worker.py	Inlined NetworkVolume definition, simplified comments
05_data_workflows/01_network_volumes/cpu_worker.py	Inlined NetworkVolume definition, changed to return base64-encoded images
05_data_workflows/01_network_volumes/workers/gpu/init.py	Removed router boilerplate
05_data_workflows/01_network_volumes/workers/cpu/init.py	Removed router boilerplate with HTML response
05_data_workflows/01_network_volumes/workers/init.py	Removed shared NetworkVolume import
05_data_workflows/01_network_volumes/main.py	Removed FastAPI app with lifespan management

Comments suppressed due to low confidence (6)

05_data_workflows/01_network_volumes/gpu_worker.py:18

Inconsistent naming convention for worker configurations. This example uses a generic name "gpu_worker" while other examples use prefixed names like "01_01_gpu_worker" or "02_01_text_to_speech_gpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
05_data_workflows/01_network_volumes/cpu_worker.py:12
Inconsistent naming convention for worker configurations. This example uses a generic name "cpu_worker" while other examples use prefixed names like "01_02_cpu_worker" or "01_03_mixed_workers_cpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
05_data_workflows/01_network_volumes/cpu_worker.py:25
Missing error handling for when the directory doesn't exist. The os.listdir call on line 25 will raise FileNotFoundError if "/runpod-volume/generated_images" doesn't exist yet. Consider adding a try-except block or checking if the directory exists first using os.path.exists() and returning an appropriate response.
05_data_workflows/01_network_volumes/cpu_worker.py:39
Potential path traversal vulnerability. The file_id parameter is directly concatenated into the file path without sanitization, allowing users to potentially access files outside the intended directory by using "../" sequences. Consider validating that the file_id doesn't contain path separators or use Path.resolve() to check that the final path is within the expected directory.
05_data_workflows/01_network_volumes/gpu_worker.py:15
The NetworkVolume is defined separately in both gpu_worker.py and cpu_worker.py with identical names and sizes. While this is intentional for the flat-file pattern where each worker is self-contained, it's important to note that both workers must use exactly the same name parameter for the volume to be shared. Consider adding a comment clarifying that the volume name must match between workers that need to share data.
05_data_workflows/01_network_volumes/cpu_worker.py:50
The function now returns image data as base64-encoded JSON instead of binary Response. This change is consistent with the flat-file pattern but results in larger payloads (base64 encoding increases size by ~33%). For large images, this could impact performance. Consider documenting this tradeoff or providing guidance on when to use this pattern versus serving binary data directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

01_getting_started/01_hello_world/gpu_worker.py

- Remove Pydantic validation and FastAPI router references - Update port 8000 to 8888, fix curl paths to /worker/run_sync pattern - Major trim on 03_mixed_workers (370 lines) and 04_dependencies (304 lines)

… flat-file refactor - Fix curl paths to /gpu_worker/run_sync pattern - Trim verbose sections to match simplified examples

…README - CPU worker: CpuLiveServerless -> CpuLiveLoadBalancer with explicit paths - CPU endpoints: GET /images (list), GET /images/{file_name} (get single) - GPU config name: gpu_worker -> 05_01_gpu_worker - README: update curl examples and API Endpoints to match actual routes

- Add .flash/ to 01_hello_world gitignore - Replace CLAUDE.md with worktree-specific template - Add fastapi, uvicorn dependencies to uv.lock

main.py was a pre-flat-file pattern artifact that manually discovered APIRouters from worker directories. With the flat-file refactor, flash run handles discovery directly from individual worker files.

Copilot

Pull request overview

Copilot reviewed 52 out of 57 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

05_data_workflows/01_network_volumes/gpu_worker.py:60

SimpleSD.__init__() assumes MODEL_PATH is set and that the directory exists (os.listdir(model_path)). When running the advertised local test (python gpu_worker.py), MODEL_PATH may be unset and/or the directory may not exist, causing a crash. Consider defaulting model_path to MODEL_PATH, ensuring the directory exists (mkdir), and guarding the os.listdir log statement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CLAUDE.md

05_data_workflows/01_network_volumes/cpu_worker.py

01_getting_started/02_cpu_worker/README.md

01_getting_started/03_mixed_workers/README.md

Each example already has pyproject.toml as the authoritative dep spec. Runtime deps are declared in @Remote(dependencies=[...]). The root requirements.txt (generated lockfile) and root .env.example remain. The per-example copies were orphaned by the flat-file refactor.

Remove fastapi, uvicorn from 02_ml_inference/01_text_to_speech (worker imports neither). Strip fastapi, uvicorn, numpy, pillow, python-multipart, structlog from root deps -- flash-generated server.py pulls these internally, and @Remote(dependencies=[...]) handles runtime deps. Keep only runpod-flash as the sole project dependency.

…attern

Replace generic request: dict with numbers: list[float], add empty-list guards, and align __main__ test block with new signature.

Strip pyproject.toml to single runpod-flash dependency and ruff dev dep. Replace full lockfile with minimal requirements.txt. Remove dead Makefile targets (typecheck, quality-check-strict, dep sync scripts). Reformat to ruff defaults.

Remove dep-sync step from quality.yml (tomli_w removed, per-example deps deleted). Replace quality-check-strict with quality-check in test matrix. Delete orphaned sync_example_deps.py script.

Replace CLAUDE.md worktree template with flat-file pattern guidelines. Guard os.listdir against missing directory in network volumes cpu_worker. Remove stale localhost:8000 reference in cpu_worker README. Fix diagram text to match actual code behavior.

deanq added 7 commits February 19, 2026 13:13

refactor(01_hello_world): simplify to flat-file pattern

ee4b390

Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep gpu_worker.py with config + @Remote function only. flash run auto-discovers endpoints.

refactor(02_cpu_worker): simplify to flat-file pattern

ed75b56

Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep cpu_worker.py with config + @Remote function only.

refactor(03_mixed_workers): simplify to flat-file pattern with LB pip…

d369314

…eline Flatten workers/ to gpu_worker.py + cpu_worker.py. Create pipeline.py as CpuLiveLoadBalancer endpoint for CPU->GPU->CPU orchestration. Remove main.py, mothership.py, all __init__.py, workers/ directory.

refactor(04_dependencies): simplify to flat-file pattern

eff3e3f

Flatten workers/ to gpu_worker.py + cpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory.

refactor(01_text_to_speech): simplify to flat-file pattern

1a73ab5

Flatten workers/ to gpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory. Drop /tts/audio binary endpoint (was router-only).

refactor(05_load_balancer): simplify to flat-file pattern

1a2357a

Flatten workers/ to gpu_lb.py + cpu_lb.py. Remove main.py, mothership.py, workers/ directory.

deanq requested a review from Copilot February 19, 2026 22:50

Copilot started reviewing on behalf of deanq February 19, 2026 22:51 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

01_getting_started/01_hello_world/gpu_worker.py Show resolved Hide resolved

jhcipar force-pushed the refactor/ae-2210-simplified-examples branch from 196bfac to 1bd9b09 Compare February 20, 2026 16:06

deanq added 5 commits February 20, 2026 10:49

docs(01_getting_started): simplify READMEs to match flat-file refactor

612f9e6

- Remove Pydantic validation and FastAPI router references - Update port 8000 to 8888, fix curl paths to /worker/run_sync pattern - Major trim on 03_mixed_workers (370 lines) and 04_dependencies (304 lines)

docs(02_ml_inference, 03_advanced_workers): simplify READMEs to match…

43b8f1f

… flat-file refactor - Fix curl paths to /gpu_worker/run_sync pattern - Trim verbose sections to match simplified examples

chore: add .flash to gitignore, update CLAUDE.md and lockfile

fcccfb6

- Add .flash/ to 01_hello_world gitignore - Replace CLAUDE.md with worktree-specific template - Add fastapi, uvicorn dependencies to uv.lock

chore: remove legacy unified FastAPI discovery app

def9ce4

main.py was a pre-flat-file pattern artifact that manually discovered APIRouters from worker directories. With the flat-file refactor, flash run handles discovery directly from individual worker files.

deanq requested a review from Copilot February 20, 2026 21:53

Copilot started reviewing on behalf of deanq February 20, 2026 21:53 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

deanq added 7 commits February 20, 2026 14:05

docs: update CONTRIBUTING, DEVELOPMENT, CLI-REFERENCE for flat-file p…

f8f205f

…attern

fix(examples): use typed params in gpu_lb compute_intensive

abf4844

Replace generic request: dict with numbers: list[float], add empty-list guards, and align __main__ test block with new signature.

fix(ci): remove references to deleted targets and scripts

76c5eb6

Remove dep-sync step from quality.yml (tomli_w removed, per-example deps deleted). Replace quality-check-strict with quality-check in test matrix. Delete orphaned sync_example_deps.py script.

KAJdev approved these changes Feb 20, 2026

View reviewed changes

deanq merged commit 654ca96 into main Feb 20, 2026
6 checks passed

deanq deleted the refactor/ae-2210-simplified-examples branch February 20, 2026 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

refactor: simplify examples to flat-file flash init pattern#29

refactor: simplify examples to flat-file flash init pattern#29
deanq merged 19 commits intomainfrom
refactor/ae-2210-simplified-examples

deanq commented Feb 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

deanq commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes per example

What was removed

What was added

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

deanq commented Feb 19, 2026 •

edited

Loading