Skip to content

Comments

refactor: simplify examples to flat-file flash init pattern#29

Merged
deanq merged 19 commits intomainfrom
refactor/ae-2210-simplified-examples
Feb 20, 2026
Merged

refactor: simplify examples to flat-file flash init pattern#29
deanq merged 19 commits intomainfrom
refactor/ae-2210-simplified-examples

Conversation

@deanq
Copy link
Member

@deanq deanq commented Feb 19, 2026

Prerequisite: runpod/flash#208

Summary

  • Refactor all 7 examples from the old FastAPI boilerplate pattern (main.py + routers + Pydantic models + mothership.py) to the simplified flash init flat-file pattern
  • Each example is now a directory of standalone @remote decorated files that flash run auto-discovers
  • Net change: 44 files reduced to 12 files, -1466 lines / +119 lines
image

Changes per example

Example Before After Key change
01_hello_world 4 files 1 file Trimmed gpu_worker.py
02_cpu_worker 4 files 1 file Trimmed cpu_worker.py
03_mixed_workers 9 files 3 files Flattened + new pipeline.py (LB endpoint)
04_dependencies 8 files 2 files Flattened workers/
01_text_to_speech 6 files 1 file Flattened, dropped /tts/audio binary endpoint
05_load_balancer 7 files 2 files Flattened to gpu_lb.py + cpu_lb.py
01_network_volumes 6 files 2 files Inlined NetworkVolume, removed FastAPI deps

What was removed

  • All main.py files (FastAPI app with routers, uvicorn startup)
  • All mothership.py files (LB config for the app)
  • All __init__.py router files (APIRouter + Pydantic request models)
  • All workers/ directory nesting

What was added

  • 03_mixed_workers/pipeline.py: CpuLiveLoadBalancer endpoint that orchestrates CPU preprocess -> GPU inference -> CPU postprocess (replaces the old main.py /classify endpoint)

Test plan

  • flash run from each example directory discovers all @remote endpoints
  • Generated Swagger UI shows correct endpoints
  • QB endpoints accessible at /{file}/run_sync or /{file}/{function}/run_sync
  • LB endpoints accessible at their configured method/path

Remove FastAPI boilerplate (main.py, mothership.py, __init__.py).
Keep gpu_worker.py with config + @Remote function only.
flash run auto-discovers endpoints.
Remove FastAPI boilerplate (main.py, mothership.py, __init__.py).
Keep cpu_worker.py with config + @Remote function only.
…eline

Flatten workers/ to gpu_worker.py + cpu_worker.py.
Create pipeline.py as CpuLiveLoadBalancer endpoint for CPU->GPU->CPU orchestration.
Remove main.py, mothership.py, all __init__.py, workers/ directory.
Flatten workers/ to gpu_worker.py + cpu_worker.py.
Remove main.py, mothership.py, all __init__.py, workers/ directory.
Flatten workers/ to gpu_worker.py.
Remove main.py, mothership.py, all __init__.py, workers/ directory.
Drop /tts/audio binary endpoint (was router-only).
Flatten workers/ to gpu_lb.py + cpu_lb.py.
Remove main.py, mothership.py, workers/ directory.
…olume config

Flatten workers/ to gpu_worker.py + cpu_worker.py.
Inline NetworkVolume definition into each worker file.
Remove FastAPI dependencies from CPU worker (pure dict returns).
Remove main.py, workers/ directory.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors 7 examples from the legacy FastAPI boilerplate pattern to the simplified flat-file flash init pattern, dramatically reducing complexity from 44 files to 12 files (-1466 lines / +119 lines). The refactoring replaces the old architecture (main.py + routers + Pydantic models + mothership.py) with standalone @remote decorated worker files that flash run auto-discovers.

Changes:

  • Eliminated all FastAPI boilerplate (main.py, routers, init.py files) in favor of direct @remote decorators
  • Removed all mothership.py configuration files and workers/ directory nesting
  • Added pipeline.py in 03_mixed_workers to demonstrate cross-worker orchestration via CpuLiveLoadBalancer

Reviewed changes

Copilot reviewed 40 out of 44 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
01_getting_started/01_hello_world/gpu_worker.py Simplified from 4 files to single standalone GPU worker with @remote decorator
01_getting_started/01_hello_world/main.py Removed FastAPI app boilerplate (no longer needed)
01_getting_started/01_hello_world/mothership.py Removed mothership load balancer configuration
01_getting_started/01_hello_world/init.py Removed package initialization
01_getting_started/02_cpu_worker/cpu_worker.py Simplified from 4 files to single standalone CPU worker
01_getting_started/02_cpu_worker/main.py Removed FastAPI app boilerplate
01_getting_started/02_cpu_worker/mothership.py Removed mothership configuration
01_getting_started/02_cpu_worker/init.py Removed package initialization
01_getting_started/03_mixed_workers/gpu_worker.py Simplified GPU inference worker
01_getting_started/03_mixed_workers/cpu_worker.py Simplified CPU preprocessing/postprocessing workers
01_getting_started/03_mixed_workers/pipeline.py New orchestration layer with CpuLiveLoadBalancer endpoint
01_getting_started/03_mixed_workers/workers/gpu/init.py Removed router boilerplate
01_getting_started/03_mixed_workers/workers/cpu/init.py Removed router boilerplate
01_getting_started/03_mixed_workers/workers/init.py Removed workers package
01_getting_started/03_mixed_workers/main.py Removed FastAPI app
01_getting_started/03_mixed_workers/mothership.py Removed mothership configuration
01_getting_started/03_mixed_workers/init.py Removed package initialization
01_getting_started/04_dependencies/gpu_worker.py Simplified dependency demonstration, removed dotenv from test code
01_getting_started/04_dependencies/cpu_worker.py Simplified dependency demonstration, removed dotenv from test code
01_getting_started/04_dependencies/workers/gpu/init.py Removed router boilerplate
01_getting_started/04_dependencies/workers/cpu/init.py Removed router boilerplate
01_getting_started/04_dependencies/workers/init.py Removed workers package
01_getting_started/04_dependencies/main.py Removed FastAPI app
01_getting_started/04_dependencies/mothership.py Removed mothership configuration
01_getting_started/04_dependencies/init.py Removed package initialization
02_ml_inference/01_text_to_speech/gpu_worker.py Flattened to single file, removed binary /tts/audio endpoint
02_ml_inference/01_text_to_speech/workers/gpu/init.py Removed router boilerplate with Pydantic models
02_ml_inference/01_text_to_speech/workers/init.py Removed workers package
02_ml_inference/01_text_to_speech/main.py Removed FastAPI app
02_ml_inference/01_text_to_speech/mothership.py Removed mothership configuration
02_ml_inference/01_text_to_speech/init.py Removed package initialization
03_advanced_workers/05_load_balancer/gpu_lb.py Flattened load-balanced GPU endpoints to standalone file
03_advanced_workers/05_load_balancer/cpu_lb.py Flattened load-balanced CPU endpoints to standalone file
03_advanced_workers/05_load_balancer/workers/gpu/init.py Removed router boilerplate
03_advanced_workers/05_load_balancer/workers/cpu/init.py Removed router boilerplate
03_advanced_workers/05_load_balancer/workers/init.py Removed workers package
03_advanced_workers/05_load_balancer/main.py Removed FastAPI app
03_advanced_workers/05_load_balancer/mothership.py Removed mothership configuration
05_data_workflows/01_network_volumes/gpu_worker.py Inlined NetworkVolume definition, simplified comments
05_data_workflows/01_network_volumes/cpu_worker.py Inlined NetworkVolume definition, changed to return base64-encoded images
05_data_workflows/01_network_volumes/workers/gpu/init.py Removed router boilerplate
05_data_workflows/01_network_volumes/workers/cpu/init.py Removed router boilerplate with HTML response
05_data_workflows/01_network_volumes/workers/init.py Removed shared NetworkVolume import
05_data_workflows/01_network_volumes/main.py Removed FastAPI app with lifespan management
Comments suppressed due to low confidence (6)

05_data_workflows/01_network_volumes/gpu_worker.py:18

  • Inconsistent naming convention for worker configurations. This example uses a generic name "gpu_worker" while other examples use prefixed names like "01_01_gpu_worker" or "02_01_text_to_speech_gpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
    05_data_workflows/01_network_volumes/cpu_worker.py:12
  • Inconsistent naming convention for worker configurations. This example uses a generic name "cpu_worker" while other examples use prefixed names like "01_02_cpu_worker" or "01_03_mixed_workers_cpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
    05_data_workflows/01_network_volumes/cpu_worker.py:25
  • Missing error handling for when the directory doesn't exist. The os.listdir call on line 25 will raise FileNotFoundError if "/runpod-volume/generated_images" doesn't exist yet. Consider adding a try-except block or checking if the directory exists first using os.path.exists() and returning an appropriate response.
    05_data_workflows/01_network_volumes/cpu_worker.py:39
  • Potential path traversal vulnerability. The file_id parameter is directly concatenated into the file path without sanitization, allowing users to potentially access files outside the intended directory by using "../" sequences. Consider validating that the file_id doesn't contain path separators or use Path.resolve() to check that the final path is within the expected directory.
    05_data_workflows/01_network_volumes/gpu_worker.py:15
  • The NetworkVolume is defined separately in both gpu_worker.py and cpu_worker.py with identical names and sizes. While this is intentional for the flat-file pattern where each worker is self-contained, it's important to note that both workers must use exactly the same name parameter for the volume to be shared. Consider adding a comment clarifying that the volume name must match between workers that need to share data.
    05_data_workflows/01_network_volumes/cpu_worker.py:50
  • The function now returns image data as base64-encoded JSON instead of binary Response. This change is consistent with the flat-file pattern but results in larger payloads (base64 encoding increases size by ~33%). For large images, this could impact performance. Consider documenting this tradeoff or providing guidance on when to use this pattern versus serving binary data directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jhcipar jhcipar force-pushed the refactor/ae-2210-simplified-examples branch from 196bfac to 1bd9b09 Compare February 20, 2026 16:06
- Remove Pydantic validation and FastAPI router references
- Update port 8000 to 8888, fix curl paths to /worker/run_sync pattern
- Major trim on 03_mixed_workers (370 lines) and 04_dependencies (304 lines)
… flat-file refactor

- Fix curl paths to /gpu_worker/run_sync pattern
- Trim verbose sections to match simplified examples
…README

- CPU worker: CpuLiveServerless -> CpuLiveLoadBalancer with explicit paths
- CPU endpoints: GET /images (list), GET /images/{file_name} (get single)
- GPU config name: gpu_worker -> 05_01_gpu_worker
- README: update curl examples and API Endpoints to match actual routes
- Add .flash/ to 01_hello_world gitignore
- Replace CLAUDE.md with worktree-specific template
- Add fastapi, uvicorn dependencies to uv.lock
main.py was a pre-flat-file pattern artifact that manually discovered
APIRouters from worker directories. With the flat-file refactor,
flash run handles discovery directly from individual worker files.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 57 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

05_data_workflows/01_network_volumes/gpu_worker.py:60

  • SimpleSD.__init__() assumes MODEL_PATH is set and that the directory exists (os.listdir(model_path)). When running the advertised local test (python gpu_worker.py), MODEL_PATH may be unset and/or the directory may not exist, causing a crash. Consider defaulting model_path to MODEL_PATH, ensuring the directory exists (mkdir), and guarding the os.listdir log statement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Each example already has pyproject.toml as the authoritative dep spec.
Runtime deps are declared in @Remote(dependencies=[...]). The root
requirements.txt (generated lockfile) and root .env.example remain.
The per-example copies were orphaned by the flat-file refactor.
Remove fastapi, uvicorn from 02_ml_inference/01_text_to_speech (worker
imports neither). Strip fastapi, uvicorn, numpy, pillow, python-multipart,
structlog from root deps -- flash-generated server.py pulls these
internally, and @Remote(dependencies=[...]) handles runtime deps.
Keep only runpod-flash as the sole project dependency.
Replace generic request: dict with numbers: list[float], add
empty-list guards, and align __main__ test block with new signature.
Strip pyproject.toml to single runpod-flash dependency and ruff dev dep.
Replace full lockfile with minimal requirements.txt. Remove dead Makefile
targets (typecheck, quality-check-strict, dep sync scripts). Reformat to
ruff defaults.
Remove dep-sync step from quality.yml (tomli_w removed, per-example deps
deleted). Replace quality-check-strict with quality-check in test matrix.
Delete orphaned sync_example_deps.py script.
Replace CLAUDE.md worktree template with flat-file pattern guidelines.
Guard os.listdir against missing directory in network volumes cpu_worker.
Remove stale localhost:8000 reference in cpu_worker README. Fix diagram
text to match actual code behavior.
@deanq deanq merged commit 654ca96 into main Feb 20, 2026
6 checks passed
@deanq deanq deleted the refactor/ae-2210-simplified-examples branch February 20, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants