refactor: simplify examples to flat-file flash init pattern#29
Conversation
Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep gpu_worker.py with config + @Remote function only. flash run auto-discovers endpoints.
Remove FastAPI boilerplate (main.py, mothership.py, __init__.py). Keep cpu_worker.py with config + @Remote function only.
…eline Flatten workers/ to gpu_worker.py + cpu_worker.py. Create pipeline.py as CpuLiveLoadBalancer endpoint for CPU->GPU->CPU orchestration. Remove main.py, mothership.py, all __init__.py, workers/ directory.
Flatten workers/ to gpu_worker.py + cpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory.
Flatten workers/ to gpu_worker.py. Remove main.py, mothership.py, all __init__.py, workers/ directory. Drop /tts/audio binary endpoint (was router-only).
Flatten workers/ to gpu_lb.py + cpu_lb.py. Remove main.py, mothership.py, workers/ directory.
…olume config Flatten workers/ to gpu_worker.py + cpu_worker.py. Inline NetworkVolume definition into each worker file. Remove FastAPI dependencies from CPU worker (pure dict returns). Remove main.py, workers/ directory.
There was a problem hiding this comment.
Pull request overview
This PR refactors 7 examples from the legacy FastAPI boilerplate pattern to the simplified flat-file flash init pattern, dramatically reducing complexity from 44 files to 12 files (-1466 lines / +119 lines). The refactoring replaces the old architecture (main.py + routers + Pydantic models + mothership.py) with standalone @remote decorated worker files that flash run auto-discovers.
Changes:
- Eliminated all FastAPI boilerplate (main.py, routers, init.py files) in favor of direct
@remotedecorators - Removed all mothership.py configuration files and workers/ directory nesting
- Added pipeline.py in 03_mixed_workers to demonstrate cross-worker orchestration via CpuLiveLoadBalancer
Reviewed changes
Copilot reviewed 40 out of 44 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| 01_getting_started/01_hello_world/gpu_worker.py | Simplified from 4 files to single standalone GPU worker with @remote decorator |
| 01_getting_started/01_hello_world/main.py | Removed FastAPI app boilerplate (no longer needed) |
| 01_getting_started/01_hello_world/mothership.py | Removed mothership load balancer configuration |
| 01_getting_started/01_hello_world/init.py | Removed package initialization |
| 01_getting_started/02_cpu_worker/cpu_worker.py | Simplified from 4 files to single standalone CPU worker |
| 01_getting_started/02_cpu_worker/main.py | Removed FastAPI app boilerplate |
| 01_getting_started/02_cpu_worker/mothership.py | Removed mothership configuration |
| 01_getting_started/02_cpu_worker/init.py | Removed package initialization |
| 01_getting_started/03_mixed_workers/gpu_worker.py | Simplified GPU inference worker |
| 01_getting_started/03_mixed_workers/cpu_worker.py | Simplified CPU preprocessing/postprocessing workers |
| 01_getting_started/03_mixed_workers/pipeline.py | New orchestration layer with CpuLiveLoadBalancer endpoint |
| 01_getting_started/03_mixed_workers/workers/gpu/init.py | Removed router boilerplate |
| 01_getting_started/03_mixed_workers/workers/cpu/init.py | Removed router boilerplate |
| 01_getting_started/03_mixed_workers/workers/init.py | Removed workers package |
| 01_getting_started/03_mixed_workers/main.py | Removed FastAPI app |
| 01_getting_started/03_mixed_workers/mothership.py | Removed mothership configuration |
| 01_getting_started/03_mixed_workers/init.py | Removed package initialization |
| 01_getting_started/04_dependencies/gpu_worker.py | Simplified dependency demonstration, removed dotenv from test code |
| 01_getting_started/04_dependencies/cpu_worker.py | Simplified dependency demonstration, removed dotenv from test code |
| 01_getting_started/04_dependencies/workers/gpu/init.py | Removed router boilerplate |
| 01_getting_started/04_dependencies/workers/cpu/init.py | Removed router boilerplate |
| 01_getting_started/04_dependencies/workers/init.py | Removed workers package |
| 01_getting_started/04_dependencies/main.py | Removed FastAPI app |
| 01_getting_started/04_dependencies/mothership.py | Removed mothership configuration |
| 01_getting_started/04_dependencies/init.py | Removed package initialization |
| 02_ml_inference/01_text_to_speech/gpu_worker.py | Flattened to single file, removed binary /tts/audio endpoint |
| 02_ml_inference/01_text_to_speech/workers/gpu/init.py | Removed router boilerplate with Pydantic models |
| 02_ml_inference/01_text_to_speech/workers/init.py | Removed workers package |
| 02_ml_inference/01_text_to_speech/main.py | Removed FastAPI app |
| 02_ml_inference/01_text_to_speech/mothership.py | Removed mothership configuration |
| 02_ml_inference/01_text_to_speech/init.py | Removed package initialization |
| 03_advanced_workers/05_load_balancer/gpu_lb.py | Flattened load-balanced GPU endpoints to standalone file |
| 03_advanced_workers/05_load_balancer/cpu_lb.py | Flattened load-balanced CPU endpoints to standalone file |
| 03_advanced_workers/05_load_balancer/workers/gpu/init.py | Removed router boilerplate |
| 03_advanced_workers/05_load_balancer/workers/cpu/init.py | Removed router boilerplate |
| 03_advanced_workers/05_load_balancer/workers/init.py | Removed workers package |
| 03_advanced_workers/05_load_balancer/main.py | Removed FastAPI app |
| 03_advanced_workers/05_load_balancer/mothership.py | Removed mothership configuration |
| 05_data_workflows/01_network_volumes/gpu_worker.py | Inlined NetworkVolume definition, simplified comments |
| 05_data_workflows/01_network_volumes/cpu_worker.py | Inlined NetworkVolume definition, changed to return base64-encoded images |
| 05_data_workflows/01_network_volumes/workers/gpu/init.py | Removed router boilerplate |
| 05_data_workflows/01_network_volumes/workers/cpu/init.py | Removed router boilerplate with HTML response |
| 05_data_workflows/01_network_volumes/workers/init.py | Removed shared NetworkVolume import |
| 05_data_workflows/01_network_volumes/main.py | Removed FastAPI app with lifespan management |
Comments suppressed due to low confidence (6)
05_data_workflows/01_network_volumes/gpu_worker.py:18
- Inconsistent naming convention for worker configurations. This example uses a generic name "gpu_worker" while other examples use prefixed names like "01_01_gpu_worker" or "02_01_text_to_speech_gpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
05_data_workflows/01_network_volumes/cpu_worker.py:12 - Inconsistent naming convention for worker configurations. This example uses a generic name "cpu_worker" while other examples use prefixed names like "01_02_cpu_worker" or "01_03_mixed_workers_cpu". Consider following the pattern of prefixing with the example number for consistency and to avoid potential naming conflicts when deploying multiple examples.
05_data_workflows/01_network_volumes/cpu_worker.py:25 - Missing error handling for when the directory doesn't exist. The os.listdir call on line 25 will raise FileNotFoundError if "/runpod-volume/generated_images" doesn't exist yet. Consider adding a try-except block or checking if the directory exists first using os.path.exists() and returning an appropriate response.
05_data_workflows/01_network_volumes/cpu_worker.py:39 - Potential path traversal vulnerability. The file_id parameter is directly concatenated into the file path without sanitization, allowing users to potentially access files outside the intended directory by using "../" sequences. Consider validating that the file_id doesn't contain path separators or use Path.resolve() to check that the final path is within the expected directory.
05_data_workflows/01_network_volumes/gpu_worker.py:15 - The NetworkVolume is defined separately in both gpu_worker.py and cpu_worker.py with identical names and sizes. While this is intentional for the flat-file pattern where each worker is self-contained, it's important to note that both workers must use exactly the same name parameter for the volume to be shared. Consider adding a comment clarifying that the volume name must match between workers that need to share data.
05_data_workflows/01_network_volumes/cpu_worker.py:50 - The function now returns image data as base64-encoded JSON instead of binary Response. This change is consistent with the flat-file pattern but results in larger payloads (base64 encoding increases size by ~33%). For large images, this could impact performance. Consider documenting this tradeoff or providing guidance on when to use this pattern versus serving binary data directly.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
196bfac to
1bd9b09
Compare
- Remove Pydantic validation and FastAPI router references - Update port 8000 to 8888, fix curl paths to /worker/run_sync pattern - Major trim on 03_mixed_workers (370 lines) and 04_dependencies (304 lines)
… flat-file refactor - Fix curl paths to /gpu_worker/run_sync pattern - Trim verbose sections to match simplified examples
…README
- CPU worker: CpuLiveServerless -> CpuLiveLoadBalancer with explicit paths
- CPU endpoints: GET /images (list), GET /images/{file_name} (get single)
- GPU config name: gpu_worker -> 05_01_gpu_worker
- README: update curl examples and API Endpoints to match actual routes
- Add .flash/ to 01_hello_world gitignore - Replace CLAUDE.md with worktree-specific template - Add fastapi, uvicorn dependencies to uv.lock
main.py was a pre-flat-file pattern artifact that manually discovered APIRouters from worker directories. With the flat-file refactor, flash run handles discovery directly from individual worker files.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 52 out of 57 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
05_data_workflows/01_network_volumes/gpu_worker.py:60
SimpleSD.__init__()assumesMODEL_PATHis set and that the directory exists (os.listdir(model_path)). When running the advertised local test (python gpu_worker.py),MODEL_PATHmay be unset and/or the directory may not exist, causing a crash. Consider defaultingmodel_pathtoMODEL_PATH, ensuring the directory exists (mkdir), and guarding theos.listdirlog statement.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Each example already has pyproject.toml as the authoritative dep spec. Runtime deps are declared in @Remote(dependencies=[...]). The root requirements.txt (generated lockfile) and root .env.example remain. The per-example copies were orphaned by the flat-file refactor.
Remove fastapi, uvicorn from 02_ml_inference/01_text_to_speech (worker imports neither). Strip fastapi, uvicorn, numpy, pillow, python-multipart, structlog from root deps -- flash-generated server.py pulls these internally, and @Remote(dependencies=[...]) handles runtime deps. Keep only runpod-flash as the sole project dependency.
Replace generic request: dict with numbers: list[float], add empty-list guards, and align __main__ test block with new signature.
Strip pyproject.toml to single runpod-flash dependency and ruff dev dep. Replace full lockfile with minimal requirements.txt. Remove dead Makefile targets (typecheck, quality-check-strict, dep sync scripts). Reformat to ruff defaults.
Remove dep-sync step from quality.yml (tomli_w removed, per-example deps deleted). Replace quality-check-strict with quality-check in test matrix. Delete orphaned sync_example_deps.py script.
Replace CLAUDE.md worktree template with flat-file pattern guidelines. Guard os.listdir against missing directory in network volumes cpu_worker. Remove stale localhost:8000 reference in cpu_worker README. Fix diagram text to match actual code behavior.
Summary
flash initflat-file pattern@remotedecorated files thatflash runauto-discoversChanges per example
What was removed
main.pyfiles (FastAPI app with routers, uvicorn startup)mothership.pyfiles (LB config for the app)__init__.pyrouter files (APIRouter + Pydantic request models)workers/directory nestingWhat was added
03_mixed_workers/pipeline.py: CpuLiveLoadBalancer endpoint that orchestrates CPU preprocess -> GPU inference -> CPU postprocess (replaces the old main.py /classify endpoint)Test plan
flash runfrom each example directory discovers all@remoteendpoints/{file}/run_syncor/{file}/{function}/run_sync