refactor: simplify flash init skeleton for zero-boilerplate flash run by deanq · Pull Request #208 · runpod/flash

deanq · 2026-02-19T22:09:54Z

Summary

Replace the old multi-directory skeleton (main.py, mothership.py, workers/) with three flat files: gpu_worker.py, cpu_worker.py, lb_worker.py
flash run auto-discovers @remote functions — no FastAPI boilerplate, routers, or main.py needed
Rewrite skeleton README with uv setup, QB/LB worker examples, GpuType reference table, and auto-provision tips

Changes

Skeleton template:

Delete main.py, mothership.py, workers/, .ruff_cache/
Add gpu_worker.py (QB GPU), cpu_worker.py (QB CPU), lb_worker.py (LB HTTP)
Simplify pyproject.toml (remove fastapi/uvicorn deps)
Add .flash/ to .gitignore
Rewrite README.md for flat-file approach

CLI:

Update flash init panel output and next steps for new structure
Add Ctrl+C cleanup hint to flash run startup output

flash run engine (prior commits):

Zero-boilerplate dev server: scans @remote functions, generates .flash/server.py
LB route dispatch through LoadBalancerSlsStub
Hot-reload on file changes
Auto-provision with --auto-provision flag
Endpoint cleanup on Ctrl+C

Tests:

Update skeleton tests for new file structure
Add flash run unit tests

Test plan

make quality-check passes (1043 tests, 68% coverage)
flash init test_project creates flat structure
cd test_project && flash run starts dev server
QB endpoints respond at /gpu_worker/run_sync and /cpu_worker/run_sync
LB endpoints respond at /lb_worker/process and /lb_worker/health
Ctrl+C cleans up provisioned endpoints

Copilot

Pull request overview

Refactors the Flash “init + run” experience to be zero-boilerplate by removing the FastAPI-first skeleton, auto-discovering @remote functions, and generating a local dev server under .flash/ with hot-reload and LB dispatch support.

Changes:

Replace the skeleton template with a flat gpu_worker.py / cpu_worker.py / lb_worker.py layout and rewrite the skeleton README accordingly.
Rework flash run to scan for @remote functions, generate .flash/server.py, run uvicorn with targeted reload, and clean up live endpoints on Ctrl+C.
Update scanner/manifest/build plumbing to support file-path-derived routing fields and LB handler generation; adjust/add unit tests.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/unit/test_skeleton.py	Updates skeleton expectations to the new flat-file template layout.
tests/unit/resources/test_serverless.py	Adds unit coverage for live-provisioning deploy checks and payload exclude behavior.
tests/unit/cli/test_run.py	Adds unit coverage for `flash run` server generation, reload behavior, watcher, and LB route generation.
tests/unit/cli/commands/build_utils/test_path_utilities.py	New tests for file-path → URL/module/resource naming utilities and LB handler detection.
tests/unit/cli/commands/build_utils/test_manifest_mothership.py	Removes legacy mothership manifest tests (mothership concept removed).
tests/integration/test_run_auto_provision.py	Removes old integration tests tied to the FastAPI entrypoint model.
src/runpod_flash/stubs/registry.py	Adds stubbing support for `CpuLiveLoadBalancer` via `LoadBalancerSlsStub`.
src/runpod_flash/stubs/load_balancer_sls.py	Broadens `/execute` routing decision to cover all live resources via `LiveServerlessMixin`.
src/runpod_flash/core/resources/serverless.py	Skips health check during live provisioning; excludes `template` when `templateId` set; injects `FLASH_MODULE_PATH` for LB deploys.
src/runpod_flash/core/resources/resource_manager.py	Removes noisy URL logging during deploy/get-or-deploy flows.
src/runpod_flash/core/resources/load_balancer_sls_resource.py	Promotes LB deploy logs to `info` with endpoint URL output.
src/runpod_flash/core/api/runpod.py	Tweaks GraphQL logging messages for endpoint save operations.
src/runpod_flash/client.py	Adds LB route-handler passthrough behavior for `@remote(method=..., path=...)` LB functions.
src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py	Removes legacy GPU worker skeleton module.
src/runpod_flash/cli/utils/skeleton_template/workers/gpu/init.py	Removes legacy FastAPI router wrapper for GPU worker.
src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py	Removes legacy CPU worker skeleton module.
src/runpod_flash/cli/utils/skeleton_template/workers/cpu/init.py	Removes legacy FastAPI router wrapper for CPU worker.
src/runpod_flash/cli/utils/skeleton_template/pyproject.toml	Simplifies skeleton pyproject (removes FastAPI/uvicorn + dev tooling from template).
src/runpod_flash/cli/utils/skeleton_template/mothership.py	Removes legacy mothership config from the skeleton template.
src/runpod_flash/cli/utils/skeleton_template/main.py	Removes legacy `main.py` FastAPI app from the skeleton template.
src/runpod_flash/cli/utils/skeleton_template/lb_worker.py	New skeleton LB worker example using `@remote(method=..., path=...)`.
src/runpod_flash/cli/utils/skeleton_template/gpu_worker.py	New skeleton GPU QB worker example.
src/runpod_flash/cli/utils/skeleton_template/cpu_worker.py	New skeleton CPU QB worker example.
src/runpod_flash/cli/utils/skeleton_template/README.md	Rewrites template docs for uv + flat files + new routes; adds GPU/CPU references.
src/runpod_flash/cli/utils/skeleton_template/.gitignore	Adds `.flash/` to template gitignore.
src/runpod_flash/cli/commands/run.py	Implements scanner-driven `.flash/server.py` generation, targeted reload, auto-provision, and Ctrl+C cleanup.
src/runpod_flash/cli/commands/init.py	Updates init output to reflect the new skeleton structure and removes mothership steps.
src/runpod_flash/cli/commands/build_utils/scanner.py	Adds file-path utilities and LB route-handler metadata; excludes `__init__.py` from scanning.
src/runpod_flash/cli/commands/build_utils/manifest.py	Removes mothership detection; adds `file_path`, `local_path_prefix`, `module_path` fields per resource.
src/runpod_flash/cli/commands/build_utils/lb_handler_generator.py	Simplifies LB handler lifespan (removes mothership reconciliation logic).
src/runpod_flash/cli/commands/build.py	Generates LB handlers during build; loosens project validation to any Python-containing dir.
src/runpod_flash/cli/commands/_run_server_helpers.py	Adds helper layer for LB route dispatch and body→kwargs mapping in generated server.
PRD.md	Adds a PRD/spec describing the intended zero-boilerplate model and route conventions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/runpod_flash/cli/commands/run.py

src/runpod_flash/stubs/load_balancer_sls.py

src/runpod_flash/cli/utils/skeleton_template/README.md

PRD.md

src/runpod_flash/cli/commands/run.py

Copilot

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/runpod_flash/cli/commands/run.py

Copilot

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/runpod_flash/cli/commands/build_utils/manifest.py

src/runpod_flash/cli/commands/build.py

PRD.md

Copilot

Pull request overview

Copilot reviewed 43 out of 44 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/runpod_flash/cli/commands/run.py

src/runpod_flash/stubs/dependency_resolver.py

src/runpod_flash/core/resources/serverless.py

src/runpod_flash/cli/commands/_run_server_helpers.py

src/runpod_flash/cli/commands/run.py

src/runpod_flash/stubs/dependency_resolver.py

…covery LB @Remote functions (with method= and path=) now return the decorated function unwrapped with __is_lb_route_handler__=True. The function body executes directly on the LB endpoint server rather than being dispatched as a remote stub. QB stubs inside the body are unaffected. Scanner gains three path utilities (file_to_url_prefix, file_to_resource_name, file_to_module_path) that convert file paths to URL prefixes, resource names, and dotted module paths respectively. RemoteFunctionMetadata gains is_lb_route_handler to distinguish LB route handlers from QB remote stubs during discovery.

Remove _serialize_routes, _create_mothership_resource, and _create_mothership_from_explicit — all referenced unimported symbols and caused F821 lint errors. The manifest now emits a flat resources dict with file_path, local_path_prefix, and module_path per resource; no is_mothership flag.

flash run now scans the project for all @Remote functions, generates .flash/server.py with routes derived from file paths, and starts uvicorn with --app-dir .flash/. Route convention: gpu_worker.py -> /gpu_worker/run and /gpu_worker/run_sync; subdirectory files produce matching URL prefixes. Cleanup on Ctrl+C is fixed: _cleanup_live_endpoints now reads .runpod/resources.pkl written by the uvicorn subprocess and deprovisions all live- prefixed endpoints, removing the dead in-process _SESSION_ENDPOINTS approach which never received data from the subprocess.

…project validation LBHandlerGenerator is now called from run_build() for all is_load_balanced resources, wiring the build pipeline to the new module_path-based handler generation. validate_project_structure switches from glob to rglob so projects with files only in subdirectories (e.g. 00_multi_resource) are not incorrectly rejected. lb_handler_generator loses the mothership reconciliation lifespan (StateManagerClient, reconcile_children) in favour of a clean startup/shutdown lifespan.

is_deployed skips the health check when FLASH_IS_LIVE_PROVISIONING=true. Newly created endpoints can fail RunPod's health API for a few seconds after creation (propagation delay), causing get_or_deploy_resource to trigger a spurious re-deploy on the second request (e.g. /run_sync immediately after /run). _payload_exclude now excludes template when templateId is already set. After first deployment _do_deploy sets templateId on the config object while the set_serverless_template validator has already set template at construction time. Sending both fields in the same payload causes RunPod to return 'You can only provide one of templateId or template.' Also adds _get_module_path helper and injects FLASH_MODULE_PATH into LB endpoint environment at deploy time so the deployed handler can import the correct user module.

Parent process watches project .py files via watchfiles and regenerates .flash/server.py on change. Uvicorn now watches only .flash/server.py instead of the whole project, so it reloads exactly once per change with the updated routes visible. - Add _watch_and_regenerate() background thread using watchfiles - Change --reload-dir from '.' to '.flash', --reload-include to 'server.py' - Start watcher thread when reload=True, stop on KeyboardInterrupt/Exception - Add TestRunCommandHotReload and TestWatchAndRegenerate test classes

watchfiles emits DEBUG-level messages ("all changes filtered out", "rust notify timeout") that are correct behavior but should not be visible to users. Silence the watchfiles logger at WARNING in _watch_and_regenerate() — scoped to that namespace only.

FastAPI treats `body: dict` as a required JSON body. GET/HEAD routes must be zero-arg so Swagger UI and browsers do not attempt to send a body, which triggers a fetch TypeError. Split the LB route code generator in _generate_flash_server() on method: get/head emit no-arg handlers; all other methods keep body: dict.

…ision LB route handlers were executing locally in the dev server process instead of forwarding to the deployed LB endpoint. The @Remote decorator returns LB handlers unwrapped (passthrough) because in a deployed pod the body IS the HTTP handler, but in flash run there is no deployed pod. Changes: - Add _run_server_helpers.py with lb_proxy() that uses ResourceManager.get_or_deploy_resource() for on-demand provisioning and get_authenticated_httpx_client() for auth headers - Generate proxy handlers for all LB routes (any HTTP method) that forward requests to the deployed endpoint transparently - Import resource config variables (not function bodies) for LB workers so the actual DeployableResource object is passed to lb_proxy - Restore --auto-provision flag dropped in 35cfa6e, using existing ResourceDiscovery and DeploymentOrchestrator to provision all endpoints upfront and eliminate cold-start latency - Replace TestGenerateFlashServer tests with proxy-aware assertions

ResourceDiscovery._import_module() uses importlib to execute each file, but cross-module imports (e.g. "from longruns.stage1 import ...") fail when the project root isn't on sys.path. This caused --auto-provision to silently skip LB endpoints whose files import from sibling packages.

Cleanup on server stop now prints a summary line with undeployed count and wall-clock duration, matching the provisioning output format.

…proxy Replace lb_proxy (transparent HTTP forwarding) with lb_execute which uses LoadBalancerSlsStub's /execute dispatch path. This fixes 404s on CpuLiveLoadBalancer resources where the remote container has no user routes — only the /execute endpoint that accepts serialized function code. - Change isinstance check from LiveLoadBalancer to LiveServerlessMixin so all live resource types (including CpuLiveLoadBalancer) use /execute - Add explicit CpuLiveLoadBalancer singledispatch registration in registry - Generate server.py imports for both config var and function reference - Clean up redundant URL debug logs in resource_manager

Restore lb_execute to dispatch through LoadBalancerSlsStub instead of calling functions locally — LB resources require Live Serverless containers and cannot execute on a local machine. Keep _map_body_to_params and body: dict signatures for OpenAPI/Swagger compatibility while dispatching remotely via the stub's /execute path. Remove /run from generated QB routes, retaining only /run_sync since the dev server executes synchronously.

Replace the old multi-directory skeleton (main.py, mothership.py, workers/) with three flat files: gpu_worker.py, cpu_worker.py, and lb_worker.py. flash run auto-discovers @Remote functions so the FastAPI boilerplate and router structure are no longer needed. - Remove main.py, mothership.py, workers/, .ruff_cache from skeleton - Add gpu_worker.py (QB GPU), cpu_worker.py (QB CPU), lb_worker.py (LB) - Simplify pyproject.toml deps (drop fastapi/uvicorn) - Add .flash/ to .gitignore - Rewrite README with uv setup, QB/LB examples, GpuType reference - Update init command panel output and next steps - Add Ctrl+C cleanup hint to flash run startup output - Update skeleton tests for new file structure

Directory names starting with digits (e.g. 01_getting_started/) produce invalid Python when used in import statements and function names. - Add _flash_import helper to generated server.py that uses importlib.import_module() with scoped sys.path so sibling imports (e.g. `from cpu_worker import ...`) resolve to the correct directory - Prefix generated function names with '_' when they start with a digit - Scope sys.path per-import to prevent name collisions when multiple directories contain files with the same name (e.g. cpu_worker.py)

The skeleton template was replaced with flat worker files (cpu_worker.py, gpu_worker.py, lb_worker.py, pyproject.toml) but the wheel validation script still expected the old multi-directory structure (main.py, workers/**). This caused the Build Package CI check to fail.

- Guard watcher_thread.join() with is_alive() check for --no-reload - Wrap watchfiles import in try/except for missing dependency - Fix debug log to show actual type instead of hardcoded class name - Fix invalid dict addition in skeleton README example - Fix PRD spec to match actual /run_sync-only behavior

…tures The dev server codegen always generated `await fn(body.get("input", body))` regardless of actual function signature. This crashed zero-param functions with TypeError and incorrectly passed a dict to multi-param functions. Scanner changes: - Extract param_names from function AST nodes (excluding self) - Extract class_method_params per public method for @Remote classes Codegen changes: - 0 params: `await fn()` with no `body: dict` in handler signature - 1 param: `await fn(body.get("input", body))` (preserves current behavior) - 2+ params: `await fn(**body.get("input", body))` (kwargs spread) - LB GET routes with path params (e.g. `/images/{file_id}`) now declare typed parameters in handler signature for proper Swagger UI rendering - LB POST routes with path params merge body and path params

Use pydantic.create_model() at server startup to dynamically build input models from @Remote function signatures. Swagger UI now shows typed form fields instead of a generic JSON text area. - Add make_input_model(), call_with_body(), to_dict() helpers - Codegen emits model creation lines and typed handler signatures - Simplify _build_call_expr to 2-way branch (zero-param vs body) - Fix class method introspection: use _class_type to bypass RemoteClassWrapper proxy signatures (*args, **kwargs) - Skip VAR_POSITIONAL/VAR_KEYWORD params in model creation as safety net - Fall back to dict when model creation fails (zero disruption)

When @Remote funcA calls @Remote funcB, the worker receives only funcA's source via exec(). funcB is undefined in that namespace, causing NameError. This adds dependency_resolver.py which AST-detects calls to other @Remote functions, provisions their endpoints via ResourceManager, and generates async dispatch stubs that are prepended to the caller's source code. The worker's exec() then defines both the stubs and the caller in the same namespace, allowing stacked @Remote calls to dispatch correctly. - Add dependency_resolver.py with detect, resolve, generate, and build - Change prepare_request to async in LiveServerlessStub and LoadBalancerSlsStub - Move LB stub timeout to constants.py as DEFAULT_LB_STUB_TIMEOUT (60s) - Update registry.py to await prepare_request - Add 24 unit tests for dependency resolver

Tags now use the parent directory path instead of per-file worker type labels. Routes from the same project appear under a single collapsible group in the Swagger UI, making multi-worker projects easier to navigate.

- Guard watcher_thread creation behind reload flag - Fix project_root derivation in build mode (use build_dir, not parent) - Update PRD QB route spec to match /run_sync-only implementation - Harden _cleanup_live_endpoints with granular error handling - Replace bare except in _watch_and_regenerate with specific handlers - Differentiate error types in lb_execute (422 for app errors, 500 for infra) - Add templateId/template mutual exclusivity validation - Improve _flash_import sys.path cleanup with index-based pop - Use explicit event loop in cleanup to avoid nested loop errors - Parallelize dependency provisioning with asyncio.gather - Clarify null-safety in detect_remote_dependencies

…oth templateId and template _payload_exclude() raised ValueError after _do_deploy() set templateId on a config object that already had template from initialization. Remove the raise in favor of silently excluding template (templateId takes precedence), and clear self.template after deploy mutation to prevent the inconsistent state at its source.

Not meant to be committed; internal planning document.

deanq changed the title ~~Simplify flash init skeleton for zero-boilerplate flash run~~ refactor: simplify flash init skeleton for zero-boilerplate flash run Feb 19, 2026

deanq requested a review from Copilot February 19, 2026 22:12

Copilot started reviewing on behalf of deanq February 19, 2026 22:13 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

deanq mentioned this pull request Feb 20, 2026

docs: remove coordinator and mothership terminology from all documentation #210

Merged

5 tasks

deanq requested a review from Copilot February 20, 2026 17:50

Copilot started reviewing on behalf of deanq February 20, 2026 17:50 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

src/runpod_flash/cli/commands/run.py Show resolved Hide resolved

deanq requested a review from Copilot February 20, 2026 18:23

Copilot started reviewing on behalf of deanq February 20, 2026 18:24 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

src/runpod_flash/cli/commands/build_utils/manifest.py Outdated Show resolved Hide resolved

src/runpod_flash/cli/commands/build.py Show resolved Hide resolved

PRD.md Outdated Show resolved Hide resolved

deanq requested a review from Copilot February 20, 2026 20:33

Copilot started reviewing on behalf of deanq February 20, 2026 20:33 View session

deanq mentioned this pull request Feb 20, 2026

refactor: simplify examples to flat-file flash init pattern runpod/flash-examples#29

Merged

4 tasks

Copilot AI reviewed Feb 20, 2026

View reviewed changes

deanq added 15 commits February 20, 2026 14:02

docs: add PRD for zero-boilerplate flash run experience

6079274

feat(run): show resource count and elapsed time during cleanup

1c285ce

Cleanup on server stop now prints a summary line with undeployed count and wall-clock duration, matching the provisioning output format.

deanq added 9 commits February 20, 2026 14:02

refactor(run): group Swagger UI tags by project directory

f262f65

Tags now use the parent directory path instead of per-file worker type labels. Routes from the same project appear under a single collapsible group in the Swagger UI, making multi-worker projects easier to navigate.

deanq force-pushed the refactor/ae-2210-simplified-starter branch from 50ee92e to 9f1928d Compare February 20, 2026 22:06

chore: remove PRD.md from branch

35af650

Not meant to be committed; internal planning document.

KAJdev approved these changes Feb 20, 2026

View reviewed changes

deanq merged commit 22894d4 into main Feb 20, 2026
6 checks passed

deanq deleted the refactor/ae-2210-simplified-starter branch February 20, 2026 23:19

runpod-release-please-bot bot mentioned this pull request Feb 20, 2026

chore: release 1.3.0 #211

Merged

Comments

Conversation

deanq commented Feb 19, 2026

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants