Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
ee4b390
refactor(01_hello_world): simplify to flat-file pattern
deanq Feb 19, 2026
ed75b56
refactor(02_cpu_worker): simplify to flat-file pattern
deanq Feb 19, 2026
d369314
refactor(03_mixed_workers): simplify to flat-file pattern with LB pip…
deanq Feb 19, 2026
eff3e3f
refactor(04_dependencies): simplify to flat-file pattern
deanq Feb 19, 2026
1a73ab5
refactor(01_text_to_speech): simplify to flat-file pattern
deanq Feb 19, 2026
1a2357a
refactor(05_load_balancer): simplify to flat-file pattern
deanq Feb 19, 2026
1bd9b09
refactor(01_network_volumes): simplify to flat-file pattern, inline v…
deanq Feb 19, 2026
612f9e6
docs(01_getting_started): simplify READMEs to match flat-file refactor
deanq Feb 20, 2026
43b8f1f
docs(02_ml_inference, 03_advanced_workers): simplify READMEs to match…
deanq Feb 20, 2026
fed4540
refactor(05_network_volumes): update cpu_worker to LB pattern, align …
deanq Feb 20, 2026
fcccfb6
chore: add .flash to gitignore, update CLAUDE.md and lockfile
deanq Feb 20, 2026
def9ce4
chore: remove legacy unified FastAPI discovery app
deanq Feb 20, 2026
c61dde6
chore: remove per-example requirements.txt and .env.example files
deanq Feb 20, 2026
1336fff
chore: remove stale deps from pyproject.toml, regenerate lockfiles
deanq Feb 20, 2026
f8f205f
docs: update CONTRIBUTING, DEVELOPMENT, CLI-REFERENCE for flat-file p…
deanq Feb 20, 2026
abf4844
fix(examples): use typed params in gpu_lb compute_intensive
deanq Feb 20, 2026
cc2894c
chore: simplify pyproject.toml, requirements.txt, and Makefile
deanq Feb 20, 2026
76c5eb6
fix(ci): remove references to deleted targets and scripts
deanq Feb 20, 2026
340c53e
fix: address PR review feedback from Copilot
deanq Feb 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions .github/workflows/quality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,5 @@ jobs:
- name: Install dependencies
run: make setup

- name: Check dependency sync
run: |
echo "::group::Dependency sync check"
uv run python scripts/sync_example_deps.py --check
echo "::endgroup::"

- name: Run quality checks
run: make ci-quality-github
2 changes: 1 addition & 1 deletion .github/workflows/test-examples.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,4 @@ jobs:
run: make venv-info

- name: Run quality checks
run: make quality-check-strict
run: make quality-check
4 changes: 0 additions & 4 deletions 01_getting_started/01_hello_world/.env.example

This file was deleted.

1 change: 1 addition & 0 deletions 01_getting_started/01_hello_world/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,4 @@ uv.lock
# OS
.DS_Store
Thumbs.db
.flash/
18 changes: 8 additions & 10 deletions 01_getting_started/01_hello_world/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,21 +26,19 @@ Get your API key from [Runpod Settings](https://www.runpod.io/console/user/setti
flash run
```

Server starts at **http://localhost:8000**
Server starts at **http://localhost:8888**

### 4. Test the API

```bash
# Health check
curl http://localhost:8000/ping
Visit **http://localhost:8888/docs** for interactive API documentation. QB endpoints are auto-generated by `flash run` based on your `@remote` functions.

# GPU worker
curl -X POST http://localhost:8000/gpu/hello \
```bash
curl -X POST http://localhost:8888/gpu_worker/run_sync \
-H "Content-Type: application/json" \
-d '{"message": "Hello GPU!"}'
```

Visit **http://localhost:8000/docs** for interactive API documentation.
Visit **http://localhost:8888/docs** for interactive API documentation.

### Full CLI Documentation

Expand All @@ -67,7 +65,9 @@ The worker demonstrates:

## API Endpoints

### POST /gpu/hello
QB (queue-based) endpoints are auto-generated from `@remote` functions. Visit `/docs` for the full API schema.

### `gpu_hello`

Executes a simple GPU worker and returns system/GPU information.

Expand Down Expand Up @@ -100,9 +100,7 @@ Executes a simple GPU worker and returns system/GPU information.

```
01_hello_world/
├── main.py # FastAPI application
├── gpu_worker.py # GPU worker with @remote decorator
├── mothership.py # Mothership endpoint configuration
├── pyproject.toml # Project metadata
├── requirements.txt # Dependencies
├── .env.example # Environment variables template
Expand Down
1 change: 0 additions & 1 deletion 01_getting_started/01_hello_world/__init__.py

This file was deleted.

73 changes: 9 additions & 64 deletions 01_getting_started/01_hello_world/gpu_worker.py
Original file line number Diff line number Diff line change
@@ -1,46 +1,20 @@
## Hello world: GPU serverless workers
# In this part of the example code, we provision a GPU-based worker and have it
# execute code. We can run the worker directly, or have it handle API requests
# to the router function. It's registered to a subrouter in the __init__.py
# file in this folder, and subsequently imported by main.py and attached to the
# FastAPI app there.
# GPU serverless worker -- detects available GPU hardware.
# Run with: flash run
# Test directly: python gpu_worker.py
from runpod_flash import GpuGroup, LiveServerless, remote

# Scaling behavior is controlled by configuration passed to the
# `LiveServerless` class.
from fastapi import APIRouter
from pydantic import BaseModel
from runpod_flash import (
GpuGroup,
LiveServerless,
remote,
)

# Here, we'll define several variables that change the
# default behavior of our serverless endpoint. `workersMin` sets our endpoint
# to scale to 0 active containers; `workersMax` will allow our endpoint to run
# up to 3 workers in parallel as the endpoint receives more work. We also set
# an idle timeout of 5 minutes so that any active worker stays alive for 5
# minutes after completing a request.
gpu_config = LiveServerless(
name="01_01_gpu_worker",
gpus=[GpuGroup.ANY], # Run on any GPU
gpus=[GpuGroup.ANY],
workersMin=0,
workersMax=3,
idleTimeout=5,
)


# Decorating our function with `remote` will package up the function code and
# deploy it on the infrastructure according to the passed input config. The
# results from the worker will be returned to your terminal. In this example
# the function will return a greeting to the input string passed in the `name`
# key. The code itself will run on a GPU worker, and information about the GPU
# the worker has access to will be included in the response.
@remote(resource_config=gpu_config)
async def gpu_hello(
input_data: dict,
) -> dict:
"""Simple GPU worker example with GPU detection."""
async def gpu_hello(input_data: dict) -> dict:
"""Simple GPU worker that returns GPU hardware info."""
import platform
from datetime import datetime

Expand All @@ -51,10 +25,7 @@ async def gpu_hello(
gpu_count = torch.cuda.device_count()
gpu_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3)

message = input_data.get(
"message",
"Hello from GPU worker!",
)
message = input_data.get("message", "Hello from GPU worker!")

return {
"status": "success",
Expand All @@ -64,40 +35,14 @@ async def gpu_hello(
"available": gpu_available,
"name": gpu_name,
"count": gpu_count,
"memory_gb": round(
gpu_memory,
2,
),
"memory_gb": round(gpu_memory, 2),
},
"timestamp": datetime.now().isoformat(),
"platform": platform.system(),
"python_version": platform.python_version(),
}


# We define a subrouter for our gpu worker so that our main router in `main.py`
# can attach it for routing gpu-specific requests.
gpu_router = APIRouter()


class MessageRequest(BaseModel):
"""Request model for GPU worker."""

message: str = "Hello from GPU!"


@gpu_router.post("/hello")
async def hello(
request: MessageRequest,
):
"""Simple GPU worker endpoint."""
result = await gpu_hello({"message": request.message})
return result


# This code is packaged up as a "worker" that will handle requests sent to the
# endpoint at /gpu/hello, but you can also trigger it directly by running
# python -m workers.gpu.endpoint
if __name__ == "__main__":
import asyncio

Expand Down
71 changes: 0 additions & 71 deletions 01_getting_started/01_hello_world/main.py

This file was deleted.

53 changes: 0 additions & 53 deletions 01_getting_started/01_hello_world/mothership.py

This file was deleted.

1 change: 0 additions & 1 deletion 01_getting_started/01_hello_world/requirements.txt

This file was deleted.

4 changes: 0 additions & 4 deletions 01_getting_started/02_cpu_worker/.env.example

This file was deleted.

18 changes: 7 additions & 11 deletions 01_getting_started/02_cpu_worker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,22 +26,18 @@ Get your API key from [Runpod Settings](https://www.runpod.io/console/user/setti
flash run
```

Server starts at **http://localhost:8000**
Server starts at **http://localhost:8888**

### 4. Test the API

```bash
# Health check
curl http://localhost:8000/ping
Visit **http://localhost:8888/docs** for interactive API documentation. QB endpoints are auto-generated by `flash run` based on your `@remote` functions.

# CPU worker
curl -X POST http://localhost:8000/cpu/hello \
```bash
curl -X POST http://localhost:8888/cpu_worker/run_sync \
-H "Content-Type: application/json" \
-d '{"name": "Flash User"}'
```

Visit **http://localhost:8000/docs** for interactive API documentation.

### Full CLI Documentation

For complete CLI usage including deployment, environment management, and troubleshooting:
Expand All @@ -67,7 +63,9 @@ The worker demonstrates:

## API Endpoints

### POST /cpu/hello
QB (queue-based) endpoints are auto-generated from `@remote` functions. Visit `/docs` for the full API schema.

### `cpu_hello`

Executes a simple CPU worker and returns a greeting with system information.

Expand All @@ -94,9 +92,7 @@ Executes a simple CPU worker and returns a greeting with system information.

```
02_cpu_worker/
├── main.py # FastAPI application
├── cpu_worker.py # CPU worker with @remote decorator
├── mothership.py # Mothership endpoint configuration
├── pyproject.toml # Project metadata
├── requirements.txt # Dependencies
├── .env.example # Environment variables template
Expand Down
1 change: 0 additions & 1 deletion 01_getting_started/02_cpu_worker/__init__.py

This file was deleted.

Loading