diff --git a/.cursor/rules/rp-styleguide.mdc b/.cursor/rules/rp-styleguide.mdc
index 9c4fefc5..e6529a1e 100644
--- a/.cursor/rules/rp-styleguide.mdc
+++ b/.cursor/rules/rp-styleguide.mdc
@@ -5,7 +5,7 @@ alwaysApply: true
 ---
 
 Always use sentence case for headings and titles.
-These are proper nouns: Runpod, Pods, Serverless, Hub, Instant Clusters, Secure Cloud, Community Cloud, Tetra.
+These are proper nouns: Runpod, Pods, Serverless, Hub, Instant Clusters, Secure Cloud, Community Cloud, Flash.
 These are generic terms: endpoint, worker, cluster, template, handler, fine-tune, network volume.
 
 Prefer using paragraphs to bullet points unless directly asked.
diff --git a/CLAUDE.md b/CLAUDE.md
index 3e7dae0c..0be9f6d3 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -98,7 +98,7 @@ Follow the Runpod style guide (`.cursor/rules/rp-styleguide.mdc`) and Google Dev
 ### Capitalization and Terminology
 
 - **Always use sentence case** for headings and titles
-- **Proper nouns**: Runpod, Pods, Serverless, Hub, Instant Clusters, Secure Cloud, Community Cloud, Tetra
+- **Proper nouns**: Runpod, Pods, Serverless, Hub, Instant Clusters, Secure Cloud, Community Cloud, Flash
 - **Generic terms** (lowercase): endpoint, worker, cluster, template, handler, fine-tune, network volume
 
 ### Writing Style
diff --git a/docs.json b/docs.json
index eb5cb51c..bf665420 100644
--- a/docs.json
+++ b/docs.json
@@ -119,6 +119,41 @@
               }
             ]
           },
+          {
+            "group": "Flash",
+            "pages": [
+              "flash/overview",
+              "flash/quickstart",
+              "flash/pricing",
+              "flash/remote-functions",
+              "flash/resource-configuration",
+              {
+                "group": "Build apps",
+                "pages": [
+                  "flash/apps/overview",
+                  "flash/apps/build-app",
+                  "flash/apps/initialize-project",
+                  "flash/apps/local-testing",
+                  "flash/apps/apps-and-environments",
+                  "flash/apps/deploy-apps"
+                ]
+              },
+              "flash/monitoring",
+              {
+                "group": "CLI reference",
+                "pages": [
+                  "flash/cli/overview",
+                  "flash/cli/init",
+                  "flash/cli/run",
+                  "flash/cli/build",
+                  "flash/cli/deploy",
+                  "flash/cli/env",
+                  "flash/cli/app",
+                  "flash/cli/undeploy"
+                ]
+              }
+            ]
+          },
           {
             "group": "Pods",
             "pages": [
diff --git a/flash/apps/apps-and-environments.mdx b/flash/apps/apps-and-environments.mdx
new file mode 100644
index 00000000..f1a6bb03
--- /dev/null
+++ b/flash/apps/apps-and-environments.mdx
@@ -0,0 +1,164 @@
+---
+title: "Manage apps and environments"
+sidebarTitle: "Manage apps and environments"
+description: "Manage Flash apps and environments with the flash app and flash env commands."
+tag: "BETA"
+---
+
+This page covers practical commands and workflows for managing Flash apps and environments. For a conceptual overview of the deployment hierarchy, see the [development lifecycle guide](/flash/apps/overview).
+
+## Flash apps
+
+A **Flash app** is a namespace registered on Runpod's backend that groups all resources for a single project, including environments, builds, and configuration. The app itself is just metadata—actual cloud resources (endpoints, volumes) are created when you deploy to an environment.
+
+### App hierarchy
+
+```text
+Flash App (my-project)
+│
+├── Environments
+│   ├── dev
+│   │   ├── Endpoints (gpu-worker, cpu-worker)
+│   │   └── Volumes (model-cache)
+│   ├── staging
+│   │   ├── Endpoints (gpu-worker, cpu-worker)
+│   │   └── Volumes (model-cache)
+│   └── production
+│       ├── Endpoints (gpu-worker, cpu-worker)
+│       └── Volumes (model-cache)
+│
+└── Builds
+    ├── build_v1 (2024-01-15)
+    ├── build_v2 (2024-01-18)
+    └── build_v3 (2024-01-20)
+```
+
+### Creating apps
+
+Apps are created automatically when you first run `flash deploy`. You can also register them explicitly:
+
+```bash
+flash app create APP_NAME
+```
+
+This registers the app namespace on Runpod's backend but doesn't create any cloud resources or local files.
+
+### Managing apps
+
+Use `flash app` commands to manage your apps:
+
+```bash
+# List all apps
+flash app list
+
+# Get app details
+flash app get APP_NAME
+
+# Delete an app and all its resources
+flash app delete --app APP_NAME
+```
+
+<Warning>
+
+Deleting an app removes all environments, builds, endpoints, and volumes associated with it. This operation is irreversible.
+
+</Warning>
+
+## Environments
+
+An **environment** is an isolated deployment stage within a Flash app (e.g., `dev`, `staging`, `production`). Each environment has its own endpoints, build version, volumes, and deployment state. Environments are completely independent.
+
+### Creating environments
+
+Environments are created automatically when you deploy with `--env`:
+
+```bash
+# Creates 'staging' environment if it doesn't exist
+flash deploy --env staging
+```
+
+You can also create them explicitly:
+
+```bash
+flash env create staging
+```
+
+### Managing environments
+
+Use `flash env` commands to manage environments:
+
+```bash
+# List all environments
+flash env list
+
+# Get environment details
+flash env get production
+
+# Delete an environment
+flash env delete dev
+```
+
+### Environment states
+
+| State | Description |
+|-------|-------------|
+| PENDING | Environment created but not deployed |
+| DEPLOYING | Deployment in progress |
+| DEPLOYED | Successfully deployed and running |
+| FAILED | Deployment or health check failed |
+| DELETING | Deletion in progress |
+
+## Best practices
+
+### Naming conventions
+
+Use clear, descriptive names:
+
+```bash
+# Good
+flash env create dev
+flash env create staging
+flash env create production
+
+# Avoid
+flash env create env1
+flash env create test123
+```
+
+### Environment strategy
+
+**Three-tier approach** (recommended for teams):
+
+| Environment | Purpose |
+|-------------|---------|
+| `dev` | Active development, frequent deploys |
+| `staging` | Pre-production testing, QA validation |
+| `production` | Live user-facing deployment |
+
+**Simple approach** (small projects):
+
+| Environment | Purpose |
+|-------------|---------|
+| `dev` | Development and testing |
+| `production` | Live deployment |
+
+### Workflow recommendations
+
+1. **Develop locally**: Test with `flash run` before deploying.
+2. **Deploy to dev**: `flash deploy --env dev` for initial testing.
+3. **Deploy to staging**: `flash deploy --env staging` for QA.
+4. **Deploy to production**: `flash deploy --env production` after approval.
+
+### Resource management
+
+- Monitor environments regularly with `flash env list`.
+- Clean up unused environments to avoid resource accumulation.
+- Check resource usage with `flash env get <name>`.
+- Delete environments carefully as deletion is irreversible.
+
+## Next steps
+
+- [Deploy your first app](/flash/apps/deploy-apps) with `flash deploy`.
+- [Learn about the CLI](/flash/cli/overview) for all available commands.
+- [View the env command reference](/flash/cli/env) for detailed options.
+- [View the app command reference](/flash/cli/app) for detailed options.
diff --git a/flash/apps/build-app.mdx b/flash/apps/build-app.mdx
new file mode 100644
index 00000000..5009330e
--- /dev/null
+++ b/flash/apps/build-app.mdx
@@ -0,0 +1,269 @@
+---
+title: "Build a Flash app"
+sidebarTitle: "Build a Flash app"
+description: "Create a Flash app, test it locally, and deploy it to production."
+tag: "BETA"
+---
+
+Flash apps let you build FastAPI apps to serve AI/ML workloads on Runpod Serverless. This guide walks you through the process of building a Flash app from scratch, from project initialization and local testing to production deployment.
+
+<Tip>
+If you haven't already, we recommend starting with the [Quickstart](/flash/quickstart) guide to get a feel for how Flash `@remote` functions work.
+</Tip>
+
+## Requirements:
+
+- You've [created a Runpod account](/get-started/manage-accounts).
+- You've [created a Runpod API key](/get-started/api-keys).
+- You've installed [Python 3.10 (or higher)](https://www.python.org/downloads/).
+
+## What you'll learn
+
+In this tutorial you'll learn how to:
+
+- Create a new Flash project with a template structure.
+- Explore the project template.
+- Install Python dependencies.
+- Add your API key to the environment.
+- Start the local development server.
+- Test the API endpoint using cURL.
+- Open the API explorer.
+- Customize your API endpoint.
+- Deploy to production.
+
+## Step 1: Initialize a new project
+
+Create a new directory and Python virtual environment:
+
+```bash
+# Create the project directory and navigate into it:
+mkdir flash_app
+cd flash_app
+
+# Install Flash:
+python3 -m venv venv
+source venv/bin/activate
+pip install runpod-flash
+```
+
+Use the `flash init` command to generate a structured project template with a preconfigured FastAPI application entry point:
+
+```bash
+flash init
+```
+
+Make sure your API key is set in the environment, either by creating a `.env` file or exporting the `RUNPOD_API_KEY` environment variable:
+
+```bash
+# Set the API key as an environment variable:
+export RUNPOD_API_KEY=YOUR_API_KEY
+
+# Or create a `.env` file:
+touch .env && echo "RUNPOD_API_KEY=YOUR_API_KEY" > .env
+```
+
+Replace `YOUR_API_KEY` with your actual Runpod API key.
+
+## Step 2: Explore the project template
+
+This is the structure of the project template created by `flash init`:
+
+```text
+flash_app/
+├── main.py                    # FastAPI application entry point
+├── workers/
+│   ├── gpu/                   # GPU worker example
+│   │   ├── __init__.py        # FastAPI router
+│   │   └── endpoint.py        # GPU script with @remote decorated function
+│   └── cpu/                   # CPU worker example
+│       ├── __init__.py        # FastAPI router
+│       └── endpoint.py        # CPU script with @remote decorated function
+├── .gitignore                 # Git ignore patterns
+├── .flashignore               # Flash deployment ignore patterns
+├── requirements.txt           # Python dependencies
+└── README.md                  # Project documentation
+```
+
+This template includes:
+
+- A FastAPI application entry point and routers.
+- Templates for `requirements.txt`, `.env`, `.gitignore`, etc.
+- Flash scripts (`endpoint.py`) for both GPU and CPU workers, which include:
+  - Pre-configured worker scaling limits using the `LiveServerless()` object.
+  - A `@remote` decorated function that returns a response from a worker.
+
+When you start the FastAPI server, it creates API endpoints at `/gpu/hello` and `/cpu/hello`, which call the remote function described in their respective `endpoint.py` files.
+
+## Step 3: Install Python dependencies
+
+Install required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Step 4: Configure your API key
+
+Open the `.env` template file in a text editor and add your [Runpod API key](/get-started/api-keys):
+
+```bash
+# Use your text editor of choice, e.g.
+cursor .env
+```
+
+Remove the `#` symbol from the beginning of the `RUNPOD_API_KEY` line and replace `your_api_key_here` with your actual Runpod API key:
+
+```text
+RUNPOD_API_KEY=your_api_key_here
+# FLASH_HOST=localhost
+# FLASH_PORT=8888
+# LOG_LEVEL=INFO
+```
+
+Save the file and close it.
+
+## Step 5: Start the local API server
+
+Use `flash run` to start the API server:
+
+```bash
+flash run
+```
+
+Open a new terminal tab or window and test your GPU API using cURL:
+
+```bash
+curl -X POST http://localhost:8888/gpu/hello \
+    -H "Content-Type: application/json" \
+    -d '{"message": "Hello from the GPU!"}'
+```
+
+If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.
+
+### Faster testing with auto-provisioning
+
+For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:
+
+```bash
+flash run --auto-provision
+```
+
+This eliminates cold-start delays by provisioning all serverless endpoints upfront. Endpoints are cached and reused across server restarts, making subsequent runs faster. Resources are identified by name, so the same endpoint won't be re-deployed if the configuration hasn't changed.
+
+## Step 6: Open the API explorer
+
+Besides starting the API server, `flash run` also starts an interactive API explorer. Point your web browser at [http://localhost:8888/docs](http://localhost:8888/docs) to explore the API.
+
+To run remote functions in the explorer:
+
+1. Expand one of the functions under **GPU Workers** or **CPU Workers**.
+2. Click **Try it out** and then **Execute**.
+
+You'll get a response from your workers right in the explorer.
+
+## Step 7: Customize your API
+
+To customize your API endpoint and functionality:
+
+1. Add or edit remote functions in your `endpoint.py` files.
+2. Test the scripts individually by running `python endpoint.py`.
+3. Configure your FastAPI routers by editing the `__init__.py` files.
+4. Add any new endpoints to your `main.py` file.
+
+### Example: Adding a custom endpoint
+
+To add a new GPU endpoint for image generation:
+
+1. Create a new file at `workers/gpu/image_gen.py`:
+
+```python
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+config = LiveServerless(
+    name="image-generator",
+    gpus=[GpuGroup.AMPERE_24],
+    workersMax=2
+)
+
+@remote(
+    resource_config=config,
+    dependencies=["diffusers", "torch", "transformers"]
+)
+def generate_image(prompt: str, width: int = 512, height: int = 512):
+    import torch
+    from diffusers import StableDiffusionPipeline
+    import base64
+    import io
+
+    pipeline = StableDiffusionPipeline.from_pretrained(
+        "runwayml/stable-diffusion-v1-5",
+        torch_dtype=torch.float16
+    ).to("cuda")
+
+    image = pipeline(prompt=prompt, width=width, height=height).images[0]
+
+    buffered = io.BytesIO()
+    image.save(buffered, format="PNG")
+    img_str = base64.b64encode(buffered.getvalue()).decode()
+
+    return {"image": img_str, "prompt": prompt}
+```
+
+2. Add a route in `workers/gpu/__init__.py`:
+
+```python
+from fastapi import APIRouter
+from .image_gen import generate_image
+
+router = APIRouter()
+
+@router.post("/generate")
+async def generate(prompt: str, width: int = 512, height: int = 512):
+    result = await generate_image(prompt, width, height)
+    return result
+```
+
+3. Include the router in `main.py` if not already included.
+
+## Step 8: Deploy to Runpod
+
+When you're ready to deploy your app to Runpod, use `flash deploy`:
+
+```bash
+flash deploy
+```
+
+This command:
+
+1. Builds your application into a deployment artifact.
+2. Uploads it to Runpod's storage.
+3. Provisions Serverless endpoints for your `@remote` functions.
+4. Deploys your FastAPI application as the "mothership" endpoint.
+
+After deployment, you'll receive a public URL for your API:
+
+```text
+Your mothership is deployed at:
+https://api-xxxxx.runpod.net
+
+Available Routes:
+POST   /gpu/hello
+POST   /cpu/hello
+```
+
+All requests to the deployed app require authentication with your Runpod API key:
+
+```bash
+curl -X POST https://api-xxxxx.runpod.net/gpu/hello \
+    -H "Authorization: Bearer $RUNPOD_API_KEY" \
+    -H "Content-Type: application/json" \
+    -d '{"message": "Hello from production!"}'
+```
+
+For detailed deployment options including environment management, see [Deploy Flash apps](/flash/apps/deploy-apps).
+
+## Next steps
+
+- [Deploy Flash applications](/flash/apps/deploy-apps) for production use.
+- [Configure resources](/flash/resource-configuration) for your endpoints.
+- [Monitor and debug](/flash/monitoring) your endpoints.
diff --git a/flash/apps/deploy-apps.mdx b/flash/apps/deploy-apps.mdx
new file mode 100644
index 00000000..3d244519
--- /dev/null
+++ b/flash/apps/deploy-apps.mdx
@@ -0,0 +1,227 @@
+---
+title: "Deploy Flash apps to Runpod"
+sidebarTitle: "Deploy to Runpod"
+description: "Bild and deploy your FastAPI app to Runpod."
+tag: "BETA"
+---
+
+Flash provides a complete deployment workflow for taking your local development project to production. Use `flash deploy` to build and deploy your application in a single command, or use `flash build` for more control over the build process.
+
+
+## Deployment workflow
+
+A typical deployment workflow looks like this:
+
+1. **Create a new project**: Use [`flash init`](/flash/cli/init) to create a new project.
+2. **Develop locally**: Use [`flash run`](/flash/cli/run) to test your application. Any functions decorated with `@remote` will be run on Runpod Serverless workers.
+3. **Preview** (optional): Use [`flash deploy --preview`](/flash/cli/deploy) to test locally with Docker.
+4. **Deploy**: Use [`flash deploy`](/flash/cli/deploy) to push to Runpod Serverless.
+5. **Manage**: Use [`flash env`](/flash/cli/env) and [`flash app`](/flash/cli/app) to manage your deployments.
+
+## Deploy your application
+
+When you're satisfied with your `@remote` functions and ready to move to production, use `flash deploy` to build and deploy your Flash application:
+
+```bash
+flash deploy
+```
+
+This command performs the following steps:
+
+1. **Build**: Packages your code, dependencies, and manifest.
+2. **Upload**: Sends the artifact to Runpod's storage.
+3. **Provision**: Creates or updates Serverless endpoints.
+4. **Configure**: Sets up environment variables and service discovery.
+5. **Verify**: Confirms endpoints are healthy.
+
+### Deployment architecture
+
+After deployment, your entire application runs on Runpod Serverless:
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    Users(["USERS"])
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        Mothership["MOTHERSHIP ENDPOINT<br/>(your FastAPI app from main.py)<br/>• Your HTTP routes<br/>• Orchestrates @remote calls<br/>• Public URL for users"]
+        GPU["gpu-worker<br/>(your @remote function)"]
+        CPU["cpu-worker<br/>(your @remote function)"]
+
+        Mothership -->|"internal"| GPU
+        Mothership -->|"internal"| CPU
+    end
+
+    Users -->|"HTTPS (authenticated)"| Mothership
+
+    style Runpod fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Users fill:#4D38F5,stroke:#4D38F5,color:#fff
+    style Mothership fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style GPU fill:#22C55E,stroke:#22C55E,color:#000
+    style CPU fill:#22C55E,stroke:#22C55E,color:#000
+```
+
+### Deploy to an environment
+
+Flash organizes deployments using [apps and environments](/flash/apps/apps-and-environments). Deploy to a specific environment using the `--env` flag:
+
+```bash
+# Deploy to staging
+flash deploy --env staging
+
+# Deploy to production
+flash deploy --env production
+```
+
+If the specified environment doesn't exist, Flash creates it automatically.
+
+### Post-deployment
+
+After a successful deployment, Flash displays:
+
+- The public URL for your application.
+- Available routes from your `@remote` decorated functions.
+- Instructions for authenticating requests.
+
+```text
+✓ Deployment Complete
+
+Your mothership is deployed at:
+https://api-xxxxx.runpod.net
+
+Available Routes:
+POST   /api/hello
+POST   /gpu/process
+
+All endpoints require authentication:
+curl -X POST https://api-xxxxx.runpod.net/api/hello \
+    -H "Authorization: Bearer $RUNPOD_API_KEY" \
+    -H "Content-Type: application/json" \
+    -d '{"message": "Hello"}'
+```
+
+## Preview before deploying
+
+Test your deployment locally using Docker before pushing to production:
+
+```bash
+flash deploy --preview
+```
+
+This command:
+
+1. Builds your project (creates the archive and manifest).
+2. Creates a Docker network for inter-container communication.
+3. Starts one container per resource config (mothership + workers).
+4. Exposes the mothership on `localhost:8000`.
+
+Use preview mode to:
+
+- Validate your deployment configuration.
+- Test cross-endpoint function calls.
+- Debug resource provisioning issues.
+- Verify the manifest structure.
+
+Press `Ctrl+C` to stop the preview environment.
+
+## Managing deployment size
+
+Runpod Serverless has a **500MB deployment limit**. If your deployment exceeds this limit, use the `--exclude` flag to skip packages already included in your base worker image:
+
+```bash
+# Exclude PyTorch packages (pre-installed in GPU images)
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+### Base image packages
+
+Which packages to exclude depends on your resource configuration:
+
+| Resource type | Base image | Pre-installed packages |
+|--------------|------------|------------------------|
+| GPU (`LiveServerless` with `gpus`) | PyTorch base | `torch`, `torchvision`, `torchaudio` |
+| CPU (`LiveServerless` with `instanceIds`) | Python slim | None |
+| Load-balanced | Same as GPU/CPU | Same as GPU/CPU |
+
+<Tip>
+
+Check the [worker-flash repository](https://github.com/runpod-workers/worker-flash) for current base images and pre-installed packages.
+
+</Tip>
+
+## Build process
+
+When you run `flash deploy` (or `flash build`), Flash:
+
+1. **Discovers** all `@remote` decorated functions.
+2. **Groups** functions by their `resource_config`.
+3. **Generates** handler files for each resource config.
+4. **Creates** a `flash_manifest.json` file for service discovery.
+5. **Installs** dependencies with Linux x86_64 compatibility.
+6. **Packages** everything into `.flash/artifact.tar.gz`.
+
+### Cross-platform builds
+
+Flash automatically handles cross-platform builds. You can build on macOS, Windows, or Linux, and the resulting package will run correctly on Runpod's Linux x86_64 infrastructure.
+
+### Build artifacts
+
+After building, these artifacts are created in the `.flash/` directory:
+
+| Artifact | Description |
+|----------|-------------|
+| `.flash/artifact.tar.gz` | Deployment package |
+| `.flash/flash_manifest.json` | Service discovery configuration |
+| `.flash/.build/` | Temporary build directory (removed by default) |
+
+## Troubleshooting
+
+### No @remote functions found
+
+If the build process can't find your remote functions:
+
+- Ensure functions are decorated with `@remote(resource_config=...)`.
+- Check that Python files aren't excluded by `.gitignore` or `.flashignore`.
+- Verify decorator syntax is correct.
+
+### Deployment size limit exceeded
+
+If your deployment exceeds 500MB:
+
+```bash
+# Exclude packages already in base image
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+### Authentication errors
+
+Verify your API key is set correctly:
+
+```bash
+echo $RUNPOD_API_KEY
+```
+
+If not set, add it to your `.env` file or export it:
+
+```bash
+export RUNPOD_API_KEY=your_api_key_here
+```
+
+### Import errors in remote functions
+
+Import packages inside the remote function, not at the top of the file:
+
+```python
+@remote(resource_config=config, dependencies=["requests"])
+def fetch_data(url):
+    import requests  # Import here
+    return requests.get(url).json()
+```
+
+## Next steps
+
+- [Learn about apps and environments](/flash/apps/apps-and-environments) for managing deployments.
+- [View the CLI reference](/flash/cli/overview) for all available commands.
+- [Configure resources](/flash/resource-configuration) for your endpoints.
+- [Monitor and debug](/flash/monitoring) your deployments.
diff --git a/flash/apps/initialize-project.mdx b/flash/apps/initialize-project.mdx
new file mode 100644
index 00000000..e20b035d
--- /dev/null
+++ b/flash/apps/initialize-project.mdx
@@ -0,0 +1,209 @@
+---
+title: "Initialize a Flash app project"
+sidebarTitle: "Initialize a project"
+description: "Use flash init to create a new Flash project with a ready-to-use structure."
+tag: "BETA"
+---
+
+The `flash init` command creates a new Flash project with a complete project structure, including a FastAPI server, example GPU and CPU workers, and configuration files. This gives you a working starting point for building Flash applications.
+
+Use `flash init` whenever you want to start a new Flash project, fully configured for you to run `flash run` and `flash deploy`.
+
+## Create a new project
+
+Create a new project in a new directory:
+
+```bash
+flash init PROJECT_NAME
+cd PROJECT_NAME
+```
+
+Or initialize in your current directory:
+
+```bash
+flash init .
+```
+
+## Project structure
+
+`flash init` creates the following structure:
+
+```text
+PROJECT_NAME/
+├── main.py              # FastAPI application entry point
+├── mothership.py        # Mothership endpoint configuration
+├── workers/
+│   ├── gpu/             # GPU worker
+│   │   ├── __init__.py
+│   │   └── endpoint.py
+│   └── cpu/             # CPU worker
+│       ├── __init__.py
+│       └── endpoint.py
+├── .env.example         # Environment variables template
+├── .flashignore         # Files to exclude from deployment
+├── .gitignore           # Git ignore patterns
+├── pyproject.toml       # Python project configuration
+├── requirements.txt     # Python dependencies
+└── README.md            # Project documentation
+```
+
+### Key files
+
+**main.py**: The FastAPI application that imports and registers your worker routers.
+
+**mothership.py**: Configuration for the "mothership" endpoint—the main entry point that orchestrates calls to your workers when deployed.
+
+**workers/gpu/endpoint.py**: An example GPU worker with a `@remote` decorated function. This is where you define functions that run on GPU workers.
+
+**workers/cpu/endpoint.py**: An example CPU worker for tasks that don't require GPU acceleration.
+
+**.flashignore**: Lists files and directories to exclude from the deployment artifact (similar to `.gitignore`).
+
+## Set up the project
+
+After initialization, complete the setup:
+
+```bash
+# Install dependencies
+pip install -r requirements.txt
+
+# Copy environment template
+cp .env.example .env
+
+# Add your API key to .env
+# RUNPOD_API_KEY=your_api_key_here
+```
+
+## How it fits into the workflow
+
+`flash init` is the first step in the Flash development workflow:
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart LR
+    Init["flash init"]
+    Dev["flash run"]
+    Deploy["flash deploy"]
+
+    Init -->|"Create project"| Dev
+    Dev -->|"Test locally"| Deploy
+
+    style Init fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style Dev fill:#22C55E,stroke:#22C55E,color:#000
+    style Deploy fill:#4D38F5,stroke:#4D38F5,color:#fff
+```
+
+1. **`flash init`**: Creates project structure and boilerplate.
+2. **`flash run`**: Starts local development server for testing.
+3. **`flash deploy`**: Builds and deploys to Runpod Serverless.
+
+## Customize your project
+
+### Add a new GPU endpoint
+
+To add a new GPU endpoint, you need to create a new file in the `workers/gpu/` directory. This file will contain the code for the endpoint and be automatically included in the FastAPI app.
+
+1. Create a new file in `workers/gpu/` with the name of the endpoint. For example, `inference.py`:
+
+```python
+# workers/gpu/inference.py
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+config = LiveServerless(
+    name="inference-worker",
+    gpus=[GpuGroup.ADA_24],
+    workersMax=3,
+)
+
+@remote(resource_config=config, dependencies=["transformers", "torch"])
+def run_inference(prompt: str) -> dict:
+    # Rember to import endpoint dependencies inside the function.
+    from transformers import pipeline
+
+    generator = pipeline("text-generation", model="gpt2")
+    result = generator(prompt, max_length=50)
+    return {"output": result[0]["generated_text"]}
+```
+
+2. Add a route in `workers/gpu/__init__.py`:
+
+```python
+from fastapi import APIRouter
+from .inference import run_inference
+
+router = APIRouter(prefix="/gpu", tags=["GPU Workers"])
+
+@router.post("/inference")
+async def inference_endpoint(prompt: str):
+    result = await run_inference(prompt)
+    return result
+```
+
+3. The router is automatically included via `main.py`.
+
+### Add a CPU endpoint
+
+Follow the same pattern in `workers/cpu/`. CPU endpoints use `instanceIds` instead of `gpus`:
+
+```python
+from runpod_flash import remote, LiveServerless, CpuInstanceType
+
+config = LiveServerless(
+    name="data-processor",
+    instanceIds=[CpuInstanceType.CPU5C_4_8],
+    workersMax=2,
+)
+
+@remote(resource_config=config, dependencies=["pandas"])
+def process_data(data: list) -> dict:
+    import pandas as pd
+    df = pd.DataFrame(data)
+    return df.describe().to_dict()
+```
+
+## Handle existing files
+
+If you run `flash init` in a directory with existing files, Flash detects conflicts and prompts for confirmation:
+
+```text
+┌─ File Conflicts Detected ─────────────────────┐
+│ Warning: The following files will be          │
+│ overwritten:                                  │
+│                                               │
+│   • main.py                                   │
+│   • requirements.txt                          │
+└───────────────────────────────────────────────┘
+Continue and overwrite these files? [y/N]:
+```
+
+Use `--force` to skip the prompt and overwrite files:
+
+```bash
+flash init . --force
+```
+
+## Start developing
+
+Once your project is set up:
+
+```bash
+# Start the development server
+flash run
+
+# Open the API explorer
+# http://localhost:8888/docs
+```
+
+Make changes to your workers, and the server reloads automatically. When you're ready, deploy with:
+
+```bash
+flash deploy
+```
+
+## Next steps
+
+- [Test locally](/flash/apps/local-testing) with `flash run`.
+- [Build your app](/flash/apps/build-app) by customizing workers.
+- [Deploy to production](/flash/apps/deploy-apps) with `flash deploy`.
+- [View the flash init reference](/flash/cli/init) for all options.
diff --git a/flash/apps/local-testing.mdx b/flash/apps/local-testing.mdx
new file mode 100644
index 00000000..3ae547de
--- /dev/null
+++ b/flash/apps/local-testing.mdx
@@ -0,0 +1,174 @@
+---
+title: "Test Flash apps locally"
+sidebarTitle: "Test locally"
+description: "Use flash run to test your Flash application locally before deploying."
+tag: "BETA"
+---
+
+The `flash run` command starts a local development server that lets you test your Flash application before deploying to production. Your FastAPI app runs locally and updates automatically as you edit files. When you call a `@remote` function, Flash sends the latest function code to Serverless workers on Runpod, so your changes are reflected immediately.
+
+Use `flash run` when you want to:
+
+- Iterate quickly with automatic code updates.
+- Test `@remote` functions against real GPU/CPU workers.
+- Debug request/response handling before deployment.
+- Develop without redeploying after every change.
+
+## Start the development server
+
+From inside your [project directory](/flash/apps/initialize-project), run:
+
+```bash
+flash run
+```
+
+The server starts at `http://localhost:8888` by default. Your FastAPI routes are available immediately, and `@remote` functions provision Serverless endpoints on first call.
+
+### Custom host and port
+
+```bash
+# Change port
+flash run --port 3000
+
+# Make accessible on network
+flash run --host 0.0.0.0
+```
+
+## Test your endpoints
+
+### Using curl
+
+```bash
+curl -X POST http://localhost:8888/gpu/hello \
+  -H "Content-Type: application/json" \
+  -d '{"name": "Flash"}'
+```
+
+### Using the API explorer
+
+Open [http://localhost:8888/docs](http://localhost:8888/docs) in your browser to access the interactive Swagger UI. You can test all endpoints directly from the browser.
+
+### Using Python
+
+```python
+import requests
+
+response = requests.post(
+    "http://localhost:8888/gpu/hello",
+    json={"name": "Flash"}
+)
+print(response.json())
+```
+
+## Reduce cold-start delays
+
+The first call to a `@remote` function provisions a Serverless endpoint, which takes 30-60 seconds. Use `--auto-provision` to provision all endpoints at startup:
+
+```bash
+flash run --auto-provision
+```
+
+This scans your project for `@remote` functions and deploys them before the server starts accepting requests. Endpoints are cached in `.runpod/resources.pkl` and reused across server restarts.
+
+## How it works
+
+With `flash run`, your system runs in a hybrid architecture:
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    subgraph Local ["YOUR MACHINE (localhost:8888)"]
+        FastAPI["FastAPI App<br/>• Updates automatically<br/>• Your HTTP routes"]
+    end
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        GPU["live-gpu-worker"]
+        CPU["live-cpu-worker"]
+    end
+
+    FastAPI -->|"HTTPS"| GPU
+    FastAPI -->|"HTTPS"| CPU
+
+    style Local fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Runpod fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style FastAPI fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style GPU fill:#22C55E,stroke:#22C55E,color:#000
+    style CPU fill:#22C55E,stroke:#22C55E,color:#000
+```
+
+**What runs where:**
+
+| Component | Location |
+|-----------|----------|
+| FastAPI app (`main.py`) | Your machine |
+| HTTP routes | Your machine |
+| `@remote` function code | Runpod Serverless |
+
+Your code updates automatically as you edit files. Endpoints created by `flash run` are prefixed with `live-` to distinguish them from production endpoints.
+
+## Development workflow
+
+A typical development cycle looks like this:
+
+1. Start the server: `flash run`
+2. Make changes to your code.
+3. The server reloads automatically.
+4. Test your changes via curl or the API explorer.
+5. Repeat until ready to deploy.
+
+When you're done, use `flash undeploy` to clean up the `live-` endpoints created during development.
+
+## Differences from production
+
+| Aspect | `flash run` | `flash deploy` |
+|--------|-------------|----------------|
+| FastAPI app runs on | Your machine | Runpod Serverless |
+| Endpoint naming | `live-` prefix | No prefix |
+| Automatic updates | Yes | No |
+| Authentication | Not required | Required |
+
+## Clean up after testing
+
+Endpoints created by `flash run` persist until you delete them. To clean up:
+
+```bash
+# List all endpoints
+flash undeploy list
+
+# Remove a specific endpoint
+flash undeploy ENDPOINT_NAME
+
+# Remove all endpoints
+flash undeploy --all
+```
+
+## Troubleshooting
+
+**Port already in use**
+
+```bash
+flash run --port 3000
+```
+
+**Slow first request**
+
+Use `--auto-provision` to eliminate cold-start delays:
+
+```bash
+flash run --auto-provision
+```
+
+**Authentication errors**
+
+Ensure `RUNPOD_API_KEY` is set in your `.env` file or environment:
+
+```bash
+export RUNPOD_API_KEY=your_api_key_here
+```
+
+## Next steps
+
+- [Deploy to production](/flash/apps/deploy-apps) when your app is ready.
+- [Clean up endpoints](/flash/cli/undeploy) after testing.
+- [View the flash run reference](/flash/cli/run) for all options.
diff --git a/flash/apps/overview.mdx b/flash/apps/overview.mdx
new file mode 100644
index 00000000..bf4d7474
--- /dev/null
+++ b/flash/apps/overview.mdx
@@ -0,0 +1,255 @@
+---
+title: "Overview"
+sidebarTitle: "Overview"
+description: "Understand the Flash development lifecycle and how to build and deploy your applications."
+tag: "BETA"
+---
+
+Flash provides a complete development and deployment workflow to build AI/ML applications and services using Runpod's GPU/CPU infrastructure. This page explains the key concepts and processes you'll use when building Flash apps.
+
+<Tip>
+If you prefer to learn by doing, follow this tuturial to [build your first Flash app](/flash/apps/build-app).
+</Tip>
+
+## App development overview
+
+Building a Flash application follows a clear progression from initialization to production deployment:
+
+<div style={{ marginLeft: '6rem'}}>
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    Init["flash init<br/>Create project"]
+    Code["Define endpoints with<br/>@remote functions"]
+    Run["Test locally with<br/>flash run"]
+    Deploy["Deploy to Runpod with<br/>flash deploy"]
+    Manage["Manage apps and<br/>environments with<br/>flash app and flash env"]
+
+    Init --> Code
+    Code --> Run
+    Run -->|"Ready for production"| Deploy
+    Deploy --> Manage
+    Run -->|"Continue development"| Code
+
+    style Init fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style Code fill:#22C55E,stroke:#22C55E,color:#000
+    style Run fill:#4D38F5,stroke:#4D38F5,color:#fff
+    style Deploy fill:#AE6DFF,stroke:#AE6DFF,color:#000
+    style Manage fill:#9289FE,stroke:#9289FE,color:#fff
+```
+</div>
+
+<Steps>
+  <Step title="Initialize">
+    Use `flash init` to create a new project with a FastAPI server and example workers:
+
+    ```bash
+    flash init PROJECT_NAME
+    cd PROJECT_NAME
+    pip install -r requirements.txt
+    ```
+
+    This gives you a working project structure with GPU and CPU worker examples. [Learn more about project initialization](/flash/apps/initialize-project).
+  </Step>
+
+  <Step title="Develop">
+    Write your application code by defining `@remote` functions that execute on Runpod workers:
+
+    ```python
+    from runpod_flash import remote, LiveServerless, GpuGroup
+
+    config = LiveServerless(
+        name="inference-worker",
+        gpus=[GpuGroup.ADA_24],
+        workersMax=3,
+    )
+
+    @remote(resource_config=config, dependencies=["torch"])
+    def run_inference(prompt: str) -> dict:
+        import torch
+        # Your inference logic here
+        return {"result": "..."}
+    ```
+
+    [Learn more about remote functions](/flash/remote-functions).
+  </Step>
+
+  <Step title="Test locally">
+    Start a local development server to test your application:
+
+    ```bash
+    flash run
+    ```
+
+    Your FastAPI app runs locally and updates automatically. When you call a `@remote` function, Flash sends the latest code to Runpod workers. This hybrid architecture lets you iterate quickly without redeploying. [Learn more about local testing](/flash/apps/local-testing).
+  </Step>
+
+  <Step title="Deploy">
+    When ready for production, deploy your application to Runpod Serverless:
+
+    ```bash
+    flash deploy
+    ```
+
+    Your entire application—including the FastAPI server and all worker functions—runs on Runpod infrastructure. [Learn more about deployment](/flash/apps/deploy-apps).
+  </Step>
+
+  <Step title="Manage">
+    Use apps and environments to organize and manage your deployments across different stages (dev, staging, production). [Learn more about apps and environments](/flash/apps/apps-and-environments).
+  </Step>
+</Steps>
+
+## Apps and environments
+
+Flash uses a two-level organizational structure to manage deployments: **apps** and **environments**.
+
+### What is a Flash app?
+
+A **Flash app** is a logical container for all resources related to a single project. Think of it as a namespace that groups together:
+
+- **Environments**: Different deployment stages (dev, staging, production).
+- **Builds**: Versioned artifacts of your application code.
+- **Configuration**: App-wide settings and metadata.
+
+Apps are created automatically when you first run `flash deploy`, or you can create them explicitly with `flash app create`.
+
+### What is an environment?
+
+An **environment** is an isolated deployment stage within an app. Each environment has its own:
+
+- **Deployed endpoints**: Serverless workers for your `@remote` functions.
+- **Build version**: The specific code version running in this environment.
+- **State**: Current deployment status (deploying, deployed, failed, etc.).
+
+Environments are completely independent—deploying to `dev` has no effect on `production`. You can create and manage environments with the `flash env` command.
+
+## Local vs production deployment
+
+Flash supports two modes of operation:
+
+### Local development (`flash run`)
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    subgraph Local ["YOUR MACHINE"]
+        FastAPI["FastAPI App<br/>• Updates automatically<br/>• localhost:8888"]
+    end
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        Workers["Workers<br/>• @remote functions<br/>• live- prefix"]
+    end
+
+    FastAPI -->|"HTTPS"| Workers
+
+    style Local fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Runpod fill:#1a1a2e,stroke:#22C55E,stroke-width:2px,color:#fff
+    style FastAPI fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style Workers fill:#22C55E,stroke:#22C55E,color:#000
+```
+
+**How it works:**
+- FastAPI runs on your machine and updates automatically
+- `@remote` functions run on Runpod workers
+- Endpoints prefixed with `live-` for easy identification
+- No authentication required for local testing
+- Fast iteration on application logic
+
+### Production deployment (`flash deploy`)
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    Users(["USERS"])
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        Mothership["Mothership<br/>• FastAPI app<br/>• Public URL"]
+        Workers["Workers<br/>• @remote functions"]
+
+        Mothership -->|"internal"| Workers
+    end
+
+    Users -->|"HTTPS (auth required)"| Mothership
+
+    style Runpod fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Users fill:#4D38F5,stroke:#4D38F5,color:#fff
+    style Mothership fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style Workers fill:#22C55E,stroke:#22C55E,color:#000
+```
+
+**How it works:**
+- Entire application runs on Runpod Serverless
+- FastAPI "mothership" endpoint orchestrates worker calls
+- Public HTTPS URL with API key authentication
+- Automatic scaling based on load
+- Production-grade reliability and performance
+
+## Common workflows
+
+### Simple projects (single environment)
+
+For solo projects or simple applications:
+
+```bash
+# Initialize and develop
+flash init PROJECT_NAME
+cd PROJECT_NAME
+
+# Test locally
+flash run
+
+# Deploy to production (creates 'production' environment by default)
+flash deploy
+```
+
+### Team projects (multiple environments)
+
+For team collaboration with dev, staging, and production stages:
+
+```bash
+# Create environments
+flash env create dev
+flash env create staging
+flash env create production
+
+# Development cycle
+flash run                          # Test locally
+flash deploy --env dev             # Deploy to dev for integration testing
+flash deploy --env staging         # Deploy to staging for QA
+flash deploy --env production      # Deploy to production after approval
+```
+
+### Feature development
+
+For testing new features in isolation:
+
+```bash
+# Create temporary feature environment
+flash env create FEATURE_NAME
+
+# Deploy and test
+flash deploy --env FEATURE_NAME
+
+# Clean up after merging
+flash env delete FEATURE_NAME
+```
+
+## Next steps
+
+<CardGroup cols={2}>
+  <Card title="Build your first app" href="/flash/apps/build-app" icon="code">
+    Create a Flash app, test it locally, and deploy it to production.
+  </Card>
+  <Card title="Initialize a project" href="/flash/apps/initialize-project" icon="folder-plus">
+    Create boilerplate code for a new Flash project with `flash init`.
+  </Card>
+  <Card title="Test locally" href="/flash/apps/local-testing" icon="flask">
+    Use `flash run` for local development and testing.
+  </Card>
+  <Card title="Deploy to Runpod" href="/flash/apps/deploy-apps" icon="rocket">
+    Deploy your application to production with `flash deploy`.
+  </Card>
+</CardGroup>
diff --git a/flash/cli/app.mdx b/flash/cli/app.mdx
new file mode 100644
index 00000000..15f1bfdc
--- /dev/null
+++ b/flash/cli/app.mdx
@@ -0,0 +1,207 @@
+---
+title: "app"
+sidebarTitle: "app"
+---
+
+Manage Flash applications. An app is the top-level container that groups your deployment environments, build artifacts, and configuration.
+
+```bash Command
+flash app <subcommand> [OPTIONS]
+```
+
+## Subcommands
+
+| Subcommand | Description |
+|------------|-------------|
+| `list` | Show all apps in your account |
+| `create` | Create a new app |
+| `get` | Show details of an app |
+| `delete` | Delete an app and all its resources |
+
+---
+
+## app list
+
+Show all Flash apps under your account.
+
+```bash Command
+flash app list
+```
+
+### Output
+
+```text
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name           ┃ ID                   ┃ Environments            ┃ Builds           ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ my-project     │ app_abc123           │ dev, staging, prod      │ build_1, build_2 │
+│ demo-api       │ app_def456           │ production              │ build_3          │
+│ ml-inference   │ app_ghi789           │ dev, production         │ build_4, build_5 │
+└────────────────┴──────────────────────┴─────────────────────────┴──────────────────┘
+```
+
+---
+
+## app create
+
+Register a new Flash app on Runpod's backend.
+
+```bash Command
+flash app create <NAME>
+```
+
+### Arguments
+
+<ResponseField name="NAME" type="string" required>
+Name for the new Flash app. Must be unique within your account.
+</ResponseField>
+
+### What it creates
+
+This command registers a Flash app in Runpod's backend—essentially creating a namespace for your environments and builds. It does not:
+
+- Create local files (use `flash init` for that)
+- Provision cloud resources (endpoints, volumes, etc.)
+- Deploy any code
+
+The app is just a container that groups environments and builds together.
+
+### When to use
+
+<Note>
+
+Most users don't need to run `flash app create` explicitly. Apps are created automatically when you first run `flash deploy`. This command is primarily for CI/CD pipelines that need to pre-register apps before deployment.
+
+</Note>
+
+---
+
+## app get
+
+Get detailed information about a Flash app.
+
+```bash Command
+flash app get <NAME>
+```
+
+### Arguments
+
+<ResponseField name="NAME" type="string" required>
+Name of the Flash app to inspect.
+</ResponseField>
+
+### Output
+
+```text
+╭─────────────────────────────────╮
+│ Flash App: my-project           │
+├─────────────────────────────────┤
+│ Name: my-project                │
+│ ID: app_abc123                  │
+│ Environments: 3                 │
+│ Builds: 5                       │
+╰─────────────────────────────────╯
+
+              Environments
+┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name       ┃ ID                 ┃ State   ┃ Active Build     ┃ Created          ┃
+┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ dev        │ env_dev123         │ DEPLOYED│ build_xyz789     │ 2024-01-15 10:30 │
+│ staging    │ env_stg456         │ DEPLOYED│ build_xyz789     │ 2024-01-16 14:20 │
+│ production │ env_prd789         │ DEPLOYED│ build_abc123     │ 2024-01-20 09:15 │
+└────────────┴────────────────────┴─────────┴──────────────────┴──────────────────┘
+
+                     Builds
+┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ ID                 ┃ Status                   ┃ Created          ┃
+┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ build_abc123       │ COMPLETED                │ 2024-01-20 09:00 │
+│ build_xyz789       │ COMPLETED                │ 2024-01-18 15:45 │
+│ build_def456       │ COMPLETED                │ 2024-01-15 11:20 │
+└────────────────────┴──────────────────────────┴──────────────────┘
+```
+
+---
+
+## app delete
+
+Delete a Flash app and all its associated resources.
+
+```bash Command
+flash app delete --app <NAME>
+```
+
+### Flags
+
+<ResponseField name="--app, -a" type="string" required>
+Name of the Flash app to delete. Required explicitly for safety.
+</ResponseField>
+
+<Note>
+
+Unlike other subcommands, `delete` requires the `--app` flag explicitly. This is a safety measure for destructive operations.
+
+</Note>
+
+### Process
+
+1. Shows app details and resources to be deleted.
+2. Prompts for confirmation (required).
+3. Deletes all environments and their resources.
+4. Deletes all builds.
+5. Deletes the app.
+
+<Warning>
+
+This operation is irreversible. All environments, builds, endpoints, volumes, and configuration will be permanently deleted.
+
+</Warning>
+
+---
+
+## App hierarchy
+
+A Flash app contains environments and builds:
+
+```text
+Flash App (my-project)
+│
+├── Environments
+│   ├── dev
+│   │   ├── Endpoints (ep1, ep2)
+│   │   └── Volumes (vol1)
+│   ├── staging
+│   │   ├── Endpoints (ep1, ep2)
+│   │   └── Volumes (vol1)
+│   └── production
+│       ├── Endpoints (ep1, ep2)
+│       └── Volumes (vol1)
+│
+└── Builds
+    ├── build_v1 (2024-01-15)
+    ├── build_v2 (2024-01-18)
+    └── build_v3 (2024-01-20)
+```
+
+## Auto-detection
+
+Flash CLI automatically detects the app name from your current directory:
+
+```bash
+cd /path/to/APP_NAME
+flash deploy          # Deploys to 'APP_NAME' app
+flash env list        # Lists 'APP_NAME' environments
+```
+
+Override with the `--app` flag:
+
+```bash
+flash deploy --app other-project
+flash env list --app other-project
+```
+
+## Related commands
+
+- [`flash env`](/flash/cli/env) - Manage environments within an app
+- [`flash deploy`](/flash/cli/deploy) - Deploy to an app's environment
+- [`flash init`](/flash/cli/init) - Create a new project
diff --git a/flash/cli/build.mdx b/flash/cli/build.mdx
new file mode 100644
index 00000000..fb6da58f
--- /dev/null
+++ b/flash/cli/build.mdx
@@ -0,0 +1,184 @@
+---
+title: "build"
+sidebarTitle: "build"
+---
+
+Build a deployment-ready artifact for your Flash application without deploying. Use this for more control over the build process or to inspect the artifact before deploying.
+
+```bash
+flash build [OPTIONS]
+```
+
+## Examples
+
+Build with all dependencies:
+
+```bash
+flash build
+```
+
+Build and launch local preview environment:
+
+```bash
+flash build --preview
+```
+
+Build with excluded packages (for smaller deployment size):
+
+```bash
+flash build --exclude torch,torchvision,torchaudio
+```
+
+Keep the build directory for inspection:
+
+```bash
+flash build --keep-build
+```
+
+## Flags
+
+<ResponseField name="--no-deps">
+Skip transitive dependencies during pip install. Only installs direct dependencies specified in `@remote` decorators. Useful when the base image already includes dependencies.
+</ResponseField>
+
+<ResponseField name="--keep-build">
+Keep the `.flash/.build` directory after creating the archive. Useful for debugging build issues or inspecting generated files.
+</ResponseField>
+
+<ResponseField name="--output, -o" type="string" default="artifact.tar.gz">
+Custom name for the output archive file.
+</ResponseField>
+
+<ResponseField name="--exclude" type="string">
+Comma-separated list of packages to exclude from the build (e.g., `torch,torchvision`). Use this to skip packages already in the base image.
+</ResponseField>
+
+<ResponseField name="--preview">
+Launch a local Docker-based test environment after building. Automatically enables `--keep-build`.
+</ResponseField>
+
+## What happens during build
+
+1. **Function discovery**: Finds all `@remote` decorated functions.
+2. **Grouping**: Groups functions by their `resource_config`.
+3. **Manifest generation**: Creates `.flash/flash_manifest.json` with endpoint definitions.
+4. **Dependency installation**: Installs Python packages for Linux x86_64.
+5. **Packaging**: Bundles everything into `.flash/artifact.tar.gz`.
+
+## Build artifacts
+
+After running `flash build`:
+
+| File/Directory | Description |
+|----------------|-------------|
+| `.flash/artifact.tar.gz` | Deployment package ready for Runpod |
+| `.flash/flash_manifest.json` | Service discovery configuration |
+| `.flash/.build/` | Temporary build directory (removed unless `--keep-build`) |
+
+## Cross-platform builds
+
+Flash automatically handles cross-platform builds:
+
+- **Automatic platform targeting**: Dependencies are installed for Linux x86_64, regardless of your build platform.
+- **Python version matching**: Uses your current Python version for package compatibility.
+- **Binary wheel enforcement**: Only pre-built wheels are used, preventing compilation issues.
+
+You can build on macOS, Windows, or Linux, and the deployment will work on Runpod.
+
+## Managing deployment size
+
+Runpod Serverless has a **500MB deployment limit**. Use `--exclude` to skip packages already in your base image:
+
+```bash
+# For GPU deployments (PyTorch pre-installed)
+flash build --exclude torch,torchvision,torchaudio
+```
+
+### Base image reference
+
+| Resource type | Base image | Safe to exclude |
+|--------------|------------|-----------------|
+| GPU | PyTorch base | `torch`, `torchvision`, `torchaudio` |
+| CPU | Python slim | Do not exclude ML packages |
+
+<Tip>
+
+Check the [worker-flash repository](https://github.com/runpod-workers/worker-flash) for current base images and pre-installed packages.
+
+</Tip>
+
+## Preview environment
+
+Test your deployment locally before pushing to Runpod:
+
+```bash
+flash build --preview
+```
+
+This:
+
+1. Builds your project (creates archive and manifest).
+2. Creates a Docker network for inter-container communication.
+3. Starts one container per resource config (mothership + workers).
+4. Exposes the mothership on `localhost:8000`.
+5. On shutdown (`Ctrl+C`), stops and removes all containers.
+
+### When to use preview
+
+- Test deployment configuration before production.
+- Validate manifest structure.
+- Debug resource provisioning.
+- Verify cross-endpoint function calls.
+
+## Troubleshooting
+
+### Build fails with "functions not found"
+
+Ensure your project has `@remote` decorated functions:
+
+```python
+from runpod_flash import remote, LiveServerless
+
+config = LiveServerless(name="my-worker")
+
+@remote(resource_config=config)
+def my_function(data):
+    return {"result": data}
+```
+
+### Archive is too large
+
+Use `--exclude` or `--no-deps`:
+
+```bash
+flash build --exclude torch,torchvision,torchaudio
+```
+
+### Dependency installation fails
+
+If a package doesn't have Linux x86_64 wheels:
+
+1. Ensure standard pip is installed: `python -m ensurepip --upgrade`
+2. Check PyPI for Linux wheel availability.
+3. For Python 3.13+, some packages may require newer manylinux versions.
+
+### Need to examine generated files
+
+Use `--keep-build`:
+
+```bash
+flash build --keep-build
+ls .flash/.build/
+```
+
+## Related commands
+
+- [`flash deploy`](/flash/cli/deploy) - Build and deploy in one step
+- [`flash run`](/flash/cli/run) - Start development server
+- [`flash env`](/flash/cli/env) - Manage environments
+
+<Note>
+
+Most users should use `flash deploy` instead, which runs build and deploy in one step. Use `flash build` when you need more control or want to inspect the artifact.
+
+</Note>
diff --git a/flash/cli/deploy.mdx b/flash/cli/deploy.mdx
new file mode 100644
index 00000000..bd4224fa
--- /dev/null
+++ b/flash/cli/deploy.mdx
@@ -0,0 +1,247 @@
+---
+title: "deploy"
+sidebarTitle: "deploy"
+---
+
+Build and deploy your Flash application to Runpod Serverless endpoints in one step. This is the primary command for getting your application running in the cloud.
+
+```bash
+flash deploy [OPTIONS]
+```
+
+## Examples
+
+Build and deploy a Flash app from the current directory (auto-selects environment if only one exists):
+
+```bash
+flash deploy
+```
+
+Deploy to a specific environment:
+
+```bash
+flash deploy --env production
+```
+
+Deploy with excluded packages to reduce size:
+
+```bash
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+Build and test locally before deploying:
+
+```bash
+flash deploy --preview
+```
+
+## Flags
+
+<ResponseField name="--env, -e" type="string">
+Target environment name (e.g., `dev`, `staging`, `production`). Auto-selected if only one exists. Creates the environment if it doesn't exist.
+</ResponseField>
+
+<ResponseField name="--app, -a" type="string">
+Flash app name. Auto-detected from the current directory if not specified.
+</ResponseField>
+
+<ResponseField name="--no-deps">
+Skip transitive dependencies during pip install. Useful when the base image already includes dependencies.
+</ResponseField>
+
+<ResponseField name="--exclude" type="string">
+Comma-separated packages to exclude (e.g., `torch,torchvision`). Use this to stay under the 500MB deployment limit.
+</ResponseField>
+
+<ResponseField name="--output, -o" type="string" default="artifact.tar.gz">
+Custom archive name for the build artifact.
+</ResponseField>
+
+<ResponseField name="--preview">
+Build and launch a local Docker-based preview environment instead of deploying to Runpod.
+</ResponseField>
+
+<ResponseField name="--use-local-flash">
+Bundle local `runpod_flash` source instead of the PyPI version. For development and testing only.
+</ResponseField>
+
+## What happens during deployment
+
+1. **Build phase**: Creates the deployment artifact (same as `flash build`).
+2. **Environment resolution**: Detects or creates the target environment.
+3. **Upload**: Sends the artifact to Runpod storage.
+4. **Provisioning**: Creates or updates Serverless endpoints.
+5. **Configuration**: Sets up environment variables and service discovery.
+6. **Verification**: Confirms endpoints are healthy.
+
+## Architecture
+
+After deployment, your entire application runs on Runpod Serverless:
+
+<div style={{ marginLeft: '4rem'}}>
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    Users(["USERS"])
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        Mothership["MOTHERSHIP ENDPOINT<br/>(your FastAPI app from main.py)<br/>• Your HTTP routes<br/>• Orchestrates @remote calls<br/>• Public URL for users"]
+        GPU["gpu-worker<br/>(your @remote function)"]
+        CPU["cpu-worker<br/>(your @remote function)"]
+
+        Mothership -->|"internal"| GPU
+        Mothership -->|"internal"| CPU
+    end
+
+    Users -->|"HTTPS (authenticated)"| Mothership
+
+    style Runpod fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Users fill:#4D38F5,stroke:#4D38F5,color:#fff
+    style Mothership fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style GPU fill:#22C55E,stroke:#22C55E,color:#000
+    style CPU fill:#22C55E,stroke:#22C55E,color:#000
+```
+</div>
+
+## Environment management
+
+### Automatic creation
+
+If the specified environment doesn't exist, `flash deploy` creates it:
+
+```bash
+# Creates 'staging' if it doesn't exist
+flash deploy --env staging
+```
+
+### Auto-selection
+
+When you have only one environment, it's selected automatically:
+
+```bash
+# Auto-selects the only available environment
+flash deploy
+```
+
+When multiple environments exist, you must specify one:
+
+```bash
+# Required when multiple environments exist
+flash deploy --env staging
+```
+
+### Default environment
+
+If no environment exists and none is specified, Flash creates a `production` environment by default.
+
+## Post-deployment
+
+After successful deployment, Flash displays:
+
+```text
+✓ Deployment Complete
+
+Your mothership is deployed at:
+https://api-xxxxx.runpod.net
+
+Available Routes:
+POST   /api/hello
+POST   /gpu/process
+
+All endpoints require authentication:
+curl -X POST https://api-xxxxx.runpod.net/api/hello \
+    -H "Authorization: Bearer $RUNPOD_API_KEY" \
+    -H "Content-Type: application/json" \
+    -d '{"param": "value"}'
+```
+
+### Authentication
+
+All deployed endpoints require authentication with your Runpod API key:
+
+```bash
+export RUNPOD_API_KEY="your_key_here"
+
+curl -X POST https://YOUR_ENDPOINT_URL/path \
+    -H "Authorization: Bearer $RUNPOD_API_KEY" \
+    -H "Content-Type: application/json" \
+    -d '{"param": "value"}'
+```
+
+## Preview mode
+
+Test locally before deploying:
+
+```bash
+flash deploy --preview
+```
+
+This builds your project and runs it in Docker containers locally:
+
+- Mothership exposed on `localhost:8000`.
+- All containers communicate via Docker network.
+- Press `Ctrl+C` to stop.
+
+## Managing deployment size
+
+Runpod Serverless has a **500MB limit**. Use `--exclude` to skip packages in the base image:
+
+```bash
+# GPU deployments (PyTorch pre-installed)
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+| Resource type | Safe to exclude |
+|--------------|-----------------|
+| GPU | `torch`, `torchvision`, `torchaudio` |
+| CPU | Do not exclude ML packages |
+
+## flash run vs flash deploy
+
+| Aspect | `flash run` | `flash deploy` |
+|--------|-------------|----------------|
+| FastAPI app runs on | Your machine | Runpod Serverless |
+| `@remote` functions run on | Runpod Serverless | Runpod Serverless |
+| Endpoint naming | `live-` prefix | No prefix |
+| Automatic updates | Yes | No |
+| Use case | Development | Production |
+
+## Troubleshooting
+
+### Multiple environments error
+
+```text
+Error: Multiple environments found: dev, staging, production
+```
+
+Specify the target environment:
+
+```bash
+flash deploy --env staging
+```
+
+### Deployment size limit
+
+Use `--exclude` to reduce size:
+
+```bash
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+### Authentication fails
+
+Ensure your API key is set:
+
+```bash
+echo $RUNPOD_API_KEY
+export RUNPOD_API_KEY="your_key_here"
+```
+
+## Related commands
+
+- [`flash build`](/flash/cli/build) - Build without deploying
+- [`flash run`](/flash/cli/run) - Local development server
+- [`flash env`](/flash/cli/env) - Manage environments
+- [`flash app`](/flash/cli/app) - Manage applications
+- [`flash undeploy`](/flash/cli/undeploy) - Remove endpoints
diff --git a/flash/cli/env.mdx b/flash/cli/env.mdx
new file mode 100644
index 00000000..7d4494ba
--- /dev/null
+++ b/flash/cli/env.mdx
@@ -0,0 +1,255 @@
+---
+title: "env"
+sidebarTitle: "env"
+---
+
+Manage deployment environments for Flash applications. Environments are isolated deployment contexts (like `dev`, `staging`, `production`) within a Flash app.
+
+```bash Command
+flash env <subcommand> [OPTIONS]
+```
+
+## Subcommands
+
+| Subcommand | Description |
+|------------|-------------|
+| `list` | Show all environments for an app |
+| `create` | Create a new environment |
+| `get` | Show details of an environment |
+| `delete` | Delete an environment and its resources |
+
+---
+
+## env list
+
+Show all available environments for an app.
+
+```bash Command
+flash env list [OPTIONS]
+```
+
+### Example
+
+```bash
+# List environments for current app
+flash env list
+
+# List environments for specific app
+flash env list --app APP_NAME
+```
+
+### Flags
+
+<ResponseField name="--app, -a" type="string">
+Flash app name. Auto-detected from current directory if not specified.
+</ResponseField>
+
+### Output
+
+```text
+┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name       ┃ ID                  ┃ Active Build      ┃ Created At       ┃
+┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ dev        │ env_abc123          │ build_xyz789      │ 2024-01-15 10:30 │
+│ staging    │ env_def456          │ build_uvw456      │ 2024-01-16 14:20 │
+│ production │ env_ghi789          │ build_rst123      │ 2024-01-20 09:15 │
+└────────────┴─────────────────────┴───────────────────┴──────────────────┘
+```
+
+---
+
+## env create
+
+Create a new deployment environment.
+
+```bash Command
+flash env create <NAME> [OPTIONS]
+```
+
+### Example
+
+```bash
+# Create staging environment
+flash env create staging
+
+# Create environment in specific app
+flash env create production --app APP_NAME
+```
+
+### Arguments
+
+<ResponseField name="NAME" type="string" required>
+Name for the new environment (e.g., `dev`, `staging`, `production`).
+</ResponseField>
+
+### Flags
+
+<ResponseField name="--app, -a" type="string">
+Flash app name. Auto-detected from current directory if not specified.
+</ResponseField>
+
+### Notes
+
+- If the app doesn't exist, it's created automatically.
+- Environment names must be unique within an app.
+- Newly created environments have no active build until first deployment.
+
+<Note>
+
+You don't always need to create environments explicitly. Running `flash deploy --env <name>` creates the environment automatically if it doesn't exist.
+
+</Note>
+
+---
+
+## env get
+
+Show detailed information about a deployment environment.
+
+```bash Command
+flash env get <NAME> [OPTIONS]
+```
+
+### Example
+
+```bash
+# Get details for production environment
+flash env get production
+
+# Get details for specific app's environment
+flash env get staging --app APP_NAME
+```
+
+### Arguments
+
+<ResponseField name="NAME" type="string" required>
+Name of the environment to inspect.
+</ResponseField>
+
+### Flags
+
+<ResponseField name="--app, -a" type="string">
+Flash app name. Auto-detected from current directory if not specified.
+</ResponseField>
+
+### Output
+
+```text
+╭────────────────────────────────────╮
+│ Environment: production            │
+├────────────────────────────────────┤
+│ ID: env_ghi789                     │
+│ State: DEPLOYED                    │
+│ Active Build: build_rst123         │
+│ Created: 2024-01-20 09:15:00       │
+╰────────────────────────────────────╯
+
+           Associated Endpoints
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
+┃ Name           ┃ ID                 ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
+│ my-gpu         │ ep_abc123          │
+│ my-cpu         │ ep_def456          │
+└────────────────┴────────────────────┘
+
+       Associated Network Volumes
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
+┃ Name           ┃ ID                 ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
+│ model-cache    │ nv_xyz789          │
+└────────────────┴────────────────────┘
+```
+
+---
+
+## env delete
+
+Delete a deployment environment and all its associated resources.
+
+```bash Command
+flash env delete <NAME> [OPTIONS]
+```
+
+### Examples
+
+```bash
+# Delete development environment
+flash env delete dev
+
+# Delete environment in specific app
+flash env delete staging --app APP_NAME
+```
+
+### Arguments
+
+<ResponseField name="NAME" type="string" required>
+Name of the environment to delete.
+</ResponseField>
+
+### Flags
+
+<ResponseField name="--app, -a" type="string">
+Flash app name. Auto-detected from current directory if not specified.
+</ResponseField>
+
+### Process
+
+1. Shows environment details and resources to be deleted.
+2. Prompts for confirmation (required).
+3. Undeploys all associated endpoints.
+4. Removes all associated network volumes.
+5. Deletes the environment from the app.
+
+<Warning>
+
+This operation is irreversible. All endpoints, volumes, and configuration associated with the environment will be permanently deleted.
+
+</Warning>
+
+---
+
+## Environment states
+
+| State | Description |
+|-------|-------------|
+| PENDING | Environment created but not deployed |
+| DEPLOYING | Deployment in progress |
+| DEPLOYED | Successfully deployed and running |
+| FAILED | Deployment or health check failed |
+| DELETING | Deletion in progress |
+
+## Common workflows
+
+### Three-tier deployment
+
+```bash
+# Create environments
+flash env create dev
+flash env create staging
+flash env create production
+
+# Deploy to each
+flash deploy --env dev
+flash deploy --env staging
+flash deploy --env production
+```
+
+### Feature branch testing
+
+```bash
+# Create feature environment
+flash env create FEATURE_NAME
+
+# Deploy feature branch
+git checkout FEATURE_NAME
+flash deploy --env FEATURE_NAME
+
+# Clean up after merge
+flash env delete FEATURE_NAME
+```
+
+## Related commands
+
+- [`flash deploy`](/flash/cli/deploy) - Deploy to an environment
+- [`flash app`](/flash/cli/app) - Manage applications
+- [`flash undeploy`](/flash/cli/undeploy) - Remove specific endpoints
diff --git a/flash/cli/init.mdx b/flash/cli/init.mdx
new file mode 100644
index 00000000..12f93b93
--- /dev/null
+++ b/flash/cli/init.mdx
@@ -0,0 +1,89 @@
+---
+title: "init"
+sidebarTitle: "init"
+---
+
+Create a new Flash project with a ready-to-use template structure including a FastAPI server, example GPU and CPU workers, and configuration files.
+
+```bash
+flash init [PROJECT_NAME] [OPTIONS]
+```
+
+## Example
+
+Create a new project directory:
+
+```bash
+flash init PROJECT_NAME
+cd PROJECT_NAME
+pip install -r requirements.txt
+flash run
+```
+
+Initialize in the current directory:
+
+```bash
+flash init .
+```
+
+## Arguments
+
+<ResponseField name="PROJECT_NAME" type="string">
+Name of the project directory to create. If omitted or set to `.`, initializes in the current directory.
+</ResponseField>
+
+## Flags
+
+<ResponseField name="--force, -f">
+Overwrite existing files if they already exist in the target directory.
+</ResponseField>
+
+## What it creates
+
+The command creates the following project structure:
+
+```text
+PROJECT_NAME/
+├── main.py              # FastAPI application entry point
+├── workers/
+│   ├── gpu/             # GPU worker example
+│   │   ├── __init__.py
+│   │   └── endpoint.py
+│   └── cpu/             # CPU worker example
+│       ├── __init__.py
+│       └── endpoint.py
+├── .env                 # Environment variables template
+├── .gitignore           # Git ignore patterns
+├── .flashignore         # Flash deployment ignore patterns
+├── requirements.txt     # Python dependencies
+└── README.md            # Project documentation
+```
+
+### Template contents
+
+- **main.py**: FastAPI application that imports routers from the `workers/` directory.
+- **workers/gpu/endpoint.py**: Example GPU worker with a `@remote` decorated function using `LiveServerless`.
+- **workers/cpu/endpoint.py**: Example CPU worker with a `@remote` decorated function using CPU configuration.
+- **.env**: Template for environment variables including `RUNPOD_API_KEY`.
+
+## Next steps
+
+After initialization:
+
+1. Copy `.env.example` to `.env` (if needed) and add your `RUNPOD_API_KEY`.
+2. Install dependencies: `pip install -r requirements.txt`
+3. Start the development server: `flash run`
+4. Open http://localhost:8888/docs to explore the API.
+5. Customize the workers for your use case.
+6. Deploy with `flash deploy` when ready.
+
+<Note>
+
+This command only creates local files. It doesn't interact with Runpod or create any cloud resources. Cloud resources are created when you run `flash run` or `flash deploy`.
+
+</Note>
+
+## Related commands
+
+- [`flash run`](/flash/cli/run) - Start the development server
+- [`flash deploy`](/flash/cli/deploy) - Build and deploy to Runpod
diff --git a/flash/cli/overview.mdx b/flash/cli/overview.mdx
new file mode 100644
index 00000000..6f1b0d66
--- /dev/null
+++ b/flash/cli/overview.mdx
@@ -0,0 +1,121 @@
+---
+title: "CLI overview"
+sidebarTitle: "Overview"
+description: "Learn how to use the Flash CLI for local development and deployment."
+---
+
+The Flash CLI provides commands for initializing projects, running local development servers, building deployment artifacts, and managing your applications on Runpod Serverless.
+
+## Install Flash
+
+Create a Python virtual environment and install Flash using pip:
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+pip install runpod-flash
+```
+
+## Configure your API key
+
+Flash requires a Runpod API key to provision and manage Serverless endpoints. Create a `.env` file in your project directory:
+
+```bash
+echo "RUNPOD_API_KEY=your_api_key_here" > .env
+```
+
+You can also set the API key as an environment variable:
+
+<Tabs>
+<Tab title="macOS/Linux">
+```bash
+export RUNPOD_API_KEY=your_api_key_here
+```
+</Tab>
+<Tab title="Windows">
+```bash
+set RUNPOD_API_KEY=your_api_key_here
+```
+</Tab>
+</Tabs>
+
+## Available commands
+
+| Command | Description |
+|---------|-------------|
+| [`flash init`](/flash/cli/init) | Create a new Flash project with a template structure |
+| [`flash run`](/flash/cli/run) | Start the local development server with automatic updates |
+| [`flash build`](/flash/cli/build) | Build a deployment artifact without deploying |
+| [`flash deploy`](/flash/cli/deploy) | Build and deploy your application to Runpod |
+| [`flash env`](/flash/cli/env) | Manage deployment environments |
+| [`flash app`](/flash/cli/app) | Manage Flash applications |
+| [`flash undeploy`](/flash/cli/undeploy) | Remove deployed endpoints |
+
+## Getting help
+
+View help for any command by adding `--help`:
+
+```bash
+flash --help
+flash deploy --help
+flash env --help
+```
+
+## Common workflows
+
+### Local development
+
+```bash
+# Create a new project
+flash init PROJECT_NAME
+cd PROJECT_NAME
+
+# Install dependencies
+pip install -r requirements.txt
+
+# Add your API key to .env
+# Start the development server
+flash run
+```
+
+### Deploy to production
+
+```bash
+# Build and deploy
+flash deploy
+
+# Deploy to a specific environment
+flash deploy --env ENVIRONMENT_NAME
+```
+
+### Manage deployments
+
+```bash
+# List environments
+flash env list
+
+# Check environment status
+flash env get ENVIRONMENT_NAME
+
+# Remove an environment
+flash env delete ENVIRONMENT_NAME
+```
+
+### Clean up endpoints
+
+```bash
+# List deployed endpoints
+flash undeploy list
+
+# Remove specific endpoint
+flash undeploy ENDPOINT_NAME
+
+# Remove all endpoints
+flash undeploy --all
+```
+
+## Next steps
+
+- [Create a project](/flash/cli/init) with `flash init`.
+- [Start developing](/flash/cli/run) with `flash run`.
+- [Deploy your app](/flash/cli/deploy) with `flash deploy`.
diff --git a/flash/cli/run.mdx b/flash/cli/run.mdx
new file mode 100644
index 00000000..4dab9e6c
--- /dev/null
+++ b/flash/cli/run.mdx
@@ -0,0 +1,156 @@
+---
+title: "run"
+sidebarTitle: "run"
+---
+
+Start the Flash development server for local testing with automatic updates. Your FastAPI app runs locally while `@remote` functions execute on Runpod Serverless.
+
+```bash
+flash run [OPTIONS]
+```
+
+## Example
+
+Start the development server with defaults:
+
+```bash
+flash run
+```
+
+Start with auto-provisioning to eliminate cold-start delays:
+
+```bash
+flash run --auto-provision
+```
+
+Start on a custom port:
+
+```bash
+flash run --port 3000
+```
+
+## Flags
+
+<ResponseField name="--host" type="string" default="localhost">
+Host address to bind the server to.
+</ResponseField>
+
+<ResponseField name="--port, -p" type="integer" default={8888}>
+Port number to bind the server to.
+</ResponseField>
+
+<ResponseField name="--reload/--no-reload" default="enabled">
+Enable or disable auto-reload on code changes. Enabled by default.
+</ResponseField>
+
+<ResponseField name="--auto-provision">
+Auto-provision all Serverless endpoints on startup instead of lazily on first call. Eliminates cold-start delays during development.
+</ResponseField>
+
+## Architecture
+
+With `flash run`, your system runs in a hybrid architecture:
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart TB
+    subgraph Local ["YOUR MACHINE (localhost:8888)"]
+        FastAPI["FastAPI App (main.py)<br/>• Your HTTP routes<br/>• Orchestrates @remote calls<br/>• Updates automatically"]
+    end
+
+    subgraph Runpod ["RUNPOD SERVERLESS"]
+        GPU["live-gpu-worker<br/>(your @remote function)"]
+        CPU["live-cpu-worker<br/>(your @remote function)"]
+    end
+
+    FastAPI -->|"HTTPS"| GPU
+    FastAPI -->|"HTTPS"| CPU
+
+    style Local fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style Runpod fill:#1a1a2e,stroke:#5F4CFE,stroke-width:2px,color:#fff
+    style FastAPI fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style GPU fill:#22C55E,stroke:#22C55E,color:#000
+    style CPU fill:#22C55E,stroke:#22C55E,color:#000
+```
+
+**Key points:**
+
+- Your FastAPI app runs locally and updates automatically for rapid iteration.
+- `@remote` functions run on Runpod as Serverless endpoints.
+- Endpoints are prefixed with `live-` to distinguish from production.
+- Changes to local code are picked up instantly.
+
+This is different from `flash deploy`, where everything runs on Runpod.
+
+## Auto-provisioning
+
+By default, endpoints are provisioned lazily on first `@remote` function call. Use `--auto-provision` to provision all endpoints at server startup:
+
+```bash
+flash run --auto-provision
+```
+
+### How it works
+
+1. **Discovery**: Scans your app for `@remote` decorated functions.
+2. **Deployment**: Deploys resources concurrently (up to 3 at a time).
+3. **Confirmation**: Asks for confirmation if deploying more than 5 endpoints.
+4. **Caching**: Stores deployed resources in `.runpod/resources.pkl` for reuse.
+5. **Updates**: Recognizes existing endpoints and updates if configuration changed.
+
+### Benefits
+
+- **Zero cold start**: All endpoints ready before you test them.
+- **Faster development**: No waiting for deployment on first HTTP call.
+- **Resource reuse**: Cached endpoints are reused across server restarts.
+
+### When to use
+
+- Local development with multiple endpoints.
+- Testing workflows that call multiple remote functions.
+- Debugging where you want deployment separated from handler logic.
+
+## Provisioning modes
+
+| Mode | When endpoints are deployed |
+|------|----------------------------|
+| Default (lazy) | On first `@remote` function call |
+| `--auto-provision` | At server startup |
+
+## Testing your API
+
+Once the server is running, test your endpoints:
+
+```bash
+# Health check
+curl http://localhost:8888/
+
+# Call a GPU endpoint
+curl -X POST http://localhost:8888/gpu/hello \
+  -H "Content-Type: application/json" \
+  -d '{"message": "Hello from GPU!"}'
+```
+
+Open http://localhost:8888/docs for the interactive API explorer.
+
+## Requirements
+
+- `RUNPOD_API_KEY` must be set in your `.env` file or environment.
+- A valid Flash project structure (created by `flash init` or manually).
+
+## flash run vs flash deploy
+
+| Aspect | `flash run` | `flash deploy` |
+|--------|-------------|----------------|
+| FastAPI app runs on | Your machine (localhost) | Runpod Serverless |
+| `@remote` functions run on | Runpod Serverless | Runpod Serverless |
+| Endpoint naming | `live-` prefix | No prefix |
+| Automatic updates | Yes | No |
+| Use case | Development | Production |
+
+## Related commands
+
+- [`flash init`](/flash/cli/init) - Create a new project
+- [`flash deploy`](/flash/cli/deploy) - Deploy to production
+- [`flash undeploy`](/flash/cli/undeploy) - Remove endpoints
diff --git a/flash/cli/undeploy.mdx b/flash/cli/undeploy.mdx
new file mode 100644
index 00000000..8225182f
--- /dev/null
+++ b/flash/cli/undeploy.mdx
@@ -0,0 +1,213 @@
+---
+title: "undeploy"
+sidebarTitle: "undeploy"
+---
+
+Manage and delete Runpod Serverless endpoints deployed via Flash. Use this command to clean up endpoints created during local development with `flash run`.
+
+```bash
+flash undeploy [NAME|list] [OPTIONS]
+```
+
+## Example
+
+List all tracked endpoints:
+
+```bash
+flash undeploy list
+```
+
+Remove a specific endpoint:
+
+```bash
+flash undeploy ENDPOINT_NAME
+```
+
+Remove all endpoints:
+
+```bash
+flash undeploy --all
+```
+
+## Usage modes
+
+### List endpoints
+
+Display all tracked endpoints with their current status:
+
+```bash
+flash undeploy list
+```
+
+Output includes:
+
+- **Name**: Endpoint name
+- **Endpoint ID**: Runpod endpoint identifier
+- **Status**: Current health status (Active/Inactive/Unknown)
+- **Type**: Resource type (Live Serverless, Cpu Live Serverless, etc.)
+
+**Status indicators:**
+
+| Status | Meaning |
+|--------|---------|
+| Active | Endpoint is running and responding |
+| Inactive | Tracking exists but endpoint deleted externally |
+| Unknown | Error during health check |
+
+### Undeploy by name
+
+Delete a specific endpoint:
+
+```bash
+flash undeploy ENDPOINT_NAME
+``` 
+
+This:
+
+1. Searches for endpoints matching the name.
+2. Shows endpoint details.
+3. Prompts for confirmation.
+4. Deletes the endpoint from Runpod.
+5. Removes from local tracking.
+
+### Undeploy all
+
+Delete all tracked endpoints (requires double confirmation):
+
+```bash
+flash undeploy --all
+```
+
+Safety features:
+
+1. Shows total count of endpoints.
+2. First confirmation: Yes/No prompt.
+3. Second confirmation: Type "DELETE ALL" exactly.
+4. Deletes all endpoints from Runpod.
+5. Removes all from tracking.
+
+### Interactive selection
+
+Select endpoints to undeploy using checkboxes:
+
+```bash
+flash undeploy --interactive
+```
+
+Use arrow keys to navigate, space bar to select/deselect, and Enter to confirm.
+
+### Clean up stale tracking
+
+Remove inactive endpoints from tracking without API deletion:
+
+```bash
+flash undeploy --cleanup-stale
+```
+
+Use this when endpoints were deleted via the Runpod console or API (not through Flash). The local tracking file (`.runpod/resources.pkl`) becomes stale, and this command cleans it up.
+
+## Flags
+
+<ResponseField name="--all">
+Undeploy all tracked endpoints. Requires double confirmation for safety.
+</ResponseField>
+
+<ResponseField name="--interactive, -i">
+Interactive checkbox selection mode. Select multiple endpoints to undeploy.
+</ResponseField>
+
+<ResponseField name="--cleanup-stale">
+Remove inactive endpoints from local tracking without attempting API deletion. Use when endpoints were deleted externally.
+</ResponseField>
+
+## Arguments
+
+<ResponseField name="NAME" type="string">
+Name of the endpoint to undeploy. Use `list` to show all endpoints.
+</ResponseField>
+
+## undeploy vs env delete
+
+| Command | Scope | When to use |
+|---------|-------|-------------|
+| `flash undeploy` | Individual endpoints from local tracking | Development cleanup, granular control |
+| `flash env delete` | Entire environment + all resources | Production cleanup, full teardown |
+
+For production deployments, use `flash env delete` to remove entire environments and all associated resources.
+
+## How tracking works
+
+Flash tracks deployed endpoints in `.runpod/resources.pkl`. Endpoints are added when you:
+
+- Run `flash run --auto-provision`
+- Run `flash run` and call `@remote` functions
+- Run `flash deploy`
+
+The tracking file is in `.gitignore` and should never be committed. It contains local deployment state.
+
+## Common workflows
+
+### Basic cleanup
+
+```bash
+# Check what's deployed
+flash undeploy list
+
+# Remove a specific endpoint
+flash undeploy ENDPOINT_NAME
+
+# Clean up stale tracking
+flash undeploy --cleanup-stale
+```
+
+### Bulk operations
+
+```bash
+# Undeploy all endpoints
+flash undeploy --all
+
+# Interactive selection
+flash undeploy --interactive
+```
+
+### Managing external deletions
+
+If you delete endpoints via the Runpod console:
+
+```bash
+# Check status - will show as "Inactive"
+flash undeploy list
+
+# Remove stale tracking entries
+flash undeploy --cleanup-stale
+```
+
+## Troubleshooting
+
+### Endpoint shows as "Inactive"
+
+The endpoint was deleted via Runpod console or API. Clean up:
+
+```bash
+flash undeploy --cleanup-stale
+```
+
+### Can't find endpoint by name
+
+Check the exact name:
+
+```bash
+flash undeploy list
+```
+
+### Undeploy fails with API error
+
+1. Check `RUNPOD_API_KEY` in `.env`.
+2. Verify network connectivity.
+3. Check if the endpoint still exists on Runpod.
+
+## Related commands
+
+- [`flash run`](/flash/cli/run) - Development server (creates endpoints)
+- [`flash deploy`](/flash/cli/deploy) - Deploy to Runpod
+- [`flash env delete`](/flash/cli/env) - Delete entire environment
diff --git a/flash/monitoring.mdx b/flash/monitoring.mdx
new file mode 100644
index 00000000..96212791
--- /dev/null
+++ b/flash/monitoring.mdx
@@ -0,0 +1,177 @@
+---
+title: "Monitor and debug remote functions"
+sidebarTitle: "Monitor and debug"
+description: "Monitor, debug, and troubleshoot Flash deployments."
+tag: "BETA"
+---
+
+This page covers how to monitor and debug your Flash deployments, including viewing logs, troubleshooting common issues, and optimizing performance.
+
+## Viewing logs
+
+When running Flash functions, logs are displayed in your terminal. The output includes:
+
+- Endpoint creation and reuse status.
+- Job submission and queue status.
+- Execution progress.
+- Worker information (delay time, execution time).
+
+Example output:
+
+```text
+2025-11-19 12:35:15,109 | INFO  | Created endpoint: rb50waqznmn2kg - flash-quickstart-fb
+2025-11-19 12:35:15,112 | INFO  | URL: https://console.runpod.io/serverless/user/endpoint/rb50waqznmn2kg
+2025-11-19 12:35:15,114 | INFO  | LiveServerless:rb50waqznmn2kg | API /run
+2025-11-19 12:35:15,655 | INFO  | LiveServerless:rb50waqznmn2kg | Started Job:b0b341e7-e460-4305-9acd-fc2dfd1bd65c-u2
+2025-11-19 12:35:15,762 | INFO  | Job:b0b341e7-e460-4305-9acd-fc2dfd1bd65c-u2 | Status: IN_QUEUE
+2025-11-19 12:36:09,983 | INFO  | Job:b0b341e7-e460-4305-9acd-fc2dfd1bd65c-u2 | Status: COMPLETED
+2025-11-19 12:36:10,068 | INFO  | Worker:icmkdgnrmdf8gz | Delay Time: 51842 ms
+2025-11-19 12:36:10,068 | INFO  | Worker:icmkdgnrmdf8gz | Execution Time: 1533 ms
+```
+
+### Log levels
+
+You can control log verbosity using the `LOG_LEVEL` environment variable:
+
+```bash
+LOG_LEVEL=DEBUG python your_script.py
+```
+
+Available log levels: `DEBUG`, `INFO`, `WARNING`, `ERROR`.
+
+## Monitoring in the Runpod console
+
+View detailed metrics and logs in the [Runpod console](https://www.runpod.io/console/serverless):
+
+1. Navigate to the **Serverless** section.
+2. Click on your endpoint to view:
+   - Active workers and queue depth.
+   - Request history and job status.
+   - Worker logs and execution details.
+   - Metrics (requests, latency, errors).
+
+### Endpoint metrics
+
+The console provides metrics including:
+
+- **Request rate**: Number of requests per minute.
+- **Queue depth**: Number of pending requests.
+- **Latency**: Average response time.
+- **Worker count**: Active and idle workers.
+- **Error rate**: Failed requests percentage.
+
+## Debugging common issues
+
+### Cold start delays
+
+If you're experiencing slow initial responses:
+
+- **Cause**: Workers need time to start, load dependencies, and initialize models.
+- **Solutions**:
+  - Set `workersMin=1` to keep at least one worker warm.
+  - Use smaller models or optimize model loading.
+  - Use `--auto-provision` with `flash run` for development.
+
+```python
+config = LiveServerless(
+    name="always-warm",
+    workersMin=1,  # Keep one worker always running
+    idleTimeout=30  # Longer idle timeout
+)
+```
+
+### Timeout errors
+
+If requests are timing out:
+
+- **Cause**: Execution taking longer than the timeout limit.
+- **Solutions**:
+  - Increase `executionTimeoutMs` in your configuration.
+  - Optimize your function to run faster.
+  - Break long operations into smaller chunks.
+
+```python
+config = LiveServerless(
+    name="long-running",
+    executionTimeoutMs=600000  # 10 minutes
+)
+```
+
+### Memory errors
+
+If you're seeing out-of-memory errors:
+
+- **Cause**: Model or data too large for available GPU/CPU memory.
+- **Solutions**:
+  - Use a larger GPU type (e.g., `GpuGroup.AMPERE_80` for 80GB VRAM).
+  - Use model quantization or smaller batch sizes.
+  - Clear GPU memory between operations.
+
+```python
+config = LiveServerless(
+    name="large-model",
+    gpus=[GpuGroup.AMPERE_80],  # A100 80GB
+    template=PodTemplate(containerDiskInGb=100)  # More disk space
+)
+```
+
+### Dependency errors
+
+If packages aren't being installed correctly:
+
+- **Cause**: Missing or incompatible dependencies.
+- **Solutions**:
+  - Verify package names and versions in the `dependencies` list.
+  - Check that packages have Linux `x86_64` wheels available.
+  - Import packages inside the function, not at the top of the file.
+
+```python
+@remote(
+    resource_config=config,
+    dependencies=["torch==2.0.0", "transformers==4.36.0"]  # Pin versions
+)
+def my_function(data):
+    import torch  # Import inside the function
+    import transformers
+    # ...
+```
+
+### Authentication errors
+
+If you're seeing API key errors:
+
+- **Cause**: Missing or invalid Runpod API key.
+- **Solutions**:
+  - Verify your API key is set in the environment.
+  - Check that the `.env` file is in the correct directory.
+  - Ensure the API key has the required permissions.
+
+```bash
+# Check if API key is set
+echo $RUNPOD_API_KEY
+
+# Set API key directly
+export RUNPOD_API_KEY=your_api_key_here
+```
+
+## Performance optimization
+
+### Reducing cold starts
+
+- Set `workersMin=1` for endpoints that need fast responses.
+- Use `idleTimeout` to balance cost and warm worker availability.
+- Cache models on network volumes to reduce loading time.
+
+### Optimizing execution time
+
+- Profile your functions to identify bottlenecks.
+- Use appropriate GPU types for your workload.
+- Batch multiple inputs into a single request when possible.
+- Use async operations to parallelize independent tasks.
+
+### Managing costs
+
+- Set appropriate `workersMax` limits to control scaling.
+- Use CPU workers for non-GPU tasks.
+- Monitor usage in the console to identify optimization opportunities.
+- Use shorter `idleTimeout` for sporadic workloads.
\ No newline at end of file
diff --git a/flash/overview.mdx b/flash/overview.mdx
new file mode 100644
index 00000000..9824ef64
--- /dev/null
+++ b/flash/overview.mdx
@@ -0,0 +1,318 @@
+---
+title: "Overview"
+sidebarTitle: "Overview"
+description: "Rapidly develop and deploy AI/ML apps with the Flash Python SDK."
+tag: "BETA"
+---
+
+<Note>
+Flash is currently in beta. [Join our Discord](https://discord.gg/cUpRmau42V) to provide feedback and get support.
+</Note>
+
+Flash is a Python SDK for developing and deploying AI workflows on [Runpod Serverless](/serverless/overview). You write Python functions locally, and Flash handles infrastructure management, GPU/CPU provisioning, dependency installation, and data transfer automatically.
+
+<CardGroup>
+  <Card title="Quickstart" href="/flash/quickstart" icon="bolt">
+    Write a standalone Flash script for instant access to Runpod infrastructure.
+  </Card>
+  <Card title="Build an app" href="/flash/apps/build-app" icon="code">
+    Create a Flash app with a FastAPI server and deploy it on Runpod to serve production endpoints.
+  </Card>
+</CardGroup>
+
+## Why use Flash?
+
+**Flash is the easiest and fastest way to test and deploy AI/ML workloads on Runpod.** Whether you're prototyping a new model or deploying a production API, Flash handles the infrastructure complexity so you can focus on your code.
+
+When you run a `@remote` function, Flash:
+- Automatically provisions resources on Runpod's infrastructure.
+- Installs your dependencies automatically.
+- Runs your function on a remote GPU/CPU.
+- Returns the result to your local environment.
+
+You can specify the exact GPU hardware you need, from RTX 4090s to A100 80GB GPUs, for AI inference, training, and other compute-intensive tasks. Functions scale automatically based on demand and can run in parallel across multiple resources.
+
+Flash uses [Runpod's Serverless pricing](/serverless/pricing) with per-second billing. You're only charged for actual compute time; there are no costs when your code isn't running.
+
+## Install Flash
+
+<Note>
+Flash requires Python 3.10 or higher.
+</Note>
+
+Create a Python virtual environment and use `pip` to install Flash:
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+pip install runpod-flash
+```
+
+In your project directory, create a `.env` file and add your Runpod API key, replacing `YOUR_API_KEY` with your actual API key:
+
+```bash
+touch .env && echo "RUNPOD_API_KEY=YOUR_API_KEY" > .env
+```
+
+## Core concepts
+
+### Remote functions
+
+The `@remote` decorator marks functions for execution on Runpod's infrastructure. Code inside the decorated function runs remotely on a Serverless worker, while code outside the function runs locally on your machine.
+
+```python
+@remote(resource_config=config, dependencies=["pandas"])
+def process_data(data):
+    # This code runs remotely on Runpod
+    import pandas as pd
+    df = pd.DataFrame(data)
+    return df.describe().to_dict()
+
+async def main():
+    # This code runs locally
+    result = await process_data(my_data)
+```
+
+### Resource configuration
+
+Flash provides fine-grained control over hardware allocation through configuration objects. You can configure GPU types, worker counts, idle timeouts, environment variables, and more.
+
+```python
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+gpu_config = LiveServerless(
+    name="ml-inference",
+    gpus=[GpuGroup.AMPERE_80],  # A100 80GB
+    workersMax=5
+)
+```
+
+[View the complete configuration reference](/flash/resource-configuration).
+
+### Dependency management
+
+Specify Python packages in the decorator, and Flash installs them automatically on the remote worker:
+
+```python
+@remote(
+    resource_config=gpu_config,
+    dependencies=["transformers==4.36.0", "torch", "pillow"]
+)
+def generate_image(prompt):
+    # Import inside the function
+    from transformers import pipeline
+    # ...
+```
+
+Imports should be placed inside the function body because they need to happen on the remote worker, not in your local environment.
+
+### Parallel execution
+
+Run multiple remote functions concurrently using Python's async capabilities:
+
+```python
+results = await asyncio.gather(
+    process_item(item1),
+    process_item(item2),
+    process_item(item3)
+)
+```
+
+## Development workflows
+
+Flash supports two main methods for running workloads on Runpod: standalone scripts and Flash apps.
+
+
+### Standalone scripts
+
+This is the fastest way to get started with Flash. Just write a Python script with `@remote` decorated functions and run it locally with `python script.py`.
+
+```python
+import asyncio
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+config = LiveServerless(
+    name="gpu-inference",
+    gpus=[GpuGroup.ADA_24],
+)
+
+@remote(resource_config=config, dependencies=["torch"])
+def process_on_gpu(data):
+    import torch
+    # Your GPU workload here
+    return {"result": "processed"}
+
+async def main():
+    result = await process_on_gpu({"input": "data"})
+    print(result)
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+Run the script locally, and Flash executes the `@remote` function on Runpod's infrastructure:
+
+```bash
+python my_script.py
+```
+
+**Use this approach for:**
+- Quick prototypes and experiments.
+- Batch processing jobs.
+- One-off data processing tasks.
+- Local development and testing.
+
+[Follow the quickstart](/flash/quickstart) to create your first Flash script.
+
+### Flash apps
+
+Build FastAPI applications with HTTP endpoints that run on Runpod Serverless. Flash apps provide a complete development and deployment workflow with local testing and production deployment.
+
+```python
+# main.py
+from fastapi import FastAPI
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+app = FastAPI()
+
+config = LiveServerless(
+    name="api-worker",
+    gpus=[GpuGroup.ADA_24],
+)
+
+@remote(resource_config=config, dependencies=["torch"])
+def inference(prompt: str):
+    import torch
+    # Your inference logic
+    return {"output": "result"}
+
+@app.post("/inference")
+async def inference_endpoint(prompt: str):
+    result = await inference(prompt)
+    return result
+```
+
+Develop and test locally with automatic updates:
+
+```bash
+flash run
+```
+
+Deploy to production when ready:
+
+```bash
+flash deploy
+```
+
+**Use this approach for:**
+
+- Production HTTP APIs.
+- Persistent endpoints.
+- Long-running services.
+- Team collaboration with staging/production environments.
+
+[Follow this tutorial](/flash/apps/build-app) to build your first Flash app.
+
+
+### Flash apps
+
+1. **Initialize**: Create a project with `flash init`
+2. **Develop**: Write your FastAPI app with `@remote` functions
+3. **Test locally**: Run `flash run` to test with automatic updates
+4. **Deploy**: Run `flash deploy` to push to production
+
+This workflow is ideal for production APIs and services that need persistent endpoints.
+
+```mermaid
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#9289FE','primaryTextColor':'#fff','primaryBorderColor':'#9289FE','lineColor':'#5F4CFE','secondaryColor':'#AE6DFF','tertiaryColor':'#FCB1FF','edgeLabelBackground':'#5F4CFE', 'fontSize':'14px','fontFamily':'font-inter'}}}%%
+
+flowchart LR
+    Init["flash init"]
+    Dev["Write code"]
+    Run["flash run<br/>(test locally)"]
+    Deploy["flash deploy<br/>(production)"]
+
+    Init --> Dev
+    Dev --> Run
+    Run -->|"Ready"| Deploy
+    Run -->|"Continue developing"| Dev
+
+    style Init fill:#5F4CFE,stroke:#5F4CFE,color:#fff
+    style Dev fill:#22C55E,stroke:#22C55E,color:#000
+    style Run fill:#4D38F5,stroke:#4D38F5,color:#fff
+    style Deploy fill:#AE6DFF,stroke:#AE6DFF,color:#000
+```
+
+[Learn more about the Flash app workflow](/flash/apps/overview).
+
+
+
+## CLI commands
+
+Flash provides CLI commands for managing Flash apps:
+
+| Command | Description |
+|---------|-------------|
+| [`flash init`](/flash/cli/init) | Create a new Flash app project |
+| [`flash run`](/flash/cli/run) | Start the local development server |
+| [`flash build`](/flash/cli/build) | Build a deployment artifact |
+| [`flash deploy`](/flash/cli/deploy) | Build and deploy to Runpod |
+| [`flash env`](/flash/cli/env) | Manage deployment environments |
+| [`flash app`](/flash/cli/app) | Manage Flash applications |
+| [`flash undeploy`](/flash/cli/undeploy) | Remove deployed endpoints |
+
+<Note>
+CLI commands are primarily for Flash apps. Standalone scripts don't require the CLI—just run them with `python`.
+</Note>
+
+See the [CLI reference](/flash/cli/overview) for detailed documentation on each command.
+
+## Use cases
+
+Flash is well-suited for a range of AI and data processing workloads:
+
+- **Multi-modal AI pipelines**: Orchestrate unified workflows combining text, image, and audio models with GPU acceleration.
+- **Distributed model training**: Scale training operations across multiple GPU workers for faster model development.
+- **AI research experimentation**: Rapidly prototype and test complex model combinations without infrastructure overhead.
+- **Production inference systems**: Deploy multi-stage inference pipelines for real-world applications.
+- **Data processing workflows**: Process large datasets using CPU workers for general computation and GPU workers for accelerated tasks.
+- **Hybrid GPU/CPU workflows**: Optimize cost and performance by combining CPU preprocessing with GPU inference.
+
+## Limitations
+
+- Serverless deployments using Flash are currently restricted to the `EU-RO-1` datacenter.
+- Be aware of your account's maximum worker capacity limits. Flash can rapidly scale workers across multiple endpoints, and you may hit capacity constraints. Contact [Runpod support](https://www.runpod.io/contact) to increase your account's capacity allocation if needed.
+
+## Next steps
+
+<CardGroup cols={2}>
+  <Card title="Quickstart" href="/flash/quickstart" icon="bolt">
+    Write your first standalone script with Flash
+  </Card>
+  <Card title="Build an app" href="/flash/apps/build-app" icon="code">
+    Create a FastAPI app with Flash
+  </Card>
+  <Card title="Configuration reference" href="/flash/resource-configuration" icon="sliders">
+    Complete reference for resource configuration
+  </Card>
+  <Card title="CLI reference" href="/flash/cli/overview" icon="terminal">
+    Learn about Flash CLI commands
+  </Card>
+</CardGroup>
+
+
+## Coding agent integration
+
+Flash provides a skill package for AI coding agents like Claude Code, Cline, and Cursor. The skill gives these agents detailed context about the Flash SDK, CLI, best practices, and common patterns.
+
+Install the Flash skill by running the following command in your terminal:
+
+```bash
+npx skills add runpod/skills
+```
+
+This allows your coding agent to provide more accurate Flash code suggestions and troubleshooting help. See the [runpod/skills repository](https://github.com/runpod/skills) for more details.
+
+## Getting help
+
+Join the [Runpod community on Discord](https://discord.gg/cUpRmau42V) for support and discussion.
diff --git a/flash/pricing.mdx b/flash/pricing.mdx
new file mode 100644
index 00000000..28ca0df8
--- /dev/null
+++ b/flash/pricing.mdx
@@ -0,0 +1,109 @@
+---
+title: "Pricing"
+sidebarTitle: "Pricing"
+description: "Understand Flash pricing and optimize your costs."
+tag: "BETA"
+---
+
+Flash follows the same pricing model as [Runpod Serverless](/serverless/pricing). You pay per second of compute time, with no charges when your code isn't running. Pricing depends on the GPU or CPU type you configure for your endpoints.
+
+## How pricing works
+
+You're billed from when a worker starts until it completes your request, plus any idle time before scaling down. If a worker is already warm, you skip the cold start and only pay for execution time.
+
+### Compute cost breakdown
+
+Flash workers incur charges during these periods:
+
+1. **Start time**: The time required to initialize a worker and load models into GPU memory. This includes starting the container, installing dependencies, and preparing the runtime environment.
+2. **Execution time**: The time spent processing your request (running your `@remote` decorated function).
+3. **Idle time**: The period a worker remains active after completing a request, waiting for additional requests before scaling down.
+
+### Pricing by resource type
+
+Flash supports both GPU and CPU workers. Pricing varies based on the hardware type:
+
+- **GPU workers**: Use `LiveServerless` or `ServerlessEndpoint` with GPU configurations. Pricing depends on the GPU type (e.g., RTX 4090, A100 80GB).
+- **CPU workers**: Use `LiveServerless` or `CpuServerlessEndpoint` with CPU configurations. Pricing depends on the CPU instance type.
+
+See the [Serverless pricing page](/serverless/pricing) for current rates by GPU and CPU type.
+
+## How to estimate and optimize costs
+
+To estimate costs for your Flash workloads, consider:
+
+- How long each function takes to execute.
+- How many concurrent workers you need (`workersMax` setting).
+- Which GPU or CPU types you'll use.
+- Your idle timeout configuration (`idleTimeout` setting).
+
+### Cost optimization strategies
+
+#### Choose appropriate hardware
+
+Select the smallest GPU or CPU that meets your performance requirements. For example, if your workload fits in 24GB of VRAM, use `GpuGroup.ADA_24` or `GpuGroup.AMPERE_24` instead of larger GPUs.
+
+```python
+# Cost-effective configuration for workloads that fit in 24GB VRAM
+config = LiveServerless(
+    name="cost-optimized",
+    gpus=[GpuGroup.ADA_24, GpuGroup.AMPERE_24],  # RTX 4090, L4, A5000, 3090
+)
+```
+
+#### Configure idle timeouts
+
+Balance responsiveness and cost by adjusting the `idleTimeout` parameter. Shorter timeouts reduce idle costs but increase cold starts for sporadic traffic.
+
+```python
+# Lower idle timeout for cost savings (more cold starts)
+config = LiveServerless(
+    name="low-idle",
+    idleTimeout=5,  # 5 seconds (default)
+)
+
+# Higher idle timeout for responsiveness (higher idle costs)
+config = LiveServerless(
+    name="responsive",
+    idleTimeout=30,  # 30 seconds
+)
+```
+
+#### Use CPU workers for non-GPU tasks
+
+For data preprocessing, postprocessing, or other tasks that don't require GPU acceleration, use CPU workers instead of GPU workers.
+
+```python
+from runpod_flash import LiveServerless, CpuInstanceType
+
+# CPU configuration for non-GPU tasks
+cpu_config = LiveServerless(
+    name="data-processor",
+    instanceIds=[CpuInstanceType.CPU5C_2_4],  # 2 vCPU, 4GB RAM
+)
+```
+
+#### Limit maximum workers
+
+Set `workersMax` to prevent runaway scaling and unexpected costs:
+
+```python
+config = LiveServerless(
+    name="controlled-scaling",
+    workersMax=3,  # Limit to 3 concurrent workers
+)
+```
+
+### Monitoring costs
+
+Monitor your usage in the [Runpod console](https://www.runpod.io/console/serverless) to track:
+
+- Total compute time across endpoints.
+- Worker utilization and idle time.
+- Cost breakdown by endpoint.
+
+## Next steps
+
+- [Create remote functions](/flash/remote-functions) with optimized resource configurations.
+- [View Serverless pricing details](/serverless/pricing) for current rates.
+- [Configure resources](/flash/resource-configuration) for your workloads.
diff --git a/flash/quickstart.mdx b/flash/quickstart.mdx
new file mode 100644
index 00000000..2eaaa675
--- /dev/null
+++ b/flash/quickstart.mdx
@@ -0,0 +1,341 @@
+---
+title: "Get started with Flash"
+sidebarTitle: "Quickstart"
+description: "Set up your development environment and run your first GPU workload with Flash."
+tag: "BETA"
+---
+
+This tutorial shows you how to set up Flash and run a GPU workload on Runpod Serverless. You'll create a remote function that performs matrix operations on a GPU and returns the results to your local machine.
+
+## What you'll learn
+
+In this tutorial you'll learn how to:
+
+- Set up your development environment for Flash.
+- Configure a Serverless endpoint using a `LiveServerless` object.
+- Create and define remote functions with the `@remote` decorator.
+- Deploy a GPU-based workload using Runpod resources.
+- Pass data between your local environment and remote workers.
+- Run multiple operations in parallel.
+
+## Requirements
+
+- You've [created a Runpod account](/get-started/manage-accounts).
+- You've [created a Runpod API key](/get-started/api-keys).
+- You've installed [Python 3.10 or higher](https://www.python.org/downloads/).
+
+## Step 1: Install Flash
+
+Create a Python virtual environment and use `pip` to install Flash:
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+pip install runpod-flash
+```
+
+## Step 2: Add your API key to the environment
+
+Add your Runpod API key to your development environment before using Flash to run workloads.
+
+Run this command to create a `.env` file, replacing `YOUR_API_KEY` with your Runpod API key:
+
+```bash
+touch .env && echo "RUNPOD_API_KEY=YOUR_API_KEY" > .env
+```
+
+<Note>
+
+You can create this in your project's root directory or in the `/examples` folder. Make sure your `.env` file is in the same folder as the Python file you create in the next step.
+
+</Note>
+
+## Step 3: Create your project file
+
+Create a new file called `matrix_operations.py` in the same directory as your `.env` file:
+
+```bash
+touch matrix_operations.py
+```
+
+Open this file in your code editor. The following steps walk through building a matrix multiplication example that demonstrates Flash's remote execution and parallel processing capabilities.
+
+## Step 4: Add imports and load the .env file
+
+Add the necessary import statements:
+
+```python
+import asyncio
+from dotenv import load_dotenv
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+# Load environment variables from .env file
+load_dotenv()
+```
+
+This imports:
+
+- `asyncio`: Python's asynchronous programming library, which Flash uses for non-blocking execution.
+- `dotenv`: Loads environment variables from your `.env` file, including your Runpod API key.
+- `remote` and `LiveServerless`: The core Flash components for defining remote functions and their resource requirements.
+
+`load_dotenv()` reads your API key from the `.env` file and makes it available to Flash.
+
+## Step 5: Add Serverless endpoint configuration
+
+Define the Serverless endpoint configuration for your Flash workload:
+
+```python
+# Configuration for a Serverless endpoint using GPU workers
+gpu_config = LiveServerless(
+    gpus=[GpuGroup.AMPERE_24, GpuGroup.ADA_24],  # Use any 24GB GPU
+    workersMax=3,
+    name="flash_gpu",
+)
+```
+
+This `LiveServerless` object defines:
+
+- `gpus=[GpuGroup.AMPERE_24, GpuGroup.ADA_24]`: The GPUs that can be used by workers on this endpoint. This restricts workers to using any 24 GB GPU (L4, A5000, 3090, or 4090). See [GPU pools](/references/gpu-types#gpu-pools) for available GPU pool IDs. Removing this parameter allows the endpoint to use any available GPUs.
+- `workersMax=3`: The maximum number of worker instances.
+- `name="flash_gpu"`: The name of the endpoint that will be created/used in the Runpod console.
+
+If you run a Flash function that uses an identical `LiveServerless` configuration to a prior run, Runpod reuses your existing endpoint rather than creating a new one. However, if any configuration values have changed (not just the `name` parameter), a new endpoint will be created.
+
+## Step 6: Define your remote function
+
+Define the function that will run on the GPU worker:
+
+```python
+@remote(
+    resource_config=gpu_config,
+    dependencies=["numpy", "torch"]
+)
+def flash_matrix_operations(size):
+    """Perform large matrix operations using NumPy and check GPU availability."""
+    import numpy as np
+    import torch
+
+    # Get GPU count and name
+    device_count = torch.cuda.device_count()
+    device_name = torch.cuda.get_device_name(0)
+
+    # Create large random matrices
+    A = np.random.rand(size, size)
+    B = np.random.rand(size, size)
+
+    # Perform matrix multiplication
+    C = np.dot(A, B)
+
+    return {
+        "matrix_size": size,
+        "result_shape": C.shape,
+        "result_mean": float(np.mean(C)),
+        "result_std": float(np.std(C)),
+        "device_count": device_count,
+        "device_name": device_name
+    }
+```
+
+This code demonstrates several key concepts:
+
+- `@remote`: The decorator that marks the function for remote execution on Runpod's infrastructure.
+- `resource_config=gpu_config`: The function runs using the GPU configuration defined earlier.
+- `dependencies=["numpy", "torch"]`: Python packages that must be installed on the remote worker.
+
+The `flash_matrix_operations` function:
+
+- Gets GPU details using PyTorch's CUDA utilities.
+- Creates two large random matrices using NumPy.
+- Performs matrix multiplication.
+- Returns statistics about the result and information about the GPU.
+
+Notice that `numpy` and `torch` are imported inside the function, not at the top of the file. These imports need to happen on the remote worker, not in your local environment.
+
+## Step 7: Add the main function
+
+Add a `main` function to execute your GPU workload:
+
+```python
+async def main():
+    # Run the GPU matrix operations
+    print("Starting large matrix operations on GPU...")
+    result = await flash_matrix_operations(1000)
+
+    # Print the results
+    print("\nMatrix operations results:")
+    print(f"Matrix size: {result['matrix_size']}x{result['matrix_size']}")
+    print(f"Result shape: {result['result_shape']}")
+    print(f"Result mean: {result['result_mean']:.4f}")
+    print(f"Result standard deviation: {result['result_std']:.4f}")
+
+    # Print GPU information
+    print("\nGPU Information:")
+    print(f"GPU device count: {result['device_count']}")
+    print(f"GPU device name: {result['device_name']}")
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+The `main` function:
+
+- Calls the remote function with `await`, which runs it asynchronously on Runpod's infrastructure.
+- Prints the results of the matrix operations.
+- Displays information about the GPU that was used.
+
+`asyncio.run(main())` is Python's standard way to execute an asynchronous `main` function from synchronous code.
+
+All code outside of the `@remote` decorated function runs on your local machine. The `main` function acts as a bridge between your local environment and Runpod's cloud infrastructure, allowing you to send input data to remote functions, wait for remote execution to complete without blocking your local process, and process returned results locally.
+
+The `await` keyword pauses execution of the `main` function until the remote operation completes, but doesn't block the entire Python process.
+
+## Step 8: Run your GPU example
+
+Run the example:
+
+```bash
+python matrix_operations.py
+```
+
+You should see output similar to this:
+
+```text
+Starting large matrix operations on GPU...
+Resource LiveServerless_33e1fa59c64b611c66c5a778e120c522 already exists, reusing.
+Registering RunPod endpoint: server_LiveServerless_33e1fa59c64b611c66c5a778e120c522 at https://api.runpod.ai/xvf32dan8rcilp
+Initialized RunPod stub for endpoint: https://api.runpod.ai/xvf32dan8rcilp (ID: xvf32dan8rcilp)
+Executing function on RunPod endpoint ID: xvf32dan8rcilp
+Initial job status: IN_QUEUE
+Job completed, output received
+
+Matrix operations results:
+Matrix size: 1000x1000
+Result shape: (1000, 1000)
+Result mean: 249.8286
+Result standard deviation: 6.8704
+
+GPU Information:
+GPU device count: 1
+GPU device name: NVIDIA GeForce RTX 4090
+```
+
+<Accordion title="Troubleshooting authentication issues">
+If you're having trouble running your code due to authentication issues:
+
+1. Verify your `.env` file is in the same directory as your `matrix_operations.py` file.
+2. Check that the API key in your `.env` file is correct and properly formatted.
+
+Alternatively, you can set the API key directly in your terminal:
+
+<Tabs>
+<Tab title="macOS/Linux">
+```bash
+export RUNPOD_API_KEY=[YOUR_API_KEY]
+```
+</Tab>
+<Tab title="Windows">
+```bash
+set RUNPOD_API_KEY=[YOUR_API_KEY]
+```
+</Tab>
+</Tabs>
+</Accordion>
+
+## Step 9: Understand what's happening
+
+When you run this script:
+
+1. Flash reads your GPU resource configuration and provisions a worker on Runpod.
+2. It installs the required dependencies (NumPy and PyTorch) on the worker.
+3. Your `flash_matrix_operations` function runs on the remote worker.
+4. The function creates and multiplies large matrices, then calculates statistics.
+5. Your local `main` function receives these results and displays them in your terminal.
+
+## Step 10: Run multiple operations in parallel
+
+Flash makes it easy to run multiple remote operations in parallel.
+
+Replace your `main` function with this code:
+
+```python
+async def main():
+    # Run multiple matrix operations in parallel
+    print("Starting large matrix operations on GPU...")
+
+    # Run all matrix operations in parallel
+    results = await asyncio.gather(
+        flash_matrix_operations(500),
+        flash_matrix_operations(1000),
+        flash_matrix_operations(2000)
+    )
+
+    print("\nMatrix operations results:")
+
+    # Print the results for each matrix size
+    for result in results:
+        print(f"\nMatrix size: {result['matrix_size']}x{result['matrix_size']}")
+        print(f"Result shape: {result['result_shape']}")
+        print(f"Result mean: {result['result_mean']:.4f}")
+        print(f"Result standard deviation: {result['result_std']:.4f}")
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+This updated `main` function demonstrates Flash's ability to run multiple operations in parallel using `asyncio.gather()`. Instead of running one matrix operation at a time, you're launching three operations with different matrix sizes (500, 1000, and 2000) simultaneously. This parallel execution significantly improves efficiency when you have multiple independent tasks.
+
+Run the example again:
+
+```bash
+python matrix_operations.py
+```
+
+You should see results for all three matrix sizes after the operations complete:
+
+```text
+Initial job status: IN_QUEUE
+Initial job status: IN_QUEUE
+Initial job status: IN_QUEUE
+Job completed, output received
+Job completed, output received
+Job completed, output received
+
+Matrix size: 500x500
+Result shape: (500, 500)
+Result mean: 125.3097
+Result standard deviation: 5.0425
+
+Matrix size: 1000x1000
+Result shape: (1000, 1000)
+Result mean: 249.9442
+Result standard deviation: 7.1072
+
+Matrix size: 2000x2000
+Result shape: (2000, 2000)
+Result mean: 500.1321
+Result standard deviation: 9.8879
+```
+
+## Clean up
+
+When you're done testing, you can clean up the endpoints created during this tutorial. Use the [`flash undeploy`](/flash/cli/undeploy) command to remove development endpoints:
+
+```bash
+# List all endpoints
+flash undeploy list
+
+# Remove a specific endpoint
+flash undeploy live-ENDPOINT_NAME
+
+# Remove all endpoints
+flash undeploy --all
+```
+
+## Next steps
+
+You've successfully used Flash to run a GPU workload on Runpod. Now you can:
+
+- [Create more complex remote functions](/flash/remote-functions) with custom dependencies and resource configurations.
+- [Build and deploy Flash apps](/flash/apps/overview) for production use.
+- Explore more examples on the [runpod-workers/flash](https://github.com/runpod-workers/flash) GitHub repository.
diff --git a/flash/remote-functions.mdx b/flash/remote-functions.mdx
new file mode 100644
index 00000000..dff3baca
--- /dev/null
+++ b/flash/remote-functions.mdx
@@ -0,0 +1,263 @@
+---
+title: "Create remote functions"
+sidebarTitle: "Create remote functions"
+description: "Learn how to create and configure remote functions with Flash."
+tag: "BETA"
+---
+
+Remote functions are the core building blocks of Flash. The `@remote` decorator marks Python functions for execution on Runpod's Serverless infrastructure, handling resource provisioning, dependency installation, and data transfer automatically.
+
+## Resource configuration
+
+Every remote function requires a resource configuration that specifies the compute resources to use. Flash provides several configuration classes for different use cases.
+
+### LiveServerless
+
+`LiveServerless` is the primary configuration class for Flash. It supports full remote code execution, allowing you to run arbitrary Python functions on Runpod's infrastructure.
+
+```python
+from runpod_flash import LiveServerless, GpuGroup
+
+gpu_config = LiveServerless(
+    name="ml-inference",
+    gpus=[GpuGroup.AMPERE_80],  # A100 80GB
+    workersMax=5,
+    idleTimeout=10
+)
+
+@remote(resource_config=gpu_config, dependencies=["torch"])
+def run_inference(data):
+    import torch
+    # Your inference code here
+    return result
+```
+
+Common configuration options:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `name` | Name for your endpoint (required) | - |
+| `gpus` | GPU pool IDs that can be used | `[GpuGroup.ANY]` |
+| `workersMax` | Maximum number of workers | 3 |
+| `workersMin` | Minimum number of workers | 0 |
+| `idleTimeout` | Minutes before scaling down | 5 |
+
+See the [resource configuration reference](/flash/resource-configuration) for all available options.
+
+### CPU configuration
+
+For CPU-only workloads, specify `instanceIds` instead of `gpus`:
+
+```python
+from runpod_flash import LiveServerless, CpuInstanceType
+
+cpu_config = LiveServerless(
+    name="data-processor",
+    instanceIds=[CpuInstanceType.CPU5C_4_8],  # 4 vCPU, 8GB RAM
+    workersMax=3
+)
+
+@remote(resource_config=cpu_config, dependencies=["pandas"])
+def process_data(data):
+    import pandas as pd
+    df = pd.DataFrame(data)
+    return df.describe().to_dict()
+```
+
+## Dependency management
+
+Specify Python packages in the `dependencies` parameter of the `@remote` decorator. Flash installs these packages on the remote worker before executing your function.
+
+```python
+@remote(
+    resource_config=config,
+    dependencies=["transformers==4.36.0", "torch", "pillow"]
+)
+def generate_image(prompt):
+    from transformers import pipeline
+    import torch
+    from PIL import Image
+    # Your code here
+```
+
+### Important notes about dependencies
+
+**Import inside the function**: Always import packages inside the decorated function body, not at the top of your file. These imports need to happen on the remote worker, not in your local environment.
+
+```python
+# Correct - imports inside the function
+@remote(resource_config=config, dependencies=["numpy"])
+def compute(data):
+    import numpy as np  # Import here
+    return np.sum(data)
+
+# Incorrect - imports at top of file won't work
+import numpy as np  # This import happens locally, not on the worker
+
+@remote(resource_config=config, dependencies=["numpy"])
+def compute(data):
+    return np.sum(data)  # numpy not available on worker
+```
+
+**Version pinning**: You can pin specific versions using standard pip syntax:
+
+```python
+dependencies=["transformers==4.36.0", "torch>=2.0.0"]
+```
+
+**Pre-installed packages**: Some packages (like PyTorch) are pre-installed on GPU workers. Including them in dependencies ensures the correct version is available.
+
+## Parallel execution
+
+Flash functions are asynchronous by default. Use Python's `asyncio` to run multiple functions in parallel:
+
+```python
+import asyncio
+
+async def main():
+    # Run three functions in parallel
+    results = await asyncio.gather(
+        process_item(item1),
+        process_item(item2),
+        process_item(item3)
+    )
+    return results
+```
+
+This is particularly useful for:
+
+- Batch processing multiple inputs.
+- Running different models on the same data.
+- Parallelizing independent pipeline stages.
+
+### Example: Parallel batch processing
+
+```python
+import asyncio
+from runpod_flash import remote, LiveServerless, GpuGroup
+
+config = LiveServerless(
+    name="batch-processor",
+    gpus=[GpuGroup.ADA_24],
+    workersMax=5  # Allow up to 5 parallel workers
+)
+
+@remote(resource_config=config, dependencies=["torch"])
+def process_batch(batch_id, data):
+    import torch
+    # Process batch
+    return {"batch_id": batch_id, "result": len(data)}
+
+async def main():
+    batches = [
+        (1, [1, 2, 3]),
+        (2, [4, 5, 6]),
+        (3, [7, 8, 9])
+    ]
+    
+    # Process all batches in parallel
+    results = await asyncio.gather(*[
+        process_batch(batch_id, data) 
+        for batch_id, data in batches
+    ])
+    
+    print(results)
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+## Custom Docker images
+
+For specialized environments that require a custom Docker image, use `ServerlessEndpoint` or `CpuServerlessEndpoint` instead of `LiveServerless`:
+
+```python
+from runpod_flash import ServerlessEndpoint, GpuGroup
+
+custom_gpu = ServerlessEndpoint(
+    name="custom-ml-env",
+    imageName="pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime",
+    gpus=[GpuGroup.AMPERE_80]
+)
+```
+
+<Warning>
+
+Unlike `LiveServerless`, `ServerlessEndpoint` and `CpuServerlessEndpoint` only support dictionary payloads in the form of `{"input": {...}}` (similar to a traditional [Serverless endpoint request](/serverless/endpoints/send-requests)). They cannot execute arbitrary Python functions remotely.
+
+</Warning>
+
+Use custom Docker images when you need:
+
+- Pre-installed system-level dependencies.
+- Specific CUDA or cuDNN versions.
+- Custom base images with large models baked in.
+
+## Using persistent storage
+
+Attach [network volumes](/storage/network-volumes) for persistent storage across workers and endpoints. This is useful for sharing large models or datasets between workers without downloading them each time.
+
+```python
+config = LiveServerless(
+    name="model-server",
+    networkVolumeId="vol_abc123",  # Your network volume ID
+    template=PodTemplate(containerDiskInGb=100)
+)
+```
+
+To find your network volume ID:
+
+1. Go to the [Storage page](https://www.runpod.io/console/storage) in the Runpod console.
+2. Click on your network volume.
+3. Copy the volume ID from the URL or volume details.
+
+### Example: Using a network volume for model storage
+
+```python
+from runpod_flash import LiveServerless, GpuGroup, PodTemplate
+
+config = LiveServerless(
+    name="model-inference",
+    gpus=[GpuGroup.AMPERE_80],
+    networkVolumeId="vol_abc123",
+    template=PodTemplate(containerDiskInGb=100)
+)
+
+@remote(resource_config=config, dependencies=["torch", "transformers"])
+def run_inference(prompt):
+    from transformers import AutoModelForCausalLM, AutoTokenizer
+    
+    # Load model from network volume
+    model_path = "/runpod-volume/models/llama-7b"
+    model = AutoModelForCausalLM.from_pretrained(model_path)
+    tokenizer = AutoTokenizer.from_pretrained(model_path)
+    
+    # Run inference
+    inputs = tokenizer(prompt, return_tensors="pt")
+    outputs = model.generate(**inputs)
+    return tokenizer.decode(outputs[0])
+```
+
+## Environment variables
+
+Pass environment variables to remote functions using the `env` parameter:
+
+```python
+config = LiveServerless(
+    name="api-worker",
+    env={"HF_TOKEN": "your_token", "MODEL_ID": "gpt2"}
+)
+```
+
+<Note>
+
+Environment variables are excluded from configuration hashing. Changing environment values won't trigger endpoint recreation, which allows different processes to load environment variables from `.env` files without causing false drift detection.
+
+</Note>
+
+## Next steps
+
+- [Create API endpoints](/flash/apps/build-app) using FastAPI.
+- [Deploy Flash applications](/flash/apps/deploy-apps) for production.
+- [View the resource configuration reference](/flash/resource-configuration) for all available options.
+- [Clean up development endpoints](/flash/cli/undeploy) when you're done testing.
diff --git a/flash/resource-configuration.mdx b/flash/resource-configuration.mdx
new file mode 100644
index 00000000..00bb1710
--- /dev/null
+++ b/flash/resource-configuration.mdx
@@ -0,0 +1,269 @@
+---
+title: "Resource configuration reference"
+sidebarTitle: "Configuration reference"
+description: "A complete reference for Flash GPU/CPU resource configuration options."
+tag: "BETA"
+---
+
+Flash provides several resource configuration classes for different use cases. This reference covers all available parameters and options.
+
+## LiveServerless
+
+`LiveServerless` is the primary configuration class for Flash. It supports full remote code execution, allowing you to run arbitrary Python functions on Runpod's infrastructure.
+
+```python
+from runpod_flash import LiveServerless, GpuGroup, CpuInstanceType, PodTemplate
+
+gpu_config = LiveServerless(
+    name="ml-inference",
+    gpus=[GpuGroup.AMPERE_80],
+    workersMax=5,
+    idleTimeout=10,
+    template=PodTemplate(containerDiskInGb=100)
+)
+```
+
+### Parameters
+
+| Parameter | Type | Description | Default |
+|-----------|------|-------------|---------|
+| `name` | `string` | Name for your endpoint (required) | - |
+| `gpus` | `list[GpuGroup]` | GPU pool IDs that can be used by workers | `[GpuGroup.ANY]` |
+| `gpuCount` | `int` | Number of GPUs per worker | 1 |
+| `instanceIds` | `list[CpuInstanceType]` | CPU instance types (forces CPU endpoint) | `None` |
+| `workersMin` | `int` | Minimum number of workers | 0 |
+| `workersMax` | `int` | Maximum number of workers | 3 |
+| `idleTimeout` | `int` | Minutes before scaling down | 5 |
+| `env` | `dict` | Environment variables | `None` |
+| `networkVolumeId` | `string` | Persistent storage volume ID | `None` |
+| `executionTimeoutMs` | `int` | Max execution time in milliseconds | 0 (no limit) |
+| `scalerType` | `string` | Scaling strategy | `QUEUE_DELAY` |
+| `scalerValue` | `int` | Scaling parameter value | 4 |
+| `locations` | `string` | Preferred datacenter locations | `None` |
+| `template` | `PodTemplate` | Pod template overrides | `None` |
+
+### GPU configuration example
+
+```python
+from runpod_flash import LiveServerless, GpuGroup, PodTemplate
+
+config = LiveServerless(
+    name="gpu-inference",
+    gpus=[GpuGroup.AMPERE_80],  # A100 80GB
+    gpuCount=1,
+    workersMin=0,
+    workersMax=5,
+    idleTimeout=10,
+    template=PodTemplate(containerDiskInGb=100),
+    env={"MODEL_ID": "llama-7b"}
+)
+```
+
+### CPU configuration example
+
+```python
+from runpod_flash import LiveServerless, CpuInstanceType
+
+config = LiveServerless(
+    name="cpu-processor",
+    instanceIds=[CpuInstanceType.CPU5C_4_8],  # 4 vCPU, 8GB RAM
+    workersMax=3,
+    idleTimeout=5
+)
+```
+
+## ServerlessEndpoint
+
+`ServerlessEndpoint` is for GPU workloads that require custom Docker images. Unlike `LiveServerless`, it only supports dictionary payloads and cannot execute arbitrary Python functions.
+
+```python
+from runpod_flash import ServerlessEndpoint, GpuGroup
+
+config = ServerlessEndpoint(
+    name="custom-ml-env",
+    imageName="pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime",
+    gpus=[GpuGroup.AMPERE_80]
+)
+```
+
+### Parameters
+
+All parameters from `LiveServerless` are available, plus:
+
+| Parameter | Type | Description | Default |
+|-----------|------|-------------|---------|
+| `imageName` | `string` | Custom Docker image | - |
+
+### Limitations
+
+- Only supports dictionary payloads in the form of `{"input": {...}}`.
+- Cannot execute arbitrary Python functions remotely.
+- Requires a custom Docker image with a handler that processes the input dictionary.
+
+### Example
+
+```python
+from runpod_flash import ServerlessEndpoint, GpuGroup
+
+# Custom image with pre-installed models
+config = ServerlessEndpoint(
+    name="stable-diffusion",
+    imageName="my-registry/stable-diffusion:v1.0",
+    gpus=[GpuGroup.AMPERE_24],
+    workersMax=3
+)
+
+# Send requests as dictionaries
+result = await config.run({
+    "input": {
+        "prompt": "A beautiful sunset over mountains",
+        "width": 512,
+        "height": 512
+    }
+})
+```
+
+## CpuServerlessEndpoint
+
+`CpuServerlessEndpoint` is for CPU workloads that require custom Docker images. Like `ServerlessEndpoint`, it only supports dictionary payloads.
+
+```python
+from runpod_flash import CpuServerlessEndpoint, CpuInstanceType
+
+config = CpuServerlessEndpoint(
+    name="data-processor",
+    imageName="python:3.11-slim",
+    instanceIds=[CpuInstanceType.CPU5C_4_8]
+)
+```
+
+### Parameters
+
+| Parameter | Type | Description | Default |
+|-----------|------|-------------|---------|
+| `name` | `string` | Name for your endpoint (required) | - |
+| `imageName` | `string` | Custom Docker image | - |
+| `instanceIds` | `list[CpuInstanceType]` | CPU instance types | - |
+| `workersMin` | `int` | Minimum number of workers | 0 |
+| `workersMax` | `int` | Maximum number of workers | 3 |
+| `idleTimeout` | `int` | Minutes before scaling down | 5 |
+| `env` | `dict` | Environment variables | `None` |
+| `networkVolumeId` | `string` | Persistent storage volume ID | `None` |
+| `executionTimeoutMs` | `int` | Max execution time in milliseconds | 0 (no limit) |
+
+## Resource class comparison
+
+| Feature | LiveServerless | ServerlessEndpoint | CpuServerlessEndpoint |
+|---------|----------------|--------------------|-----------------------|
+| Remote code execution | ✅ Full Python function execution | ❌ Dictionary payload only | ❌ Dictionary payload only |
+| Custom Docker images | ❌ Fixed optimized images | ✅ Any Docker image | ✅ Any Docker image |
+| Use case | Dynamic remote functions | Traditional API endpoints | Traditional CPU endpoints |
+| Function returns | Any Python object | Dictionary only | Dictionary only |
+| `@remote` decorator | Full functionality | Limited to payload passing | Limited to payload passing |
+
+## Available GPU types
+
+The `GpuGroup` enum provides access to GPU pools. Some common options:
+
+| GpuGroup | Description | VRAM |
+|----------|-------------|------|
+| `GpuGroup.ANY` | Any available GPU (default) | Varies |
+| `GpuGroup.ADA_24` | RTX 4090 | 24GB |
+| `GpuGroup.AMPERE_24` | RTX A5000, L4, RTX 3090 | 24GB |
+| `GpuGroup.AMPERE_48` | A40, RTX A6000 | 48GB |
+| `GpuGroup.AMPERE_80` | A100 80GB | 80GB |
+
+See [GPU types](/references/gpu-types#gpu-pools) for the complete list of available GPU pools.
+
+## Available CPU instance types
+
+The `CpuInstanceType` enum provides access to CPU configurations:
+
+### 3rd generation general purpose
+
+| CpuInstanceType | ID | vCPU | RAM |
+|-----------------|-----|------|-----|
+| `CPU3G_1_4` | cpu3g-1-4 | 1 | 4GB |
+| `CPU3G_2_8` | cpu3g-2-8 | 2 | 8GB |
+| `CPU3G_4_16` | cpu3g-4-16 | 4 | 16GB |
+| `CPU3G_8_32` | cpu3g-8-32 | 8 | 32GB |
+
+### 3rd generation compute-optimized
+
+| CpuInstanceType | ID | vCPU | RAM |
+|-----------------|-----|------|-----|
+| `CPU3C_1_2` | cpu3c-1-2 | 1 | 2GB |
+| `CPU3C_2_4` | cpu3c-2-4 | 2 | 4GB |
+| `CPU3C_4_8` | cpu3c-4-8 | 4 | 8GB |
+| `CPU3C_8_16` | cpu3c-8-16 | 8 | 16GB |
+
+### 5th generation compute-optimized
+
+| CpuInstanceType | ID | vCPU | RAM |
+|-----------------|-----|------|-----|
+| `CPU5C_1_2` | cpu5c-1-2 | 1 | 2GB |
+| `CPU5C_2_4` | cpu5c-2-4 | 2 | 4GB |
+| `CPU5C_4_8` | cpu5c-4-8 | 4 | 8GB |
+| `CPU5C_8_16` | cpu5c-8-16 | 8 | 16GB |
+
+## PodTemplate
+
+Use `PodTemplate` to configure additional pod settings:
+
+```python
+from runpod_flash import LiveServerless, PodTemplate
+
+config = LiveServerless(
+    name="custom-template",
+    template=PodTemplate(
+        containerDiskInGb=100,
+        env=[{"key": "PYTHONPATH", "value": "/workspace"}]
+    )
+)
+```
+
+### Parameters
+
+| Parameter | Type | Description | Default |
+|-----------|------|-------------|---------|
+| `containerDiskInGb` | `int` | Container disk size in GB | 20 |
+| `env` | `list[dict]` | Environment variables as key-value pairs | `None` |
+
+## Environment variables
+
+Environment variables can be set in two ways:
+
+### Using the `env` parameter
+
+```python
+config = LiveServerless(
+    name="api-worker",
+    env={"HF_TOKEN": "your_token", "MODEL_ID": "gpt2"}
+)
+```
+
+### Using PodTemplate
+
+```python
+config = LiveServerless(
+    name="api-worker",
+    template=PodTemplate(
+        env=[
+            {"key": "HF_TOKEN", "value": "your_token"},
+            {"key": "MODEL_ID", "value": "gpt2"}
+        ]
+    )
+)
+```
+
+<Note>
+
+Environment variables are excluded from configuration hashing. Changing environment values won't trigger endpoint recreation, which allows different processes to load environment variables from `.env` files without causing false drift detection. Only structural changes (like GPU type, image, or template modifications) trigger endpoint updates.
+
+</Note>
+
+## Next steps
+
+- [Create remote functions](/flash/remote-functions) using these configurations.
+- [Deploy Flash applications](/flash/apps/deploy-apps) for production.
+- [Learn about pricing](/flash/pricing) to optimize costs.