Skip to content

Add RunPod to Vast.ai migration guide#86

Open
wbrennan899 wants to merge 3 commits intomainfrom
examples/migrate-from-runpod
Open

Add RunPod to Vast.ai migration guide#86
wbrennan899 wants to merge 3 commits intomainfrom
examples/migrate-from-runpod

Conversation

@wbrennan899
Copy link
Collaborator

@wbrennan899 wbrennan899 commented Mar 19, 2026

Summary

Adds a comprehensive migration guide for users moving GPU workloads from RunPod to Vast.ai. The guide covers the full surface area: account setup, instance management (Pods → Instances), serverless migration, networking, and API/CLI reference.

What's included

  • Concept mapping table — RunPod terms to Vast equivalents (Pods, Templates, Volumes, Serverless, etc.)
  • Account setup — API key, CLI install, SSH key, billing
  • Migrating from Pods — Finding GPUs, Docker images, environment variables, startup scripts, port mapping, storage, and lifecycle management with side-by-side RunPod vs Vast code examples
  • Migrating from Serverless — Pre-built templates (vLLM, TGI, ComfyUI), endpoint/workergroup creation, and SDK client code. Leads with the simple template-based path rather than custom PyWorker code
  • API and CLI reference — Mapping of common RunPod CLI/API calls to Vast equivalents
  • SEO structured data — HowTo schema markup for search engines
  • Navigation entry — Added to docs.json under Examples → Migrations

Revisions (post-review)

  • De-emphasized Vast REST API throughout — Removed all 12 Vast API curl blocks from tutorial sections; users are nudged toward the vastai CLI and Python SDK instead. CodeGroups with only one remaining block were unwrapped. A note at the API reference table directs API users to the reference docs.
  • Tone made more favorable to Vast — Several comparison passages that inadvertently highlighted RunPod advantages were rewritten:
    • Bandwidth: removed "RunPod includes free bandwidth" framing; rewritten to emphasize Vast's transparent per-use billing vs. competitors that bundle bandwidth costs into inflated GPU rates
    • Volumes: removed "unlike RunPod Network Volumes, they can't move between hosts"; reframed around object storage as a more portable, provider-agnostic solution
    • Host variability: removed "home connection with occasional downtime" implication; reframed around the data Vast surfaces (reliability scores, network speeds) to make an informed choice
    • Opening paragraph: "you own more of the stack" → "you have more control…rather than accepting an opaque allocation"
    • Jupyter section: removed "RunPod pre-configures JupyterLab on official templates" framing; focused on what Vast offers
  • Docker compatibility language softened — "will likely work as-is" / "Most Runpod-compatible images" → "may work as-is" / "Many Runpod-compatible images" to avoid overpromising
  • Concept mapping table reordered — "Community Cloud / Secure Cloud" row moved up to sit immediately after "Serverless Endpoint → Worker"
  • Stale notes removed — Removed the MB-vs-GB API note and "REST API uses a JSON object instead" sentence that were only relevant alongside the now-removed API blocks

Two-track guide (Pods and Serverless) with three-tab code examples
showing RunPod API, Vast CLI, and Vast REST API side-by-side. Covers
instance creation, Docker config, networking, storage, logs, lifecycle
management, PyWorker migration, and a full API/CLI reference table.
The serverless migration section previously led with a PyWorker vs RunPod
handler code comparison, making it look like writing custom PyWorker code
was required. Now it leads with pre-built templates (vLLM, TGI, ComfyUI),
adds a "Calling Your Endpoint" section with SDK client code from the
official quickstart, and links to PyWorker docs for advanced users only.
Comment on lines +748 to +771
```bash Vast API
# Create an endpoint
curl -X POST "https://console.vast.ai/api/v0/endptjobs/" \
-H "Authorization: Bearer $VAST_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"endpoint_name": "my-llm-endpoint",
"max_workers": 5,
"cold_workers": 1,
"target_util": 0.9
}'

# Create a workergroup
curl -X POST "https://console.vast.ai/api/v0/workergroups/" \
-H "Authorization: Bearer $VAST_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"endpoint_name": "my-llm-endpoint",
"template_hash": "<TEMPLATE_HASH>",
"gpu_ram": 24
}'
```

</CodeGroup>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be including the API in guides.


## Migrating from Serverless

RunPod **Serverless** lets you deploy a handler function that scales to zero — you send a request, RunPod spins up a worker, runs your handler, and tears it down. You pay per second of compute, not for idle GPUs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we advertising why RunPod is better?


RunPod **Serverless** lets you deploy a handler function that scales to zero — you send a request, RunPod spins up a worker, runs your handler, and tears it down. You pay per second of compute, not for idle GPUs.

Vast **Serverless** serves the same purpose — autoscaling inference without managing instances — but the architecture is different. Instead of wrapping a handler function, you pick a pre-built template (vLLM, TGI, ComfyUI) and Vast runs it behind a managed proxy that handles routing, queueing, and autoscaling. For most migrations, no custom code is needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serverless templates are just that -- templates. By no means is the Vast Serverless platform built around these templates fundamentally. They mostly serve as examples. We expect people to implement their own templates, their own API wrappers, and configure their endpoints. Yes, you can use the templates out of the box, but this is more akin to using RunPod's pre-built templates. You are absolutely implementing handlers for Vast serverless, just in a different way than RunPod.

- Replace cloud.vast.ai/search/ → cloud.vast.ai/create/ (3 occurrences)
- Replace "25–50% discount/savings" → "up to 50%" to match official docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@guthrie-vast
Copy link
Collaborator

@wbrennan899 @LucasArmandVast
I made some updates to the PR, focused on de-emphasizing the Vast REST API, improving tone, and smaller fixes (wording, table reorder, stale notes).

I will get more eyeballs on this

@robballantyne robballantyne self-requested a review March 20, 2026 18:01
1. **Your existing Runpod images will likely work as-is** — Most Runpod-compatible Docker images run on Vast with minimal or no modification.
2. **Often cheaper for the same GPU** — Marketplace competition drives prices down. You'll frequently find the same hardware at lower rates than fixed-tier providers.
3. **You pick the individual machine, not just the GPU type** — Every offer shows reliability score, network speed, CPU, location, and other critical specs. Two A100s at the same price can be very different machines. Vast gives you the data to choose the right one.
4. **Bandwidth is metered** — Runpod includes free bandwidth; on Vast, egress is charged per GB at a rate shown on each offer (typically much lower than AWS).5. **Set your disk size right at launch** — Resizing requires recreating the container. Storage is cheap — err on the side of more space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vast charges for both inbound and outbound data transfers. Users will often pull large models so we should be very transparent here

| Pod | Instance | Docker container with exclusive GPU access |
| Serverless Endpoint -> Worker | Serverless Endpoint -> Workergroup -> Worker | Vast has managed autoscaling inference — see [Migrating from Serverless](#migrating-from-serverless) |
| Template | Template / Docker image | Specify a Docker image and configuration at launch |
| Network Volume | (Local) Volume | Vast volumes are currently local to one machine, not network-portable — see [Storage](#storage) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our volumes are more closely related to RP volume-disk. Benefit is attachment to one of many GPUs in a single node

| Serverless Endpoint -> Worker | Serverless Endpoint -> Workergroup -> Worker | Vast has managed autoscaling inference — see [Migrating from Serverless](#migrating-from-serverless) |
| Template | Template / Docker image | Specify a Docker image and configuration at launch |
| Network Volume | (Local) Volume | Vast volumes are currently local to one machine, not network-portable — see [Storage](#storage) |
| Hub | [Model Library](/documentation/serverless/getting-started-with-serverless) + [Template Library](/documentation/templates/introduction) | Vast has official templates for specific models in addition to base templates for lower-level control |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Official templates for many popular inference engines & applications along with specific model configs through the model library - Or similar?

If you have a working Runpod template, you likely already have a Docker image that works on Vast. Most Runpod-compatible images run as-is — just specify the image in the `--image` flag.

To minimize cold start times:
- Use **Vast base images** which are pre-cached on many hosts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs further explanation and link to the base-image github/dockerhub repos. Docs there should help users


Both platforms provide proxy access to services. On Runpod, proxy URLs are static: `https://<POD_ID>-<PORT>.proxy.runpod.net`. On Vast, there are two proxy mechanisms:

- **HTTP/HTTPS proxy** — instances using [Vast base images](https://github.com/vast-ai/base-image/) get auto-generated Cloudflare tunnel URLs (`https://four-random-words.trycloudflare.com`) per open port via the [Instance Portal](/documentation/instances/connect/instance-portal).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloudflare tunnels are best effort and may not always be available. User can configure their instance to use the built-in jupyter cert to ensure TLS

Comment on lines +782 to +786
Runpod **Serverless** lets you deploy a handler function that scales to zero — you send a request, Runpod spins up a worker, runs your handler, and tears it down. You pay per second of compute, not for idle GPUs.

Vast **Serverless** delivers autoscaling inference at marketplace rates — no usage tiers, no hidden surcharges, just per-second billing across 68+ GPU types globally. Rather than wrapping a handler function, you select a pre-built template (vLLM, TGI, ComfyUI) and Vast handles routing, queueing, and autoscaling automatically.

**Pricing:** Runpod charges a premium for Serverless GPU time on top of the base instance cost. On Vast, Serverless workers run on the same marketplace instances you'd rent directly — you pay the same rate, just with autoscaling on top.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to point out that having "—" everywhere makes it very obvious that this text was LLM generated. Even if it wasn't, people will assume that it was. That's not necessarily a problem, but feels a bit unprofessional to me. Just my opinion though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants