A self-hosted API gateway that lets you talk to OpenAI, Anthropic Claude, and Google Gemini through a single endpoint. You get one API key, one billing account, and one place to see what everything costs.
Juggling three separate API keys, three dashboards, and three billing accounts gets old fast. This project proxies all three providers behind a unified /completions endpoint, tracks token usage per request, and handles billing through a simple prepaid credit system — so you always know what you're spending before you hit a limit.
- One endpoint for everything — send
{ model, messages, apiKey }to/completions, get a streaming SSE response back regardless of which provider actually handles it - Token counting and cost math — input and output tokens are measured per request, multiplied by the per-model rate from the database, and deducted from your credit balance atomically
- API key management — create multiple keys, track spending per key, toggle them on/off, soft-delete when done
- Credit top-ups — prepaid credits system with a transaction history so nothing surprises you
- React dashboard — full-featured UI to manage keys, check your balance, and see usage stats at a glance
| Area | Choice |
|---|---|
| Runtime | Bun 1.3.9 |
| Backend | Express.js + Zod validation |
| Frontend | React 19 + Tailwind CSS 4 + shadcn/ui |
| Database | PostgreSQL via Neon Serverless |
| ORM | Prisma 7 |
| Auth | JWT + bcrypt |
| Data fetching | TanStack Query 5 |
| Routing | React Router 7 |
| Monorepo | Turborepo |
| Deployment | Vercel (single serverless function + static frontend) |
The repo is a Turborepo monorepo with three apps that collapse into one Vercel deployment:
openrouter/
├── apps/
│ ├── backend/ # Auth, API keys, models, payments
│ ├── api-backend/ # LLM proxy — routes requests to OpenAI / Claude / Gemini
│ └── dashboard-frontend/ # React SPA
├── api/
│ └── [...route].ts # Vercel entrypoint — mounts both Express apps at /api/*
└── packages/
├── db/ # Prisma client + schema
├── ui/ # Shared component primitives
├── typescript-config/
└── eslint-config/
In development the three apps run on separate ports. In production they collapse into a single Vercel serverless function that mounts both Express apps under /api, with the React build served as static files from the same deployment.
When a request hits /completions:
- The API key is validated against the database (must exist, not deleted, not disabled)
- The model slug is resolved to a provider + per-token cost via
ModelProviderMapping - The request is forwarded to the right SDK —
Openai.chat(),Claude.chat(), orGemini.chat() - Token usage comes back from the provider; the cost in credits is calculated
- A Prisma transaction atomically deducts credits, updates the key's usage counter, and writes the conversation record
- The response streams back as Server-Sent Events, character by character
If the user doesn't have enough credits, it stops at step 5 with a 402 before touching any provider API.
User ──── ApiKey ──── Conversation
│ │
│ ModelProviderMapping ──── Model ──── Company
│ │
└── OnrampTransaction └──── Provider
Everything billing-related traces back to a ModelProviderMapping row that holds the per-million-token input and output costs for each model/provider combination. That's what makes it easy to add new models without touching code.
All routes are mounted under /api in production.
Auth & management (/api/auth/*, /api/api-keys, etc.)
| Route | Method | Description |
|---|---|---|
/auth/sign-up |
POST | Register |
/auth/sign-in |
POST | Login, returns JWT cookie |
/auth/profile |
GET | Current user + credit balance |
/api-keys |
POST | Create a key |
/api-keys |
GET | List all your keys |
/api-keys |
PUT | Enable / disable a key |
/api-keys/:id |
DELETE | Soft-delete a key |
/models |
GET | All available models |
/payments/onramp |
POST | Add 1,000 credits |
Gateway
| Route | Method | Description |
|---|---|---|
/completions |
POST | Stream a chat completion via SSE |
The completions endpoint expects:
{
"model": "gpt-4o",
"messages": [{ "role": "user", "content": "Hello" }],
"apiKey": "your-openrouter-key"
}And streams back data: {"content": "..."} events, ending with data: [DONE].
Requirements: Bun 1.3.9+, a Neon (or any PostgreSQL) database
# Install
bun install
# Copy and fill in env vars
cp packages/db/.env.example .env
# Push the schema and generate the Prisma client
bun run db:generateEnvironment variables:
DATABASE_URL=postgresql://...
JWT_SECRET=something-long-and-random
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...Run locally:
bun run dev # all three apps at once
bun run dev --filter=backend # port 3000
bun run dev --filter=api-backend # port 3001
bun run dev --filter=dashboard-frontend # port 9001Deploy to Vercel:
Import the repo, add the env vars, and push. The included vercel.json already handles the build command, output directory, and SPA rewrite rules. Or from the CLI:
vercelbun run build # build everything
bun run dev # run everything in watch mode
bun run lint # lint all packages
bun run check-types # tsc across the monorepo
bun run format # prettier
bun run db:generate # regenerate the Prisma client after schema changesMIT
