An open-source AI Gateway built for stability, extensibility, and operability.
Multi-provider / multi-model access with first-class observability, dynamic configuration, and graceful operations.
English | 简体中文
TiyGate is an independent AI Gateway product written in Rust. It sits between your applications and upstream LLM providers (OpenAI, Anthropic, Bedrock, and any OpenAI-compatible service) and gives you a single, stable control point for routing, observability, and policy.
The two things it does best:
- Multi-backend / multi-model access — one canonical entry, many providers. Cross-protocol translation (e.g. OpenAI
chat_completions→ Anthropicmessages) is a first-class capability, not a hack. - Logs and analytics — every request is captured, structured, and routed to a hot-path-safe async pipeline. No blocking the request path. No silent drops.
Most gateways optimize for one dimension. TiyGate is engineered to hold three at once.
| Quality goal | What carries it |
|---|---|
| Stability | Per-instance circuit breaker + fine-grained FallbackPolicy (error classification, retry vs. failover separated, global attempt/time budget, idempotency gate), respect for upstream Retry-After, ingress body/slow-read/concurrency limits, SIGTERM graceful drain, telemetry off the hot path |
| Extensibility | Trait + inventory decentralized registration (adding a provider = new file + one submit!); hook pipeline; Executor escape hatch for SDK-style providers; three-segment protocol identity; pluggable strategies, cache, and log sinks |
| Maintainability | core has zero dependencies on concrete providers/protocols/DB; canonical IR collapses N×N protocol translation to N; field-level capability matrix makes lossiness explicit; heavy dependencies isolated in dedicated crates |
The field-level lossiness matrix used by lossy_default_reject lives in docs/protocol-capability-matrix.md.
tiygate/
├── crates/
│ ├── core/ # Canonical IR, traits, pipeline. Zero I/O, zero concrete deps.
│ ├── protocols/ # Protocol codecs (chat_completions, messages, responses, gemini, embeddings)
│ ├── providers/ # Built-in provider metadata + auth
│ ├── provider-bedrock/ # SDK-shape provider (Executor escape hatch), heavy deps isolated
│ ├── store/ # Config OLTP (SQLite/Postgres) + pluggable log sinks
│ ├── cache/ # Embedding cache (deterministic, LLM chat/completion are NOT cached)
│ ├── admin/ # Admin REST API + OAuth flows
│ └── server/ # Ingress, data/control plane assembly, deployment modes
├── webui/ # Embedded admin console (React + TS + Vite, served at /admin/ui)
├── docs/ # Architecture design + protocol capability matrix
└── scripts/ # Operational scripts
- Rust 1.88+ (
rustup update stable) - Node.js 20+ (for building the embedded WebUI)
- An upstream provider key, e.g.
OPENAI_API_KEYorANTHROPIC_API_KEY
git clone https://github.com/tiylabs/tiygate.git
cd tiygateConfigure environment variables by copying the template, then fill in the required values:
cp .env.example .envEdit .env — the two variables you must set for a working WebUI:
# SQLite is the easiest local backend (file is created on first run)
TIYGATE_DATABASE_URL=sqlite://./tiygate.db?mode=rwc
# Admin API token — the WebUI login screen asks for this exact value
TIYGATE_ADMIN_TOKEN=dev-admin-token-change-meSee .env.example for the full list of knobs (listen address, mode, logging, streaming timeouts, etc.). The server loads .env automatically at startup when the dotenv feature is on.
Start the gateway with the embedded WebUI:
make devmake dev builds the frontend first (so rust-embed can embed it), then runs the server with the webui feature. The default listen address is 0.0.0.0:3000.
Once the server is running, open http://localhost:3000/admin/ui in your browser. Paste your TIYGATE_ADMIN_TOKEN on the login screen to enter the console. From there you can manage providers, routes, API keys, and view analytics.
curl -sS http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Say hi in one short sentence."}]
}'For streaming, add "stream": true. The server speaks Server-Sent Events end-to-end.
The same gateway will accept chat_completions and translate it to messages (Anthropic) when you route to that provider — the field-level capability matrix decides what's lossless and rejects combinations that aren't.
The tiygate binary supports three modes (selected via --mode / env / config):
| Mode | What it runs | When to use |
|---|---|---|
all |
Data plane + control plane + DB in one process | Local dev, single-node, small teams |
proxy |
Data plane only (stateless, horizontally scalable) | Production data plane |
admin |
Control plane only (Admin API + WebUI) | Production control plane |
Health probes are wired by default:
GET /healthz— liveness, returns 200 even while draining (so you don't get killed mid-roll)GET /readyz— readiness, returns 503 once the pod enters draining (so the load balancer stops sending traffic)
In all / admin modes the binary serves an embedded React console at /admin/ui (e.g. http://localhost:8080/admin/ui). It covers the full control plane — providers, routes, API keys (with one-time secret + quota editing and live usage), the OAuth authorization-code flow — plus analytics: per-model / provider / API-key stats, circuit-breaker status, request-log drill-down with replay, and the audit trail. It is bilingual (English / 简体中文).
Authentication reuses the single TIYGATE_ADMIN_TOKEN: paste it on the login screen (validated against the Admin API, stored in the browser). The UI is compiled into the binary via rust-embed (the opt-in webui feature), so the frontend must be built before the Rust crate — run scripts/build-with-webui.sh, or cd webui && npm install && npm run build followed by cargo build -p tiygate-server --features webui. See webui/README.md for development details.
Send SIGTERM (or K8s preStop) and the gateway:
- Flips
/readyzto503so the load balancer removes it from the pool - Refuses new requests with
503 + Retry-After - Lets in-flight requests (including long SSE streams) finish naturally
- On
drain_timeout(default 30s, must be ≥ single-requestdeadline), sends a protocol-native error frame to any still-open streams and runsUsageAccumulatorto prevent billing drift. The streaming path is implemented incrates/server/src/ingress.rs::drive_upstream_stream— it also adds a 120s idle timer (configurable viaTIYGATE_UPSTREAM_STREAM_IDLE_TIMEOUT_SECS), an opt-in total wall-clock budget (TIYGATE_UPSTREAM_STREAM_TOTAL_TIMEOUT_SECS, default disabled), and a 30s SSE keepalive (SseKeepaliveStream) so middleboxes do not silently drop long-quiet streams - Flushes the telemetry channel, releases resources, exits
All TiyGate knobs are read from environment variables. Unknown keys are ignored. The gateway also loads .env from the working directory at startup (when the dotenv feature is on).
| Variable | Default | Purpose |
|---|---|---|
TIYGATE_LISTEN_ADDR |
0.0.0.0:3000 |
Listen address for the HTTP server. |
TIYGATE_MODE |
all |
Deployment mode. all (data + control in one process), proxy (data plane only), admin (control plane only). |
TIYGATE_MAX_BODY_BYTES |
10485760 (10 MiB) |
Per-request body size limit for plain text / JSON. |
TIYGATE_MAX_INFLIGHT |
1024 |
Maximum concurrent in-flight requests. Beyond this, additional requests queue and are eventually rejected with 503 + Retry-After. |
TIYGATE_ROUTING_STRATEGY |
weighted |
Routing strategy across targets. weighted (default), priority, cooldown, latency. |
| Variable | Default | Purpose |
|---|---|---|
TIYGATE_UPSTREAM_STREAM_IDLE_TIMEOUT_SECS |
120 |
Idle window for upstream streaming responses. If no chunk arrives for this long, the stream is closed with a protocol-native end frame. |
TIYGATE_UPSTREAM_STREAM_TOTAL_TIMEOUT_SECS |
0 (disabled) |
Wall-clock budget for upstream streaming responses. When it elapses, the stream is closed with a protocol-native error frame. Set to 0 to opt out. |
| Variable | Default | Purpose |
|---|---|---|
TIYGATE_ADMIN_TOKEN |
unset | Bearer token required by the Admin API. When unset, Admin API requests are rejected. |
TIYGATE_MASTER_KEY |
unset | Master key used to AES-GCM-encrypt provider keys/tokens at rest. Planned for the DB-backed phase; the in-memory config store does not yet read it. Treat unset as "not encrypted" today. |
| Variable | Default | Purpose |
|---|---|---|
RUST_LOG |
info |
tracing / tracing-subscriber filter. Examples: info, tiygate=debug, tiygate_server::ingress=trace. |
- DB-driven config (OLTP): provider / route / API key CRUD via Admin API, no restart required
- Epoch versioning: data plane polls for config changes, atomically switches to the new snapshot; in-flight requests keep the old epoch until they finish — no half-old, half-new state mid-request
- Secret encryption: provider keys/tokens are AES-GCM encrypted at rest; the master key is read from
TIYGATE_MASTER_KEY
Only embedding requests are cached. LLM chat/completion is not cached — by design (non-determinism makes response caching value-low and risk-high). The cache is pluggable: process-local LRU by default, Redis shared backend for multi-replica deployments.
When enabled, a background worker gzip-compresses the full request/response payload detail of each request (8 objects per request — raw body + parsed metadata for each of the 4 hops: client→gateway, gateway→provider, provider→gateway, gateway→client), uploads them to S3-compatible object storage, verifies sha256/size, and then clears the payload text from the database in the same transaction. This keeps the DB lean for high-volume deployments while preserving full replay fidelity.
The Admin Console's request replay feature transparently hydrates archived objects back from S3 on demand (verify → decompress → return), so the user experience is unchanged whether a request's payloads live in the DB or in object storage.
Object lifecycle is decoupled from DB retention — the worker never deletes from S3; use bucket lifecycle policies for expiry.
Set TIYGATE_PAYLOAD_ARCHIVE_ENABLED=true and configure the S3 endpoint / credentials in .env. Full variable list in .env.example (TIYGATE_PAYLOAD_ARCHIVE_*).
W3C traceparent / tracestate are extracted from the inbound request and re-injected on the upstream call. The gateway span attaches to the caller's trace as a parent. Logs and traces are cross-linkable by trace_id.
# Run the full test suite
cargo test --all-features
# Lint (workspace lints forbid unsafe_code and deny unwrap/expect/panic in libs)
cargo clippy --all-features -- -D warnings
# Format check
cargo fmt --all -- --check
# Workspace-wide dependency tree
cargo tree --workspace
# Verify a heavy-dep crate is isolated (e.g. AWS SDK stays out of core)
cargo tree -p tiygate-core | grep -i aws # should be empty
cargo tree -p tiygate-provider-bedrock | head # AWS SDK lives here onlyThe CI baseline is strict: no #[allow(...)] workarounds, no unwrap/expect/panic! in library code, no dead code.
The tiygate binary is feature-gated. Pick the smallest set that matches your deployment so you don't pay compile time or binary size for components you don't ship.
| Feature | What it pulls in | When you need it |
|---|---|---|
admin |
tiygate-admin (control plane, Admin API, OAuth) |
admin / all deploy mode |
cache |
tiygate-cache (in-memory response cache) |
Anywhere that benefits from caching |
providers |
tiygate-providers (OpenAI / Anthropic / generic OpenAI-compatible) |
Any non-Bedrock LLM traffic |
bedrock |
tiygate-provider-bedrock (AWS SDK) |
Routes that target AWS Bedrock |
tracing |
tracing-subscriber with JSON formatter |
The default tiygate binary |
dotenv |
dotenvy — auto-load .env at startup |
Local development |
webui |
rust-embed — embeds webui/dist and serves the admin console at /admin/ui |
admin / all deploy mode with a UI |
Defaults: admin, cache, providers, tracing, dotenv — the common case. bedrock is opt-in (it pulls the heavy AWS SDK) — add it explicitly if you need AWS Bedrock routes. webui is also opt-in: it embeds webui/dist at compile time, so build the frontend first (cd webui && npm install && npm run build) and then build with --features webui, or just run scripts/build-with-webui.sh which does both in order.
# Default build (everything except Bedrock — that's now opt-in)
cargo build -p tiygate-server --release
# Add Bedrock back when you need it
cargo build -p tiygate-server --release --features bedrock
# Minimal data-plane proxy — drop admin / cache / bedrock
cargo build -p tiygate-server --release \
--no-default-features --features "providers,tracing,dotenv"
# Bedrock-only — skip OpenAI / Anthropic to keep the binary lean
cargo build -p tiygate-server --release \
--no-default-features --features "bedrock,tracing,dotenv"
# Control-plane only — for the `admin` deploy mode
cargo build -p tiygate-server --release \
--no-default-features --features "admin,tracing,dotenv"
# Inspect what's actually compiled in
cargo tree -p tiygate-server -e features --depth 1
bedrockis opt-in by design. Compiling the AWS SDK is the single biggest hit to your cold-build time, so we keep it out of the default. If you route to Bedrock, opt in explicitly:cargo build -p tiygate-server --release --features bedrockCI smoke matrix —
bash scripts/verify-deps.shwill still pass under any feature combination, because dependency isolation lives incore/providersand is enforced separately from theserverbuild matrix.
Issues and pull requests are welcome. The design is opinionated, and contributions that fight the layering (e.g. adding a concrete provider dependency to core, or introducing allow_lossy) will be declined.