This document defines the V1 production shape and the local Docker topology. It does not provision cloud resources.
For initial us-east and london production edge host deployments, use the
operational runbook and production-style examples in
docs/edge-host-runbook-v1.md.
Use the image release workflow in docs/release-images-v1.md
to build production images and deploy immutable digest refs from the release
manifest.
Public V1 billing uses Autumn; see
docs/billing-autumn-v1.md for the customer mapping,
feature IDs, failure policy, and usage tracking model.
Run the public launch gate before deploy or promotion; see
docs/launch-gate-v1.md.
rend-api: Rust API and control plane. It owns upload ingest, asset state, Postgres migrations, playback bootstrap, therend.edge_nodesregistry, best-effort edge warm/purge fanout, and telemetry ingestion into ClickHouse.rend-media-worker: the same repo runtime, started asrend-api worker media. It claims queued media jobs, usesffmpegandffprobe, writes artifacts to S3-compatible storage, and asks healthy registered edges to warm playback artifacts.rend-edge: Rust playback edge. It validates signed playback URLs locally, serves playback artifacts, fills and coalesces local cache misses from object storage, exposes internal warm/purge endpoints, registers and heartbeats withrend-apiwhen configured, and spools playback telemetry locally before sending it torend-api.
Production dependencies are external managed services: Postgres, Redis, S3-compatible object storage, and ClickHouse.
Production browser playback should use the site route only for JSON bootstrap, then send hot media bytes directly to the selected public edge hostname:
browser
-> https://www.rend.so/api/player/{assetId}
-> Next.js route handler uses Vercel geo headers and shared route config
-> route returns tokenless https://ash-1.play.rend.so/v/{assetId}/... URLs
-> https://ash-1.play.rend.so/v/{assetId}/{artifactPath}
-> edge ingress
-> rend-edge
-> edge-local cache or object-storage fillProduction edge selection comes from the checked-in
@rend/playback-routing route table. The site route uses Vercel's
@vercel/functions geolocation(request) helper to read latitude, longitude,
country, and country-region from the incoming request. It selects the closest
configured metal route by great-circle distance. If coordinates are not
available, it falls back through country, optional continent header, then the
default metal route. Current public metal routes:
ash-1 us-east https://ash-1.play.rend.so
ams-1 amsterdam https://ams-1.play.rend.soREND_PLAYER_EDGE_BASE_URLS remains available as a non-production/local escape
hatch, or as an explicit override with REND_PLAYER_EDGE_BASE_URLS_MODE=override.
Production should not need a per-request control-plane lookup or a separate
media load balancer just to choose the closest edge.
Keep REND_PLAYER_PLAYBACK_BASE_URL as an emergency fallback single-edge base.
GET /api/player/{assetId} returns direct
/v/{assetId}/... media URLs and sets an HttpOnly playback cookie scoped to
/v/{assetId}/. The cookie may use Domain=rend.so only when both the site host
and media host are trusted Rend domains.
The legacy same-origin artifact route
/api/player/{assetId}/artifact/{artifactPath} remains as a fallback for local
or unconfigured environments, but it should not carry production media bytes
once direct edge delivery is configured. Edge headers such as
x-rend-cache: HIT, x-rend-edge-id, and x-rend-region prove that the
upstream rend-edge served the artifact from its local cache. The direct path
removes Next/Vercel from the media byte path.
The direct path keeps playback tokens out of JavaScript-visible URLs, uses only
configured public edge hosts, requires credentialed CORS from allowed Rend
origins, preserves the HttpOnly playback credential boundary, avoids exposing
/internal/*, and keeps private/authenticated media private.
The site route logs one structured rend_player_edge_selected event per
playback bootstrap. The event includes Vercel request id, Vercel edge region
from the geolocation helper, country, country-region, optional continent header,
whether valid coordinates were present, selected metal route id/region,
selection reason, playback host, and rounded route distance. It does not log IP
addresses, cookies, auth headers, playback tokens, signed URLs, or raw
coordinates.
compose.yml mirrors the production roles with local services:
- Postgres on host port
5432 - Redis on host port
6379 - MinIO S3 API on host port
9100and console on9101 - ClickHouse HTTP on host port
8123 rend-apion host port4000- default
rend-edgeon host port4100 - optional
rend-edge-us-easton host port4101 - optional
rend-edge-londonon host port4102
Container-to-container URLs use Docker service names: postgres, redis,
minio, clickhouse, rend-api, and rend-edge. REND_PLAYBACK_BASE_URL
is the local client-facing URL and defaults to http://127.0.0.1:4100. Edge
containers register API-reachable REND_EDGE_BASE_URL values in
rend.edge_nodes; API and worker fan out warm/purge calls to all healthy edges
with fresh heartbeats.
Run the default single-edge stack:
bun run backend:docker:build
bun run backend:docker:upRun the two-edge simulation:
docker compose --profile two-edge up -d rend-edge-us-east rend-edge-london
bun run backend:docker:two-edge-smokeDeploy the same image targets:
rend-api:Dockerfiletargetrend-apirend-media-worker:Dockerfiletargetrend-media-workerrend-edge:Dockerfiletargetrend-edge
The runtime image includes ffmpeg and ffprobe so the media worker can run
without host media tooling. In production, run API, worker, and edge as separate
services even when they share a repository and image lineage.
Canonical image repositories are rend-api, rend-media-worker, and
rend-edge. For a registry prefix such as registry.example.com/rend, the
release script builds registry.example.com/rend/rend-api,
registry.example.com/rend/rend-media-worker, and
registry.example.com/rend/rend-edge. Production compose variables should use
the manifest image_digest values, for example
registry.example.com/rend/rend-api@sha256:..., instead of mutable tags.
API:
REND_ENV=local|productionDATABASE_URLREND_REDIS_URLCLICKHOUSE_URLCLICKHOUSE_DATABASECLICKHOUSE_USERCLICKHOUSE_PASSWORDOBJECT_STORE_HEALTH_URLS3_ENDPOINTS3_REGIONS3_BUCKETAWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYREND_API_BIND_ADDRREND_API_CORS_ALLOWED_ORIGINSREND_API_AUTO_MIGRATEREND_SITE_INTERNAL_TOKENREND_BILLING_MODE=local|autumn(autumnis required in production)AUTUMN_SECRET_KEYwhenREND_BILLING_MODE=autumnAUTUMN_API_URLAUTUMN_API_VERSIONREND_BILLING_FEATURE_DELIVERY_720PREND_BILLING_FEATURE_DELIVERY_1080PREND_BILLING_FEATURE_DELIVERY_2KREND_BILLING_FEATURE_DELIVERY_4KREND_BILLING_FEATURE_STORAGE_720PREND_BILLING_FEATURE_STORAGE_1080PREND_BILLING_FEATURE_STORAGE_2KREND_BILLING_FEATURE_STORAGE_4KREND_BILLING_ENTITLEMENT_FAILURE_POLICYREND_BILLING_DELIVERY_SYNC_LAG_SECSREND_BILLING_DELIVERY_SYNC_MAX_WINDOW_SECSREND_BILLING_STORAGE_SYNC_LAG_SECSREND_BILLING_STORAGE_SYNC_MAX_WINDOW_SECSREND_PLAYBACK_BASE_URLREND_PLAYER_PLAYBACK_BASE_URLon the site deploymentREND_PLAYER_EDGE_BASE_URLSon local/test site deployments only, or withREND_PLAYER_EDGE_BASE_URLS_MODE=overridefor emergency production overrideREND_PLAYBACK_COOKIE_DOMAINREND_MAX_UPLOAD_BYTESREND_EDGE_ACTIVE_HEARTBEAT_WINDOW_SECSREND_EXPECTED_EDGESREND_ALLOW_INSECURE_EDGE_URLSREND_EDGE_INTERNAL_TOKENREND_INTERNAL_TELEMETRY_TOKENREND_PLAYBACK_SIGNING_KEY_IDREND_PLAYBACK_SIGNING_SECRETREND_PLAYBACK_TOKEN_TTL_SECS
REND_EDGE_WARM_URL and REND_EDGE_PURGE_URL are optional single-edge
fallbacks for local/dev or emergency debugging. Leave them unset in normal
production so warm/purge fanout uses registered healthy edges.
REND_EXPECTED_EDGES uses comma-separated
edge_id=region=base_url entries. In production, edge base URLs must be
HTTPS.
Worker:
- all API dependency vars used for Postgres, Redis, S3, ClickHouse, playback signing, edge internal auth, and Autumn billing
REND_API_AUTO_MIGRATE=falseafter the API migration step is deployed
The Release and Deploy Backend workflow syncs the deploy-managed allowlist
into the control-plane API and worker env files before deployment, including
CLICKHOUSE_*, REND_API_CORS_ALLOWED_ORIGINS, and billing keys. The
Production GitHub environment must include CLICKHOUSE_URL, CLICKHOUSE_USER,
CLICKHOUSE_PASSWORD, and AUTUMN_SECRET_KEY; the sync helper refuses to run
unless the Autumn key is visibly live, and logs only key names.
REND_MEDIA_WORKER_IDREND_MEDIA_WORKER_POLL_INTERVAL_SECSREND_MEDIA_JOB_LOCK_TIMEOUT_SECSREND_MEDIA_PROCESS_TIMEOUT_SECSREND_FFMPEG_PATHREND_FFPROBE_PATH
Edge:
REND_ENV=local|productionREND_EDGE_BIND_ADDRREND_EDGE_IDREND_EDGE_REGIONREND_EDGE_BASE_URLREND_EDGE_CORS_ALLOWED_ORIGINSREND_EXPECTED_EDGESREND_ALLOW_INSECURE_EDGE_URLSREND_CONTROL_PLANE_URLREND_EDGE_HEARTBEAT_INTERVAL_SECSREND_EDGE_CACHE_MAX_BYTESREND_EDGE_CACHE_MIN_FREE_BYTESREND_EDGE_CACHE_DIRREND_EDGE_ORIGIN_HEALTH_URLS3_ENDPOINTS3_REGIONS3_BUCKETAWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYREND_EDGE_INTERNAL_TOKENREND_EDGE_WARM_MAX_ARTIFACTS(default16; enough for the HLS master, four variant playlists, and the first two segments for each generated tier)REND_EDGE_MAX_IN_FLIGHT_FILLSREND_EDGE_MAX_ORIGIN_ARTIFACT_BYTESREND_INTERNAL_TELEMETRY_TOKENREND_EDGE_TELEMETRY_ENABLEDREND_EDGE_TELEMETRY_INGEST_URLREND_EDGE_TELEMETRY_QUEUE_CAPACITYREND_EDGE_TELEMETRY_BATCH_SIZEREND_EDGE_TELEMETRY_FLUSH_INTERVAL_SECSREND_EDGE_TELEMETRY_REQUEST_TIMEOUT_SECSREND_EDGE_TELEMETRY_SPOOL_DIRREND_EDGE_TELEMETRY_SPOOL_MAX_BYTESREND_PLAYBACK_SIGNING_KEY_IDREND_PLAYBACK_SIGNING_SECRET
Use .env.local.example for host development and .env.docker.example for
Docker service-name defaults. Production secrets must come from
.env.production.local for local production-targeted checks or from the
deployment platform for real deploys, not from .env.local.
Production mode rejects empty required secrets, checked-in dev
defaults, and local service URLs such as localhost, 127.0.0.1, minio,
rend-api, or rend-edge. rend-edge streams cold playback misses while writing
atomic cache files and enforces cache size/free-space bounds with deterministic
priority eviction.
Local validation and production-profile validation are separate:
bun run env:local
bun run env:production
bun run verify:production-localProduction-profile commands load .env.production and .env.production.local
or host/platform env vars. They do not load .env.local.
Local Compose uses persistent volumes for:
rend-postgres-datarend-redis-datarend-minio-datarend-clickhouse-datarend-edge-cacherend-edge-telemetry-spoolrend-edge-us-east-cacherend-edge-us-east-telemetry-spoolrend-edge-london-cacherend-edge-london-telemetry-spool
In production, Postgres, Redis, object storage, and ClickHouse are managed externally. Each edge node keeps local cache and telemetry spool volumes. These edge volumes are node-local, not shared.
- Postgres:
pg_isready - Redis:
redis-cli ping - MinIO:
/minio/health/ready - ClickHouse:
SELECT 1 rend-api:GET /readyzrend-edge:GET /readyzrend-media-worker: process liveness check forrend-api worker media
API readiness checks Postgres, Redis, and object storage. Edge readiness checks the local cache directory and object-store origin. Worker readiness is process liveness because the worker has no HTTP listener.
Postgres migrations are applied by rend-api through SQLx when
REND_API_AUTO_MIGRATE=true. In local Compose, the worker waits for the API and
sets REND_API_AUTO_MIGRATE=false to avoid duplicate startup migration work.
For production, deploy or run the API migration step before starting workers.
ClickHouse schema is applied by the local clickhouse-init one-shot service on
every Compose startup. The schema uses CREATE DATABASE IF NOT EXISTS and
CREATE TABLE IF NOT EXISTS, so repeated runs are safe.
MinIO bucket creation is handled by the local-only minio-init one-shot
service. Production object storage should be provisioned outside this repo.
Use the checked-in operator scripts for first-host production deployments. They do not provision cloud resources, DNS, TLS, proxies, registry credentials, image signing, or SBOMs.
Validate production env files before deploy:
scripts/validate-production-env.sh --role control-plane
scripts/validate-production-env.sh --role edge-hostThe validator requires vars to be present, rejects placeholder values, rejects
local/dev defaults unless --allow-dev-defaults is passed, and checks URL,
port, boolean, numeric, and path shapes. For local Docker example dry-runs:
scripts/validate-production-env.sh --role all --allow-dev-defaults \
--api-env .env.docker.example \
--worker-env .env.docker.example \
--edge-env .env.docker.exampleRun host preflight before deploy. Production manifests must contain
image_digest refs and platform metadata for the required services. The host
expectation defaults to linux/amd64; pass --expected-platform only for an
intentional architecture change:
scripts/preflight-control-plane-host.sh \
--manifest .rend/releases/production-001.json
scripts/preflight-edge-host.sh \
--manifest .rend/releases/production-001.jsonThe control-plane preflight checks Docker/Compose, compose/env files, manifest
digest refs, manifest platform metadata, manifest image pull readiness, pulled
image OS/architecture, managed dependency connectivity where local tools allow
it, and host bind ports. The edge preflight checks Docker/Compose, edge env,
manifest digest ref, manifest platform metadata, manifest image pull readiness,
pulled image OS/architecture,
private-by-default direct port publishing, uid/gid 10001 cache and spool
writeability, object-store health, control-plane register/heartbeat
reachability, telemetry ingest reachability, and host bind ports.
Use deploy helpers in dry-run mode first to print the exact Compose commands with manifest image refs:
scripts/deploy-control-plane-host.sh \
--manifest .rend/releases/production-001.json \
--dry-run
scripts/deploy-edge-host.sh \
--manifest .rend/releases/production-001.json \
--dry-runAfter deploy, verify the first-host path:
scripts/verify-first-host-deploy.sh \
--api-base https://api.rend.so \
--edge-base https://edge-us-east.example.com \
--edge-internal-base http://10.0.10.12:4100 \
--edge-base https://edge-london.example.com \
--edge-internal-base http://10.0.20.12:4100 \
--api-env /etc/rend/rend-api.env \
--edge-env /etc/rend/rend-edge.env \
--asset-id 00000000-0000-0000-0000-000000000000 \
--rewrite-playback-baseThe verifier checks API /readyz, private edge /readyz, all expected edge
registrations, the public deny surface, warmed HIT signed playback on each
edge, playback analytics increasing after the smoke requests, no
dropped-telemetry increase, and telemetry spool bytes returning to 0. It reads
Postgres and ClickHouse settings from --api-env, or from explicit
--database-url, --clickhouse-url, --clickhouse-database,
--clickhouse-user, and --clickhouse-password flags for laptop or bastion
runs. For psql probes only, it normalizes hosted Postgres URLs by removing
sslrootcert=system; the service DATABASE_URL is not rewritten.
Run the synthetic playback readiness gate before and after production deploys that can affect upload ingest, media processing, playback bootstrap, edge cache behavior, telemetry, or deploy routing. The gate uploads generated test media only; it does not use customer media.
Local two-edge run:
bun run playback:readinessThe default target starts the local Docker stack plus the two-edge profile,
then verifies rend-edge-us-east on http://127.0.0.1:4101 and
rend-edge-london on http://127.0.0.1:4102.
Production-style run:
REND_API_BASE_URL=https://api.rend.so \
REND_READINESS_API_KEY=<api-key-with-upload-read-delete-analytics> \
REND_EDGE_INTERNAL_TOKEN=<edge-internal-token> \
REND_READINESS_EDGES='edge-us=us-east=https://edge-us.example.com=http://10.0.10.12:4100,edge-eu=london=https://edge-eu.example.com=http://10.0.20.12:4100' \
bun run playback:readiness -- --target configured --skip-local-stackREND_READINESS_EDGES uses
edge_id=region=public_playback_base[=private_edge_base]. The public base is
used for signed playback fetches; the private base is used for /readyz,
/internal/warm, /internal/purge, and /metrics. If the private base is
omitted, the public base is used for both.
To include the gate in first-host verification:
scripts/verify-first-host-deploy.sh \
--api-base https://api.rend.so \
--edge-base https://edge-us-east.example.com \
--edge-internal-base http://10.0.10.12:4100 \
--edge-base https://edge-london.example.com \
--edge-internal-base http://10.0.20.12:4100 \
--api-env /etc/rend/rend-api.env \
--edge-env /etc/rend/rend-edge.env \
--asset-id 00000000-0000-0000-0000-000000000000 \
--rewrite-playback-base \
--run-readiness-gateThe gate writes a run artifact and updates
.rend/readiness/playback-readiness-latest.json for the private operator UI.
Set REND_READINESS_OUTPUT, REND_READINESS_LATEST_OUTPUT, or
REND_READINESS_ARTIFACT_PATH to place or read the latest result elsewhere.
Artifacts are redacted and are checked before write: they must not contain full
URLs, cookies, signed URL query tokens, authorization headers, bearer tokens,
configured API keys, edge internal tokens, or client IPs.
The result status means:
pass: correctness checks passed and all measured timings stayed under warn thresholds.warn: correctness checks passed, but one or more conservative performance warn thresholds were exceeded. Treat this as a deploy note unless the trend is regressing.fail: a correctness/safety check failed or a fail threshold was exceeded. Do not promote the deploy until the artifact'sfailureslist is resolved.
Correctness failures include missing expected edges, non-200 upload/bootstrap/ playback responses, non-tokenless playback URL shape, wrong content types, unexpected cache headers, telemetry visibility timeout, dropped telemetry increase, nonzero telemetry spool bytes after the run, unredacted artifact content, or synthetic cleanup failure.
Performance thresholds can be configured with env vars:
REND_READINESS_WARN_UPLOAD_RESPONSE_MS,REND_READINESS_FAIL_UPLOAD_RESPONSE_MSREND_READINESS_WARN_UPLOAD_TO_OPENER_PLAYABLE_MS,REND_READINESS_FAIL_UPLOAD_TO_OPENER_PLAYABLE_MSREND_READINESS_WARN_UPLOAD_TO_HLS_READY_MS,REND_READINESS_FAIL_UPLOAD_TO_HLS_READY_MSREND_READINESS_WARN_PLAYBACK_BOOTSTRAP_MS,REND_READINESS_FAIL_PLAYBACK_BOOTSTRAP_MSREND_READINESS_WARN_EDGE_TTFB_MISS_MS,REND_READINESS_FAIL_EDGE_TTFB_MISS_MSREND_READINESS_WARN_EDGE_TTFB_HIT_MS,REND_READINESS_FAIL_EDGE_TTFB_HIT_MSREND_READINESS_WARN_EDGE_TTFB_WARMED_HIT_MS,REND_READINESS_FAIL_EDGE_TTFB_WARMED_HIT_MSREND_READINESS_WARN_TELEMETRY_VISIBILITY_MS,REND_READINESS_FAIL_TELEMETRY_VISIBILITY_MS
The bytes-per-delivered-minute value is a proxy from synthetic playback bytes and fixture duration. It is useful for deploy comparison, not billing-grade usage or watch accounting.
- Provision managed Postgres, Redis, S3-compatible storage, and ClickHouse.
- Apply or confirm ClickHouse schema.
- From a clean git worktree, build and optionally push images with
bun run release:images -- --tag production-001 --registry <registry-prefix> --platform linux/amd64 --push. Pushed releases require the git SHA to be reachable from a pushed branch or tag and copy the accepted manifest todocs/releases/. - Copy production-style compose files, real env files, and the release manifest to the target hosts.
- Run
scripts/validate-production-env.shand the relevant preflight script on each host. - Run the deploy helper with
--dry-run, then run it without--dry-run. - Deploy
rend-apiwithREND_API_AUTO_MIGRATE=truefor the migration step. - Start
rend-apiserving traffic after/readyzpasses. - Start
rend-edgenodes with uniqueREND_EDGE_ID,REND_EDGE_REGION, API-reachableREND_EDGE_BASE_URL, cache volume, and telemetry spool volume. - Start
rend-media-workerwithREND_API_AUTO_MIGRATE=false. - Run
scripts/verify-first-host-deploy.shwith a providedhls_readyasset to confirm edge registration, signed playback, and telemetry analytics. - Run
bun run playback:readiness -- --target configured --skip-local-stackor pass--run-readiness-gateto the verifier before promoting traffic.
Roll back services in dependency order from the edge inward:
- Roll back
rend-edgefirst if playback cache behavior regresses. - Roll back
rend-media-workerif artifact generation or warming regresses. - Roll back
rend-apilast. Treat Postgres migrations as forward-only unless a tested rollback migration exists.
Edge cache can be purged or discarded during rollback. Telemetry spool files can be retained for replay or deleted if the ingest contract changed incompatibly.
US East and London edge nodes differ only by environment and attached volumes:
REND_EDGE_IDREND_EDGE_REGIONREND_EDGE_BASE_URL- host port or load balancer target
- local cache volume
- local telemetry spool volume
The same rend-edge image and command run in both regions.