Rend Playback Edge V1 Deployment

This document defines the V1 production shape and the local Docker topology. It does not provision cloud resources.

For initial us-east and london production edge host deployments, use the operational runbook and production-style examples in docs/edge-host-runbook-v1.md. Use the image release workflow in docs/release-images-v1.md to build production images and deploy immutable digest refs from the release manifest. Public V1 billing uses Autumn; see docs/billing-autumn-v1.md for the customer mapping, feature IDs, failure policy, and usage tracking model. Run the public launch gate before deploy or promotion; see docs/launch-gate-v1.md.

Service Topology

rend-api: Rust API and control plane. It owns upload ingest, asset state, Postgres migrations, playback bootstrap, the rend.edge_nodes registry, best-effort edge warm/purge fanout, and telemetry ingestion into ClickHouse.
rend-media-worker: the same repo runtime, started as rend-api worker media. It claims queued media jobs, uses ffmpeg and ffprobe, writes artifacts to S3-compatible storage, and asks healthy registered edges to warm playback artifacts.
rend-edge: Rust playback edge. It validates signed playback URLs locally, serves playback artifacts, fills and coalesces local cache misses from object storage, exposes internal warm/purge endpoints, registers and heartbeats with rend-api when configured, and spools playback telemetry locally before sending it to rend-api.

Production dependencies are external managed services: Postgres, Redis, S3-compatible object storage, and ClickHouse.

Browser Media Path

Production browser playback should use the site route only for JSON bootstrap, then send hot media bytes directly to the selected public edge hostname:

browser
  -> https://www.rend.so/api/player/{assetId}
  -> Next.js route handler uses Vercel geo headers and shared route config
  -> route returns tokenless https://ash-1.play.rend.so/v/{assetId}/... URLs
  -> https://ash-1.play.rend.so/v/{assetId}/{artifactPath}
  -> edge ingress
  -> rend-edge
  -> edge-local cache or object-storage fill

Production edge selection comes from the checked-in @rend/playback-routing route table. The site route uses Vercel's @vercel/functions geolocation(request) helper to read latitude, longitude, country, and country-region from the incoming request. It selects the closest configured metal route by great-circle distance. If coordinates are not available, it falls back through country, optional continent header, then the default metal route. Current public metal routes:

ash-1  us-east    https://ash-1.play.rend.so
ams-1  amsterdam  https://ams-1.play.rend.so

REND_PLAYER_EDGE_BASE_URLS remains available as a non-production/local escape hatch, or as an explicit override with REND_PLAYER_EDGE_BASE_URLS_MODE=override. Production should not need a per-request control-plane lookup or a separate media load balancer just to choose the closest edge.

Keep REND_PLAYER_PLAYBACK_BASE_URL as an emergency fallback single-edge base. GET /api/player/{assetId} returns direct /v/{assetId}/... media URLs and sets an HttpOnly playback cookie scoped to /v/{assetId}/. The cookie may use Domain=rend.so only when both the site host and media host are trusted Rend domains.

The legacy same-origin artifact route /api/player/{assetId}/artifact/{artifactPath} remains as a fallback for local or unconfigured environments, but it should not carry production media bytes once direct edge delivery is configured. Edge headers such as x-rend-cache: HIT, x-rend-edge-id, and x-rend-region prove that the upstream rend-edge served the artifact from its local cache. The direct path removes Next/Vercel from the media byte path.

The direct path keeps playback tokens out of JavaScript-visible URLs, uses only configured public edge hosts, requires credentialed CORS from allowed Rend origins, preserves the HttpOnly playback credential boundary, avoids exposing /internal/*, and keeps private/authenticated media private.

The site route logs one structured rend_player_edge_selected event per playback bootstrap. The event includes Vercel request id, Vercel edge region from the geolocation helper, country, country-region, optional continent header, whether valid coordinates were present, selected metal route id/region, selection reason, playback host, and rounded route distance. It does not log IP addresses, cookies, auth headers, playback tokens, signed URLs, or raw coordinates.

Local Docker Topology

compose.yml mirrors the production roles with local services:

Postgres on host port 5432
Redis on host port 6379
MinIO S3 API on host port 9100 and console on 9101
ClickHouse HTTP on host port 8123
rend-api on host port 4000
default rend-edge on host port 4100
optional rend-edge-us-east on host port 4101
optional rend-edge-london on host port 4102

Container-to-container URLs use Docker service names: postgres, redis, minio, clickhouse, rend-api, and rend-edge. REND_PLAYBACK_BASE_URL is the local client-facing URL and defaults to http://127.0.0.1:4100. Edge containers register API-reachable REND_EDGE_BASE_URL values in rend.edge_nodes; API and worker fan out warm/purge calls to all healthy edges with fresh heartbeats.

Run the default single-edge stack:

bun run backend:docker:build
bun run backend:docker:up

Run the two-edge simulation:

docker compose --profile two-edge up -d rend-edge-us-east rend-edge-london
bun run backend:docker:two-edge-smoke

Production Topology

Deploy the same image targets:

rend-api: Dockerfile target rend-api
rend-media-worker: Dockerfile target rend-media-worker
rend-edge: Dockerfile target rend-edge

The runtime image includes ffmpeg and ffprobe so the media worker can run without host media tooling. In production, run API, worker, and edge as separate services even when they share a repository and image lineage.

Canonical image repositories are rend-api, rend-media-worker, and rend-edge. For a registry prefix such as registry.example.com/rend, the release script builds registry.example.com/rend/rend-api, registry.example.com/rend/rend-media-worker, and registry.example.com/rend/rend-edge. Production compose variables should use the manifest image_digest values, for example registry.example.com/rend/rend-api@sha256:..., instead of mutable tags.

Required Env Vars

API:

REND_ENV=local|production
DATABASE_URL
REND_REDIS_URL
CLICKHOUSE_URL
CLICKHOUSE_DATABASE
CLICKHOUSE_USER
CLICKHOUSE_PASSWORD
OBJECT_STORE_HEALTH_URL
S3_ENDPOINT
S3_REGION
S3_BUCKET
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
REND_API_BIND_ADDR
REND_API_CORS_ALLOWED_ORIGINS
REND_API_AUTO_MIGRATE
REND_SITE_INTERNAL_TOKEN
REND_BILLING_MODE=local|autumn (autumn is required in production)
AUTUMN_SECRET_KEY when REND_BILLING_MODE=autumn
AUTUMN_API_URL
AUTUMN_API_VERSION
REND_BILLING_FEATURE_DELIVERY_720P
REND_BILLING_FEATURE_DELIVERY_1080P
REND_BILLING_FEATURE_DELIVERY_2K
REND_BILLING_FEATURE_DELIVERY_4K
REND_BILLING_FEATURE_STORAGE_720P
REND_BILLING_FEATURE_STORAGE_1080P
REND_BILLING_FEATURE_STORAGE_2K
REND_BILLING_FEATURE_STORAGE_4K
REND_BILLING_ENTITLEMENT_FAILURE_POLICY
REND_BILLING_DELIVERY_SYNC_LAG_SECS
REND_BILLING_DELIVERY_SYNC_MAX_WINDOW_SECS
REND_BILLING_STORAGE_SYNC_LAG_SECS
REND_BILLING_STORAGE_SYNC_MAX_WINDOW_SECS
REND_PLAYBACK_BASE_URL
REND_PLAYER_PLAYBACK_BASE_URL on the site deployment
REND_PLAYER_EDGE_BASE_URLS on local/test site deployments only, or with REND_PLAYER_EDGE_BASE_URLS_MODE=override for emergency production override
REND_PLAYBACK_COOKIE_DOMAIN
REND_MAX_UPLOAD_BYTES
REND_EDGE_ACTIVE_HEARTBEAT_WINDOW_SECS
REND_EXPECTED_EDGES
REND_ALLOW_INSECURE_EDGE_URLS
REND_EDGE_INTERNAL_TOKEN
REND_INTERNAL_TELEMETRY_TOKEN
REND_PLAYBACK_SIGNING_KEY_ID
REND_PLAYBACK_SIGNING_SECRET
REND_PLAYBACK_TOKEN_TTL_SECS

REND_EDGE_WARM_URL and REND_EDGE_PURGE_URL are optional single-edge fallbacks for local/dev or emergency debugging. Leave them unset in normal production so warm/purge fanout uses registered healthy edges. REND_EXPECTED_EDGES uses comma-separated edge_id=region=base_url entries. In production, edge base URLs must be HTTPS.

Worker:

all API dependency vars used for Postgres, Redis, S3, ClickHouse, playback signing, edge internal auth, and Autumn billing
REND_API_AUTO_MIGRATE=false after the API migration step is deployed

The Release and Deploy Backend workflow syncs the deploy-managed allowlist into the control-plane API and worker env files before deployment, including CLICKHOUSE_*, REND_API_CORS_ALLOWED_ORIGINS, and billing keys. The Production GitHub environment must include CLICKHOUSE_URL, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, and AUTUMN_SECRET_KEY; the sync helper refuses to run unless the Autumn key is visibly live, and logs only key names.

REND_MEDIA_WORKER_ID
REND_MEDIA_WORKER_POLL_INTERVAL_SECS
REND_MEDIA_JOB_LOCK_TIMEOUT_SECS
REND_MEDIA_PROCESS_TIMEOUT_SECS
REND_FFMPEG_PATH
REND_FFPROBE_PATH

Edge:

REND_ENV=local|production
REND_EDGE_BIND_ADDR
REND_EDGE_ID
REND_EDGE_REGION
REND_EDGE_BASE_URL
REND_EDGE_CORS_ALLOWED_ORIGINS
REND_EXPECTED_EDGES
REND_ALLOW_INSECURE_EDGE_URLS
REND_CONTROL_PLANE_URL
REND_EDGE_HEARTBEAT_INTERVAL_SECS
REND_EDGE_CACHE_MAX_BYTES
REND_EDGE_CACHE_MIN_FREE_BYTES
REND_EDGE_CACHE_DIR
REND_EDGE_ORIGIN_HEALTH_URL
S3_ENDPOINT
S3_REGION
S3_BUCKET
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
REND_EDGE_INTERNAL_TOKEN
REND_EDGE_WARM_MAX_ARTIFACTS (default 16; enough for the HLS master, four variant playlists, and the first two segments for each generated tier)
REND_EDGE_MAX_IN_FLIGHT_FILLS
REND_EDGE_MAX_ORIGIN_ARTIFACT_BYTES
REND_INTERNAL_TELEMETRY_TOKEN
REND_EDGE_TELEMETRY_ENABLED
REND_EDGE_TELEMETRY_INGEST_URL
REND_EDGE_TELEMETRY_QUEUE_CAPACITY
REND_EDGE_TELEMETRY_BATCH_SIZE
REND_EDGE_TELEMETRY_FLUSH_INTERVAL_SECS
REND_EDGE_TELEMETRY_REQUEST_TIMEOUT_SECS
REND_EDGE_TELEMETRY_SPOOL_DIR
REND_EDGE_TELEMETRY_SPOOL_MAX_BYTES
REND_PLAYBACK_SIGNING_KEY_ID
REND_PLAYBACK_SIGNING_SECRET

Use .env.local.example for host development and .env.docker.example for Docker service-name defaults. Production secrets must come from .env.production.local for local production-targeted checks or from the deployment platform for real deploys, not from .env.local.

Production mode rejects empty required secrets, checked-in dev defaults, and local service URLs such as localhost, 127.0.0.1, minio, rend-api, or rend-edge. rend-edge streams cold playback misses while writing atomic cache files and enforces cache size/free-space bounds with deterministic priority eviction.

Local validation and production-profile validation are separate:

bun run env:local
bun run env:production
bun run verify:production-local

Production-profile commands load .env.production and .env.production.local or host/platform env vars. They do not load .env.local.

Volumes

Local Compose uses persistent volumes for:

rend-postgres-data
rend-redis-data
rend-minio-data
rend-clickhouse-data
rend-edge-cache
rend-edge-telemetry-spool
rend-edge-us-east-cache
rend-edge-us-east-telemetry-spool
rend-edge-london-cache
rend-edge-london-telemetry-spool

In production, Postgres, Redis, object storage, and ClickHouse are managed externally. Each edge node keeps local cache and telemetry spool volumes. These edge volumes are node-local, not shared.

Healthchecks

Postgres: pg_isready
Redis: redis-cli ping
MinIO: /minio/health/ready
ClickHouse: SELECT 1
rend-api: GET /readyz
rend-edge: GET /readyz
rend-media-worker: process liveness check for rend-api worker media

API readiness checks Postgres, Redis, and object storage. Edge readiness checks the local cache directory and object-store origin. Worker readiness is process liveness because the worker has no HTTP listener.

Bootstrap And Migrations

Postgres migrations are applied by rend-api through SQLx when REND_API_AUTO_MIGRATE=true. In local Compose, the worker waits for the API and sets REND_API_AUTO_MIGRATE=false to avoid duplicate startup migration work. For production, deploy or run the API migration step before starting workers.

ClickHouse schema is applied by the local clickhouse-init one-shot service on every Compose startup. The schema uses CREATE DATABASE IF NOT EXISTS and CREATE TABLE IF NOT EXISTS, so repeated runs are safe.

MinIO bucket creation is handled by the local-only minio-init one-shot service. Production object storage should be provisioned outside this repo.

Operator Harness

Use the checked-in operator scripts for first-host production deployments. They do not provision cloud resources, DNS, TLS, proxies, registry credentials, image signing, or SBOMs.

Validate production env files before deploy:

scripts/validate-production-env.sh --role control-plane
scripts/validate-production-env.sh --role edge-host

The validator requires vars to be present, rejects placeholder values, rejects local/dev defaults unless --allow-dev-defaults is passed, and checks URL, port, boolean, numeric, and path shapes. For local Docker example dry-runs:

scripts/validate-production-env.sh --role all --allow-dev-defaults \
  --api-env .env.docker.example \
  --worker-env .env.docker.example \
  --edge-env .env.docker.example

Run host preflight before deploy. Production manifests must contain image_digest refs and platform metadata for the required services. The host expectation defaults to linux/amd64; pass --expected-platform only for an intentional architecture change:

scripts/preflight-control-plane-host.sh \
  --manifest .rend/releases/production-001.json

scripts/preflight-edge-host.sh \
  --manifest .rend/releases/production-001.json

The control-plane preflight checks Docker/Compose, compose/env files, manifest digest refs, manifest platform metadata, manifest image pull readiness, pulled image OS/architecture, managed dependency connectivity where local tools allow it, and host bind ports. The edge preflight checks Docker/Compose, edge env, manifest digest ref, manifest platform metadata, manifest image pull readiness, pulled image OS/architecture, private-by-default direct port publishing, uid/gid 10001 cache and spool writeability, object-store health, control-plane register/heartbeat reachability, telemetry ingest reachability, and host bind ports.

Use deploy helpers in dry-run mode first to print the exact Compose commands with manifest image refs:

scripts/deploy-control-plane-host.sh \
  --manifest .rend/releases/production-001.json \
  --dry-run

scripts/deploy-edge-host.sh \
  --manifest .rend/releases/production-001.json \
  --dry-run

After deploy, verify the first-host path:

scripts/verify-first-host-deploy.sh \
  --api-base https://api.rend.so \
  --edge-base https://edge-us-east.example.com \
  --edge-internal-base http://10.0.10.12:4100 \
  --edge-base https://edge-london.example.com \
  --edge-internal-base http://10.0.20.12:4100 \
  --api-env /etc/rend/rend-api.env \
  --edge-env /etc/rend/rend-edge.env \
  --asset-id 00000000-0000-0000-0000-000000000000 \
  --rewrite-playback-base

The verifier checks API /readyz, private edge /readyz, all expected edge registrations, the public deny surface, warmed HIT signed playback on each edge, playback analytics increasing after the smoke requests, no dropped-telemetry increase, and telemetry spool bytes returning to 0. It reads Postgres and ClickHouse settings from --api-env, or from explicit --database-url, --clickhouse-url, --clickhouse-database, --clickhouse-user, and --clickhouse-password flags for laptop or bastion runs. For psql probes only, it normalizes hosted Postgres URLs by removing sslrootcert=system; the service DATABASE_URL is not rewritten.

Playback Readiness Gate

Run the synthetic playback readiness gate before and after production deploys that can affect upload ingest, media processing, playback bootstrap, edge cache behavior, telemetry, or deploy routing. The gate uploads generated test media only; it does not use customer media.

Local two-edge run:

bun run playback:readiness

The default target starts the local Docker stack plus the two-edge profile, then verifies rend-edge-us-east on http://127.0.0.1:4101 and rend-edge-london on http://127.0.0.1:4102.

Production-style run:

REND_API_BASE_URL=https://api.rend.so \
REND_READINESS_API_KEY=<api-key-with-upload-read-delete-analytics> \
REND_EDGE_INTERNAL_TOKEN=<edge-internal-token> \
REND_READINESS_EDGES='edge-us=us-east=https://edge-us.example.com=http://10.0.10.12:4100,edge-eu=london=https://edge-eu.example.com=http://10.0.20.12:4100' \
bun run playback:readiness -- --target configured --skip-local-stack

REND_READINESS_EDGES uses edge_id=region=public_playback_base[=private_edge_base]. The public base is used for signed playback fetches; the private base is used for /readyz, /internal/warm, /internal/purge, and /metrics. If the private base is omitted, the public base is used for both.

To include the gate in first-host verification:

scripts/verify-first-host-deploy.sh \
  --api-base https://api.rend.so \
  --edge-base https://edge-us-east.example.com \
  --edge-internal-base http://10.0.10.12:4100 \
  --edge-base https://edge-london.example.com \
  --edge-internal-base http://10.0.20.12:4100 \
  --api-env /etc/rend/rend-api.env \
  --edge-env /etc/rend/rend-edge.env \
  --asset-id 00000000-0000-0000-0000-000000000000 \
  --rewrite-playback-base \
  --run-readiness-gate

The gate writes a run artifact and updates .rend/readiness/playback-readiness-latest.json for the private operator UI. Set REND_READINESS_OUTPUT, REND_READINESS_LATEST_OUTPUT, or REND_READINESS_ARTIFACT_PATH to place or read the latest result elsewhere. Artifacts are redacted and are checked before write: they must not contain full URLs, cookies, signed URL query tokens, authorization headers, bearer tokens, configured API keys, edge internal tokens, or client IPs.

The result status means:

pass: correctness checks passed and all measured timings stayed under warn thresholds.
warn: correctness checks passed, but one or more conservative performance warn thresholds were exceeded. Treat this as a deploy note unless the trend is regressing.
fail: a correctness/safety check failed or a fail threshold was exceeded. Do not promote the deploy until the artifact's failures list is resolved.

Correctness failures include missing expected edges, non-200 upload/bootstrap/ playback responses, non-tokenless playback URL shape, wrong content types, unexpected cache headers, telemetry visibility timeout, dropped telemetry increase, nonzero telemetry spool bytes after the run, unredacted artifact content, or synthetic cleanup failure.

Performance thresholds can be configured with env vars:

REND_READINESS_WARN_UPLOAD_RESPONSE_MS, REND_READINESS_FAIL_UPLOAD_RESPONSE_MS
REND_READINESS_WARN_UPLOAD_TO_OPENER_PLAYABLE_MS, REND_READINESS_FAIL_UPLOAD_TO_OPENER_PLAYABLE_MS
REND_READINESS_WARN_UPLOAD_TO_HLS_READY_MS, REND_READINESS_FAIL_UPLOAD_TO_HLS_READY_MS
REND_READINESS_WARN_PLAYBACK_BOOTSTRAP_MS, REND_READINESS_FAIL_PLAYBACK_BOOTSTRAP_MS
REND_READINESS_WARN_EDGE_TTFB_MISS_MS, REND_READINESS_FAIL_EDGE_TTFB_MISS_MS
REND_READINESS_WARN_EDGE_TTFB_HIT_MS, REND_READINESS_FAIL_EDGE_TTFB_HIT_MS
REND_READINESS_WARN_EDGE_TTFB_WARMED_HIT_MS, REND_READINESS_FAIL_EDGE_TTFB_WARMED_HIT_MS
REND_READINESS_WARN_TELEMETRY_VISIBILITY_MS, REND_READINESS_FAIL_TELEMETRY_VISIBILITY_MS

The bytes-per-delivered-minute value is a proxy from synthetic playback bytes and fixture duration. It is useful for deploy comparison, not billing-grade usage or watch accounting.

Deploy Order

Provision managed Postgres, Redis, S3-compatible storage, and ClickHouse.
Apply or confirm ClickHouse schema.
From a clean git worktree, build and optionally push images with bun run release:images -- --tag production-001 --registry <registry-prefix> --platform linux/amd64 --push. Pushed releases require the git SHA to be reachable from a pushed branch or tag and copy the accepted manifest to docs/releases/.
Copy production-style compose files, real env files, and the release manifest to the target hosts.
Run scripts/validate-production-env.sh and the relevant preflight script on each host.
Run the deploy helper with --dry-run, then run it without --dry-run.
Deploy rend-api with REND_API_AUTO_MIGRATE=true for the migration step.
Start rend-api serving traffic after /readyz passes.
Start rend-edge nodes with unique REND_EDGE_ID, REND_EDGE_REGION, API-reachable REND_EDGE_BASE_URL, cache volume, and telemetry spool volume.
Start rend-media-worker with REND_API_AUTO_MIGRATE=false.
Run scripts/verify-first-host-deploy.sh with a provided hls_ready asset to confirm edge registration, signed playback, and telemetry analytics.
Run bun run playback:readiness -- --target configured --skip-local-stack or pass --run-readiness-gate to the verifier before promoting traffic.

Rollback Basics

Roll back services in dependency order from the edge inward:

Roll back rend-edge first if playback cache behavior regresses.
Roll back rend-media-worker if artifact generation or warming regresses.
Roll back rend-api last. Treat Postgres migrations as forward-only unless a tested rollback migration exists.

Edge cache can be purged or discarded during rollback. Telemetry spool files can be retained for replay or deleted if the ingest contract changed incompatibly.

Edge Region Config

US East and London edge nodes differ only by environment and attached volumes:

REND_EDGE_ID
REND_EDGE_REGION
REND_EDGE_BASE_URL
host port or load balancer target
local cache volume
local telemetry spool volume

The same rend-edge image and command run in both regions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rend Playback Edge V1 Deployment

Service Topology

Browser Media Path

Local Docker Topology

Production Topology

Required Env Vars

Volumes

Healthchecks

Bootstrap And Migrations

Operator Harness

Playback Readiness Gate

Deploy Order

Rollback Basics

Edge Region Config

FilesExpand file tree

deployment-v1.md

Latest commit

History

deployment-v1.md

File metadata and controls

Rend Playback Edge V1 Deployment

Service Topology

Browser Media Path

Local Docker Topology

Production Topology

Required Env Vars

Volumes

Healthchecks

Bootstrap And Migrations

Operator Harness

Playback Readiness Gate

Deploy Order

Rollback Basics

Edge Region Config