A production-grade, 5-service distributed URL shortening and redirection system β solo-architected, horizontally scaled across 26 live replicas, load-tested to 17,843 RPS under 10,000 concurrent virtual users, and deployed at rdrt.dev.
Live endpoints:
- Frontend:
https://rdrt.dev - API:
https://api.rdrt.dev
- System Architecture (HLD & LLD)
- Why This Is Hard
- Production Deployment
- Benchmark Results
- Tech Decision Rationale
- Service Breakdown & Repo Structure
- Local Setup
- Performance Profiling
- What I'd Do Next
The system relies on platform-native Layer 7 load balancing to route traffic to the API Gateway. The architecture intentionally isolates the write-heavy URL creation path from the read-heavy, latency-sensitive redirect path.
graph TD
Client([Client]) -->|HTTPS| LB[Railway Native Ingress L7 LB]
subgraph "Ingress & Gateway"
UI[url-frontend]
API[api-gateway x6]
end
LB -->|rdrt.dev| UI
LB -->|api.rdrt.dev| API
subgraph "Microservices"
Auth[auth-service x2]
URL[url-service x3]
Redirect[redirect-service x12]
Analytics[analytics-service x3]
end
API -->|Route| Auth
API -->|Route| URL
API -->|Route| Redirect
URL -->|Write| PG_Primary[(PostgreSQL Primary)]
Auth -->|Read/Write| PG_Primary
Redirect -->|1. Cache Check O 1| Redis[(Redis Shared Cache)]
Redirect -->|2. Fallback Read| PG_Replica[(PostgreSQL Read Replica)]
Redirect -.->|3. Async Event| Kafka[[Apache Kafka Stream]]
Kafka -.->|Consume| Analytics
Analytics -->|Batch Write| PG_Primary
To achieve median latencies of 111ms under extreme load, the redirect sequence ensures that analytics and database writes never block the HTTP response.
sequenceDiagram
participant C as Client
participant G as API Gateway
participant R as Redirect Service
participant Cache as Redis
participant DB as Postgres Replica
participant K as Kafka
C->>G: GET /r/{shortcode}
G->>R: Forward Request
rect rgb(30, 40, 50)
Note over R,DB: Critical Latency Path
R->>Cache: GET {shortcode}
alt Cache Hit (99.9%)
Cache-->>R: Return original_url
else Cache Miss (0.1%)
R->>DB: SELECT original_url FROM urls
DB-->>R: Return original_url
R->>Cache: SET {shortcode} (Background Goroutine)
end
end
R->>K: Publish ClickEvent (Async / Fire & Forget)
R-->>G: HTTP 302 Found (Location: original_url)
G-->>C: Redirect to Destination
URL redirection sounds trivial β receive a short code, look it up, return 302. At 10 RPS, it is. At 17,843 RPS with 10,000 concurrent connections, the engineering surface explodes:
- The redirect hot-path is brutally latency-sensitive. Every millisecond of added overhead is multiplied across thousands of simultaneous connections. A naΓ―ve implementation β synchronous DB write per redirect plus in-band analytics β collapses under load because PostgreSQL cannot sustain 17K+ IOPS at sub-150ms while also accepting analytics writes.
- Shared cache invalidation across replicas is non-trivial. The
redirect-serviceruns across 12 replicas. In-process caching breaks horizontal scaling β if Replica 1 caches a URL mapping, Replica 2 has a cold cache and goes to the DB. Redis solves cross-replica cache coherence, but getting it wrong means a 12Γ DB read amplification on every cache miss. - Analytics cannot block the critical path. If a click-analytics write takes 50ms and it's synchronous with the redirect response, your median latency triples. Decoupling requires an event bus (Kafka), introducing at-least-once delivery semantics.
- Nginx hop elimination was counterintuitive. Conventionally, Nginx sits in front of Go services as a reverse proxy. Under
pprofprofiling, each intra-cluster Nginx hop added context-switch overhead per request. Removing it (Railway LB β Go appinstead ofRailway LB β Nginx β Go app) cut per-request switching cost by 25%.
The system is live on Railway, heavily weighted toward the redirect read-path:
(Screenshot of live Railway deployment showing active replicas and DB instances)
| Component | Replicas | Responsibility |
|---|---|---|
| API Gateway | 6 active | api.rdrt.dev routing, rate limiting |
| Redirect Service | 12 active | The Hot Path. Redis lookup β PG fallback β Kafka publish β 302 |
| URL Service | 3 active | URL creation, ownership, PostgreSQL primary writes |
| Analytics Service | 3 active | Kafka consumer, click aggregation, stats |
| Auth Service | 2 active | JWT issuance, user management |
| Frontend | 1 active | rdrt.dev web UI |
Data Layer Isolation:
- PostgreSQL Primary: Handles all writes from URL creation, Auth, and batched Analytics.
- PostgreSQL Read Replica (
Postgres-7Ev1): Dedicated exclusively to theredirect-servicefallback reads. Zero write pressure. - Redis: Shared global cache across all 12 redirect replicas.
All tests run with k6 from local MacBook Pro against the live api.rdrt.dev deployment over the public internet.
Click to view raw k6 output
k6 run redirect_load_test.js
scenarios: 10,000 max VUs, 4m0s, 4 stages
β is redirect (301/302/307/308)
β has location header
http_req_duration: avg=290ms med=111ms p(90)=333ms p(95)=1.19s max=8.41s
http_req_failed: 0.00% β 2 i/o timeouts / 4,284,206 requests
http_reqs: 4,284,206 total @ 17,843 RPS
Success rate: 99.9999%
Click to view raw k6 output
k6 run load-test.js
scenarios: 3,000 max VUs, 3m30s, 6 stages
β status is 302
http_req_duration: avg=119ms med=113ms p(90)=138ms p(95)=157ms max=614ms
http_req_failed: 0.00% β 1 failure / 2,315,167 requests
p(95) = 157ms β (threshold: 200ms β PASSED)
| Concurrent VUs | Total Requests | RPS | p(50) | p(95) | Failures |
|---|---|---|---|---|---|
| 400 | 36,058 | 171 | 288ms | 335ms | 0 |
| 1,000 | 520,805 | 2,168 | 276ms | 341ms | 0 |
| 3,000 | 2,315,167 | 11,019 | 113ms | 157ms | 1 |
| 10,000 | 4,284,206 | 17,843 | 111ms | 1.19s | 2 |
| 5,000 (analytics) | 2,011,389 | 8,378 | 105ms | 886ms | 2 |
On the p(95) rise at 10K VUs: p(95) rises to ~1.19s at 10,000 concurrent users. This is Railway's free-tier TCP connection ceiling β not Go application saturation. The application p(50) remains 111ms even at peak, confirming the Go runtime is not the bottleneck.
On the RPS jump from 2,168β11,019: This reflects Redis cache warming. Once the hot URL working-set is fully cached across the 12 redirect replicas, requests never reach PostgreSQL β pure O(1) Redis at wire speed.
| Decision | Chosen | Rejected | Why |
|---|---|---|---|
| Caching | Redis | In-memory map | Shared state across 12 replicas. Prevents 12x DB read amplification on cache misses. |
| Analytics pipeline | Kafka | Direct DB write | Eliminates I/O blocking on the redirect hot-path. Ensures 302 response is decoupled from click tracking. |
| Load balancing | Platform L7 LB | Nginx | Removes an unnecessary network hop and context switch inside the container. |
| Database | PG Primary + Replica | MongoDB / Single PG | Primary handles URL writes. Read Replica takes 100% of the redirect fallback reads, eliminating lock contention. |
redirection-engine/
βββ services/
β βββ analytics-service/ # Kafka consumer, click aggregation, stats
β βββ api-gateway/ # API Gateway β routing, rate-limiting
β βββ auth-service/ # Auth Service β JWT, user management
β βββ redirect-service/ # Redirect Service β hot path, Redis, Kafka producer
β βββ url-service/ # URL Service β URL creation, DB writes
βββ pkg/
β βββ cache/ # Redis client abstraction
β βββ db/ # PostgreSQL connection pool
β βββ kafka/ # Producer/consumer helpers
β βββ middleware/ # Shared HTTP middleware
βββ docker-compose.yaml
βββ go.mod
βββ go.sum
Prerequisites: Docker & Docker Compose, Go 1.22+
git clone [https://github.com/ruthwikkakumani/redirection-engine](https://github.com/ruthwikkakumani/redirection-engine)
cd redirection-engine
docker compose up --build| Service | Local Port |
|---|---|
| API Gateway | 8080 |
| Auth Service | 8081 |
| Redirect Service | 8082 |
| Analytics Service | 8083 |
| URL Service | 8084 |
# 1. Register and get a token
TOKEN=$(curl -s -X POST http://localhost:8080/api/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"you@example.com","password":"secret"}' | jq -r '.token')
# 2. Shorten a URL
curl -X POST http://localhost:8080/api/urls \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"original_url":"[https://google.com](https://google.com)"}'
# 3. Follow the redirect (check Location header)
curl -I http://localhost:8080/r/<shortcode># CPU profile β capture 30s under load
curl "http://localhost:8082/debug/pprof/profile?seconds=30" > cpu.prof
go tool pprof cpu.prof
# Goroutine snapshot β check for leaks
curl "http://localhost:8082/debug/pprof/goroutine?debug=2"- No Goroutine Leaks: Goroutine count stabilizes strictly under 10K VU load.
- Structured logging: JSON logs with
request_id,service,short_code,cache_hit,latency_mson every request for distributed tracing. - Graceful shutdown: OS signal listener triggers server drain with 30s timeout; in-flight requests complete before the process exits.
- OpenTelemetry traces β End-to-end spans from API Gateway through Kafka to Analytics, exported to Jaeger or Tempo.
- Circuit breaker β
sony/gobreakeron Redis and PostgreSQL clients; fail-fast on dependency degradation rather than piling up goroutines waiting on a dead connection. - GitHub Actions CI β
go test -race ./...+golangci-lint+ Docker build verification on every PR. - Canary deployments β Progressive rollout (5% β 25% β 100%) with automated rollback triggered by p99 spike detection.
Ruthwik Kakumani β Backend & Distributed Systems
LinkedIn Β· GitHub Β· LeetCode Β· ruthwikkakumani@gmail.com