A robust, concurrent Load Balancer written in Go, designed to distribute traffic across multiple backend services using the Least-Connections algorithm (aware of backend load). This project demonstrates advanced Go concepts such as Goroutines, Atomic Operations, Mutexes for thread safety, and System Architecture.
The system sits in front of a pool of backend servers. It performs active health checks in the background to ensure traffic is only routed to healthy instances.
graph TD
Client(Client Requests) -->|HTTP| LB[Go Load Balancer :3030]
subgraph "Server Pool (Least Connections)"
LB -->|Route| App1[Backend 1]
LB -->|Route| App2[Backend 2]
LB -.->|❌ Detected Down| App3[Backend 3]
end
HC[Health Checker Worker] -.->|HTTP GET every 20s| App1
HC -.->|HTTP GET every 20s| App2
HC -.->|HTTP GET every 20s| App3
style App3 fill:#ffcccc,stroke:#ff0000
style LB fill:#d4edfc,stroke:#0052cc,stroke-width:2px
- ⚡ Least-Connections Selection: Traffic is routed to the backend with the fewest active connections (reduces latency imbalance).
- 🛡️ Active Health Checks: A background worker (Goroutine) pings backends periodically via HTTP. If a server fails (non-2xx status), it is automatically removed from the rotation.
- 🔒 Thread-Safe Design: Uses
sync.RWMutexto manage concurrent reads/writes to the server pool status. - 🚀 Atomic Operations: Uses
sync/atomicfor the request counter to avoid locking bottlenecks in the hot path. - 🐳 Docker Native: Fully containerized with a Multi-Stage Build (Alpine based) for a lightweight production image.
- 📊 Real-time Stats: Exposes a
/statsendpoint providing live metrics (uptime, memory usage) for each backend.
The easiest way to run the project is using Docker Compose. It will spin up the Load Balancer and 3 dummy backend services (traefik/whoami) to simulate a cluster.
- Docker & Docker Compose
- Go 1.22+ (optional, for local dev)
# 1. Clone the repository
git clone https://github.com/P4ST4S/go-load-balancer.git
cd go-load-balancer
# 2. Start the infrastructure
docker-compose up --buildThe Load Balancer will start on http://localhost:3030.
Open a terminal and send multiple requests. You will see the response coming from different containers (observe the Hostname field):
curl http://localhost:3030
# Output: Hostname: f33e964c5fe9 (Server 1)
curl http://localhost:3030
# Output: Hostname: a82b12c5558d (Server 2)Simulate a server crash by stopping one of the backend containers:
docker stop app2Watch the Load Balancer logs. Within seconds, you will see:
Status change: http://app2:80 [down]
Now, run curl again. You will notice that traffic is never routed to the stopped server.
You can monitor the health and resource usage of your backends in real-time:
curl http://localhost:3030/stats | jqOutput:
[
{
"url": "http://app1:80",
"alive": true,
"uptime": "00h:05m:23s",
"memory_usage": "1.2 MB"
},
...
]To validate the Least-Connections behavior, the backend servers expose a /sleep endpoint that waits 5 seconds before replying.
You can use the provided Python script to visualize the traffic distribution in real-time:
# from project root
docker-compose up --build
python3 scripts/visual_test.pyBenchmark:
The output demonstrates that the Load Balancer favors backends with fewer active connections. In this example, app2 is busy with slow requests (High Active Conns), so the LB routes the majority of new traffic to app1 and app3.
=== Load Balancer Distribution (Least Connections) ===
Backend http://app1:80 [Active Conns: 0] | Fast Req: 125 | Slow Req: 8
Backend http://app2:80 [Active Conns: 5] | Fast Req: 22 | Slow Req: 15 <-- Busy, receiving less traffic
Backend http://app3:80 [Active Conns: 1] | Fast Req: 118 | Slow Req: 9
To handle high throughput, the ServerPool uses a Race-Condition Free design:
- Reads (
IsAlive): Protected byRWMutex.RLock()allowing multiple concurrent readers. - Writes (
SetAlive): Protected byRWMutex.Lock()ensuring exclusive access during health updates.
For the Round-Robin index, I chose atomic.AddUint64 instead of a standard Mutex.
Why? Mutexes are expensive. In a high-load scenario (10k req/sec), locking the counter for every request creates a bottleneck. Atomic CPU instructions are non-blocking and significantly faster.
To prevent goroutine leaks when fetching statistics from potentially slow backends, I implemented a Worker Pool pattern.
- A fixed number of workers (3) consume update tasks from a buffered channel.
- If the channel is full (backpressure), new updates are skipped until workers are available.
- This ensures the main health check loop is never blocked by slow network calls.
The project includes a comprehensive suite of unit tests covering both the core logic and the HTTP handlers, ensuring reliability and thread safety.
To run the tests:
go test -v ./...Coverage Highlights:
- Core Logic: Validates atomic operations, concurrency safety (Race Detector), and the Least-Connections algorithm.
- HTTP Handlers: Mocks backend servers to verify routing, error handling (503), and health checks.
- Robustness: Tests edge cases like network errors, JSON parsing failures, and worker pool backpressure.
Made with ❤️ and Go.