diff --git a/README.md b/README.md index a217ec3..28df19f 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,7 @@ user, err := resile.Do(ctx, func(ctx context.Context) (*User, error) { - [Distributed Deadline Propagation](#23-distributed-deadline-propagation) - [Reliable File Downloads (HTTP Resumption)](#24-reliable-file-downloads-http-resumption) - [SQL Resilience](#25-sql-resilience) + - [Redis Resilience](#26-redis-resilience) - [Built on Hyperscaler Research](#built-on-hyperscaler-research) - [Configuration Reference](#configuration-reference) - [Architecture & Design](#architecture--design) @@ -111,6 +112,7 @@ Want to learn more about the philosophy behind Resile and advanced resilience pa * [Self-Healing State Machines: Resilient State Transitions in Go](docs/articles/self-healing-state-machines.md) * [Resilience Beyond Counters: Sliding Window Circuit Breakers in Go](docs/articles/sliding-window-circuit-breakers.md) * [Stop the Domino Effect: Bulkhead Isolation in Go](docs/articles/bulkhead-isolation.md) +* [Reliable Redis: Combining Retries and Bulkheads for Rock-Solid Caching](docs/articles/redis-resilience-with-go.md) * [Prioritize Your Traffic: Priority-Aware Bulkheads in Go](docs/articles/priority-aware-bulkheads.md) * [Respecting Boundaries: Precise Rate Limiting in Go](docs/articles/rate-limiting.md) * [Beyond Static Limits: Adaptive Concurrency with TCP-Vegas in Go](docs/articles/adaptive-concurrency.md) @@ -142,6 +144,7 @@ The [examples/](examples/) directory contains standalone programs showing how to - **[Chaos Injection](examples/chaos/main.go)**: Simulating faults and latency to test your policies. - **[HTTP Resumption](examples/http_resume_stream/main.go)**: Resuming large file downloads using HTTP Range. - **[SQL Resilience](examples/sql/main.go)**: Using Resile with standard `database/sql`. +- **[Redis Resilience](examples/redis/main.go)**: Adding resilience to Redis operations with shared bulkheads. --- @@ -564,6 +567,27 @@ _, err := resile.Do(ctx, func(ctx context.Context) (sql.Result, error) { [Read more: Building Bulletproof Database Clients in Go: SQL Resilience with Resile](docs/articles/sql-resilience.md) +### 26. Redis Resilience +**The Problem**: Database connection pools (SQL or NoSQL like Redis) can be exhausted when the database slows down, leading to cascading failures. + +**The Recipe**: +Combine retries for transient blips with a shared bulkhead to strictly limit the number of concurrent operations hitting the connection pool. + +```go +// 1. Create a shared bulkhead matching your pool size +redisBulkhead := resile.NewBulkhead(20) + +// 2. Wrap your Redis or SQL calls +val, err := resile.Do(ctx, func(ctx context.Context) (string, error) { + return rdb.Get(ctx, "key").Result() +}, + resile.WithMaxAttempts(3), + resile.WithBulkheadInstance(redisBulkhead), +) +``` + +[Read more: Reliable Redis: Combining Retries and Bulkheads for Rock-Solid Caching](docs/articles/redis-resilience-with-go.md) + --- ## Built on Hyperscaler Research diff --git a/docs/articles/bulkhead-isolation.md b/docs/articles/bulkhead-isolation.md index 030ebb0..41ea1ca 100644 --- a/docs/articles/bulkhead-isolation.md +++ b/docs/articles/bulkhead-isolation.md @@ -79,6 +79,14 @@ If your infrastructure is highly dynamic, consider using the `AdaptiveLimiter` a --- +## Practical Application: Database Connection Pools + +A common use case for shared bulkheads is protecting database connection pools (SQL or NoSQL like Redis). By using a bulkhead that matches your pool size, you ensure that your application never blocks indefinitely on the pool itself. + +[Read more: Reliable Redis: Combining Retries and Bulkheads for Rock-Solid Caching](redis-resilience-with-go.md) + +--- + ## Why "Fail-Fast" Matters When a bulkhead is full, Resile immediately returns `resile.ErrBulkheadFull`. diff --git a/docs/articles/redis-resilience-with-go.md b/docs/articles/redis-resilience-with-go.md new file mode 100644 index 0000000..d8e6b07 --- /dev/null +++ b/docs/articles/redis-resilience-with-go.md @@ -0,0 +1,126 @@ +# Reliable Redis: Combining Retries and Bulkheads for Rock-Solid Caching + +Redis is the bedrock of many high-performance Go applications. It's incredibly fast, but like any distributed component, it's not invincible. Network blips, Redis server restarts, or connection pool exhaustion can turn your lightning-fast cache into a source of application errors. + +In this article, we'll explore how to use [Resile](https://github.com/cinar/resile) to build a resilient Redis integration that handles transient failures gracefully and prevents connection pool saturation. + +--- + +## The Problem: The Invisible Bottleneck + +Most Go developers use `go-redis` or `redigo`. While these clients are excellent, they often hide a critical bottleneck: the **Connection Pool**. + +When Redis slows down (e.g., during a BGSAVE or a complex `KEYS *` command), your Go application continues to spawn goroutines that attempt to acquire a connection from the pool. If the pool is exhausted, your goroutines block, waiting for a connection. This leads to: +1. **Increased Latency**: Every call starts waiting for the pool. +2. **Resource Leaks**: Goroutines pile up, consuming memory. +3. **Cascading Failure**: Your application process eventually hits its limits, failing even non-Redis related tasks. + +--- + +## The Solution: Layered Resilience + +To build a truly resilient Redis client, we need two layers of protection: +1. **Retries**: To handle transient network blips. +2. **Shared Bulkheads**: To strictly limit the number of concurrent operations hitting the connection pool. + +### Implementation with Resile + +Resile allows you to wrap Redis calls in a type-safe, declarative way. Here’s how you can implement a resilient GET operation: + +```go +package main + +import ( + "context" + "fmt" + "time" + + "github.com/cinar/resile" + "github.com/redis/go-redis/v9" +) + +func main() { + ctx := context.Background() + rdb := redis.NewClient(&redis.Options{Addr: "localhost:6379"}) + + // 1. Create a Shared Bulkhead. + // This ensures that across our ENTIRE application, we never + // have more than 20 concurrent Redis operations. + redisBulkhead := resile.NewBulkhead(20) + + // 2. Define our Resilience Policy. + // We combine retries with the shared bulkhead. + opts := []resile.Option{ + resile.WithMaxAttempts(3), + resile.WithBaseDelay(100 * time.Millisecond), + resile.WithBulkheadInstance(redisBulkhead), + } + + // 3. Execute with Type Safety. + // resile.Do automatically infers the return type (string). + val, err := resile.Do(ctx, func(ctx context.Context) (string, error) { + return rdb.Get(ctx, "user:123").Result() + }, opts...) + + if err != nil { + fmt.Printf("Redis operation failed: %v\n", err) + return + } + fmt.Printf("User: %s\n", val) +} +``` + +--- + +## Why Shared Bulkheads are Critical + +In the example above, `redisBulkhead` is created once and passed to `WithBulkheadInstance`. + +If you have 5 different services (User, Order, Catalog, etc.) all hitting the same Redis instance, you should use the **same bulkhead instance** for all of them. This creates a "global" limit for your process. If one service starts misbehaving and hammers Redis, the bulkhead will fill up and start shedding load *before* the `go-redis` connection pool is completely exhausted, keeping the rest of your application responsive. + +[Read more about Bulkhead Isolation](bulkhead-isolation.md) + +--- + +## Type Safety with `resile.Do` + +One of the pain points of using generic resilience libraries in Go is losing type safety. Resile uses Go Generics (v1.18+) to ensure that `resile.Do` returns the exact type your Redis command returns. + +Whether you are fetching a `string`, a `struct` (via JSON), or a `map`, `resile.Do` preserves the types: + +```go +// Returns (User, error) - no manual type casting required! +user, err := resile.Do(ctx, func(ctx context.Context) (User, error) { + var u User + err := rdb.Get(ctx, "user:456").Scan(&u) + return u, err +}, opts...) +``` + +--- + +## Advanced: Adding a Circuit Breaker + +For even more protection, you can add a **Circuit Breaker**. If Redis goes down completely, the breaker will "trip" and stop all attempts for a cooldown period, preventing your application from wasting resources on retries that are guaranteed to fail. + +```go +cb := resile.NewCircuitBreaker() + +resile.Do(ctx, action, + resile.WithCircuitBreakerInstance(cb), + resile.WithBulkheadInstance(redisBulkhead), + resile.WithMaxAttempts(3), +) +``` + +[Learn about Sliding Window Circuit Breakers](sliding-window-circuit-breakers.md) + +--- + +## Conclusion + +Redis is fast, but your application's resilience shouldn't rely on "hope." By combining shared bulkheads to protect your connection pool and retries to handle transient blips, you can build a Go application that remains stable even when your infrastructure isn't. + +**Explore Resile on GitHub:** [github.com/cinar/resile](https://github.com/cinar/resile) + +#golang #redis #resilience #microservices #backend #caching diff --git a/examples/redis/README.md b/examples/redis/README.md new file mode 100644 index 0000000..9c4373e --- /dev/null +++ b/examples/redis/README.md @@ -0,0 +1,46 @@ +# Redis Resilience Example + +This example demonstrates how to use **Resile** to add resilience to a Redis client using the popular `github.com/redis/go-redis` package. + +## Features Covered + +1. **Retries:** Automatically retry failed Redis commands (e.g., due to transient connection issues). +2. **Bulkhead:** Limit the number of concurrent operations to Redis to prevent overloading the database or the application. +3. **Type Safety:** Using `resile.Do` to maintain type information for the Redis response. + +## Prerequisites + +- [Redis](https://redis.io/) server running on `localhost:6379` (optional, the example shows the pattern even if it fails to connect). +- [Go](https://go.dev/) 1.18+ + +## How to Run + +1. Initialize dependencies: + ```bash + go mod tidy + ``` + +2. Run the example: + ```bash + go run main.go + ``` + +## Key Pattern + +```go +// Define a shared bulkhead for Redis operations. +redisBulkhead := resile.NewBulkhead(10) + +// Common options. +retryOpts := []resile.Option{ + resile.WithMaxAttempts(3), + resile.WithBulkheadInstance(redisBulkhead), +} + +// Execute command. +val, err := resile.Do(ctx, func(ctx context.Context) (string, error) { + return rdb.Get(ctx, "key").Result() +}, + retryOpts..., +) +``` diff --git a/examples/redis/go.mod b/examples/redis/go.mod new file mode 100644 index 0000000..98d4bfd --- /dev/null +++ b/examples/redis/go.mod @@ -0,0 +1,16 @@ +module github.com/cinar/resile/examples/redis + +go 1.24.0 + +replace github.com/cinar/resile => ../../ + +require ( + github.com/cinar/resile v0.0.0-00010101000000-000000000000 + github.com/redis/go-redis/v9 v9.18.0 +) + +require ( + github.com/cespare/xxhash/v2 v2.3.0 // indirect + github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect + go.uber.org/atomic v1.11.0 // indirect +) diff --git a/examples/redis/go.sum b/examples/redis/go.sum new file mode 100644 index 0000000..e25b1f4 --- /dev/null +++ b/examples/redis/go.sum @@ -0,0 +1,22 @@ +github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs= +github.com/bsm/ginkgo/v2 v2.12.0/go.mod h1:SwYbGRRDovPVboqFv0tPTcG1sN61LM1Z4ARdbAV9g4c= +github.com/bsm/gomega v1.27.10 h1:yeMWxP2pV2fG3FgAODIY8EiRE3dy0aeFYt4l7wh6yKA= +github.com/bsm/gomega v1.27.10/go.mod h1:JyEr/xRbxbtgWNi8tIEVPUYZ5Dzef52k01W3YH0H+O0= +github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs= +github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= +github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= +github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f h1:lO4WD4F/rVNCu3HqELle0jiPLLBs70cWOduZpkS1E78= +github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f/go.mod h1:cuUVRXasLTGF7a8hSLbxyZXjz+1KgoB3wDUb6vlszIc= +github.com/klauspost/cpuid/v2 v2.0.9 h1:lgaqFMSdTdQYdZ04uHyN2d/eKdOMyi2YLSvlQIBFYa4= +github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg= +github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= +github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/redis/go-redis/v9 v9.18.0 h1:pMkxYPkEbMPwRdenAzUNyFNrDgHx9U+DrBabWNfSRQs= +github.com/redis/go-redis/v9 v9.18.0/go.mod h1:k3ufPphLU5YXwNTUcCRXGxUoF1fqxnhFQmscfkCoDA0= +github.com/stretchr/testify v1.3.0 h1:TivCn/peBQ7UY8ooIcPgZFpTNSz0Q2U6UrFlUfqbe0Q= +github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= +github.com/zeebo/xxh3 v1.0.2 h1:xZmwmqxHZA8AI603jOQ0tMqmBr9lPeFwGg6d+xy9DC0= +github.com/zeebo/xxh3 v1.0.2/go.mod h1:5NWz9Sef7zIDm2JHfFlcQvNekmcEl9ekUZQQKCYaDcA= +go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE= +go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0= diff --git a/examples/redis/main.go b/examples/redis/main.go new file mode 100644 index 0000000..ccde710 --- /dev/null +++ b/examples/redis/main.go @@ -0,0 +1,68 @@ +// Copyright (c) 2026 Onur Cinar. +// The source code is provided under MIT License. +// https://github.com/cinar/resile + +package main + +import ( + "context" + "fmt" + "time" + + "github.com/cinar/resile" + "github.com/redis/go-redis/v9" +) + +func main() { + ctx := context.Background() + + // 1. Setup Redis Client + // For this example, we assume a local Redis server. + rdb := redis.NewClient(&redis.Options{ + Addr: "localhost:6379", + }) + + fmt.Println("--- Redis Resilience Example ---") + + // 2. Define shared resilience policies + // We create a shared bulkhead to limit total concurrent Redis operations across all calls. + redisBulkhead := resile.NewBulkhead(10) + + // Define common retry options + retryOpts := []resile.Option{ + resile.WithMaxAttempts(3), + resile.WithBaseDelay(100 * time.Millisecond), + resile.WithBulkheadInstance(redisBulkhead), + } + + // 3. Wrap Redis Get command with Resile + // We combine retries (for transient network issues) and the shared bulkhead. + val, err := resile.Do(ctx, func(ctx context.Context) (string, error) { + fmt.Println("Attempting to GET 'my-key' from Redis...") + return rdb.Get(ctx, "my-key").Result() + }, + retryOpts..., + ) + + if err != nil { + fmt.Printf("Redis GET failed: %v\n", err) + } else { + fmt.Printf("Redis GET succeeded: %s\n", val) + } + + // 4. Wrap Redis Set command with Resile (error only) + err = resile.DoErr(ctx, func(ctx context.Context) error { + fmt.Println("Attempting to SET 'my-key' in Redis...") + return rdb.Set(ctx, "my-key", "resilient-value", 0).Err() + }, + // We can reuse the same options or provide different ones. + // Using the same retryOpts ensures it shares the same bulkhead. + retryOpts..., + ) + + if err != nil { + fmt.Printf("Redis SET failed: %v\n", err) + } else { + fmt.Println("Redis SET succeeded") + } +}