Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Changelog

## 1.0.0 (2026-06-24)

### Changes
- **`caller_id` is now required -- set it to a callable (a lambda) - and has no default.** The old default keyed on the `Authorization` header / session cookie — but **rotating tokens are unsafe to key on**, so that default was removed.
Configure `caller_id` with a callable that returns a stable, non-secret identifier (a user id, a JWT `sub`, an API-client id).
- When `caller_id` is unset or resolves to `nil`, de-duplication is now **skipped** for that request (it's allowed through) and a warning is logged — instead of treating all unidentified callers as one identity (which could wrongly 409 a different caller's identical request). Return a fixed string from `caller_id` to dedupe globally.

## 1.0.0.pre1 (2026-06-16)

Initial release. See the [README](README.md) for full usage and configuration.
Expand Down
87 changes: 69 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Automatic server-side de-duplication of inbound mutating Rails requests (POST /

When a client re-sends the same mutating request — because of a retry, a network timeout, a double-click, or a buggy client — a non-idempotent endpoint often turns the duplicate into a 5xx (the resource is already created or modified).

One go-to solution for this used to be to require the client to provide a idempotency key together with the request, and then reject duplicate requests (requests that use a previous idemptotency key).
One go-to solution for this used to be to require the client to provide an idempotency key together with the request, and then reject duplicate requests (requests that use a previous idempotency key).

`dedupe_requests` simplifies this, removing the requirement for providing an idempotency key, and instead auto-computes a fingerprint of each mutating request (effectively auto-generating the idempotency key on-the-fly), claims it atomically in Redis, and short-circuits a duplicate seen within a configurable window with a clean **409 Conflict** instead of letting it blow up your app.

Expand All @@ -32,6 +32,69 @@ GET and DELETE are never deduped. Time is not part of the fingerprint — the ti
gem "dedupe_requests"
```

## Configuration: Who's your caller?

There is an important configuration we can not decide for you: **what identifies your caller?**

APIs typically have different callers, and you need to configure a way we can establish a `caller_id` that identifies the unique caller for `dedupe_requests` to work properly.

If you have end users, the caller is an individual user.
If you have a B2B application, the caller is probably your business partner.

Make sure to configure the `caller_id` mechanism correctly.

**There is no default — you must set `caller_id`.** If it's unset (or your callable returns `nil` for a request), `dedupe_requests` **skips de-duplication for that request** (it's allowed through) and logs a warning. That's deliberate: with no caller identity, two *different* callers sending the same payload would collide and the second would get a wrong 409. So de-duplication only kicks in once `caller_id` resolves to a value.

> **⚠️ Do not use a raw bearer token, API key, or session id as the identity.** They are secret and they rotate — so the same caller would look like different callers (silently weakening de-duplication), and you'd be leaking a secret into the dedup layer. Derive a **stable, non-secret** identifier instead: a user id, a JWT `sub`, an API-client id.

`caller_id` is a callable given the **controller** (reach the request with `controller.request`):

```ruby
# config/initializers/dedupe_requests.rb
DedupeRequests.configure do |c|
c.caller_id = ->(controller) { controller.current_user&.id }
end
```

Here are common ways to identify the caller — read any of them through the `controller` and return it from your `caller_id` lambda (e.g. `->(controller) { controller.request.headers['X-Client-ID'] }`):

### Directly:
* `current_user.id` in a customer-facing application

### Custom Headers: (only trustworthy if authenticated)
* `request.headers['X-Client-ID']`
* `request.headers['X-Organization-Id']`
* `request.headers['X-Partner-Id']`

### Indirectly: (tokens can rotate or have a nonce)
* `request.headers["X-API-Key"]`
`partner = ApiClient.find_by!(api_key: api_key)`

* `request.headers["Authorization"]` — decode the JWT and key on a stable claim:

```ruby
c.caller_id = ->(controller) do
claims = decode_jwt(controller.request.headers["Authorization"])
claims["sub"] # or claims["partner_id"]
end
```

### Infrastructure-Provided Identity

`request.headers['X-Authenticated-User']`
`request.headers['X-Forwarded-Client-Cert']`
`request.headers['X-Amzn-Oidc-Identity']`
`request.headers['X-Goog-Authenticated-User-Id']`

### Network-Based Identity: (rare and finicky)
* `caller_ips.include?(request.remote_ip)` # if you know the IP ranges for each caller

**Only one caller? Dedupe globally.** If your API has a single caller — or you want to de-duplicate across all callers regardless of who's calling — return a fixed value so every request shares one identity (this also suppresses the no-identity warning):

```ruby
c.caller_id = ->(_) { "global" }
```

## Usage

### 1. Global defaults — an initializer
Expand Down Expand Up @@ -98,29 +161,17 @@ You never specify HTTP verbs per action — the route already determines the ver

### 3. Per-caller identity (`caller_id`)

Dedup is scoped per caller, so two different users sending the same payload don't collide. `caller_id` is a callable given the **controller**, so it can read whatever identifies the caller:

```ruby
DedupeRequests.configure do |c|
c.caller_id = ->(controller) { controller.current_user&.id } # current_user
# c.caller_id = ->(controller) { controller.request.get_header("HTTP_X_API_KEY") } # a header
# c.caller_id = ->(controller) { controller.some_method } # any controller method
end
```

If you don't set it, the default derives identity from the `Authorization` header, falling back to a Rails session cookie — so token- and cookie-auth apps work with no configuration.

> **Note:** make sure you configure `caller_id` correctly for your API. If it can't derive an identity (no `Authorization` header and no session cookie), it falls back to `nil` — and then *different* callers sending the same payload to the same endpoint are treated as one request, so the second gets a 409. That's probably not what you want, so set `caller_id` to whatever identifies a caller in your app.
⚠️ `caller_id` scopes de-duplication per caller, and it **must be customized and properly configured for your application** — see the **Configuration** section above. There is no default; if it resolves to `nil`, that request is not de-duplicated (and a warning is logged).

## Modes and safe rollout

`mode` has three states:

- `:off` — disabled; no fingerprinting, no storage.
- `:observe` — **shadow mode**: compute and store fingerprints and fire the metrics hooks, but never return a 409. Duplicates are detected and reported only.
- `:observe` — **shadow mode**: compute and store fingerprints and fire `on_duplicate_detected`, but never return a 409. Duplicates are detected and reported only.
- `:enforce` — detect, store, and reject duplicates with a 409.

Recommended rollout on a live service: enable `:observe`, build a dashboard from the `duplicate_detected` hook, watch real volume for a week or two, then flip to `:enforce`.
Recommended rollout on a live service: enable `:observe`, build a dashboard from the `on_duplicate_detected` hook, watch real volume for a week or two, then flip to `:enforce`.

## Observability

Expand All @@ -133,7 +184,7 @@ DedupeRequests.configure do |c|
end
```

Each hook receives `{ fingerprint:, controller:, action:, verb:, path: }`. `duplicate_detected` fires in both `observe` and `enforce`; `duplicate_rejected` only when a 409 is actually returned.
Each hook receives `{ fingerprint:, controller:, action:, verb:, path: }`. `on_duplicate_detected` fires in both `observe` and `enforce`; `on_duplicate_rejected` only when a 409 is actually returned.

When tagging metrics, use only `controller`, `action`, and `verb` — these come from a small fixed set. Do **not** tag with `fingerprint` or `path`: the fingerprint is unique per request and the path usually contains record ids, so tagging with them creates a separate counter per request (a surprise bill on Datadog, or dropped series and broken dashboards). Log those instead if you need them.

Expand Down Expand Up @@ -169,7 +220,7 @@ A `409` is deliberate: well-behaved retrying clients do **not** loop on a 409 (t
| `ttl` | `90` | Dedup window, in seconds. |
| `digest` | `:sha256` | `:sha256` / `:sha512` / `:sha1` / `:md5`, or a callable. |
| `namespace` | `"dedupe_requests"` | Redis key prefix (`<namespace>:dedup:<hash>`). |
| `caller_id` | Authorization / session cookie | Callable **given the controller**, returns a per-caller identity (e.g. `->(c){ c.current_user&.id }`, a header via `c.request`, or any controller method). Default derives it from the Authorization header / session cookie. |
| `caller_id` | none (required) | Callable **given the controller**, returns a stable, non-secret per-caller identity (e.g. `->(c){ c.current_user&.id }`). No default — if unset or it returns `nil`, that request is not de-duplicated (and a warning is logged). |
| `fingerprint` | `nil` | Callable **given the request**, returns the fingerprint string — fully overriding the default computation. |
| `conflict_status` | `409` | Status returned for a rejected duplicate. |
| `conflict_body` | structured errors | JSON body for a rejected duplicate. |
Expand Down
1 change: 1 addition & 0 deletions dedupe_requests.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Gem::Specification.new do |spec|

spec.metadata["homepage_uri"] = spec.homepage
spec.metadata["source_code_uri"] = spec.homepage
spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
spec.metadata["rubygems_mfa_required"] = "false"

spec.files = Dir["lib/**/*.rb", "examples/**/*", "README.md", "CHANGELOG.md", "LICENSE.txt"]
Expand Down
16 changes: 9 additions & 7 deletions examples/config.ru
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,12 @@ DedupeRequests.configure do |c|
c.redis = Redis.new(url: ENV.fetch("REDIS_URL", "redis://localhost:6379/15"))
c.mode = ENV.fetch("DEDUPE_MODE", "enforce").to_sym
c.ttl = GLOBAL_TTL
# caller_id is left at its default, which derives the caller identity from the
# request's Authorization header. The integration test sends a different
# `Authorization: Bearer <token>` per simulated caller, so the same payload from
# two different callers fingerprints differently and is NOT treated as a duplicate.

# ⚠️ DEMO ONLY — uses the raw Authorization header as the caller_id to keep the
# test simple. Do NOT do this in production; see the README "Configuration"
# section for how to set a stable, non-secret caller_id.
# (Overridden below when DEDUPE_CUSTOM_CALLER_ID is set.)
c.caller_id = ->(controller) { controller.request.get_header("HTTP_AUTHORIZATION") }

# Record the duplicate-notification hooks. on_duplicate_detected fires whenever a
# duplicate is seen (observe AND enforce); on_duplicate_rejected fires only when a
Expand All @@ -75,9 +77,9 @@ DedupeRequests.configure do |c|
end

# When asked, replace caller_id with a custom one that identifies the caller by
# an X-Api-Key header (ignoring the Authorization header the default would use),
# so the test can prove this callable is what drives the per-caller scoping. It
# also records that the hook was invoked.
# an X-Api-Key header (instead of the Authorization header above), so the test
# can prove this callable is what drives the per-caller scoping. It also records
# that the hook was invoked.
if ENV["DEDUPE_CUSTOM_CALLER_ID"] == "1"
c.caller_id = lambda do |controller|
key = controller.request.get_header("HTTP_X_API_KEY")
Expand Down
32 changes: 11 additions & 21 deletions lib/dedupe_requests/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,18 @@ class Configuration
}]
}.freeze

# Per-caller identity. The callable is given the CONTROLLER, so it can read
# anything the controller exposes — `current_user`, a helper method, or a
# header via `controller.request`. Examples:
# Per-caller identity. There is NO default — you MUST configure `caller_id`
# with a callable that returns a stable, non-secret identifier for the caller
# (a user id, a JWT `sub`, an API-client id). Do NOT use a raw bearer token or
# API key: it's secret and it rotates, so the same caller would look like
# different callers and de-duplication would silently weaken. The callable is
# given the CONTROLLER, so it can read `current_user`, a helper, or a header via
# `controller.request`. Examples:
# c.caller_id = ->(controller) { controller.current_user&.id }
# c.caller_id = ->(controller) { controller.request.get_header("HTTP_X_API_KEY") }
#
# The default derives identity from the request's Authorization header,
# falling back to a Rails-style session cookie (so token- and cookie-auth
# apps work with no configuration). It accepts either a controller or a bare
# request.
DEFAULT_CALLER_ID = lambda do |context|
request = context.respond_to?(:request) ? context.request : context
if request.respond_to?(:get_header)
auth = request.get_header("HTTP_AUTHORIZATION")
return auth if auth && !auth.to_s.empty?
end
if request.respond_to?(:cookies)
request.cookies.each { |name, value| return value if name.to_s =~ /\A_.*_session\z/i }
end
nil
end

# When `caller_id` is unset or returns nil, de-duplication is skipped for the
# request (and a warning is logged), rather than risk treating different callers
# as one.
attr_accessor :redis, :ttl, :digest, :namespace, :caller_id, :fingerprint,
:conflict_status, :logger,
:on_duplicate_detected, :on_duplicate_rejected
Expand All @@ -47,7 +37,7 @@ def initialize
@ttl = 90
@digest = :sha256
@namespace = "dedupe_requests"
@caller_id = DEFAULT_CALLER_ID
@caller_id = nil
@fingerprint = nil
@conflict_status = 409
@logger = nil
Expand Down
32 changes: 31 additions & 1 deletion lib/dedupe_requests/controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,28 @@ def dedupe_requests_around
return
end

# GET/DELETE are never deduped — bail out before resolving caller_id, so the
# caller_id callable only runs for the verbs we actually de-duplicate.
unless dedupe_requests_mutating_verb?
yield
return
end

caller_id = dedupe_requests_caller_id
# Without a caller identity, every unidentified caller would share one
# fingerprint, so two genuinely-different requests with the same body would
# collide and the second would be wrongly rejected. Skip de-duplication in
# that case (let the request through) and warn, rather than risk a false 409.
if caller_id.nil?
dedupe_requests_warn_missing_caller_id
yield
return
end

result = dedupe_requests_guard.claim(
request,
ttl: dedupe_requests_ttl_for(action_name),
caller_id: dedupe_requests_caller_id
caller_id: caller_id
)

case result.outcome
Expand Down Expand Up @@ -112,6 +130,18 @@ def dedupe_requests_caller_id
DedupeRequests.config.caller_id&.call(self)
end

def dedupe_requests_mutating_verb?
DedupeRequests::MUTATING_VERBS.include?(request.request_method.to_s)
end

# Loud on purpose: a missing caller identity silently weakens de-duplication,
# so we warn on every such request (via the configured logger, else stderr).
def dedupe_requests_warn_missing_caller_id
message = "[dedupe_requests] caller_id resolved to nil for #{controller_name}##{action_name} (#{request.request_method} #{request.path}); de-duplication skipped. Configure DedupeRequests.config.caller_id."
logger = DedupeRequests.config.logger
logger ? logger.warn(message) : warn(message)
end

def dedupe_requests_guard
@dedupe_requests_guard ||= DedupeRequests::Guard.new(DedupeRequests.config)
end
Expand Down
2 changes: 1 addition & 1 deletion lib/dedupe_requests/version.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

module DedupeRequests
VERSION = "1.0.0.pre1"
VERSION = "1.0.0"
end
48 changes: 2 additions & 46 deletions spec/configuration_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -67,51 +67,7 @@ def with
expect(config.store.claim("fp", ttl: 1)).to eq(:error)
end

describe "DEFAULT_CALLER_ID" do
def request_with(headers: {}, cookies: {})
RequestDouble.new(
request_method: "POST", path: "/x", query_string: "", raw_post: "",
headers: headers, cookies: cookies
)
end

it "uses the Authorization header when present" do
id = described_class::DEFAULT_CALLER_ID.call(request_with(headers: { "HTTP_AUTHORIZATION" => "Bearer z" }))
expect(id).to eq("Bearer z")
end

it "falls back to a Rails-style session cookie" do
id = described_class::DEFAULT_CALLER_ID.call(request_with(cookies: { "_myapp_session" => "abc" }))
expect(id).to eq("abc")
end

it "is nil when neither identity signal is present" do
expect(described_class::DEFAULT_CALLER_ID.call(request_with)).to be_nil
end

it "skips the Authorization check when the request has no get_header" do
obj = Object.new
def obj.cookies
{ "_app_session" => "ck" }
end
expect(described_class::DEFAULT_CALLER_ID.call(obj)).to eq("ck")
end

it "returns nil when the request supports no cookies and has no auth" do
obj = Object.new
def obj.get_header(_name)
nil
end
expect(described_class::DEFAULT_CALLER_ID.call(obj)).to be_nil
end

it "ignores cookies that are not a session cookie" do
expect(described_class::DEFAULT_CALLER_ID.call(request_with(cookies: { "tracking" => "x" }))).to be_nil
end

it "reads from controller.request when given a controller" do
controller = Struct.new(:request).new(request_with(headers: { "HTTP_AUTHORIZATION" => "Bearer y" }))
expect(described_class::DEFAULT_CALLER_ID.call(controller)).to eq("Bearer y")
end
it "has no default caller_id (you must configure one)" do
expect(config.caller_id).to be_nil
end
end
Loading
Loading