Skip to content

Fix segfault: stop storing NTLM cache item in c->data on idle upstream connections#4

Merged
matthias-lay merged 4 commits intomasterfrom
copilot/fix-nginx-segfault-issue-again
May 6, 2026
Merged

Fix segfault: stop storing NTLM cache item in c->data on idle upstream connections#4
matthias-lay merged 4 commits intomasterfrom
copilot/fix-nginx-segfault-issue-again

Conversation

Copy link
Copy Markdown

Copilot AI commented May 6, 2026

  • Read and understand full module source
  • Issue 1: Add c->data = NULL when caching a connection in free_ntlm_peer
  • Issue 2: Add ntlm_requests / ntlm_time directives and enforce c->requests / c->start_time limits in free_ntlm_peer
  • Issue 3: Save and pass through notify peer callback in init_ntlm_peer
  • Issue 5: Guard NULL deref on old_pc in cache-eviction path in free_ntlm_peer
  • Code review: restore c->read->data = c before c->data = NULL in get_ntlm_peer for clearer ordering
  • Code review: rename struct field requestsmax_requests for clarity
Original prompt

Extend existing PR #3 in Securepoint/nginx-ntlm-module to fully fix segfaults by removing unsafe use of ngx_connection_t->data.

Context

  • Observed crashes with nginx 1.28.3 where ngx_http_upstream_handler assumes: c = ev->data; r = c->data (ngx_http_request_t*). If module repurposes c->data to a cache item, posted/ready events can crash.
  • Current PR Fix segfaults caused by race condition between ntlm close handler and client conn cleanup #3 only fixes a race between ngx_http_upstream_client_conn_cleanup posting events and ngx_http_upstream_ntlm_close_handler, but does not remove c->data=item usage.

Requirements

  1. Work on PR Fix segfaults caused by race condition between ntlm close handler and client conn cleanup #3 branch (head is copilot/fix-nginx-segfault-issue).
  2. Remove all uses of setting upstream peer connection's c->data to ngx_http_upstream_ntlm_cache_t* item while cached/idle.
    • Do not rely on reading item via c->data in close handler.
  3. Implement a safe mapping from cached upstream ngx_connection_t* to its cache item.
    • Prefer a module-managed structure in conf (e.g., ngx_rbtree keyed by connection pointer value) or similar.
    • Must be safe under nginx worker single-thread model.
    • Must allow close_handler and cleanup logic to find the item for a given connection without touching c->data.
  4. Ensure cleanup and close handler are idempotent and cannot corrupt queues.
    • Keep/extend the in_cache ownership flag.
    • Always clear ownership before queue/close operations.
    • No ngx_post_event from pool cleanup.
  5. Ensure compatibility with nginx 1.28.3 and OpenSSL 3.2.6; protocol mix (HTTP/1.1, h2, h3) should not matter.
  6. Update PR description to mention the c->data hazard and how mapping fixes it.
  7. If tests exist under t/, update/add a regression test if feasible.

Deliverable

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: ```
static void
ngx_http_upstream_handler(ngx_event_t *ev)
{
ngx_connection_t *c;
ngx_http_request_t *r;
ngx_http_upstream_t *u;

c = ev->data;
r = c->data;

u = r->upstream;
c = r->connection;

ngx_http_set_log_request(c->log, r);

ngx_log_debug2(NGX_LOG_DEBUG_HTTP, c->log, 0,
               "http upstream request: \"%V?%V\"", &r->uri, &r->args);

if (ev->delayed && ev->timedout) {
    ev->delayed = 0;
    ev->timedout = 0;
}

if (ev->write) {
    u->write_event_handler(r, u);

} else {
    u->read_event_handler(r, u);
}

ngx_http_run_posted_requests(c);

}


Assistant: The segfault in `ngx_http_upstream_handler()` is happening because **for that event, `c->data` is not a valid `ngx_http_request_t *` anymore**.

In upstream core, the handler assumes:

- `c = ev->data;` is an upstream `ngx_connection_t *`
- **`r = c->data;` must be a valid `ngx_http_request_t *`**
- then it immediately dereferences `r->upstream`, etc.

Your module violates this assumption when it caches a connection:

- it sets **`c->data = item`** (your `ngx_http_upstream_ntlm_cache_t *`) here:  
  https://github.com/gabihodoroaga/nginx-ntlm-module/blob/3da77b0cb30e517dfee01d7e7f7d649144d24051/ngx_http_upstream_ntlm_module.c#L353-L357

So if **any** posted event ends up calling `ngx_http_upstream_handler()` for a connection while `c->data` still points to your `item`, nginx will treat that pointer as `ngx_http_request_t *` and crash on `u = r->upstream` or `c = r->connection`.

That matches your backtrace perfectly.

## Why does nginx still call `ngx_http_upstream_handler()` if you changed handlers?
You do change handlers when caching:

- `c->write->handler = ngx_http_upstream_ntlm_dummy_handler;`
- `c->read->handler  = ngx_http_upstream_ntlm_close_handler;`

…but **posted events can already be enqueued** (or timers/ready events can race in) with the *old* handler path. Also, your cleanup explicitly posts events (`ngx_post_event(item->peer_connection->read, &ngx_posted_events)`), which increases the chance of “an event fires after state changed”.

Bottom line: **you can’t safely repurpose `c->data` on an upstream connection** unless you can *prove* nginx will never run upstream core handlers on it again. Your crash shows that proof doesn’t hold.

## The fix (what to change)
### 1) Stop storing `item` in `c->data`
This is the big one. Don’t do:

```c
c->data = item;

Instead, store item somewhere that won’t break upstream assumptions. Common patterns:

  • Store on the event objects:
    • c->read->data = item;
    • c->write->data = item;
      (and in your handlers, get item = ev->data or keep ev->data = c and put item in another field—depends on nginx struct layout/version)
  • Or keep a mapping keyed by ngx_connection_t * (rbtree/hash) managed by the module.
  • Or (simpler) create a small wrapper struct that keeps both:
    • original ngx_http_request_t *r (or NULL when idle)
    • ngx_http_upstream_ntlm_cache_t *item
      and only set c->data to that wrapper if you also ensure upstream handler will never see it (usua...

This pull request was created from Copilot chat.

…LM cache

nginx's ngx_http_upstream_handler() assumes r = c->data is ngx_http_request_t*.
Storing our cache item in c->data causes segfaults when posted/ready events
fire for cached upstream connections.

Fix: store the cache item in c->read->data instead. Since we replace
c->read->handler with our own handler, only our handler runs for idle
connections and retrieves item via ev->data. Restore c->read->data = c in
get_ntlm_peer before returning the connection to upstream core.

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-module/sessions/2e9fcf2a-dc92-4dda-831c-76cc616351af

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix segfaults by removing unsafe use of ngx_connection_t->data Fix segfault: stop storing NTLM cache item in c->data on idle upstream connections May 6, 2026
Copilot AI requested a review from matthias-lay May 6, 2026 08:36
Copilot AI and others added 2 commits May 6, 2026 09:14
…otify passthrough), 5 (eviction NULL guard)

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-module/sessions/672b9196-95fe-4861-bee8-50b1cc5ff159

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
…equests->max_requests

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-module/sessions/672b9196-95fe-4861-bee8-50b1cc5ff159

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
@matthias-lay matthias-lay marked this pull request as ready for review May 6, 2026 09:35
Copilot stopped work on behalf of matthias-lay due to an error May 6, 2026 09:48
@matthias-lay matthias-lay merged commit 58a4c6b into master May 6, 2026
5 of 6 checks passed
Copilot stopped work on behalf of matthias-lay due to an error May 6, 2026 10:06
Copilot stopped work on behalf of matthias-lay due to an error May 6, 2026 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants