Skip to content

[BUG]: asyncio create_task wrapping breaks anyio cancel scope handling (CancelledError leak) #17035

@fgeracitano

Description

@fgeracitano

Tracer Version(s)

4.5.4 (also tested with 4.6.1 — same issue)

Python Version(s)

3.14.2

Pip Version(s)

uv 0.7.x (using uv instead of pip)

Bug Report

ddtrace's asyncio integration wraps coroutines passed to asyncio.BaseEventLoop.create_task() with a traced_coro wrapper in ddtrace/contrib/internal/asyncio/patch.py. This extra coroutine frame breaks anyio's CancelScope handling, causing CancelledError to leak out of task groups during normal operation.

The most visible symptom: first HTTP request via httpx returns 500 after every deployment, because httpx → httpcore → anyio connect_tcp() uses the Happy Eyeballs algorithm (RFC 6555) which races connections in a task group and cancels losers via cancel scope.

How it happens

  1. httpx makes a request → httpcore calls anyio.connect_tcp()
  2. connect_tcp() resolves DNS, gets multiple addresses (IPv4/IPv6 or multiple A records)
  3. Happy eyeballs spawns connection attempts via tg.start_soon() → internally calls asyncio.create_task()
  4. ddtrace's _wrapped_create_task wraps each coroutine in traced_coro (adds extra coroutine frame)
  5. When the first connection wins, tg.cancel_scope.cancel() cancels the losers
  6. The CancelledError does not propagate cleanly through the traced_coro wrapper — it leaks out of the task group
  7. Error propagates up through middleware → 500

Subsequent requests work because httpx reuses pooled connections, bypassing connect_tcp().

The wrapping code causing the issue

# ddtrace/contrib/internal/asyncio/patch.py, line 62-66
async def traced_coro(*args_c, **kwargs_c):
    if dd_active != tracer.current_trace_context():
        tracer.context_provider.activate(dd_active)
    core.dispatch("asyncio.execute_task", (task_data,))
    return await coro

This wrapper adds an extra coroutine frame between the original coroutine and the task scheduler. When anyio cancels a task via cancel scope, the CancelledError doesn't propagate correctly through this extra frame.

Reproduction

Reliably reproduces on ECS (Linux, dual-stack VPC networking where DNS returns multiple addresses). Hard to reproduce locally due to Docker Desktop not providing real dual-stack DNS.

Confirmed fixes:

  • Removing ddtrace-run entirely → fixed (no create_task wrapping)
  • Removing the traced_coro wrapper from _wrapped_create_task → fixed (keeping ddtrace patch active but not wrapping coroutines)
  • Replacing httpx with aiohttp → fixed (aiohttp uses its own TCP stack, no anyio connect_tcp)
  • Disabling happy eyeballs in anyio connect_tcp() → fixed (no task group, no cancel scopes)

Did not fix:

  • DD_PATCH_MODULES=httpx:false — the httpx patch is not the culprit
  • Upgrading ddtrace to 4.6.1
  • Vendoring anyio with the CancelScope.__exit__ fix from anyio PR #1092

Impact

This affects any code path where asyncio.create_task() is called internally by a library that then cancels tasks as part of normal operation. Beyond anyio/httpx, this could affect asyncio.TaskGroup with cancellation, timeout patterns, and connection pools.

Suggested Fix

Consider one of:

  1. Skip wrapping internal library coroutines — detect coroutines from anyio/httpcore and pass through
  2. Use contextvars for trace propagation instead of wrapping the coroutine in an extra async def
  3. Ensure CancelledError is fully transparent through traced_coro

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions