Releases: true-async/server
v0.7.2 — optional per-request scope
Feature release: a new opt-in knob to drop the per-request async scope on hot paths.
Added
-
HttpServerConfig::setRequestScope(bool)/isRequestScope()— opt out of the
per-request child async scope (default on, behaviour unchanged). When off, each
H1/H2/H3 handler coroutine reuses the connection scope directly instead of minting a
fresh per-request child, saving two allocations (emalloc/efree) per request. The
setter is chainable and locks once the config is handed to a server.Disabling it means
Async\request_context()resolves to null for that request
(use the?->operator) — there is no per-request context subtree. Only disable it
for handlers that never rely on per-request context (e.g. throughput benchmarks). The
knob propagates correctly acrosssetWorkers(N > 1)via the cross-thread shared-config
snapshot.
Tests
server/core/049-request-scope-setter— default / toggle / chainable / locked-guard,
plus scope-OFF serving withAsync\request_context()asserted null.server/core/050-request-scope-workers— the knob is honoured on worker threads
(setWorkers(2)), guarding the shared-config propagation path.
Also folds in 0.7.1 (HTTP/3 bidi stream-credit fix, #79), which shipped tagged but
without a changelog entry.
v0.7.1 — HTTP/3 stream-credit fix
Patch release on top of 0.7.0 — a single focused fix to the HTTP/3 path.
Fixed
- HTTP/3 throughput collapse under sustained concurrency (#79).
stream_close_cb
closed each nghttp3 request stream but never replenished the QUIC bidi stream
credit, so every connection was permanently capped atinitial_max_streams_bidi
(default 100). After 100 requests a client could not open another stream and the
connection stalled — HttpArenabaseline-h3/static-h3at c=64 collapsed to
~1277 req/s (≈20 req/s per connection) with the server otherwise idle. ngtcp2 does
not auto-sendMAX_STREAMSon close; the application must. The fix calls
ngtcp2_conn_extend_max_streams_bidi(conn, 1)for each client-initiated bidi
stream (id & 3 == 0) instream_close_cb. A/B at c=64: 6400 done/30s →
60000 done/10s.
Tests
- New
036-h3-stream-credit-replenish— 150 request streams over one connection
(pre-fix stalls at 100, post-fix all 150 complete). Full H3 suite 36/36 green.
Full changelog: v0.7.0...v0.7.1
v0.7.0 — HTTP/3 over QUIC
Headline release: HTTP/3 over QUIC. Folds in everything tagged but not yet
documented since 0.6.7 (the 0.6.8 tag carried no changelog entry).
Added
- HTTP/3 / QUIC server (
HttpServerConfig::addHttp3Listener) — full request
lifecycle over QUIC: end-to-end GET/POST withawaitBody, streamingsend(),
HEAD,sendFile()delivery, andaddStaticHandlermount routing. Built on
ngtcp2 + nghttp3 + OpenSSL ≥ 3.5; auto-detected (--enable-http3/
--disable-http3). - HTTP/3 production controls: connection migration / NAT rebinding (RFC 9000 §9),
opt-in send pacing (setHttp3Pacing), per-peer connection budget with global
cap and explicit refusal, configurable UDP socket buffer
(setHttp3SocketBufferBytes), idle timeout, Alt-Svc advertisement, Retry token
source-address validation, version negotiation, and stateless reset. HttpServer::getHttp3Stats()— handshake / ALPN / nghttp3 / send-error counters.HttpServer::isHttp2()/isHttp3()compile-time capability probes.HttpServerConfig::setTlsBufferBytes— tunable TLS clear-text-out BIO ring (#29).- Shared-fd TCP listener path for workers on kernels without load-balancing
SO_REUSEPORT, selectable at runtime.
Changed
- HTTP/3 send path coalesces outbound datagrams to once-per-tick and splits
coalesced inbound datagrams viaUDP_GRO; UDP socket buffers enlarged. - HTTP/1 conformance hardening:
Dateheader, HEAD sends no body, reject
CONNECTand asterisk-form targets, validateHost, reject empty
Transfer-Encoding, reject fragment/backslash in request-target, reject
duplicateContent-Type. - HTTP/2 over TLS parks the emit remainder when the clear-text-out BIO ring fills
(backpressure instead of a write deadlock) (#29). HttpServer::start()now throws on listener bind failure instead of failing
silently.
Fixed
- Drain in-flight per-request coroutines on server shutdown so
server_scopeis
not disposed while handlers are still running (#74). - HTTP/3: dirty-list use-after-free on connection free, dispatched-stream slot
leak when a stream is rejected mid-awaitBody, andarm_timerNULL-ngtcp2_conn
guard. http_server: use-after-free of the wait event on non-stop teardown.- Windows MSVC build.
v0.6.6 — code audit + memory observability
Closes #37 (Code audit & refactoring) — Phases 1–6 rolled up.
Highlights
Refactor / cleanup
src/http_response.c(2173 lines) split into three TUs (S7):src/http_response.c— PHP class machinery only.src/http1/http1_format.c— HTTP/1.x wire formatters.src/http_response_server_api.c— server-side C-API used by static / h2 / compression paths.
- Dedup of repeated patterns across compression / h2 / parser / response code (X1–X14).
- Dead code & stale comment removal across Phase 2.
Observability
HttpServer::getRuntimeStats(): array— lock-free snapshot of the server's internal allocators:conn_arena(live / total / chunks / bytes) — slab pool forhttp_connection_t.body_pool[](per-class LIFO of large request bodies) +body_pool_total_bytes.- Pairs with
Async\runtime_stats()and (debug builds)zend_mm_dump_live_allocations()to attribute live RSS down to a concrete subsystem.
Correctness / hygiene
send_fileengineopen()usesO_NOFOLLOWon REJECT-mount so a symlink swapped in after the open-file-cache TTL still 404s (C2, new phptstatic/021).- DS2 assert on
http2_emit_record_t.body.lenbound. - License headers added to compression / http3 / core TUs that were missing them.
Tests
034-config-tls-and-log.phpt: drop the deprecatedcurl_close($ch)call (no-op since PHP 8.0; emits Deprecated on 8.5+). This and the previously baseline-fail044are now green: phpt 211/211.
Test plan
- phpt 211/211 PASS on PHP 8.6 dev.
- HttpArena validate: 57/0 PASS (true-async-server) · 43/0 PASS (symfony-spawn-tas, including
async-db). - h2load smoke (no docker overhead, c=64, 10s):
- baseline H1 · 287 k req/s
- baseline-h2 TLS · 158 k req/s
- baseline-h2c · 220 k req/s
- /json/1 · 300 k req/s
- /async-db?limit=10 · 38.6 k req/s
- Stress
/async-db(Symfony): c=256 m=20 / 60k req — 0 errors, RSS ≤ 114 MiB, no SEGV.
TODO file
A new Step 5 entry in TODO.md documents the Zend MM retention analysis and proposes a future setMaxRequestsPerWorker(N) knob for FPM-style worker recycle (RSS reclamation on long-running benches). No code in this release — design notes only.
v0.6.4
Fixed
- HTTP/1 pipelining crash under high connection count (HttpArena
pipelined/4096c). A handler-coroutine spawn failure destroyed the connection — freeing its llhttp parser — synchronously from insidellhttp_execute(the dispatch callback fires fromon_headers_complete), causing a use-after-free SIGSEGV inon_message_complete. Connection teardown now defers (in_parser_feedguard) while a parser feed is on the stack and is finalised once the feed unwinds. - Stranded
Async\AsyncExceptionon I/O write submit failure. Fire-and-forget write submit failures (broken pipe / connection reset) left an exception inEG(exception)with no coroutine to receive it; it then aborted an unrelatedZEND_ASYNC_NEW_COROUTINE— which is exactly what produced the spawn failure above. The batched-send paths now log and clear the exception at the submission site.
v0.6.3
Added
- One-shot brotli compress with
BROTLI_PARAM_SIZE_HINT(Step 4 of perf TODO).apply_buffereduses the stateless one-passBrotliEncoderCompress()when the body is fully known. The size hint lets the encoder right-size its ring buffer / hash tables for the actual payload instead of for arbitrary streaming. New optional vtable slotscompress_oneshot+max_compressed_size; streaming path stays for chunked / unknown-length responses. Closes the brotli encode gap vs Swoole'sBrotliEncoderCompress-based path. C-side defaults stay production-typical (gzip 6, brotli 4); bench callers setsetCompressionLevel(1)/setBrotliLevel(1)for Swoole-equivalent throughput. - Loud stderr logging on unexpected worker thread exits in
pool_worker_handler— covers uncaught$server->start()exceptions, clean returns while the await loop still expects workers, and server-transfer failure. Previously each case silently dropped 1/N of accept capacity with no operator signal.
Fixed
Connection: closerequest header now producesConnection: closein the response (RFC 9112 §9.6). The parser already flippedreq->keep_alive = falseand the dispose path closed the FD, but the missing response header left clients unable to tell the TCP was not reusable until the next write hitECONNRESET— wrk under-H 'Connection: close'counted every reply as a read error. Side effect on the local short-lived bench (wrk c=512 d=10s): 174k → 230k RPS, p50 14.5 ms → 2.5 ms, read-errors 2.0M → 0.
Changed
- Server-side codec preference order flipped to
zstd > gzip > brotli > identity. Clients sending the commongzip, brAccept-Encoding now get gzip — the brotli pool can't reuse encoder state (libbrotli has no public reset API), so until the arena-allocator follow-up lands, gzip'sdeflateResetpath is the better default. Clients that explicitly want brotli via q-values (br;q=1.0, gzip;q=0.5) still get it.
Bench delta vs Swoole (docker, /json/40, c=512, 5-run median, both with q=1)
| Accept-Encoding | TAS v0.6.3 | Swoole | Δ |
|---|---|---|---|
| br | 106k | 94k | +13% |
| gzip | 94k | 67k | +40% |
| gzip, br | 95k | 95k | parity |
v0.6.2 — H2 TLS hybrid emit selector
What's new
HTTP/2 over TLS now picks its emit path adaptively based on the in-flight body size — small responses take a low-overhead DRAIN path, large ones get amortised over a single `SSL_write_ex` via GATHER.
Hybrid emit selector (#30, #32)
Each HTTP/2 session pins a counter when it submits a response whose body exceeds 2 KiB (or whose total size is unknown — streaming). The emit pump picks per pass:
- DRAIN (counter == 0): `nghttp2_session_mem_send` into a 16 KiB stack buffer → `BIO_write` straight into the plaintext BIO → `tls_drain` encrypts. No `records[]` / `body_refs[]` allocation, no per-pass alloc churn. Wins on short responses where alloc / `zval_ptr_dtor` cost dominates.
- GATHER (counter > 0): drive nghttp2 via `session_send` + NO_COPY callbacks, fold frames into `records[]` (with `body_refs[]` keeping bodies alive), memcpy into stage[] and ship in one `SSL_write_ex`. Wins on bodies that fill at least one TLS record — amortises cipher setup; only one memcpy of the body instead of two.
Bench
Release PHP, h2 TLS, c=100 m=32, h2load -t 1, 10 s × N median.
| body | gather (old default) | drain | hybrid |
|---|---|---|---|
| dyn 3B | 162k | 235k | 243k |
| dyn 16K | 58k | 43k | 57k |
| dyn 64K | 18k | 11k | 18k |
| static 100B | 125k | 146k | 145k |
| static 16K | 55k | 40k | 61k |
| static 64K | 17k | 12k | 17k |
Override
Set `TRUE_ASYNC_H2_TLS_EMIT_MODE` to `drain`, `gather`, or `hybrid` (default) for A/B testing. Read once and cached.
Docs
`docs/H2_TLS_EMIT_STRATEGIES.md` walks through the three paths and the cross-over arithmetic.
Up next
Kernel TLS (kTLS) support is tracked in #31 on a separate branch.
Full diff: v0.6.1...v0.6.2
v0.6.1
Fixed
- H1 handler dispatch deferred to
on_message_completefor buffered bodies. A TCP-fragmented request (slow client, small MTU) no longer runs the handler against a partial$req->getBody()— the handler always sees the complete body. Streaming handlers (setBodyStreamingEnabled(true)) still dispatch at headers-complete. Regression test:h1/018-tcp-fragmentation.phpt(5 split scenarios). Closes HttpArenavalidate.shbaseline TCP fragmentation failures. - Request leak in the new deferred-dispatch path when a parse error fires between headers-complete and message-complete (chunked body cap hit).
parser->owns_requestis now flipped only on actual handoff.
Compatibility
- Requires TrueAsync ABI v0.15+ (unchanged from 0.6.0).
v0.6.0
Fixed
- Double-destroy in
conn_arena_freeunder TLS load. A synchronous write-completion insidetls_advance_state → tls_drain → libuv_io_write → tls_cipher_completioncould callhttp_connection_destroyand put the conn on the freelist; on stack-unwind the outertls_finalize_if_closingthen re-invoked destroy on freed memory, trippingassert(arena->alive_head == conn)and aborting the worker. Guarded by a newconn->destroyingbit set past every defer gate.
Changed
- Asymmetric TLS BIO ring sizes — saves ~62 KiB per TLS connection (~248 MiB on 4096 conns) with no throughput impact:
- CT-in (
network_biowritebuf): 64 KiB → 17 KiB. Bounded by one TLS record (16 KiB) by spec. - PT-app back-channel (
tls_plaintext_bio_appwritebuf): 32 KiB → 17 KiB. Direction is unused. - CT-out (64 KiB) and PT-out (32 KiB) unchanged — hot paths for static/h2 multi-record batching.
- CT-in (
Added
HttpServerConfig::setBootloader(?Closure)/getBootloader()— closure deep-copied into each worker, runs before its task loop. Requires TrueAsync ABI v0.15+. Test:tests/phpt/server/core/021-bootloader.phpt.
Known issues
- #29 — TLS write deadlock when CT-out BIO ring shrinks below the response body. Latent at the default 64 KiB; do not lower until fixed.
Also in this release
Windows MSVC build fixes from #28.
v0.5.1 — Win32 build fixes
Win32-only release. Linux / macOS behaviour is unchanged from v0.5.0.
Fixed
- Windows / MSVC build restored after the streaming request body merge (PR #27).
- CMake: add
src/http_body_stream.cand the HTTP/2 sources to the Win32 source list; guard TLS-only sources onOpenSSL_FOUND. - Unit tests: stop letting PHP's
win32/unistd.h/win32/time.hshadow the CRT system headers; add the four sources thathttp_parser.candmultipart_processor.cnow depend on (http_body_stream.c,core/body_pool.c,http_rfc5987.c,http_param_parse.c); add a lightweight compression-vtable stub fortest_compression_negotiate; prependPHP_DLL_DIRandCMOCKA_DLL_DIRto PATH for every CTest target so DLL loading no longer fails with0xc0000135.
- CMake: add
See CHANGELOG.md for full history.