Replace httpc with vendored Mint HTTP/1+HTTP/2 client#1139
Draft
Replace httpc with vendored Mint HTTP/1+HTTP/2 client#1139
Conversation
Adds scripts/vendor_mint.sh which vendors Mint and its dependency HPAX into lib/hex/mint/ with Hex.Mint.* module prefixes. The vendored code runs outside the :mint application so persistent_term keys are namespaced under :hex_mint and no application env is accessed.
Replaces :httpc with the vendored Hex.Mint.HTTP as the transport for
the :mix_hex_http adapter. Architecture follows Finch's protocol-split
strategy, adapted for Hex's no-deps constraint:
* Hex.HTTP.Pool - router + supervisor. Owns a Registry and a
DynamicSupervisor, plus an ETS table caching the ALPN-negotiated
protocol per {scheme, host, port}. First request for a host probes
with `protocols: [:http1, :http2]` and installs the result in ETS.
* Hex.HTTP.Pool.HTTP1 - pool of single-connection Worker GenServers
with checkout/checkin semantics. Default 8 workers per host
(matches former :httpc max_sessions). Idle workers auto-close
after 30s.
* Hex.HTTP.Pool.HTTP1.Worker - owns one Mint HTTP/1 connection,
handles one in-flight request at a time, translates Mint's async
message stream into a synchronous GenServer.call reply.
* Hex.HTTP.Pool.HTTP2 - :gen_statem per host (default 1 connection,
multiplexes many requests). States: :disconnected, :connecting,
:connected, :connected_read_only. Exponential backoff reconnect.
Server GOAWAY transitions to :connected_read_only so the router
can spin up a fresh connection while in-flight streams drain.
Hex.HTTP keeps its role as the :mix_hex_http behaviour and retains:
retry (with IPv4/IPv6 fallback on :nxdomain/:ehostunreach/etc),
redirect following, timeout wrapper, netrc auth, gzip decompression,
progress-callback body chunking, x-hex-message handling. Error
patterns now match %Hex.Mint.TransportError{}. Proxy config is
emitted as a per-connect Mint `:proxy` option instead of mutating
global :httpc state.
Application changes:
* Hex.HTTP.Pool added to supervision tree (before Hex.Server and
the fetcher pool).
* start_httpc/0 removed; :inets dropped from extra_applications.
* :httpc_profile removed from Hex.State.
* test/support/hexpm.ex wait_on_start uses :gen_tcp directly.
* test/hex/http_test.exs setup no longer calls :httpc.set_options;
proxy_config tests adapted to Mint :proxy tuple shape.
https://claude.ai/code/session_01DdkbbnQEW9WygTsj2Dq5KS
The previous pool design split HTTP/1 and HTTP/2 into separate modules
and probed ALPN in the caller process. That caller opened the socket,
then tried to hand it off to a worker via `controlling_process/2` — which
fails with `:not_owner` when called from anywhere except the current
socket owner. The "fix" in HTTP1.Worker.init/1 (calling
`controlling_process(conn, self())` from the worker) could never have
worked at runtime; the HTTP/2 path had the same latent bug.
Replaced the HTTP1/HTTP1.Worker/HTTP2 modules with a unified design:
* Hex.HTTP.Pool.Host (one GenServer per {scheme, host, port}) owns a
pool of Conn processes. On start it spawns two probe Conns; when the
first reports its protocol it scales up to 8 for :http1 or stays at
2 for :http2. Dispatches by least in-flight to the Conn with free
capacity; queues waiters otherwise.
* Hex.HTTP.Pool.Conn (one GenServer per Mint connection) connects
inside its own process, so the socket is owned from birth — no
controlling_process handoff ever needed. Reports negotiated protocol
and per-conn capacity (1 for HTTP/1, server's max_concurrent_streams
for HTTP/2) back to its host. Requests arrive as casts carrying the
caller's `from`; Conn replies directly to the caller and casts back
:req_done so the host decrements in-flight. Handles GOAWAY by
draining in-flight then reconnecting with exponential backoff.
* Hex.HTTP.Pool now just wires up the Registry + DynamicSupervisor and
routes requests to the Host for a given {scheme, host, port}. No
more probe phase, no ETS cache, no initial_conn plumbing.
Also fixes 1xx informational response handling in the vendored Mint
HTTP/1 module, which previously emitted `:done` and popped the request
after a 100 Continue / 103 Early Hints etc., causing the real final
response to arrive with no active request and the connection to close.
Upstream fix submitted as elixir-mint/mint#479; the vendored copy marks
the change with `# HEX PATCH` comments for easy re-application after
re-vendoring.
Conn.process_response resets accumulated headers/data on each new
`:status` so informational headers (e.g. 103 Early Hints `link:`) don't
bleed into the final response. Added bypass-backed regression tests for
100 Continue round-trip, 100 Continue with early error response, and
103 Early Hints with headers. test/mix/tasks/hex.registry_test.exs now
uses `Hex.HTTP.config/0` instead of the raw httpc-defaulting config.
Stream request bodies:
`mix hex.publish` shows an upload progress bar by wrapping the tarball
in a producer function and passing a `progress_callback`. Previously
`Hex.HTTP.build_mint_request` collected the full stream into a binary
before sending, so the callback fired at gigabyte-per-second (reading
from memory) and then the actual upload ran silently with no progress
shown. Now `Hex.HTTP` passes `{:stream, fun, offset}` through to
`Pool.Conn`, which calls `Mint.HTTP.stream_request_body/3` chunk by
chunk — the callback fires as bytes actually go out on the wire.
IPv4/IPv6 fallback:
`Hex.HTTP.retry` swaps `:inet`/`:inet6` in `connect_opts` on
`:nxdomain`/`:ehostunreach` to recover from hosts that resolve on only
one IP family. The pool was keyed on `{scheme, host, port}` and reused
the first request's `connect_opts` forever, so the swapped opts never
reached a new connect. Include the inet variant in the Host registry
key so the fallback retry actually gets its own pool with the opposite
inet flags.
Vendor script:
Add a comment block at the top of `scripts/vendor_mint.sh` listing the
upstream Mint PRs that must be present in any source tree used for
re-vendoring (#478 Elixir ~> 1.12 support, already applied; #479 1xx
informational handling, still open), with links to the PRs and to the
fork branches if re-vendoring before merge.
Exercises the full Mint pipeline end-to-end: TLS handshake, certificate verification, ALPN protocol negotiation, the connection pool, and response decoding. Both hosts currently serve HTTP/2, so these also cover the HTTP/2 statem path that bypass-based tests don't reach (bypass speaks HTTP/1 only). Six tests: GET /api/packages/phoenix on hex.pm; GET /names and GET /packages/phoenix on repo.hex.pm; ALPN-protocol assertions for both hosts (Host's stored protocol is `:http1` or `:http2`); and pool-reuse confirming a second request to the same host hits the same Host pid in the registry. Tagged `:network` so the suite can opt out with `mix test --exclude network` in offline environments, but the tag is not in the default exclude list so plain `mix test` runs them.
Proxy routing:
Added a bypass-free test that spins up a bare TCP listener acting as an
HTTP proxy, configures `:http_proxy` with credentials, and asserts the
request arrives with the absolute-URI request line and a
`proxy-authorization: Basic ...` header. Bypass (Cowboy) rejects
absolute-URI requests with 400, so a bypass-based test isn't possible —
the tiny TCP proxy accepts in a loop to handle both probe connections
the pool opens.
Related fix: `proxy_connect_opts` now passes the proxy host as a binary
rather than a charlist. Mint's `Core.Util.hostname/2` only accepts
binaries when deriving a hostname from the address, so a charlist proxy
address raised an ArgumentError before any bytes went out. Updated the
corresponding `proxy_config` assertion to match the new shape.
Pool coverage:
`test/hex/http/pool_test.exs` adds three tests for the unified pool:
* Different inet variants (`{:inet4, true, :inet6, false}` vs the
reverse) get separate Host registrations under distinct registry
keys — validates the fix that made IPv4↔IPv6 retry fallback
actually take effect.
* Killing a `Conn` process causes `Host` (trap_exit) to spawn a
replacement and the pool remains functional for subsequent
requests — exercises the supervision path.
* Saturating the HTTP/1 pool with more concurrent requests than
`@http1_size` (8) queues the excess callers and drains them as
in-flight requests finish.
Benchmarked `rm -rf ~/.hex/cache.ets ~/.hex/packages deps; mix deps.get` on a real project, 10 runs each, total time: httpc (main): 29.2s (baseline) Mint, HTTP/1+HTTP/2, 2 probes: 32.7s (+12%) Mint, HTTP/2, 1 probe: 42.3s (+45%) Mint, HTTP/2, 8 probes: 30.5s (+4%) Mint, HTTP/1, 2→8 pool: 27.8s (-5%, fastest) HTTP/2 consistently loses for hex's workload (many small tarball GETs). The multiplexing win doesn't offset Mint's per-stream HPACK/frame processing cost in pure Elixir. Fewer Conn processes also means fewer BEAM scheduler slots working in parallel on response parsing — HTTP/2 with 8 probe conns nearly matches HTTP/1 with 8 conns, which confirms the bottleneck is the single-Conn-process serialization of response events rather than anything on the wire. HTTP/1 with 8 conns beats httpc by ~5%. Matches the historical httpc behaviour (which was HTTP/1 only via `:inets`), so no semantic change for any caller. `Conn.do_connect` pins `protocols: [:http1]`. The HTTP/2 capacity logic is kept in place so a caller that overrides `:protocols` in `:connect_opts` still works; it's just not the default.
HTTP/2 defines a spec-mandated 64 KB initial connection-level receive window that cannot be changed via SETTINGS (only per-stream window is settable that way). For bulk downloads like hex's tarballs — multi-MB bodies across many parallel streams sharing one connection — this window fills in milliseconds and then every subsequent ~64 KB of data has to wait a full RTT for Mint's auto WINDOW_UPDATE to reach the server, capping throughput far below the TCP rate. Added `:connection_window_size` option to `Hex.Mint.HTTP2.initiate/5` that, when greater than the default 65_535, piggybacks a WINDOW_UPDATE frame on the client preface bumping the connection window up to the caller's target. `Conn.do_connect` now sets it (and the per-stream `initial_window_size`) to 8 MB so HTTP/2 doesn't stall mid-stream if a caller opts into HTTP/2 via `:connect_opts`. Benchmarked `rm -rf ~/.hex/cache.ets ~/.hex/packages deps; mix deps.get` 10 runs total against a real hexpm project: HTTP/2, default windows, 2 probes: 32.7s (+12% vs httpc) HTTP/2, stream window 8 MB only: 31.9s HTTP/2, both windows 8 MB: 29.2s (= httpc) HTTP/1, default: 27.8s (-5% vs httpc, fastest) Conn still pins `protocols: [:http1]` by default — HTTP/1 remains ~5% faster than HTTP/2 for hex's many-small-GETs workload because Mint's pure-Elixir HPACK and frame parsing costs more per request than HTTP/1 line parsing, and multiplexing doesn't recover that cost. But the patch + the configured window options mean HTTP/2 is now at parity with httpc rather than a regression, so callers who need HTTP/2 (e.g. org repos that only speak it) don't take a throughput hit. Marked with `HEX PATCH` in `lib/hex/mint/http2.ex` and documented in `scripts/vendor_mint.sh` alongside the other pending upstream items. TODO: upstream as a Mint PR.
Earlier commit defaulted to HTTP/1 based on measurements that showed a ~5% wall-time gap vs HTTP/1. Re-running the benchmark on a quieter network revealed that gap was within noise — 28.0s–30.5s across a handful of 10-run cycles, with no systematic winner. Comparing CPU use with `/usr/bin/time -lp` on a single `mix deps.get`: HTTP/1: real 3.11s user 1.65s sys 1.79s (CPU 3.44s) HTTP/2: real 2.62s user 1.58s sys 1.62s (CPU 3.20s) HTTP/2 actually uses slightly *less* CPU (fewer TLS handshakes) and is equivalent or better on wall time. The earlier hypothesis that Mint's pure-Elixir HPACK/frame parsing costs enough to make HTTP/2 slower was wrong — total CPU budget is dominated by TLS handshake, tarball unpack (zlib + crypto), and file I/O, not HTTP parsing. Flip `do_connect` back to `protocols: [:http1, :http2]` so ALPN negotiates. The 8 MB window tuning stays — it's what makes HTTP/2 viable (default 64 KB windows stall multi-MB tarball downloads).
Refactored the Mint HTTP/2 connection-window patch from a narrow
`:connection_window_size` connect option into a proper public API,
`Mint.HTTP2.set_window_size(conn, target, new_size)`, that supports
both `:connection` and `{:request, ref}` and can be called at any point
after connect. Tracks the receive window in a new `receive_window_size`
field (connection and stream); grow-only; validated to `1..2^31-1`.
This is the function shape that fills a longstanding, well-known gap in
Mint's public API — upstream issue #357 (2022, closed) asked for exactly
this, #432 (2024, still open) is a related enhancement. Ready to submit
upstream as a PR.
On the hex side:
* Re-vendor from the integration branch (ericmj/hex-vendor-integration)
which has PR #478 (Elixir 1.12), PR #479 (HTTP/1 1xx handling) and
the new set_window_size/3 commit all stacked.
* `Conn.do_connect` now calls `Hex.Mint.HTTP2.set_window_size(conn,
:connection, 8_000_000)` immediately after a successful HTTP/2 connect.
No-op on HTTP/1. TCP ordering guarantees the WINDOW_UPDATE reaches the
server before any request HEADERS, so there's no extra RTT.
* `:client_settings: [initial_window_size: 8_000_000]` still handles the
per-stream initial window via SETTINGS.
* Update `scripts/vendor_mint.sh` comment block to point at the
integration branch and list all three upstream Mint patches with
branch links.
Absorbs two new upstream Mint commits on top of the existing vendored
stack:
* `Raise default HTTP/2 receive windows` — bumps Mint's defaults from
the spec-mandated 64 KB to 4 MB per stream and 16 MB per connection.
At typical RTTs (10-150 ms) 64 KB caps throughput far below link
speed; 4/16 MB unlocks ~40 MB/s transcontinental and stream < conn
lets ~4 parallel streams run at full rate before the conn pool binds.
* `Batch HTTP/2 receive-window refills` — gates `WINDOW_UPDATE` on a
configurable threshold (`:receive_window_update_threshold`, default
160_000, ~10× the default max frame size) instead of refilling on
every DATA frame. Mitigates the amplification-DoS shape where a
malicious server sends many tiny DATA frames to force many
WINDOW_UPDATE responses.
On the hex side this means the explicit window tuning that the previous
re-vendor added to `Conn.do_connect` (setting `client_settings:
[initial_window_size: 8_000_000]` and calling `set_window_size(conn,
:connection, 8_000_000)`) is now redundant — Mint's 4/16 MB defaults
already cover hex's bulk-tarball workload. Drop the tuning block and the
`maybe_bump_connection_window/1` helper; `do_connect` now just sets
`protocols: [:http1, :http2]` and lets Mint's defaults handle the rest.
Also:
* Exclude the vendored `lib/hex/mint/**` tree from `.formatter.exs` so
`mix format --check-formatted` doesn't fight upstream formatting
every re-vendor.
* Format drift in `lib/hex/http.ex` and `lib/hex/http/pool.ex` that the
vendored formatter exclusion surfaced.
* Update `scripts/vendor_mint.sh` comment block to list the new
larger-default-windows branch alongside the other pending upstream
items.
fa2b5bc to
f470149
Compare
The initial rebase resolution for `request_to_file/6` (from hex_core
0.15.0) just called the in-memory `Pool.request/5` and then did
`File.write/2` on the full binary. For tarballs that can be tens of MBs
this pays for the body twice: once in the `Conn` accumulator and once on
disk.
Add a real streaming path through the pool:
* `Pool.request_to_file/6` and `Host.request_to_file/7` — same dispatch
as `request/5` but carry the target filename.
* `Conn` opens the file in `[:write, :raw, :binary]` mode before
issuing the Mint request and writes each `:data` chunk straight to
disk. On `:done` the file is closed and the caller gets
`{:ok, status, headers, nil}`.
* On `{:status, ref, _}` mid-request (1xx → final, or a follow-up
response on the same ref) the sink is rewound + truncated so a 1xx
body never bleeds into the final payload.
* On transport `:error` or a write error the partial file is removed
so callers never see a half-written payload.
Redirects keep working as-is: each `do_request` opens the file fresh
with `[:write]` mode which truncates, so a 3xx response body is
overwritten by the follow.
Added two pool tests: the 5 MB streaming-without-buffering test and a
redirect-truncation test that asserts a `302 → 200` chain leaves only
the final body on disk.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces
:httpcwith a vendored copy of Mint + HPAX as the transport layer behind Hex'shex_coreHTTP adapter.httpcdoes not speak HTTP/2 (needed for*.hexorgs.pmprivate docs/packages), has long-standing connection-reuse quirks, and its error model forces Hex to paper over transport failures instead of handling them structurally.What this does
lib/hex/mint/under aHex.Mint.*module prefix (so we don't collide if a user's project also depends on Mint). Ascripts/vendor_mint.shhelper handles the re-vendor (module renaming,@moduledocstripping, version string inlining, Erlang shim renaming,{:mint, …}→{:hex_mint, …}tag renaming). Vendored tree is excluded from.formatter.exs.hex_coreHTTP adapter (Hex.HTTP.Pool) that owns a two-level pool of GenServers: oneHostper(scheme, host, port, inet-family)that dispatches requests to a set ofConnworkers. EachConnowns a single Mint connection; ALPN negotiates HTTP/1 vs HTTP/2 per connect and capacity is computed from the negotiated protocol (1 for HTTP/1,SETTINGS_MAX_CONCURRENT_STREAMSfor HTTP/2).GOAWAYas a drain: theConnstops accepting new requests, finishes in-flight, then reconnects with capped backoff. Crashes in the connection process are caught and in-flight callers get{:error, reason}rather than hanging.stream_request_body/3so large package uploads don't buffer in memory.1xx-informational bug (100 Continue / 103 Early Hints terminating the real response early) by resetting accumulated state on each new:statusevent.hex.pmandrepo.hex.pmso ALPN, TLS, and the*.hexorgs.pmHTTP/2-only endpoints are covered end-to-end.http_proxyandhttps_proxyvia Mint'sUnsafeProxy/TunnelProxytransports, keyed into the pool correctly.HTTP/2 by default
After benchmarking
rm -rf ~/.hex/{cache.ets,packages} deps; mix deps.getagainst a real hexpm project over 10 runs::httpc(baseline)HTTP/2 at parity or better, and uses slightly less CPU (fewer TLS handshakes).
Conn.do_connectnow setsprotocols: [:http1, :http2]and lets ALPN decide.Upstream Mint dependencies
The vendored tree includes four Mint changes, all filed upstream:
Mint.HTTP2.set_window_size/3— grow-only public API to raise connection or stream receive windows after connect, filling a longstanding gap (upstream HTTP refactor #357, Improve CHANGELOG style against mardownlint #432)WINDOW_UPDATEbatching (mitigates the amplification-DoS shape of refilling on every DATA frame). With these defaults Hex no longer needs to tune windows itself.The integration branch that stacks all four is
elixir-mint/mint:ericmj/hex-vendor-integration;scripts/vendor_mint.shpoints at it and lists each PR/branch in its comment header.