Add DNS cache to deduplicate upstream lookups#610
Add DNS cache to deduplicate upstream lookups#610clemlesne wants to merge 1 commit intocontainers:mainfrom
Conversation
Introduce pkg/dnscache, a thread-safe LRU cache (map + doubly-linked list) with per-entry TTL and lazy expiry. The DNS handler now checks the cache before querying upstream for Type A lookups, and populates it on cache miss. This deduplicates repeated upstream lookups for clients without their own cache. The cache is created in addServices() with a 5 s TTL and 10 k entry capacity (~2.5 MB at full load, verified by a memory-usage test). Passing nil disables caching, keeping all existing call sites and tests unchanged in behaviour. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: clemlesne The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Have you considered using https://github.com/rs/dnscache which you mention in your PR description? Just curious, not saying you should change. You will also need to add a DCO to your commits https://github.com/containers/gvisor-tap-vsock/pull/610/checks?check_run_id=63405935692 |
Well this implementation is really simple, why use a lib? A lib inherently add risks. Don’t see the added values there. I would rather to strengthen tests if you want. What’s your opinion?
I will! |
Summary
pkg/dnscache, a thread-safe LRU cache (map + doubly-linked list) with per-entry TTL (5 s) and lazy expiry, capped at 10k entries (~2.5 MB at full load)addServices()and passed through todns.New(); passingnildisables cachingWhy
Go's
net.Resolverdoes not cache DNS lookups. EachLookupIPAddr()call goes through to the system resolver.On hosts with a local DNS cache daemon —
systemd-resolved(up to 4096 entries, honors upstream TTL up to 2 hours), macOSmDNSResponder(unlimited active entries, min 15 s TTL) — repeated lookups are already cheap. Our cache adds no meaningful value there.On hosts without a local cache daemon — glibc's stub resolver (minimal containers, Alpine, some RHEL/Debian setups) — every
LookupIPAddr()is a real network round trip to the upstream DNS server. Since gvisor-tap-vsock acts as the DNS server for VMs, bursts of repeated queries for the same domain (TLS handshakes, HTTP redirects, retries) each hit upstream. This is where the cache helps.Relationship with #609 (SNI-based outbound filtering)
This cache becomes more valuable when coupled with #609. The TCP forwarder's SNI allowlist check needs to correlate a TLS ClientHello hostname with previously resolved IPs. Without this cache, that would require a fresh DNS lookup on every TCP connection. With it, the forwarder calls
cache.Get()to retrieve the already-resolved IPs — no extra round trip per connection.Prior art
Application-level DNS caching with short TTLs is a common pattern when the runtime doesn't cache:
TCPConnectorenables DNS caching by default with a 10 s TTLDesign decisions
niland behave identically to beforeWhat is NOT included
cache.Get()for SNI allowlist checking) — wired in feat: add outbound filtering with SNI-based allowlist and security hardening #609Test plan
go build ./cmd/gvproxy/go test ./pkg/dnscache/ -count=1 -v(24 tests: correctness, TTL, LRU, concurrency, security, memory usage)go test ./pkg/services/dns/ -count=1 -v(all existing tests pass withnilcache)go vet ./pkg/...🤖 Generated with Claude Code