Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions dns-strict-resolver/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
FROM golang:1.22-alpine AS builder
WORKDIR /src
COPY go.mod ./
RUN go mod download
COPY main.go ./
RUN CGO_ENABLED=0 go build -o /out/dns-strict-resolver .

FROM alpine:3.20
RUN apk add --no-cache ca-certificates
COPY --from=builder /out/dns-strict-resolver /usr/local/bin/dns-strict-resolver
EXPOSE 8086
ENTRYPOINT ["/usr/local/bin/dns-strict-resolver"]
75 changes: 75 additions & 0 deletions dns-strict-resolver/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# dns-strict-resolver

Minimal Go HTTP server that exercises the **unconnected-UDP + RFC 5452
strict-source-validation** DNS client path. Used by Keploy's e2e CI as a
regression guard for the `cgroup/recvmsg{4,6}` SNAT fix.

- Tracking issue: https://github.com/keploy/keploy/issues/4092
- Keploy fix: https://github.com/keploy/keploy/pull/4093
- eBPF fix: https://github.com/keploy/ebpf/pull/97

## Why a raw UDP client?

`net.LookupHost` on glibc (cgo) uses connected UDP most of the time, and
connected-UDP clients are rescued by Keploy's existing
`cgroup/getpeername4` hook — so they never exposed this bug. The
production failure mode (`java.net.UnknownHostException: Temporary
failure in name resolution` / `EAI_AGAIN`) only surfaces on the
unconnected-UDP path, where the client is responsible for validating the
reply's source address itself.

This sample sends DNS A queries over **unconnected** UDP sockets, reads
replies with `ReadFromUDP`, and **discards any reply whose source does
not match the nameserver it queried**. The `/suite` endpoint also runs a
connected-UDP control and a same-socket multi-upstream check against
fixture-only `*.keploy.test` records, so the sample catches the broader
bug class: missing reply-source SNAT, broken transaction-id handling,
fixture DNS drift, and original-destination mixups when one socket talks
to more than one nameserver.

## Running

```bash
go run . &
curl -sS "http://localhost:8086/resolve?domain=google.com"
```

Expected shape (post-fix):
```json
{
"domain": "google.com",
"nameserver": "127.0.0.11:53",
"rcode": 0,
"ips": ["142.250.x.x", "..."],
"source_mismatches": 0,
"attempts": 1,
"elapsed_ms": 4
}
```

Under the **buggy** (pre-fix) Keploy, replies arrive from
`<agent_ip>:<keploy_dns_port>` instead of the configured nameserver, the
source check rejects them, and `/resolve` eventually returns HTTP 502
with a non-zero `source_mismatches` counter and no answers.

## Under Keploy

```bash
sudo -E env PATH=$PATH keploy record -c "./dns-strict-resolver"
# hit /resolve endpoints, then stop keploy

sudo -E env PATH=$PATH keploy test -c "./dns-strict-resolver" --delay 10
```
Comment on lines +57 to +62
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Under Keploy” section runs keploy record -c "./dns-strict-resolver" / keploy test -c "./dns-strict-resolver" but doesn’t mention building the binary first. As written, these commands will fail unless the user has already run go build . (producing ./dns-strict-resolver). Consider adding an explicit build step (or use -c "go run .").

Copilot uses AI. Check for mistakes.

Both record and test must complete with `source_mismatches: 0` and a
non-empty `ips` list for the sample to pass. CI should prefer `/suite`
over one-off `/resolve` calls because it exercises the full regression
surface in one recorded request.

## Endpoints

| Path | Description |
| --- | --- |
| `GET /health` | Liveness probe used by the CI script. |
| `GET /resolve?domain=<d>&nameserver=<ip:53>` | Single strict unconnected-UDP A-record lookup. `domain` defaults to `google.com`; `nameserver` defaults to the first entry in `/etc/resolv.conf`. |
| `GET /suite?nameserver=<ip:53>&secondary_nameserver=<ip:53>&fixture=1` | Full regression suite: strict unconnected lookups for all fixture domains, connected-UDP control, and optional same-socket multi-upstream validation. `fixture=1` also asserts the bundled CoreDNS fixture IPs. |
5 changes: 5 additions & 0 deletions dns-strict-resolver/coredns-secondary/Corefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
. {
file /etc/coredns/zone
log
errors
}
12 changes: 12 additions & 0 deletions dns-strict-resolver/coredns-secondary/zone
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
$ORIGIN .
$TTL 300
@ IN SOA ns2.keploy.test. admin.keploy.test. (2026042401 3600 600 86400 300)
IN NS ns2.keploy.test.
ns2.keploy.test. IN A 172.30.0.11

google.com. IN A 142.250.80.46
cloudflare.com. IN A 104.16.132.229
example.com. IN A 93.184.215.14
alpha.keploy.test. IN A 10.42.0.11
beta.keploy.test. IN A 10.42.0.12
gamma.keploy.test. IN A 10.42.0.13
5 changes: 5 additions & 0 deletions dns-strict-resolver/coredns/Corefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
. {
file /etc/coredns/zone
log
errors
}
12 changes: 12 additions & 0 deletions dns-strict-resolver/coredns/zone
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
$ORIGIN .
$TTL 300
@ IN SOA ns.keploy.test. admin.keploy.test. (2026042201 3600 600 86400 300)
IN NS ns.keploy.test.
ns.keploy.test. IN A 172.30.0.10

google.com. IN A 142.250.80.46
cloudflare.com. IN A 104.16.132.229
example.com. IN A 93.184.215.14
alpha.keploy.test. IN A 10.42.0.11
beta.keploy.test. IN A 10.42.0.12
gamma.keploy.test. IN A 10.42.0.13
45 changes: 45 additions & 0 deletions dns-strict-resolver/curl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash
# Traffic generation for the dns-strict-resolver E2E test.
# Exercises the unconnected-UDP + RFC 5452 strict-source-validation path
# that surfaces keploy/keploy#4092.
#
# NAMESERVER (optional): ip:port of the DNS server to query explicitly.
# When set, it is passed through to /resolve so the sample does not
# have to rely on /etc/resolv.conf (which, under keploy's docker-mode
# --network=container:<keploy-v3> rewrite, may not be a useful
# address). The CI harness sets it to the fixture CoreDNS container.
#
# SECONDARY_NAMESERVER (optional): second ip:port used by /suite for the
# same-socket multi-upstream check.

set -euo pipefail

BASE="http://localhost:8086"
NS_QUERY=""
FIXTURE_QUERY="fixture=0"
if [[ -n "${NAMESERVER:-}" ]]; then
NS_QUERY="&nameserver=${NAMESERVER}"
FIXTURE_QUERY="fixture=1"
fi
SECONDARY_NS_QUERY=""
if [[ -n "${SECONDARY_NAMESERVER:-}" ]]; then
SECONDARY_NS_QUERY="&secondary_nameserver=${SECONDARY_NAMESERVER}"
fi

echo "=== dns regression suite ==="
curl -sS --max-time 20 "$BASE/suite?${FIXTURE_QUERY}${NS_QUERY}${SECONDARY_NS_QUERY}"
echo

echo "=== strict resolve: google.com ==="
curl -sS --max-time 10 "$BASE/resolve?domain=google.com${NS_QUERY}"
echo

echo "=== strict resolve: cloudflare.com ==="
curl -sS --max-time 10 "$BASE/resolve?domain=cloudflare.com${NS_QUERY}"
echo

echo "=== strict resolve: example.com ==="
curl -sS --max-time 10 "$BASE/resolve?domain=example.com${NS_QUERY}"
echo

echo "=== Done ==="
3 changes: 3 additions & 0 deletions dns-strict-resolver/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module dns-strict-resolver

go 1.22.0
Loading
Loading