Skip to content

fix(tcp): drain send_buf on ACK — unstick sustained exchange (bidir_multi hang)#113

Closed
gspivey wants to merge 1 commit into
developmentfrom
agent/tcp-sustained-exchange-fix
Closed

fix(tcp): drain send_buf on ACK — unstick sustained exchange (bidir_multi hang)#113
gspivey wants to merge 1 commit into
developmentfrom
agent/tcp-sustained-exchange-fix

Conversation

@gspivey

@gspivey gspivey commented Jul 2, 2026

Copy link
Copy Markdown
Owner

The engine bug the TCP smoke tiers (#111) isolated on real hardware: a single TCP round-trip works, but sustained exchange on one connection hangs (Graviton tier1-tcp-echo: bidir_echo ✅, bidir_multi ❌ hung to timeout).

Root cause

handle_established's ACK path prunes the retransmit queue but never drains the acknowledged bytes from the front of send_buf, and never clears has_unacked_data. After the first send+ACK, send_buf still holds the stale acked bytes while already_sent_offset (derived from the now-empty retransmit queue) resets to 0. The next small write then makes send_buf.len() exceed one MSS with has_unacked_data == true, so Nagle withholds it forever → the peer never gets iteration ≥1 → both sides idle → client blocks (the ~60s harness kill). The receive window is fully open — it's not flow control.

Fix

On cumulative ACK: drain the acked bytes off the front of send_buf, rebase surviving retransmit-entry offsets, and clear has_unacked_data when nothing is in flight (snd_una == snd_nxt).

Verification (offline, no EC2)

New tests/loopback_stall_repro.rs wires a client TcpEngine ↔ server TcpEngine through in-memory frame queues and drives 20 sequential 64B echoes on one connection — mirroring tcp-test-client --mode bidir --count 20. Fails before (stalls at exchange 2), passes after. All 221 unit + property/integration tests green.

Found via a focused diagnostic that reproduced the EC2 symptom in a <1ms unit test. Follow-up: the same send_buf drain is missing in handle_fin_wait_1/handle_close_wait (teardown send_buf leak, not a stall).

🤖 Generated with Claude Code

…xchange)

handle_established's ACK path pruned the retransmit queue but never drained the
acknowledged bytes from the front of send_buf, nor cleared has_unacked_data.
After the first send+ACK, send_buf still held the stale acked bytes while
already_sent_offset (derived from the now-empty retransmit queue) reset to 0 —
so the next small write was misclassified as unsent behind Nagle
(has_unacked_data==true, len<MSS) and never transmitted. The connection stalled
after the first exchange.

This is the sustained-exchange (bidir_multi) hang the TCP smoke tiers isolated
on real hardware (#111 Graviton: single round-trip PASS, 20-on-one-connection
HANG).

Fix: on cumulative ACK, drain acked bytes off the front of send_buf, rebase the
surviving retransmit-entry offsets, and clear has_unacked_data when nothing is
in flight (snd_una==snd_nxt).

Offline regression tests/loopback_stall_repro.rs wires client<->server engines
through in-memory queues and drives 20 sequential 64B echoes on one connection:
FAILS before (stalls at exchange 2), PASSES after. 221 unit + all property/
integration tests green.

Follow-up: the same send_buf drain is missing in handle_fin_wait_1 /
handle_close_wait (teardown states — a send_buf leak, not a stall).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Synthetic Performance Results — Graviton (run)

Commit: b09b19b0

✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)

Synthetic UDP Performance Results

Measures framework overhead: sync dpdk_udp::UdpSocket vs async (std::sync::Mutex + try_recv_from).

IPv4 Baseline

Test Payload Sync PPS Async PPS Ratio (sync/async) Async ns/op
TX send_to 64B 11.8M 11.2M 1.1x 89
RX recv_from 64B 3.4M 4.1M 0.8x 242
TX send_to 1400B 1.9M 1.8M 1.0x 543
RX recv_from 1400B 1.1M 1.2M 0.9x 830

IPv6

Test Payload Sync PPS Async PPS Ratio (sync/async) Async ns/op
TX send_to (IPv6) 64B 8.9M 8.9M 1.0x 112
RX recv_from (IPv6) 64B 4.0M 5.4M 0.7x 183
TX send_to (IPv6) 1400B 3.0M 3.1M 1.0x 327
RX recv_from (IPv6) 1400B 1.2M 1.3M 0.9x 787

IPv6 vs IPv4 Comparison (sync path)

Test Payload IPv4 PPS IPv6 PPS IPv4/IPv6 Ratio
TX send_to (sync) 64B 11.8M 8.9M 1.32x
RX recv_from (sync) 64B 3.4M 4.0M 0.84x
TX send_to (sync) 1400B 1.9M 3.0M 0.61x
RX recv_from (sync) 1400B 1.1M 1.2M 0.94x

IPv4 avg sync/async ratio: 1.0x, worst: 1.1x | IPv6 vs IPv4 worst ratio: 1.32x (OK)

OK: IPv6 is 32.0% slower than IPv4 — within acceptable threshold (<50%). Expected due to larger headers (40B vs 20B) and mandatory UDP checksum.

Good: Async wrapper is within 1.1x of sync — minimal framework overhead.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Synthetic Performance Results (run)

Commit: b09b19b0

✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)
✅ synthetic UDP socket bound to [2001:db8::1]:9000 (MAC: 02:00:00:00:00:01)

Synthetic UDP Performance Results

Measures framework overhead: sync dpdk_udp::UdpSocket vs async (std::sync::Mutex + try_recv_from).

IPv4 Baseline

Test Payload Sync PPS Async PPS Ratio (sync/async) Async ns/op
TX send_to 64B 11.7M 11.2M 1.0x 89
RX recv_from 64B 3.5M 4.7M 0.7x 210
TX send_to 1400B 1.8M 1.7M 1.1x 585
RX recv_from 1400B 1.1M 1.2M 0.9x 802

IPv6

Test Payload Sync PPS Async PPS Ratio (sync/async) Async ns/op
TX send_to (IPv6) 64B 9.0M 8.8M 1.0x 113
RX recv_from (IPv6) 64B 4.0M 5.5M 0.7x 182
TX send_to (IPv6) 1400B 3.1M 3.0M 1.0x 328
RX recv_from (IPv6) 1400B 1.2M 1.3M 0.9x 765

IPv6 vs IPv4 Comparison (sync path)

Test Payload IPv4 PPS IPv6 PPS IPv4/IPv6 Ratio
TX send_to (sync) 64B 11.7M 9.0M 1.30x
RX recv_from (sync) 64B 3.5M 4.0M 0.88x
TX send_to (sync) 1400B 1.8M 3.1M 0.60x
RX recv_from (sync) 1400B 1.1M 1.2M 0.95x

IPv4 avg sync/async ratio: 0.9x, worst: 1.1x | IPv6 vs IPv4 worst ratio: 1.30x (OK)

OK: IPv6 is 30.4% slower than IPv4 — within acceptable threshold (<50%). Expected due to larger headers (40B vs 20B) and mandatory UDP checksum.

Good: Async wrapper is within 1.1x of sync — minimal framework overhead.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

[CI] Stage: Deploy

Infrastructure ready.

  • Sender: i-001d2c7c957bb86b4 (DPDK ENI: 10.0.1.26)
  • Receiver: i-07acca8bad58a8fdc (DPDK ENI: 10.0.1.217)
  • Both instances SSM-ready.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

[CI] Stage: Deploy

Infrastructure ready.

  • Sender: i-0ac642b1736db9cfd (DPDK ENI: 10.0.1.35)
  • Receiver: i-07324334c7ff4ee7c (DPDK ENI: 10.0.1.157)
  • Both instances SSM-ready.

@gspivey

gspivey commented Jul 2, 2026

Copy link
Copy Markdown
Owner Author

Consolidated into #111. The send_buf-drain engine fix + loopback repro are now on agent/tcp-smoke-tiers.

@gspivey gspivey closed this Jul 2, 2026
@gspivey gspivey deleted the agent/tcp-sustained-exchange-fix branch July 2, 2026 11:38
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

[CI] Stage: Summary

All tests PASSED.

ARP seeding: kernel /proc/net/arp (automatic)

  • tier1-dpdk-echo: 6 tests, 0 failures
  • tier2-kernel-interop: 4 tests, 0 failures
  • tier3-iperf-interop: 1 tests, 0 failures
  • tier3-iperf-interop: 1 tests, 0 failures

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

✅ Integration Tests Passed — Graviton (run)

Branch: 113/merge | Commit: b09b19b0

Test Results

  • tier1-dpdk-echo.xml: 6 tests, 0 failures
  • tier2-kernel-interop.xml: 4 tests, 0 failures
  • tier3-iperf-sends.xml: 1 tests, 0 failures
  • tier3-our-app-sends.xml: 1 tests, 0 failures

Application Logs (last 20 lines)

receiver-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.217:9000 (MAC: 02:f2:4d:90:2d:a1)
echo listening on 10.0.1.217:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.26:9000 (MAC: 02:c0:5f:fd:60:b3)
echo listening on 10.0.1.26:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-test-client.log

EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
DPDK bind failed (Port init failed: Invalid port ID: 0), falling back to tokio
Backend: tokio
Sending packets...
Sent 12 bytes: 'arp-probe #1'
Received 12 bytes from 10.0.1.217:9000: 'arp-probe #1'
Test complete
[2026-07-02T11:30:51Z] INFO: ARP resolution succeeded (got response from peer)
[2026-07-02T11:30:51Z] INFO: Test: udp_send_receive
[2026-07-02T11:30:52Z] INFO: UDP send/receive succeeded
[2026-07-02T11:30:52Z] INFO: Test: echo_roundtrip
[2026-07-02T11:30:53Z] INFO: Echo roundtrip succeeded: 5/5 responses received
[2026-07-02T11:30:53Z] INFO: Test: payload_integrity
[2026-07-02T11:30:53Z] INFO: Response received, checking payload match...
[2026-07-02T11:30:53Z] INFO: Payload integrity verified (found in response)
[2026-07-02T11:30:53Z] INFO: JUnit XML written to: /tmp/test-results/tier2-kernel-interop.xml
[2026-07-02T11:30:53Z] INFO: Tier 2 sender tests complete. Results: /tmp/test-results/tier2-kernel-interop.xml

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

[CI] Stage: Summary

All tests PASSED.

ARP seeding: kernel /proc/net/arp (automatic)

  • tier1-dpdk-echo: 6 tests, 0 failures
  • tier2-kernel-interop: 4 tests, 0 failures
  • tier3-iperf-interop: 1 tests, 0 failures
  • tier3-iperf-interop: 1 tests, 0 failures

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

✅ Integration Tests Passed (Run 28585060010)

Branch: 113/merge | Commit: b09b19b0

Test Results

  • tier1-dpdk-echo.xml: 6 tests, 0 failures, skipped
  • tier2-kernel-interop.xml: 4 tests, 0 failures, skipped
  • tier3-iperf-sends.xml: 1 tests, 0 failures, skipped
  • tier3-our-app-sends.xml: 1 tests, 0 failures, skipped

Application Logs (last 20 lines)

receiver-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.157:9000 (MAC: 02:23:07:00:17:03)
echo listening on 10.0.1.157:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.35:9000 (MAC: 02:9b:fb:52:c4:c9)
echo listening on 10.0.1.35:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-test-client.log

EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
DPDK bind failed (Port init failed: Invalid port ID: 0), falling back to tokio
Backend: tokio
Sending packets...
Sent 12 bytes: 'arp-probe #1'
Received 12 bytes from 10.0.1.157:9000: 'arp-probe #1'
Test complete
[2026-07-02T11:33:06Z] INFO: ARP resolution succeeded (got response from peer)
[2026-07-02T11:33:06Z] INFO: Test: udp_send_receive
[2026-07-02T11:33:07Z] INFO: UDP send/receive succeeded
[2026-07-02T11:33:07Z] INFO: Test: echo_roundtrip
[2026-07-02T11:33:07Z] INFO: Echo roundtrip succeeded: 5/5 responses received
[2026-07-02T11:33:07Z] INFO: Test: payload_integrity
[2026-07-02T11:33:08Z] INFO: Response received, checking payload match...
[2026-07-02T11:33:08Z] INFO: Payload integrity verified (found in response)
[2026-07-02T11:33:08Z] INFO: JUnit XML written to: /tmp/test-results/tier2-kernel-interop.xml
[2026-07-02T11:33:08Z] INFO: Tier 2 sender tests complete. Results: /tmp/test-results/tier2-kernel-interop.xml

receiver-test-client-iperf.log

[2026-07-02T11:40:13Z] INFO: iperf-sends: sent 10 packets, received 10 responses
[2026-07-02T11:40:13Z] INFO: iperf-sends: PASS (sent >= 5 packets)
[2026-07-02T11:40:13Z] INFO: JUnit XML written to: /tmp/test-results/tier3-iperf-sends.xml
[2026-07-02T11:40:13Z] INFO: iperf-sends test complete

sender-test-client-iperf.log

Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #3'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #4'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #4'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #5'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #5'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #6'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #6'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #7'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #7'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #8'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #8'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #9'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #9'
Sent 31 bytes: 'dpdk-to-kernel-test-payload #10'
Received 31 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #10'
Test complete
[2026-07-02T11:39:15Z] INFO: our-app-sends: sent 10 packets, received 10 responses
[2026-07-02T11:39:15Z] INFO: our-app-sends: PASS (sent >= 5 packets)
[2026-07-02T11:39:15Z] INFO: JUnit XML written to: /tmp/test-results/tier3-our-app-sends.xml
[2026-07-02T11:39:15Z] INFO: our-app-sends test complete
Full Application Logs (last 200 lines each)

receiver-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.157:9000 (MAC: 02:23:07:00:17:03)
echo listening on 10.0.1.157:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-echo-server.log

EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.35:9000 (MAC: 02:9b:fb:52:c4:c9)
echo listening on 10.0.1.35:9000 (MTU=9001, max_udp_payload=8973)
Shutting down gracefully...

sender-test-client.log

[2026-07-02T11:29:09Z] INFO: Test: arp_resolution
UDP Test Client
Target: 10.0.1.157:9000
Bind address: 10.0.1.35:0
Message: 'arp-probe'
Count: 1
EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.35:32768 (MAC: 02:9b:fb:52:c4:c9)
Backend: dpdk
Sending packets...
Sent 12 bytes: 'arp-probe #1'
Received 12 bytes from 10.0.1.157:9000: 'arp-probe #1'
Test complete
[2026-07-02T11:29:10Z] INFO: ARP resolution succeeded (got response from peer)
[2026-07-02T11:29:10Z] INFO: Test: udp_send_receive
[2026-07-02T11:29:12Z] INFO: UDP send/receive succeeded
[2026-07-02T11:29:12Z] INFO: Test: echo_roundtrip
[2026-07-02T11:29:14Z] INFO: Echo roundtrip succeeded: 5/5 responses received
[2026-07-02T11:29:14Z] INFO: Test: payload_integrity
[2026-07-02T11:29:14Z] INFO: Response received, checking payload match...
[2026-07-02T11:29:14Z] INFO: Payload integrity verified (found in response)
[2026-07-02T11:29:14Z] INFO: Test: jumbo_diagnostics
[2026-07-02T11:29:14Z] INFO: === JUMBO FRAME DIAGNOSTICS ===
[2026-07-02T11:29:14Z] INFO: Interface MTU:
  9001
  65536
[2026-07-02T11:29:14Z] INFO:   ens5: MTU=9001
[2026-07-02T11:29:14Z] INFO:   lo: MTU=65536
[2026-07-02T11:29:14Z] INFO: Routing table MTU column:
Iface	Destination	Gateway 	Flags	RefCnt	Use	Metric	Mask		MTU	Window	IRTT                                                       
ens5	00000000	0101000A	0003	0	0	512	00000000	0	0	0                                                                             
ens5	0200000A	0101000A	0007	0	0	512	FFFFFFFF	0	0	0                                                                             
ens5	0001000A	00000000	0001	0	0	512	00FFFFFF	0	0	0                                                                             
ens5	0101000A	00000000	0005	0	0	512	FFFFFFFF	0	0	0                                                                             
[2026-07-02T11:29:14Z] INFO: DPDK port config (from echo server log):
[2026-07-02T11:29:14Z] INFO:   (no MTU info in echo log)
[2026-07-02T11:29:14Z] INFO: === END JUMBO DIAGNOSTICS ===
[2026-07-02T11:29:14Z] INFO: Test: jumbo_echo_8000
[2026-07-02T11:29:16Z] INFO: Jumbo output: UDP Test Client
Target: 10.0.1.157:9000
Bind address: 10.0.1.35:0
Payload size: 8000 bytes
Count: 3
EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.35:32768 (MAC: 02:9b:fb:52:c4:c9)
Backend: dpdk
Sending packets...
Sent 8000 bytes (binary payload)
Received 8000 bytes from 10.0.1.157:9000 (expected 8000, OK)
Sent 8000 bytes (binary payload)
Received 8000 bytes from 10.0.1.157:9000 (expected 8000, OK)
Sent 8000 bytes (binary payload)
Received 8000 bytes from 10.0.1.157:9000 (expected 8000, OK)
Test complete
[2026-07-02T11:29:16Z] INFO: Jumbo frame echo succeeded: 3/3 responses with correct size
[2026-07-02T11:29:16Z] INFO: JUnit XML written to: /tmp/test-results/tier1-dpdk-echo.xml
[2026-07-02T11:29:16Z] INFO: Tier 1 sender tests complete. Results: /tmp/test-results/tier1-dpdk-echo.xml
[2026-07-02T11:33:05Z] INFO: Test: arp_resolution
UDP Test Client
Target: 10.0.1.157:9000
Bind address: 0.0.0.0:0
Message: 'arp-probe'
Count: 1
EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
DPDK bind failed (Port init failed: Invalid port ID: 0), falling back to tokio
Backend: tokio
Sending packets...
Sent 12 bytes: 'arp-probe #1'
Received 12 bytes from 10.0.1.157:9000: 'arp-probe #1'
Test complete
[2026-07-02T11:33:06Z] INFO: ARP resolution succeeded (got response from peer)
[2026-07-02T11:33:06Z] INFO: Test: udp_send_receive
[2026-07-02T11:33:07Z] INFO: UDP send/receive succeeded
[2026-07-02T11:33:07Z] INFO: Test: echo_roundtrip
[2026-07-02T11:33:07Z] INFO: Echo roundtrip succeeded: 5/5 responses received
[2026-07-02T11:33:07Z] INFO: Test: payload_integrity
[2026-07-02T11:33:08Z] INFO: Response received, checking payload match...
[2026-07-02T11:33:08Z] INFO: Payload integrity verified (found in response)
[2026-07-02T11:33:08Z] INFO: JUnit XML written to: /tmp/test-results/tier2-kernel-interop.xml
[2026-07-02T11:33:08Z] INFO: Tier 2 sender tests complete. Results: /tmp/test-results/tier2-kernel-interop.xml

receiver-test-client-iperf.log

[2026-07-02T11:40:13Z] INFO: iperf-sends: sent 10 packets, received 10 responses
[2026-07-02T11:40:13Z] INFO: iperf-sends: PASS (sent >= 5 packets)
[2026-07-02T11:40:13Z] INFO: JUnit XML written to: /tmp/test-results/tier3-iperf-sends.xml
[2026-07-02T11:40:13Z] INFO: iperf-sends test complete

sender-test-client-iperf.log

[2026-07-02T11:39:13Z] INFO: Pre-flight: checking DPDK state and ARP cache...
[2026-07-02T11:39:13Z] INFO: Local IP: 10.0.1.35, Peer IP: 10.0.1.157, Port: 9000
[2026-07-02T11:39:13Z] INFO: /proc/net/arp contents:
IP address       HW type     Flags       HW address            Mask     Device
10.0.1.157       0x1         0x2         02:23:07:00:17:03     *        ens5
10.0.1.224       0x1         0x2         02:03:52:f6:a0:95     *        ens5
10.0.1.95        0x1         0x2         02:cf:82:37:07:69     *        ens5
10.0.1.1         0x1         0x2         02:53:5c:0e:ec:59     *        ens5
10.0.1.35        0x1         0x2         02:9b:fb:52:c4:c9     *        ens5
[2026-07-02T11:39:13Z] INFO: DPDK runtime state:
No /var/run/dpdk/ directory
[2026-07-02T11:39:13Z] INFO: vfio-pci bindings:
0000:00:06.0
bind
module
new_id
remove_id
uevent
unbind
[2026-07-02T11:39:13Z] INFO: Test binary: /opt/dpdk-stdlib/target/release/test-client
-rwxr-xr-x. 2 root root 1950760 Jul  2 11:25 /opt/dpdk-stdlib/target/release/test-client
[2026-07-02T11:39:13Z] INFO: Launching test-client: /opt/dpdk-stdlib/target/release/test-client --target 10.0.1.157 --port 9000 --bind-ip 10.0.1.35 --count 10 --delay 200
[2026-07-02T11:39:15Z] INFO: Test client output: UDP Test Client
Target: 10.0.1.157:9000
Bind address: 10.0.1.35:0
Message: 'dpdk-to-kernel-test-payload'
Count: 10
EAL: Detected CPU lcores: 2
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Warning: Some RX offloads not supported by device (flags: 0x1)
Warning: Some TX offloads not supported by device (flags: 0x1)
✅ DPDK UDP socket bound to 10.0.1.35:32768 (MAC: 02:9b:fb:52:c4:c9)
Backend: dpdk
Sending packets...
Sent 30 bytes: 'dpdk-to-kernel-test-payload #1'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #1'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #2'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #2'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #3'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #3'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #4'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #4'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #5'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #5'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #6'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #6'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #7'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #7'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #8'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #8'
Sent 30 bytes: 'dpdk-to-kernel-test-payload #9'
Received 30 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #9'
Sent 31 bytes: 'dpdk-to-kernel-test-payload #10'
Received 31 bytes from 10.0.1.157:9000: 'dpdk-to-kernel-test-payload #10'
Test complete
[2026-07-02T11:39:15Z] INFO: our-app-sends: sent 10 packets, received 10 responses
[2026-07-02T11:39:15Z] INFO: our-app-sends: PASS (sent >= 5 packets)
[2026-07-02T11:39:15Z] INFO: JUnit XML written to: /tmp/test-results/tier3-our-app-sends.xml
[2026-07-02T11:39:15Z] INFO: our-app-sends test complete
⚠️ SSM Command Failures (receiver-ssm-failure.log)
=== Polling timeout after 30s ===
Status: InProgress
Instance: i-07324334c7ff4ee7c (receiver)
Command ID: e8986f50-57bf-4014-b72d-0eabe569449e

=== STDOUT ===


=== STDERR ===


=== Polling timeout after 30s ===
Status: InProgress
Instance: i-07324334c7ff4ee7c (receiver)
Command ID: 3daa37f1-3453-44af-8a3d-35022785f807

=== STDOUT ===


=== STDERR ===


=== Polling timeout after 30s ===
Status: InProgress
Instance: i-07324334c7ff4ee7c (receiver)
Command ID: 7c6290e1-cf3e-43bf-90b8-468cfa6232f7

=== STDOUT ===


=== STDERR ===


=== Polling timeout after 30s ===
Status: InProgress
Instance: i-07324334c7ff4ee7c (receiver)
Command ID: 50adb89a-09c9-40b7-8be6-e3129b43bb6d

=== STDOUT ===


=== STDERR ===


⚠️ SSM Command Failures (sender-ssm-failure.log)
=== Polling timeout after 30s ===
Status: InProgress
Instance: i-0ac642b1736db9cfd (sender)
Command ID: 2748ab50-2c56-48c2-9a29-fd1ff34caa24

=== STDOUT ===


=== STDERR ===


Network & PCI State

receiver-network-interfaces.log

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:bf:dd:d6:25:af brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname eni-0c96af3ef15917ec2
    altname device-number-0.0
    inet 10.0.1.140/24 metric 512 brd 10.0.1.255 scope global dynamic ens5
       valid_lft 2106sec preferred_lft 2106sec
    inet6 fe80::bf:ddff:fed6:25af/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

sender-network-interfaces.log

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:07:31:ab:66:15 brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname eni-0092d6f9b40c2b05f
    altname device-number-0.0
    inet 10.0.1.39/24 metric 512 brd 10.0.1.255 scope global dynamic ens5
       valid_lft 2127sec preferred_lft 2127sec
    inet6 fe80::7:31ff:feab:6615/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

receiver-networking-diag-baseline.txt

=== NETWORKING DIAGNOSTICS ===
timestamp: 2026-07-02T11:28:32Z
hostname: ip-10-0-1-140.ec2.internal
kernel: 6.18.35-68.129.amzn2023.x86_64

=== DPDK PORT STATUS ===

Network devices using DPDK-compatible driver
============================================
0000:00:06.0 'Elastic Network Adapter (ENA) ec20' drv=vfio-pci unused=ena

Network devices using kernel driver
===================================
0000:00:05.0 'Elastic Network Adapter (ENA) ec20' if=ens5 drv=ena unused=vfio-pci *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'DMA' devices detected
=========================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================

No 'Regex' devices detected
===========================

=== IP ADDRESSES ===
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:bf:dd:d6:25:af brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname eni-0c96af3ef15917ec2
    altname device-number-0.0
    inet 10.0.1.140/24 metric 512 brd 10.0.1.255 scope global dynamic ens5
       valid_lft 3106sec preferred_lft 3106sec
    inet6 fe80::bf:ddff:fed6:25af/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

=== ARP TABLE ===
10.0.1.1 dev ens5 lladdr 02:53:5c:0e:ec:59 REACHABLE 
10.0.1.95 dev ens5 lladdr 02:cf:82:37:07:69 REACHABLE 

=== ROUTE TABLE ===
default via 10.0.1.1 dev ens5 proto dhcp src 10.0.1.140 metric 512 
10.0.0.2 via 10.0.1.1 dev ens5 proto dhcp src 10.0.1.140 metric 512 
10.0.1.0/24 dev ens5 proto kernel scope link src 10.0.1.140 metric 512 
10.0.1.1 dev ens5 proto dhcp scope link src 10.0.1.140 metric 512 

=== IMDS: ENI INFORMATION ===
ENI MACs found: 02:23:07:00:17:03/ 02:bf:dd:d6:25:af/ 

--- ENI: 02:23:07:00:17:03/ ---
  device-number: 1
  local-ipv4s: 10.0.1.157
  subnet-id: subnet-0c4c14096af4191a7
  subnet-cidr: 10.0.1.0/24

--- ENI: 02:bf:dd:d6:25:af/ ---
  device-number: 0
  local-ipv4s: 10.0.1.140
  subnet-id: subnet-0c4c14096af4191a7
  subnet-cidr: 10.0.1.0/24


=== GATEWAY ARP TEST ===
Gateway IP: 10.0.1.1
Gateway ARP entry:
10.0.1.1 dev ens5 lladdr 02:53:5c:0e:ec:59 REACHABLE 

arping result:
ARPING 10.0.1.1 from 10.0.1.140 ens5
Unicast reply from 10.0.1.1 [02:53:5C:0E:EC:59]  0.532ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

=== HUGEPAGE STATUS ===
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:     51200 kB
HugePages_Total:    1024
HugePages_Free:     1024
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         2097152 kB

=== VFIO STATUS ===
total 0
drwxr-xr-x.  2 root root       80 Jul  2 11:28 .
drwxr-xr-x. 14 root root     3100 Jul  2 11:27 ..
crw-------.  1 root root 243,   0 Jul  2 11:28 noiommu-0
crw-rw-rw-.  1 root root  10, 196 Jul  2 11:20 vfio

noiommu mode:
Y

=== DPDK SHARED MEMORY ===
no /var/run/dpdk/ directory (clean state)

=== DPDK-RELATED DMESG (last 30 lines) ===
[    0.054669] printk: legacy console [ttyS0] enabled
[    0.055692] x2apic enabled
[    0.059877] mitigations: Enabled attack vectors: user_kernel, user_user, guest_host, guest_guest, SMT mitigations: auto
[    0.059984] x86/fpu: Enabled xstate features 0x2ff, context size is 2568 bytes, using 'compacted' format.
[    0.069746] audit: type=2000 audit(1782991212.538:1): state=initialized audit_enabled=0 res=1
[    0.089231] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[    0.141711] ACPI: Interpreter enabled
[    0.141711] ACPI: Enabled 2 GPEs in block 00 to 0F
[    0.159036] pci 0000:00:05.0: enabling Extended Tags
[    0.224366] SGI XFS with ACLs, security attributes, quota, no debug enabled
[    0.236970] ACPI: \_SB_.LNKD: Enabled at IRQ 11
[    0.348516] IPI shorthand broadcast: enabled
[    3.099044] systemd[1]: Mounting dev-hugepages.mount - Huge Pages File System...
[    3.139741] systemd[1]: Mounted dev-hugepages.mount - Huge Pages File System.
[    3.201770] VFIO - User Level meta-driver version: 0.3
[    3.641394] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.17.1g
[    3.654906] ena 0000:00:05.0: ENA device version: 0.10
[    3.655617] ena 0000:00:05.0: ENA controller version: 0.0.1 implementation version 1
[    3.754651] ena 0000:00:05.0: ENA Large LLQ is disabled
[    3.767018] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem c0500000, mac addr 02:bf:dd:d6:25:af
[    3.784868] ena 0000:00:05.0 ens5: renamed from eth0
[  452.150283] pci 0000:00:06.0: enabling Extended Tags
[  452.154239] ena 0000:00:06.0: enabling device (0000 -> 0002)
[  452.167354] ena 0000:00:06.0: ENA device version: 0.10
[  452.168103] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[  452.267578] ena 0000:00:06.0: ENA Large LLQ is disabled
[  452.280009] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem c0508000, mac addr 02:23:07:00:17:03
[  452.287397] ena 0000:00:06.0 ens6: renamed from eth0
[  487.182036] vfio-pci 0000:00:06.0: Adding to iommu group 0
[  487.183446] vfio-pci 0000:00:06.0: Adding kernel taint for vfio-noiommu group on device

=== DPDK-RELATED PROCESSES ===
no DPDK processes running

=== END DIAGNOSTICS ===

sender-networking-diag-baseline.txt

=== NETWORKING DIAGNOSTICS ===
timestamp: 2026-07-02T11:28:24Z
hostname: ip-10-0-1-39.ec2.internal
kernel: 6.18.35-68.129.amzn2023.x86_64

=== DPDK PORT STATUS ===

Network devices using DPDK-compatible driver
============================================
0000:00:06.0 'Elastic Network Adapter (ENA) ec20' drv=vfio-pci unused=ena

Network devices using kernel driver
===================================
0000:00:05.0 'Elastic Network Adapter (ENA) ec20' if=ens5 drv=ena unused=vfio-pci *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'DMA' devices detected
=========================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================

No 'Regex' devices detected
===========================

=== IP ADDRESSES ===
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:07:31:ab:66:15 brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname eni-0092d6f9b40c2b05f
    altname device-number-0.0
    inet 10.0.1.39/24 metric 512 brd 10.0.1.255 scope global dynamic ens5
       valid_lft 3113sec preferred_lft 3113sec
    inet6 fe80::7:31ff:feab:6615/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

=== ARP TABLE ===
10.0.1.224 dev ens5 lladdr 02:03:52:f6:a0:95 STALE 
10.0.1.95 dev ens5 lladdr 02:cf:82:37:07:69 REACHABLE 
10.0.1.1 dev ens5 lladdr 02:53:5c:0e:ec:59 REACHABLE 

=== ROUTE TABLE ===
default via 10.0.1.1 dev ens5 proto dhcp src 10.0.1.39 metric 512 
10.0.0.2 via 10.0.1.1 dev ens5 proto dhcp src 10.0.1.39 metric 512 
10.0.1.0/24 dev ens5 proto kernel scope link src 10.0.1.39 metric 512 
10.0.1.1 dev ens5 proto dhcp scope link src 10.0.1.39 metric 512 

=== IMDS: ENI INFORMATION ===
ENI MACs found: 02:07:31:ab:66:15/ 02:9b:fb:52:c4:c9/ 

--- ENI: 02:07:31:ab:66:15/ ---
  device-number: 0
  local-ipv4s: 10.0.1.39
  subnet-id: subnet-0c4c14096af4191a7
  subnet-cidr: 10.0.1.0/24

--- ENI: 02:9b:fb:52:c4:c9/ ---
  device-number: 1
  local-ipv4s: 10.0.1.35
  subnet-id: subnet-0c4c14096af4191a7
  subnet-cidr: 10.0.1.0/24


=== GATEWAY ARP TEST ===
Gateway IP: 10.0.1.1
Gateway ARP entry:
10.0.1.1 dev ens5 lladdr 02:53:5c:0e:ec:59 REACHABLE 

arping result:
ARPING 10.0.1.1 from 10.0.1.39 ens5
Unicast reply from 10.0.1.1 [02:53:5C:0E:EC:59]  0.533ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

=== HUGEPAGE STATUS ===
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:     51200 kB
HugePages_Total:    1024
HugePages_Free:     1024
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         2097152 kB

=== VFIO STATUS ===
total 0
drwxr-xr-x.  2 root root       80 Jul  2 11:28 .
drwxr-xr-x. 14 root root     3100 Jul  2 11:26 ..
crw-------.  1 root root 243,   0 Jul  2 11:28 noiommu-0
crw-rw-rw-.  1 root root  10, 196 Jul  2 11:20 vfio

noiommu mode:
Y

=== DPDK SHARED MEMORY ===
no /var/run/dpdk/ directory (clean state)

=== DPDK-RELATED DMESG (last 30 lines) ===
[    0.053867] printk: legacy console [ttyS0] enabled
[    0.054874] x2apic enabled
[    0.058991] mitigations: Enabled attack vectors: user_kernel, user_user, guest_host, guest_guest, SMT mitigations: auto
[    0.059100] x86/fpu: Enabled xstate features 0x2ff, context size is 2568 bytes, using 'compacted' format.
[    0.068861] audit: type=2000 audit(1782991212.437:1): state=initialized audit_enabled=0 res=1
[    0.088336] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[    0.140778] ACPI: Interpreter enabled
[    0.140778] ACPI: Enabled 2 GPEs in block 00 to 0F
[    0.158054] pci 0000:00:05.0: enabling Extended Tags
[    0.224320] SGI XFS with ACLs, security attributes, quota, no debug enabled
[    0.239102] ACPI: \_SB_.LNKD: Enabled at IRQ 11
[    0.348709] IPI shorthand broadcast: enabled
[    3.685318] systemd[1]: Mounting dev-hugepages.mount - Huge Pages File System...
[    3.726481] systemd[1]: Mounted dev-hugepages.mount - Huge Pages File System.
[    3.782672] VFIO - User Level meta-driver version: 0.3
[    4.131723] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.17.1g
[    4.144622] ena 0000:00:05.0: ENA device version: 0.10
[    4.145333] ena 0000:00:05.0: ENA controller version: 0.0.1 implementation version 1
[    4.244552] ena 0000:00:05.0: ENA Large LLQ is disabled
[    4.256947] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem c0500000, mac addr 02:07:31:ab:66:15
[    4.274741] ena 0000:00:05.0 ens5: renamed from eth0
[  392.131418] pci 0000:00:06.0: enabling Extended Tags
[  392.135275] ena 0000:00:06.0: enabling device (0000 -> 0002)
[  392.148561] ena 0000:00:06.0: ENA device version: 0.10
[  392.149310] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[  392.250383] ena 0000:00:06.0: ENA Large LLQ is disabled
[  392.262709] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem c0508000, mac addr 02:9b:fb:52:c4:c9
[  392.269440] ena 0000:00:06.0 ens6: renamed from eth0
[  480.893062] vfio-pci 0000:00:06.0: Adding to iommu group 0
[  480.894484] vfio-pci 0000:00:06.0: Adding kernel taint for vfio-noiommu group on device

=== DPDK-RELATED PROCESSES ===
no DPDK processes running

=== END DIAGNOSTICS ===
⚠️ Crash Diagnostics

receiver-dmesg-crashes.log

[    0.069131] pid_max: default: 32768 minimum: 301
[    0.162902] iommu: Default domain type: Translated
[    0.162928] NetLabel:  unlabeled traffic allowed by default
[    0.179241] PCI: CLS 0 bytes, default 64
[    0.253677] nvme nvme0: 2/0/0 default/read/poll queues
[    0.418997] systemd[1]: systemd 252.23-12.amzn2023 running in system mode (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 -BZIP2 -LZ4 +XZ +ZLIB -ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[    0.419120] systemd[1]: No hostname configured, using default hostname.
[    0.486790] systemd[1]: Queued start job for default target initrd.target.
[    2.440592] systemd[1]: systemd 252.23-12.amzn2023 running in system mode (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 -BZIP2 -LZ4 +XZ +ZLIB -ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[  522.139666] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:8331)
[  757.935911] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:8815)
[ 1365.006217] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:9987)

sender-dmesg-crashes.log

[    0.068240] pid_max: default: 32768 minimum: 301
[    0.161835] iommu: Default domain type: Translated
[    0.161860] NetLabel:  unlabeled traffic allowed by default
[    0.178987] PCI: CLS 0 bytes, default 64
[    0.254882] nvme nvme0: 2/0/0 default/read/poll queues
[    0.429281] systemd[1]: systemd 252.23-12.amzn2023 running in system mode (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 -BZIP2 -LZ4 +XZ +ZLIB -ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[    0.429408] systemd[1]: No hostname configured, using default hostname.
[    0.490311] systemd[1]: Queued start job for default target initrd.target.
[    3.156197] systemd[1]: systemd 252.23-12.amzn2023 running in system mode (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 -BZIP2 -LZ4 +XZ +ZLIB -ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[ 1183.454990] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:9662)
Kernel Console (dmesg)

receiver-console-output.log (PCI/driver events only)

[  522.138132] vfio-pci 0000:00:06.0: reset done
[  522.139666] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:8331)
[  522.141058] vfio-pci 0000:00:06.0: resetting
[  522.357999] vfio-pci 0000:00:06.0: reset done
[  599.827298] vfio-pci 0000:00:06.0: Removing from iommu group 0
[  600.840505] ena 0000:00:06.0: ENA device version: 0.10
[  600.841245] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[  600.935209] ena 0000:00:06.0: ENA Large LLQ is disabled
[  600.947963] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem c0508000, mac addr 02:23:07:00:17:03
[  600.957869] ena 0000:00:06.0 ens6: renamed from eth0 (while UP)
[  620.319154] vfio-pci 0000:00:06.0: Adding to iommu group 0
[  620.320519] vfio-pci 0000:00:06.0: Adding kernel taint for vfio-noiommu group on device
[  757.710119] vfio-pci 0000:00:06.0: resetting
[  757.934362] vfio-pci 0000:00:06.0: reset done
[  757.935911] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:8815)
[  757.937309] vfio-pci 0000:00:06.0: resetting
[  758.154226] vfio-pci 0000:00:06.0: reset done
[  830.263848] vfio-pci 0000:00:06.0: Removing from iommu group 0
[  831.276656] ena 0000:00:06.0: ENA device version: 0.10
[  831.277392] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[  831.371529] ena 0000:00:06.0: ENA Large LLQ is disabled
[  831.383809] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem c0508000, mac addr 02:23:07:00:17:03
[  831.392191] ena 0000:00:06.0 ens6: renamed from eth0
[ 1280.639290] vfio-pci 0000:00:06.0: Adding to iommu group 0
[ 1280.640710] vfio-pci 0000:00:06.0: Adding kernel taint for vfio-noiommu group on device
[ 1364.787200] vfio-pci 0000:00:06.0: resetting
[ 1365.004666] vfio-pci 0000:00:06.0: reset done
[ 1365.006217] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (echo:9987)
[ 1365.007640] vfio-pci 0000:00:06.0: resetting
[ 1365.224537] vfio-pci 0000:00:06.0: reset done

sender-console-output.log (PCI/driver events only)

[ 1183.456394] vfio-pci 0000:00:06.0: resetting
[ 1183.673349] vfio-pci 0000:00:06.0: reset done
[ 1243.979783] vfio-pci 0000:00:06.0: Removing from iommu group 0
[ 1244.991112] ena 0000:00:06.0: ENA device version: 0.10
[ 1244.991825] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[ 1245.091120] ena 0000:00:06.0: ENA Large LLQ is disabled
[ 1245.103168] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem c0508000, mac addr 02:9b:fb:52:c4:c9
[ 1245.113960] ena 0000:00:06.0 ens6: renamed from eth0 (while UP)
[ 1274.204817] vfio-pci 0000:00:06.0: Adding to iommu group 0
[ 1274.206155] vfio-pci 0000:00:06.0: Adding kernel taint for vfio-noiommu group on device
[ 1380.752329] vfio-pci 0000:00:06.0: resetting
[ 1380.981307] vfio-pci 0000:00:06.0: reset done
[ 1380.982829] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (tokio-rt-worker:10104)
[ 1380.984383] vfio-pci 0000:00:06.0: resetting
[ 1381.201104] vfio-pci 0000:00:06.0: reset done
[ 1381.488188] vfio-pci 0000:00:06.0: resetting
[ 1381.701110] vfio-pci 0000:00:06.0: reset done
[ 1381.702647] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (tokio-rt-worker:10124)
[ 1381.704180] vfio-pci 0000:00:06.0: resetting
[ 1381.921122] vfio-pci 0000:00:06.0: reset done
[ 1383.210155] vfio-pci 0000:00:06.0: resetting
[ 1383.441146] vfio-pci 0000:00:06.0: reset done
[ 1383.442659] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (tokio-rt-worker:10147)
[ 1383.444221] vfio-pci 0000:00:06.0: resetting
[ 1383.661156] vfio-pci 0000:00:06.0: reset done
[ 1384.950537] vfio-pci 0000:00:06.0: resetting
[ 1385.171144] vfio-pci 0000:00:06.0: reset done
[ 1385.172678] vfio-pci 0000:00:06.0: vfio-noiommu device opened by user (tokio-rt-worker:10170)
[ 1385.174240] vfio-pci 0000:00:06.0: resetting
[ 1385.391061] vfio-pci 0000:00:06.0: reset done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant