chore(test): add gateway peering NAT tests for all cases in 26.01#1579
chore(test): add gateway peering NAT tests for all cases in 26.01#1579
Conversation
There was a problem hiding this comment.
Pull request overview
Adds comprehensive gateway peering NAT coverage to the hhfab release tests, expanding beyond the existing static/masquerade cases to include port-forward and combined NAT modes across internal VPC peerings and gateway externals.
Changes:
- Extend gateway peering spec helpers to support port-forward NAT and combined masquerade+port-forward (via split expose entries).
- Add new NAT test cases for internal VPC peerings (port-forward only, masquerade+port-forward).
- Add a new external NAT test suite covering BGP and static externals, and wire it into the multi-VPC single-subnet suite; improve external selection/skip signaling.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/hhfab/testing.go | Removes a hard “unsupported” check so peering presence logic can tolerate expose.As (needed for NAT cases). |
| pkg/hhfab/rt_utils.go | Adds NAT-mode helpers (port-forward + combined), external NAT annotation helpers, and an inverted external port-forward peering builder. |
| pkg/hhfab/rt_nat_tests.go | Adds internal gateway peering tests for port-forward NAT and masquerade+port-forward NAT, including a new inbound port-forward connectivity helper. |
| pkg/hhfab/rt_nat_external_tests.go | Introduces a new external NAT test matrix for BGP and static externals (static/masq/port-forward/combined) with convergence handling. |
| pkg/hhfab/rt_multi_vpc_single_subnet_suite.go | Registers the external NAT test cases into the multi-VPC single-subnet suite. |
| pkg/hhfab/rt_base.go | Adds findExternals and enhances skip-flag handling for viable BGP vs static externals. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
170a1dd to
8e2b5dc
Compare
7844d6a to
bf540ad
Compare
bf540ad to
1b5a833
Compare
|
The tbr commit (1b5a833) can be removed once githedgehog/lab-ci#17 is merged |
|
This passes on And There are some issues on I may add |
1b5a833 to
c275a9d
Compare
|
Hit this while testing hlab job: #1593 |
edipascale
left a comment
There was a problem hiding this comment.
thanks @pau-hedgehog, this mostly looks good to me, I'm only concerned about the proliferation of testing methods for the various NAT types and the need to maintain slightly different copies of the same code. FWIW I think it's fine to keep the external related functions separate (and thanks for reusing the checkCurl method there!)
I'm also fine with merging this as is and iterating later on a separate PR, I see no major blocker here
6235ab2 to
5e167fd
Compare
|
After the hlab issues I could run a whole tests harness and we have 2 failures we don't observe in env-1/env-5: |
dc6b610 to
32e95de
Compare
Frostman
left a comment
There was a problem hiding this comment.
Per the discussion on the fabric sync, @pau-hedgehog, please make sure that ALL gateway-related tests have "Gateway" (with capital G) in the name so we can run them with a simple regex and enable it on the dataplane repo
32e95de to
da96854
Compare
|
@pau-hedgehog I've removed |
da96854 to
e8a6f34
Compare
OK. Will re-add once ready to see this PR pass (hopefully) BTW, I added a commit to fail the tests under https://github.com/githedgehog/internal/issues/348 The only reliable way I found was to ping the External IP at least 10 times |
|
After env-ci-1 swap we still face failures on this branch (which passess clean on env-1/env-5):
|
e8a6f34 to
ec9a23e
Compare
ec9a23e to
ad1a036
Compare
bc4896c to
f4897b5
Compare
a52679e to
5c2fe59
Compare
|
This branch is passing consistently now after githedgehog/lab-ci#20 All tests involving Gateway have the word on the test name so we can filter through release test regex |
Add tests for the full NAT matrix (no NAT, static, masquerade, port-forward, masquerade+port-forward) across all three peering target types: internal VPC, BGP external, and static external Signed-off-by: Pau Capdevila <pau@githedgehog.com>
Signed-off-by: Pau Capdevila <pau@githedgehog.com>
5c2fe59 to
357e6a5
Compare

Adds a full NAT test matrix for gateway peerings with external networks, covering all combinations of external type × NAT mode. Also refactors the external selection and NAT spec-building helpers to support the new test variants cleanly.
Part of https://github.com/githedgehog/internal/issues/305
Background
Gateway peering connects VPCs (or a VPC to an upstream network) through a dedicated gateway device that can apply NAT rules. This is different from fabric-native VPC peering: the gateway sits in the data path and can rewrite IP addresses before forwarding.
Externals are upstream network devices (routers/ISPs) that VPCs can reach through a gateway peering. Two types exist:
NAT modes tested (per external type):
10.50.1.5→192.168.85.5)Why per-environment NAT pool CIDRs (the annotations)
All test environments share a single physical edge device (DS2000). If two environments used the same NAT pool CIDR, DS2000 could not route return traffic to the correct environment's gateway — it would go to whichever environment last advertised the route.
Each environment's
Externalobject carries an annotation with its assigned pool:Tests read the annotation at runtime and skip if it's absent, so environments without the annotation configured are not affected.
The "inverted" port-forward tests
Port-forward is an inbound feature — an external client initiates a connection to a NATed IP, and the gateway forwards it to a server behind NAT. In our lab, "the external client" would be DS2000, but we have no SSH access to DS2000 to trigger connections from it.
The workaround: flip the NAT onto the external side. The external device's IP is exposed behind a virtual IP in the NAT pool (
.200by convention), with a port-forward rule5201→15201. A VPC server then acts as the iperf3 client connecting to<natpool>.200:15201, and the gateway forwards that to<ds2000-ip>:5201where an iperf3 server is listening. This exercises the port-forward dataplane without needing outbound access from DS2000.Why MasqueradePortForward produces two expose entries
The gateway API schema does not allow a single
exposeentry to carry bothNAT.MasqueradeandNAT.PortForwardblocks simultaneously. The workaround (buildExposes) emits two expose entries with identicalIPs/Asfields but differentNATblocks. The gateway processes both and applies both rule types to traffic.Why BGP tests retry for up to 2 minutes after WaitReady
WaitReadyconfirms that fabric switches have programmed their forwarding tables. But for BGP externals, there is additional work: the gateway must advertise the NAT pool CIDR to DS2000 via BGP, and DS2000 must accept and install that route. This propagation takes extra time that is not visible from the fabric side. Static externals skip the wait because DS2000 already has a static route for the pool — there is nothing to propagate.Changes
rt_nat_external_tests.go(new) — 10 new test cases covering the full BGP × static × NAT mode matrix, plus connectivity helpers (testNATExternalConnectivity,testIperfToExternal)rt_utils.go— new NAT modes (NATModePortForward,NATModeMasqueradePortForward), newbuildExposeshelper,appendGwExtPeeringSpecWithNAT,appendGwExtInvertedPortForwardPeeringSpec;appendGwPeeringSpecnow returns an errorrt_base.go—findExternalsreplaces inline external-selection loops; separately tracksextName(BGP) andstaticExtName;NoStaticExternalsskip flag is now wired intoselectAndRunSuitert_multi_vpc_single_subnet_suite.go/rt_single_vpc_suite.go/rt_nat_tests.go— updated callers ofappendGwPeeringSpecto handle the new error returntesting.go— removedexpose.Asguard that blocked NAT peerings from being verified.github/workflows/run-vlab.yaml— pinlab-cicheckout topau/nat-tests-annotations(contains per-environment NAT pool annotation configuration)