Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ All notable changes to this project will be documented in this file.

- Telemetry
- Force IPv4-only connections for gNMI tunnel client and fix TLS credential handling

## [v0.8.3](https://github.com/malbeclabs/doublezero/compare/client/v0.8.2...client/v0.8.3) – 2026-01-22

- Client
- Add NAT support for IBRL mode (NAT is already supported in IBRLAllocatedIP mode)
- Data
- Add indexer that syncs serviceability and telemetry data to ClickHouse and Neo4J

## [v0.8.3](https://github.com/malbeclabs/doublezero/compare/client/v0.8.2...client/v0.8.3) – 2026-01-22
### Breaking

- None for this release
Expand Down
18 changes: 14 additions & 4 deletions client/doublezero/src/command/connect.rs
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ impl ProvisioningCliCommand {
pubkey
))?;

if user_type == UserType::IBRLWithAllocatedIP {
if user_type == UserType::IBRL || user_type == UserType::IBRLWithAllocatedIP {
tunnel_src = resolve_tunnel_src(controller, device).await?;
}

Expand All @@ -399,7 +399,7 @@ impl ProvisioningCliCommand {
spinner.println(format!(" Device selected: {} ", device.code));
spinner.inc(1);

if user_type == UserType::IBRLWithAllocatedIP {
if user_type == UserType::IBRL || user_type == UserType::IBRLWithAllocatedIP {
tunnel_src = resolve_tunnel_src(controller, &device).await?;
}

Expand Down Expand Up @@ -1157,9 +1157,14 @@ mod tests {
let (device1_pk, device1) = fixture.add_device(DeviceType::Hybrid, 100, true);
let user = fixture.create_user(UserType::IBRL, device1_pk, "1.2.3.4");
fixture.expect_create_user(Pubkey::new_unique(), &user);
fixture.expected_provisioning_request(

let resolved_src = Ipv4Addr::new(192, 168, 1, 100);
fixture.expect_resolve_route(device1.public_ip, resolved_src);

fixture.expected_provisioning_request_with_tunnel_src(
UserType::IBRL,
user.client_ip.to_string().as_str(),
resolved_src.to_string().as_str(),
device1.public_ip.to_string().as_str(),
None,
None,
Expand Down Expand Up @@ -1213,9 +1218,14 @@ mod tests {
let (device1_pk, device1) = fixture.add_device(DeviceType::Edge, 100, true);
let user = fixture.create_user(UserType::IBRL, device1_pk, "1.2.3.4");
fixture.expect_create_user(Pubkey::new_unique(), &user);
fixture.expected_provisioning_request(

let resolved_src = Ipv4Addr::new(192, 168, 1, 101);
fixture.expect_resolve_route(device1.public_ip, resolved_src);

fixture.expected_provisioning_request_with_tunnel_src(
UserType::IBRL,
user.client_ip.to_string().as_str(),
resolved_src.to_string().as_str(),
device1.public_ip.to_string().as_str(),
None,
None,
Expand Down
2 changes: 1 addition & 1 deletion client/doublezerod/internal/services/ibrl.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ func (s *IBRLService) Setup(p *api.ProvisionRequest) error {
var noUninstall bool
switch p.UserType {
case api.UserTypeIBRL:
err = createBaseTunnel(s.nl, tun)
err = createTunnelWithIP(s.nl, tun, p.DoubleZeroIP)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about solving this by just applying the detected public IP to the tunnel interface. If their app is only listening to the RFC1918/whatever address is behind the NAT, they're going to bring up their tunnel and we'll potentially break them because traffic is now coming in destined to an address they're not listening on.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it requires applications to bind to 0.0.0.0, but isn't that the nature of the beast? Also I think "break them" is too strong, because wouldn't they have to make some other change external to this client update to send traffic to the public IP?

I can't think of a way to solve for that other than documentation and support. Any ideas?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it requires applications to bind to 0.0.0.0, but isn't that the nature of the beast?

What do you mean nature of the beast? In the current IBRL mode, we don't require users to do that.

Also I think "break them" is too strong, because wouldn't they have to make some other change external to this client update to send traffic to the public IP?

The situation I'm worried about is:

  1. An app behind a 1:1 NAT is listening on their RFC1918 address, not all interfaces
  2. There are 10 other hosts connected to DZ communicating with this app via it's public IP behind 1:1 NAT
  3. The app gets connected to DZ, and it's public IP is advertised into DZ
  4. The 10 other hosts now attempt to reach the app via DZ
  5. Traffic ends up getting dropped in the kernel where this app is running because the app's listening socket is not bound to this address

Copy link
Copy Markdown
Contributor Author

@nikw9944 nikw9944 Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean nature of the beast?

I should have said: "Yes, it requires applications to bind to 0.0.0.0, but is there any other way to do multihoming properly?"

I get the concern, but it doesn't seem like an unreasonable caveat to me. Am I missing something?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Yes, it requires applications to bind to 0.0.0.0, but is there any other way to do multihoming properly?"

We could handle the 1:1 NAT for the user on the doublezero0 interface via something like tc. This way it doesn't matter what address the app is bound to and there's no risk we break their application when they connect to DZ.

case api.UserTypeIBRLWithAllocatedIP:
err = createTunnelWithIP(s.nl, tun, p.DoubleZeroIP)
noUninstall = true
Expand Down
2 changes: 1 addition & 1 deletion client/doublezerod/internal/services/services_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ func TestServices(t *testing.T) {
RemoteOverlay: net.IPv4(169, 254, 0, 0),
MTU: routing.GREMTU,
},
wantTunAddrAdded: []MockTunAddr{{IP: "169.254.0.1/31"}},
wantTunAddrAdded: []MockTunAddr{{IP: "169.254.0.1/31"}, {IP: "192.168.1.1/32"}},
wantTunUp: true,
wantRulesAdded: nil,
wantRoutesAdded: nil,
Expand Down
224 changes: 224 additions & 0 deletions e2e/client_behind_nat_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
//go:build e2e

package e2e_test

import (
"os"
"path/filepath"
"testing"
"time"

"github.com/malbeclabs/doublezero/e2e/internal/devnet"
"github.com/malbeclabs/doublezero/e2e/internal/random"
"github.com/stretchr/testify/require"
)

// TestE2E_ClientBehindNAT verifies that IBRL mode works for clients behind a NAT gateway.
// This proves that the tunnel can be established and BGP session comes up when the client's
// public IP differs from its private IP (i.e., the client is behind NAT).
//
// The test also verifies that IBRL with --allocate-addr works alongside the NAT client.
func TestE2E_ClientBehindNAT(t *testing.T) {
t.Parallel()

deployID := "dz-e2e-" + t.Name() + "-" + random.ShortID()
log := logger.With("test", t.Name(), "deployID", deployID)

currentDir, err := os.Getwd()
require.NoError(t, err)
serviceabilityProgramKeypairPath := filepath.Join(currentDir, "data", "serviceability-program-keypair.json")

dn, err := devnet.New(devnet.DevnetSpec{
DeployID: deployID,
DeployDir: t.TempDir(),

CYOANetwork: devnet.CYOANetworkSpec{
CIDRPrefix: subnetCIDRPrefix,
},
Manager: devnet.ManagerSpec{
ServiceabilityProgramKeypairPath: serviceabilityProgramKeypairPath,
},
}, log, dockerClient, subnetAllocator)
require.NoError(t, err)

log.Info("==> Starting devnet")
err = dn.Start(t.Context(), nil)
require.NoError(t, err)
log.Info("--> Devnet started")

// Add a single device.
deviceCode := "ewr1-dz01"
log.Info("==> Adding device", "deviceCode", deviceCode)
device, err := dn.AddDevice(t.Context(), devnet.DeviceSpec{
Code: deviceCode,
Location: "ewr",
Exchange: "xewr",
CYOANetworkIPHostID: 16,
CYOANetworkAllocatablePrefix: 29,
Interfaces: map[string]string{
"Ethernet2": "physical",
},
LoopbackInterfaces: map[string]string{
"Loopback255": "vpnv4",
"Loopback256": "ipv4",
},
})
require.NoError(t, err)
devicePK := device.ID
log.Info("--> Device added", "deviceCode", deviceCode, "devicePK", devicePK)

// Wait for device to exist onchain.
log.Info("==> Waiting for device to exist onchain")
serviceabilityClient, err := dn.Ledger.GetServiceabilityClient()
require.NoError(t, err)
require.Eventually(t, func() bool {
data, err := serviceabilityClient.GetProgramData(t.Context())
if err != nil {
return false
}
return len(data.Devices) == 1
}, 30*time.Second, 1*time.Second)
log.Info("--> Device exists onchain")

// Create NAT infrastructure.
log.Info("==> Creating NAT infrastructure")
behindNATNetwork := devnet.NewBehindNATNetwork(dn, log, "nat1")
_, err = behindNATNetwork.CreateIfNotExists(t.Context())
require.NoError(t, err)
log.Info("--> Behind-NAT network created", "name", behindNATNetwork.Name, "subnet", behindNATNetwork.SubnetCIDR)

// Create and start NAT gateway.
natGateway := &devnet.NATGateway{
Spec: &devnet.NATGatewaySpec{
Code: "gw1",
BehindNATNetworkIPHostID: 2,
CYOANetworkIPHostID: 130,
},
BehindNATNetwork: behindNATNetwork,
}
natGateway.SetDevnet(dn, log)
_, err = natGateway.StartIfNotRunning(t.Context())
require.NoError(t, err)
log.Info("--> NAT gateway started", "behindNATIP", natGateway.BehindNATNetworkIP, "cyoaIP", natGateway.CYOANetworkIP)

// Add an IBRL with --allocate-addr behind NAT.
log.Info("==> Adding allocate-addr client")
allocateAddrNatClient, err := dn.AddClient(t.Context(), devnet.ClientSpec{
CYOANetworkIPHostID: 100,
})
require.NoError(t, err)
log.Info("--> Allocate-addr client added", "pubkey", allocateAddrNatClient.Pubkey, "ip", allocateAddrNatClient.CYOANetworkIP)

// Add an IBRL client behind NAT.
log.Info("==> Adding client behind NAT")
ibrlNatClient, err := dn.AddClient(t.Context(), devnet.ClientSpec{
BehindNATGateway: natGateway,
BehindNATNetworkIPHostID: 10,
})
require.NoError(t, err)
log.Info("--> NAT client added", "pubkey", ibrlNatClient.Pubkey, "privateIP", ibrlNatClient.PrivateIP, "publicIP", ibrlNatClient.CYOANetworkIP)

// Verify NAT client's public IP is the NAT gateway's CYOA IP.
require.Equal(t, natGateway.CYOANetworkIP, ibrlNatClient.CYOANetworkIP)

// Configure NAT rules for the client.
log.Info("==> Configuring NAT for client")
err = natGateway.ConfigureNATForClient(t.Context(), ibrlNatClient.PrivateIP)
require.NoError(t, err)
log.Info("--> NAT configured for client")

// Wait for latency results.
log.Info("==> Waiting for client latency results")
err = allocateAddrNatClient.WaitForLatencyResults(t.Context(), devicePK, 90*time.Second)
require.NoError(t, err)
err = ibrlNatClient.WaitForLatencyResults(t.Context(), devicePK, 90*time.Second)
require.NoError(t, err)
log.Info("--> Latency results received")

// Add clients to Access Pass.
log.Info("==> Adding clients to Access Pass")
_, err = dn.Manager.Exec(t.Context(), []string{"bash", "-c", "doublezero access-pass set --accesspass-type prepaid --client-ip " + allocateAddrNatClient.CYOANetworkIP + " --user-payer " + allocateAddrNatClient.Pubkey})
require.NoError(t, err)
_, err = dn.Manager.Exec(t.Context(), []string{"bash", "-c", "doublezero access-pass set --accesspass-type prepaid --client-ip " + ibrlNatClient.CYOANetworkIP + " --user-payer " + ibrlNatClient.Pubkey})
require.NoError(t, err)
log.Info("--> Clients added to Access Pass")

// Run connect subtest.
if !t.Run("connect", func(t *testing.T) {
log.Info("==> Connecting allocate-addr client in IBRL mode with --allocate-addr")
_, err = allocateAddrNatClient.Exec(t.Context(), []string{"doublezero", "connect", "ibrl", "--client-ip", allocateAddrNatClient.CYOANetworkIP, "--device", deviceCode, "--allocate-addr"})
require.NoError(t, err)
log.Info("--> Allocate-addr client connected")

log.Info("==> Connecting NAT client in IBRL mode")
_, err = ibrlNatClient.Exec(t.Context(), []string{"doublezero", "connect", "ibrl", "--client-ip", ibrlNatClient.CYOANetworkIP, "--device", deviceCode})
require.NoError(t, err)
log.Info("--> NAT client connected")

log.Info("==> Waiting for tunnels to come up")
err = allocateAddrNatClient.WaitForTunnelUp(t.Context(), 90*time.Second)
require.NoError(t, err)
log.Info("--> Allocate-addr client tunnel up")

err = ibrlNatClient.WaitForTunnelUp(t.Context(), 90*time.Second)
require.NoError(t, err)
log.Info("--> NAT client tunnel up (BGP session established)")

log.Info("==> Verifying tunnel status")

allocateAddrStatus, err := allocateAddrNatClient.GetTunnelStatus(t.Context())
require.NoError(t, err)
require.Len(t, allocateAddrStatus, 1)
allocateAddrDZIP := allocateAddrStatus[0].DoubleZeroIP.String()
log.Info("--> Allocate-addr client DZ IP", "ip", allocateAddrDZIP)
// Allocate-addr client should get an IP from device's allocatable range (not its CYOA IP).
require.NotEqual(t, allocateAddrNatClient.CYOANetworkIP, allocateAddrDZIP, "allocate-addr client should get allocated IP, not CYOA IP")

natStatus, err := ibrlNatClient.GetTunnelStatus(t.Context())
require.NoError(t, err)
require.Len(t, natStatus, 1)
natDZIP := natStatus[0].DoubleZeroIP.String()
log.Info("--> NAT client DZ IP", "ip", natDZIP)
require.Equal(t, ibrlNatClient.CYOANetworkIP, natDZIP, "NAT client's DZ IP should be NAT gateway's CYOA IP")

log.Info("--> Verified: allocate-addr client got allocated IP, NAT client uses gateway's public IP")

log.Info("==> Testing connectivity between clients")

// Allocate-addr client pings NAT client.
_, err = allocateAddrNatClient.Exec(t.Context(), []string{"ping", "-c", "3", natDZIP, "-W", "1"})
require.NoError(t, err)
log.Info("--> Allocate-addr client can ping NAT client", "src", allocateAddrDZIP, "dst", natDZIP)

// NAT client pings allocate-addr client.
_, err = ibrlNatClient.Exec(t.Context(), []string{"ping", "-c", "3", allocateAddrDZIP, "-W", "1"})
require.NoError(t, err)
log.Info("--> NAT client can ping allocate-addr client", "src", natDZIP, "dst", allocateAddrDZIP)

log.Info("--> Connectivity verified: bidirectional ping works with ibrl and ibrl -a clients behind NAT")
}) {
t.Fail()
return
}

// Run disconnect subtest.
if !t.Run("disconnect", func(t *testing.T) {
log.Info("==> Disconnecting clients")
_, err = allocateAddrNatClient.Exec(t.Context(), []string{"doublezero", "disconnect", "--client-ip", allocateAddrNatClient.CYOANetworkIP})
require.NoError(t, err)
_, err = ibrlNatClient.Exec(t.Context(), []string{"doublezero", "disconnect", "--client-ip", ibrlNatClient.CYOANetworkIP})
require.NoError(t, err)
log.Info("--> Clients disconnected")

log.Info("==> Waiting for tunnels to disconnect")
err = allocateAddrNatClient.WaitForTunnelDisconnected(t.Context(), 60*time.Second)
require.NoError(t, err)
err = ibrlNatClient.WaitForTunnelDisconnected(t.Context(), 60*time.Second)
require.NoError(t, err)
log.Info("--> Tunnels disconnected")
}) {
t.Fail()
}

log.Info("==> Test completed successfully - IBRL mode works via NAT and with --allocate-addr")
}
Loading
Loading