Skip to content

fix: zerohop inflates traceroute hop count and shows 'Meshtastic ffff (ffff)' ghost hops #46

@eric-becker

Description

@eric-becker

Description

When a traceroute response (or request) traverses an MQTT gateway on a channel where floodgate zero-hops the packet, the receiving device renders the route with several Meshtastic ffff (ffff) entries between the real endpoints and inflates the total hop count to the protocol maximum (typically 7).

The MQTT gateway leg should appear as a single hop (named correctly, with synthetic 0 dB SNR), not as 7 unknown filler nodes.

Suspected root cause

zerohop_protobuf / zerohop_json set hop_limit = 0 but leave hop_start untouched. Meshtastic firmware computes hops_taken = hop_start - hop_limit for display purposes (and traceroute rendering). With a typical hop_start = 7 and our forced hop_limit = 0, every receiver concludes the packet has already taken 7 hops, and the traceroute UI fills the gap between known endpoints with Meshtastic ffff (ffff) placeholders (0xffffffff = unknown/broadcast).

Effectively: zero-hop correctly prevents re-broadcast, but the on-the-wire packet now lies about the number of hops it took to get to the receiver.

Reproduction

  1. Run floodgate with default config (zerohop_enabled: true, all 8 standard channels in zerohop_channels) in front of a public-channel EMQX broker.
  2. From a Meshtastic client, run a traceroute against any MQTT-attached gateway node on a zerohopped channel (e.g. LongFast).
  3. Observe the traceroute output: 6+ Meshtastic ffff (ffff) entries appear between the local node and the gateway, in both directions, with the displayed hop count maxed out.

Screenshots from production (Beanfield Gator Lite tracing several MQTT gateways) — to be attached to this issue:

  • Bombero MQTT node traceroute (request and response): 6× Meshtastic ffff (ffff) before the real endpoint.
  • LWS ARC Area 51 Gateway (MQTT) traceroute: same pattern.
  • LWS ARC Altamonte Gateway (MQTT) traceroute: same pattern.
  • LWS ARC Shaun Gateway (MQTT) traceroute: same pattern.

Expected behavior

A packet that arrived via an MQTT gateway (zerohopped or not) should render in traceroute as a single hop with the gateway's resolved name and 0 dB SNR. No ghost ffff entries, and the total hop count should reflect the actual delivered hop count, not the original hop_start.

Code pointers

Proposed fix (to validate)

When forcing hop_limit = 0, also set hop_start = 0 (or some equivalently coherent value such that hop_start - hop_limit reflects the real delivered hop count of 1). The two fields should be modified together so receivers compute hops_taken correctly. Need to verify against firmware behaviour (see related issue for the MCP-server-driven test harness) — there may be additional fields (e.g. relay_node, next_hop, via_mqtt) that the firmware also consults for traceroute rendering.

Tests to add alongside the fix

  • Unit: zerohop_protobuf / zerohop_json produce coherent (hop_start, hop_limit) pairs.
  • Integration: end-to-end traceroute through a zerohopped MQTT gateway renders without ffff ghost hops (gated on the upstream Meshtastic MCP server work).

Environment

  • floodgate version: current main / feat/helm-chart
  • EMQX version: 6.2.0
  • Deployment: production (k3s) and docker-compose.test

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions