Skip to content

feat(rtps): Improve implementation to talk to FastRTPS and implement reliable QoS (HEARTBEAT/ACKNACK)#660

Merged
finger563 merged 2 commits into
mainfrom
feat/rtps-reliability-and-improvements
Jun 30, 2026
Merged

feat(rtps): Improve implementation to talk to FastRTPS and implement reliable QoS (HEARTBEAT/ACKNACK)#660
finger563 merged 2 commits into
mainfrom
feat/rtps-reliability-and-improvements

Conversation

@finger563

@finger563 finger563 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Description

Takes the RTPS participant from best-effort-only to a fully interoperable
reliable RTPS endpoint
, validated on hardware against Fast DDS/RTPS (ESP32-P4
over Ethernet publishing to a Fast RTPS subscriber on another machine).

Reliable QoS — full HEARTBEAT / ACKNACK / GAP, both directions:

  • Submessage codecs: HEARTBEAT, ACKNACK, GAP, INFO_DST, and a
    SequenceNumberSet (bitmapBase + up to 256-bit bitmap), all endian-aware.
  • Reliable writer: per-writer history cache (bounded by
    WriterConfig.history_depth, KEEP_LAST); sample cached on publish();
    HEARTBEAT emitted after publish and periodically (Config.heartbeat_period);
    on ACKNACK, resends the NACKed samples still in history to the requesting
    reader, and emits a GAP for samples already evicted so the reader stops
    re-NACKing them.
  • Reliable reader: per-(reader, remote-writer) state with duplicate
    suppression and in-order delivery via a bounded reorder buffer
    (Config.reliable_reorder_depth); replies to HEARTBEATs with an ACKNACK
    (missing-SN bitmap or positive ack); honors incoming GAP by advancing its
    frontier past irrelevant SNs (and releasing anything buffered behind them).
  • One shared build_directed_data_message (INFO_DST + DATA addressed to the
    target reader) drives all retransmission.

Builtin discovery (SEDP) reliability:

  • We now ACKNACK the HEARTBEATs a reliable peer sends for its builtin SEDP
    writers, so Fast DDS (re)sends us the endpoint data we need to match it (fixes
    the original "No send destinations").
  • Our builtin SEDP writers now use stable, index-based sequence numbers (each
    local endpoint = SN = its 1-based index) and emit SEDP HEARTBEATs, so a
    reliable peer ACKNACKs and we retransmit a missed announcement — discovery
    is now recoverable on a lossy link, not just via periodic re-announce.

Interoperability fixes:

  • Unicast DATA now carries the matched reader's readerId (multicast stays
    ENTITYID_UNKNOWN). Fast RTPS reliable readers require this.
  • SEDP is addressed to the peer's advertised metatraffic unicast locator.

Discovery storage refactor (DiscoveryDb):

  • Participants and endpoints live in a GUID-keyed DiscoveryDb (own mutex)
    instead of vectors scanned with find_if.
  • Discovery updates merge the fields present in each announcement into the
    existing record, so a later/trimmed announce no longer erases learned
    locators/QoS/names.
  • ParticipantProxy stores the full metatraffic/default unicast+multicast
    locators.

Diagnostics: DEBUG logging across discovery and the reliable paths
(advertised address/ports, parsed SPDP/SEDP locators, retransmit/GAP
destinations, unknown-writer DATA).

See components/rtps/RELIABLE_RTPS_PLAN.md for the phase-by-phase design.

Motivation and Context

The participant could discover peers and send best-effort data but could not
interoperate with reliable DDS/ROS 2 endpoints: it never ACKNACK'd discovery or
data HEARTBEATs (so reliable peers withheld discovery data → "No send
destinations"), sent DATA with an unknown readerId, never retransmitted on
NACK, lost locators on re-announcement, and steered multicast out the wrong
interface on multi-homed hosts. This change makes the ESP32 a reliable RTPS
publisher/subscriber that Fast RTPS accepts and that recovers from packet loss.

How has this been tested?

  • rtps example builds clean (esp32 family); socket changes build via the rtps
    dependency.
  • On hardware: ESP32-P4 over Ethernet publishing to a Fast RTPS subscriber on a
    separate machine — confirmed SPDP/SEDP discovery, the builtin + user
    HEARTBEAT/ACKNACK exchange (Wireshark + new DEBUG logs), and reliable sample
    delivery. SEDP HEARTBEAT/ACKNACK confirmed working on hardware.
  • Verified multicast egress/join on the configured wired interface on a
    multi-homed macOS host without OS interface-priority changes.
  • Pending: automated induced-packet-loss tests and Cyclone interop (tracked in
    the plan doc, Phase 5).

Screenshots (if appropriate):

N/A

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change
  • Documentation Update
  • Software change

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly (component README + RELIABLE_RTPS_PLAN.md).

Software

  • I have added tests to cover my changes.
  • I have updated .github/workflows/build.yml to add my new test to the cloud build.
  • All new and existing tests passed.
  • My code follows the code style of this project.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

✅Static analysis result - no issues found! ✅

@finger563 finger563 self-assigned this Jun 29, 2026
@finger563 finger563 added enhancement New feature or request rtps real time publish subscribe labels Jun 29, 2026
@finger563 finger563 changed the title feat(rtps): Improve implementation to talk to FastRTPS and implement HEARTBEAT/ACKNACK feat(rtps): Improve implementation to talk to FastRTPS and implement reliable QoS (HEARTBEAT/ACKNACK) Jun 29, 2026
@finger563 finger563 merged commit a721ea4 into main Jun 30, 2026
229 of 230 checks passed
@finger563 finger563 deleted the feat/rtps-reliability-and-improvements branch June 30, 2026 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request rtps real time publish subscribe

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant