Skip to content

RadioLibInterface: drop "caught missed RX_DONE" from WARN to DEBUG#10255

Open
nightjoker7 wants to merge 1 commit into
meshtastic:developfrom
nightjoker7:fix/missed-rxdone-log-level
Open

RadioLibInterface: drop "caught missed RX_DONE" from WARN to DEBUG#10255
nightjoker7 wants to merge 1 commit into
meshtastic:developfrom
nightjoker7:fix/missed-rxdone-log-level

Conversation

@nightjoker7
Copy link
Copy Markdown
Contributor

Context

PR #9658 added pollMissedIrqs() as a 1 Hz backup to catch RX_DONE events where the GPIO edge happened while the ISR was still disabled from processing the previous packet. That's a real scenario on back-to-back receptions:

  1. Packet A's edge fires DIO1 → ISR runs → disableInterrupt()
  2. Worker thread reads + processes A via SPI (takes milliseconds)
  3. Packet B completes during step 2; its DIO1 transition happens while our GPIO interrupt is disabled
  4. Worker re-enables interrupt, but DIO1 is already high — no new edge to trigger on
  5. pollMissedIrqs() at 1 Hz notices the RX_DONE flag is set, kicks the handler → packet B gets processed

This is the polling safety net working exactly as intended. Not a warning condition.

Problem

Current log level is LOG_WARN:

if (iface->checkIrq(RADIOLIB_IRQ_RX_DONE)) {
    LOG_WARN("caught missed RX_DONE");
    notify(ISR_RX, true);
}

On a Station G2 running in ROUTER_LATE in an active urban mesh, this fires ~200 times per day under sustained RX load — clustered during dense-RX bursts (e.g., phone-app reconnect storms, neighbor floods). It buries actual warnings in the log.

Same pattern as PR #10252's CRC_MISMATCH fix (same file): a normal radio recovery event logged at the wrong level.

Fix

Drop to LOG_DEBUG and expand the message so it's self-documenting at DEBUG level:

LOG_DEBUG("Caught RX_DONE via poll (ISR-missed edge, normal under dense RX)");

Alignment with existing file conventions

This matches how RadioLibInterface.cpp logs other "we compensated for a normal radio oddity" cases:

Line Current level Condition
123 LOG_DEBUG "Ignore false preamble detection" (receiver timed out on PREAMBLE_DETECTED without follow-up HEADER_VALID)
130 LOG_DEBUG "Ignore false header detection" (similar — HEADER_VALID but no RX_DONE follow-up)
558 (was) LOG_WARN "caught missed RX_DONE" (our robustness-poll catching a back-to-back RX)

The new level brings #558 into line with the other two.

Risk

Minimal — log level only, no behavior change. Identical commit pattern to the CRC_MISMATCH drop in #10252.

Observation data

On my Station G2 fleet over 24 hours, captured:

177 LOG_WARN "caught missed RX_DONE"

Clustered at high-stress periods:

  • 71 events during a 17:00 UTC reconnect-storm window
  • 36 during morning mesh activity
  • 50 during an early-morning reconnect event

In every case the poll successfully recovered the missed IRQ and the packet was processed normally. Zero observed lost packets attributable to missed RX_DONE — the backup poll is doing its job, it just shouldn't be shouting while it does.

This log line fires in the normal recovery path for back-to-back RX
events: a second packet's DIO1 edge arrives while the GPIO interrupt
is still disabled from processing the previous packet (between ISR
disable and handler re-enable), so no edge event queues for when we
re-enable. The 1 Hz `pollMissedIrqs()` backup (added in meshtastic#9658) then
sees the RX_DONE IRQ flag is set, logs this line, and kicks the
handler.

This is the polling safety-net working as designed — not a warning
condition. On a ROUTER_LATE Station G2 in an active urban mesh, it
fires ~200 times per day under sustained RX load, burying actual
warnings in noise.

The file already uses LOG_DEBUG for analogous "we compensated for a
normal radio oddity" messages:

- LOG_DEBUG("Ignore false preamble detection")  (line 123)
- LOG_DEBUG("Ignore false header detection")    (line 130)

Align this one with that convention. Also expand the message slightly
so anyone still watching DEBUG logs can understand what the line means
without cross-referencing meshtastic#9658.

No behavior change — only log level.
if (iface->checkIrq(RADIOLIB_IRQ_RX_DONE)) {
LOG_WARN("caught missed RX_DONE");
// This is the normal recovery path for back-to-back RX events where the
// second packet's DIO1 edge happens while the GPIO interrupt was still
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the log level changes, but this comment block is a novel 😓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants