Skip to content

fix UdpWriter log flooding on thread interrupt#1227

Merged
brharrington merged 1 commit intoNetflix:mainfrom
brharrington:excessive-logging
Mar 11, 2026
Merged

fix UdpWriter log flooding on thread interrupt#1227
brharrington merged 1 commit intoNetflix:mainfrom
brharrington:excessive-logging

Conversation

@brharrington
Copy link
Contributor

In Spark, executor threads get interrupted for task cancellation. ClosedByInterruptException closes the UDP channel and sets the thread interrupt flag. The reconnect attempt always fails because the thread is still interrupted, logging a WARN on every write.

  • Catch ClosedByInterruptException separately and skip reconnection since it will always fail while the thread is interrupted

  • Remove WARN logs from writeImpl that bypassed the suppressWarnings mechanism in SidecarWriter.write(), which could cause flooding on any repeated reconnection failure (e.g., FD exhaustion)

In Spark, executor threads get interrupted for task
cancellation. ClosedByInterruptException closes the UDP
channel and sets the thread interrupt flag. The reconnect
attempt always fails because the thread is still interrupted,
logging a WARN on every write.

- Catch ClosedByInterruptException separately and skip
  reconnection since it will always fail while the thread
  is interrupted

- Remove WARN logs from writeImpl that bypassed the
  suppressWarnings mechanism in SidecarWriter.write(),
  which could cause flooding on any repeated reconnection
  failure (e.g., FD exhaustion)
@brharrington brharrington added this to the 1.9.5 milestone Mar 11, 2026
@brharrington brharrington requested a review from jzhuge March 11, 2026 19:18
Copy link
Contributor

@jzhuge jzhuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the new flow is

writeImpl → ClosedByInterruptException → re-thrown (no WARN in UdpWriter)
    → caught by SidecarWriter.write()'s catch (IOException e)
        → 1st time: logs WARN, suppressWarnings = true
        → 2nd+ time: suppressed ✓

This should be able to fix the flooding.

@brharrington brharrington merged commit 8a421e1 into Netflix:main Mar 11, 2026
1 check passed
@brharrington brharrington deleted the excessive-logging branch March 11, 2026 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants