Skip to content

[SPARK-56466][CONNECT] Make SparkSession.close() idempotent#55333

Closed
zhengruifeng wants to merge 1 commit intoapache:masterfrom
zhengruifeng:fix-double-stop
Closed

[SPARK-56466][CONNECT] Make SparkSession.close() idempotent#55333
zhengruifeng wants to merge 1 commit intoapache:masterfrom
zhengruifeng:fix-double-stop

Conversation

@zhengruifeng
Copy link
Copy Markdown
Contributor

@zhengruifeng zhengruifeng commented Apr 14, 2026

What changes were proposed in this pull request?

Make spark.stop() idempotent on the client side so that calling it twice (e.g., user code calls it, then the platform calls it during cleanup) does not hang or fail.

Client side (SparkSession.close()):

  • Add a releaseLock and released flag. After releaseSession() completes (success or caught failure), the flag is set. On a second call, the flag gates the method and skips the RPC — since receiving a ReleaseSessionResponse already confirms server-side cleanup.
  • Wrap client.shutdown() and allocator.close() in try-catch inside the synchronized block.

Server side: already handles duplicate release requests gracefully — closeSession() is a no-op when the session does not exist.

Root cause of the hang: Without the released guard, the second close() calls releaseSession() on the already-shutdown gRPC channel. The channel returns UNAVAILABLE, which the default retry policy retries up to 15 times with exponential backoff (~10 minutes total).

Why are the changes needed?

When spark.stop() is called twice (which happens in real-world cleanup scenarios), the second call hangs for ~10 minutes due to gRPC retries on the already-shutdown channel. This makes close() idempotent and safe to call multiple times.

Does this PR introduce any user-facing change?

No. This is a bug fix — spark.stop() now completes immediately on a second call instead of hanging.

How was this patch tested?

Added SparkSessionCloseE2ESuite with test "double stop should not retry releaseSession" — creates a session with a gRPC interceptor that counts ReleaseSession calls, stops it twice, and verifies the second stop does not issue another releaseSession RPC.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-6)

@HyukjinKwon
Copy link
Copy Markdown
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the fix-double-stop branch April 15, 2026 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants