Skip to content

Conversation

@ok2c
Copy link
Member

@ok2c ok2c commented Jan 17, 2026

  • Corrected sleep time calculation in IdleConnectionEvictor
  • Use 1 minute sleep time by default if maxIdleTime has not been specified

@ok2c ok2c requested a review from arturobernalg January 17, 2026 14:21
Copy link
Contributor

@rschmitt rschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been meaning to ask you about IdleConnectionEvictor. Is it still necessary and relevant now that we have the idle timeout setting in ConnectionConfig?

@rschmitt
Copy link
Contributor

What was the actual bug here? I'm staring at the code and I can't spot it. If there's a bug in sleep time calculation here, then there's a very high chance I hit it in production not two weeks ago and failed to realize it (I thought I just misconfigured the client).

localSleepTime.sleep();
connectionManager.closeExpired();
if (maxIdleTime != null) {
connectionManager.closeIdle(maxIdleTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion here. An issue with this code is that it never evicts connections until they've already been sitting expired in the pool, which creates a window of opportunity for stale connection reuse.

Instead of passing in maxIdleTime, I would pass in maxIdleTime - localSleepTime. This will cause all connections to be closed that will exceed their maxIdleTime during the next "tick" (where localSleepTime is the duration of a tick). This preserves the invariant that the conn pool never leases a connection that has been idle longer than maxIdleTime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to rationalize evictIdleConnections and ConnectionConfig.idleTimeout in the client builders. The former could be used as the default value for the latter. This would provide a much stronger guarantee than my previous suggestion, which is still prone to race conditions since we don't have real-time guarantees about when IdleConnectionEvictor runs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion here. An issue with this code is that it never evicts connections until they've already been sitting expired in the pool, which creates a window of opportunity for stale connection reuse.

Instead of passing in maxIdleTime, I would pass in maxIdleTime - localSleepTime. This will cause all connections to be closed that will exceed their maxIdleTime during the next "tick" (where localSleepTime is the duration of a tick). This preserves the invariant that the conn pool never leases a connection that has been idle longer than maxIdleTime.

@rschmitt What should happen if the user chooses sleep interval greater than maxIdleTime?

The purpose of this class is not to eliminate the possibility of stale connection reuse but to ensure connections do not stay stuck in the pool if there is no normal activity. That is it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to rationalize evictIdleConnections and ConnectionConfig.idleTimeout in the client builders. The former could be used as the default value for the latter. This would provide a much stronger guarantee than my previous suggestion, which is still prone to race conditions since we don't have real-time guarantees about when IdleConnectionEvictor runs.

@rschmitt Could we fix the immediate problem and work on evictIdleConnections and ConnectionConfig.idleTimeout harmonization in a new pull request?

@ok2c
Copy link
Member Author

ok2c commented Jan 18, 2026

What was the actual bug here? I'm staring at the code and I can't spot it. If there's a bug in sleep time calculation here, then there's a very high chance I hit it in production not two weeks ago and failed to realize it (I thought I just misconfigured the client).

@rschmitt The actual bug was 1 second sleep interval used by default. That made the evictor thread grab the pool lock way too often causing unnecessary lock contention with worker threads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants