-
Notifications
You must be signed in to change notification settings - Fork 983
Bug fix: corrected sleep time calculation in IdleConnectionEvictor #789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…se 1 minute sleep time by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been meaning to ask you about IdleConnectionEvictor. Is it still necessary and relevant now that we have the idle timeout setting in ConnectionConfig?
|
What was the actual bug here? I'm staring at the code and I can't spot it. If there's a bug in sleep time calculation here, then there's a very high chance I hit it in production not two weeks ago and failed to realize it (I thought I just misconfigured the client). |
| localSleepTime.sleep(); | ||
| connectionManager.closeExpired(); | ||
| if (maxIdleTime != null) { | ||
| connectionManager.closeIdle(maxIdleTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a suggestion here. An issue with this code is that it never evicts connections until they've already been sitting expired in the pool, which creates a window of opportunity for stale connection reuse.
Instead of passing in maxIdleTime, I would pass in maxIdleTime - localSleepTime. This will cause all connections to be closed that will exceed their maxIdleTime during the next "tick" (where localSleepTime is the duration of a tick). This preserves the invariant that the conn pool never leases a connection that has been idle longer than maxIdleTime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to rationalize evictIdleConnections and ConnectionConfig.idleTimeout in the client builders. The former could be used as the default value for the latter. This would provide a much stronger guarantee than my previous suggestion, which is still prone to race conditions since we don't have real-time guarantees about when IdleConnectionEvictor runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a suggestion here. An issue with this code is that it never evicts connections until they've already been sitting expired in the pool, which creates a window of opportunity for stale connection reuse.
Instead of passing in
maxIdleTime, I would pass inmaxIdleTime - localSleepTime. This will cause all connections to be closed that will exceed theirmaxIdleTimeduring the next "tick" (wherelocalSleepTimeis the duration of a tick). This preserves the invariant that the conn pool never leases a connection that has been idle longer thanmaxIdleTime.
@rschmitt What should happen if the user chooses sleep interval greater than maxIdleTime?
The purpose of this class is not to eliminate the possibility of stale connection reuse but to ensure connections do not stay stuck in the pool if there is no normal activity. That is it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to rationalize
evictIdleConnectionsandConnectionConfig.idleTimeoutin the client builders. The former could be used as the default value for the latter. This would provide a much stronger guarantee than my previous suggestion, which is still prone to race conditions since we don't have real-time guarantees about whenIdleConnectionEvictorruns.
@rschmitt Could we fix the immediate problem and work on evictIdleConnections and ConnectionConfig.idleTimeout harmonization in a new pull request?
@rschmitt The actual bug was 1 second sleep interval used by default. That made the evictor thread grab the pool lock way too often causing unnecessary lock contention with worker threads |
IdleConnectionEvictormaxIdleTimehas not been specified