Fix the wrong backoff computation when retrying#296
Merged
BewareMyPower merged 5 commits intoJul 7, 2023
Conversation
Contributor
Author
|
|
### Motivation All the retryable operations share the same `Backoff` object in `RetryableLookupService`, so if the reconnection happens for some times, the delay of retrying will keeps the maximum value (30 seconds). ### Modifications Refactor the design of the `RetryableLookupService`: - Add a `RetryableOperation` class to represent a retryable operation, each instance has its own `Backoff` object. The operation could only be executed once. - Add a `RetryableOperationCache` class to represent a map that maps a specific name to its associated operation. It's an optimization that if an operation (e.g. find the owner topic of topic A) was not complete while the same operation was executed, the future would be reused. - In `RetryableLookupService`, just maintain some caches for different operations. - Add `RetryableOperationCacheTest` to verify the behaviors.
1146d02 to
78bb0be
Compare
RobertIndie
reviewed
Jul 5, 2023
Comment on lines
+70
to
+72
| size_t getNumberOfPendingTasks() const { | ||
| return lookupCache_->size() + partitionLookupCache_->size() + namespaceLookupCache_->size() + | ||
| getSchemaCache_->size(); |
Member
There was a problem hiding this comment.
It's only used for testing. Do we need to expose it here?
Contributor
Author
There was a problem hiding this comment.
I removed them. PTAL again.
shibd
approved these changes
Jul 6, 2023
Member
shibd
left a comment
There was a problem hiding this comment.
Nice catch! Left some small comments.
Contributor
Author
|
It seems some tests failed after merging the main branch. Mark it as drafted currently. |
Contributor
Author
|
@shibd @RobertIndie Now all tests passed, PTAL again. |
shibd
approved these changes
Jul 7, 2023
RobertIndie
approved these changes
Jul 7, 2023
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Sep 8, 2023
### Motivation apache#296 introduced a regression for GCC <= 7. > lib/RetryableOperation.h:109:66: error: 'pulsar::RetryableOperation<T>::runImpl(pulsar::TimeDuration)::<lambda(pulsar::Result, const T&)> [with T = pulsar::LookupService::LookupResult]::<lambda(const boost::system::error_code&)>' declared with greater visibility than the type of its field 'pulsar::RetryableOperation<T>::runImpl(pulsar::TimeDuration)::<lambda(pulsar::Result, const T&)> [with T = pulsar::LookupService::LookupResult]::<lambda(const boost::system::error_code&)>::<this capture>' [-Werror=attributes] It seems to be a bug for GCC <= 7 abort the visibility of the lambda expression might not be affected by the `-fvisibility=hidden` option. ### Modifications Add `__attribute__((visibility("hidden")))` to `RetryableOperation::runImpl` explicitly.
4 tasks
BewareMyPower
added a commit
that referenced
this pull request
Sep 9, 2023
### Motivation #296 introduced a regression for GCC <= 7. > lib/RetryableOperation.h:109:66: error: 'pulsar::RetryableOperation<T>::runImpl(pulsar::TimeDuration)::<lambda(pulsar::Result, const T&)> [with T = pulsar::LookupService::LookupResult]::<lambda(const boost::system::error_code&)>' declared with greater visibility than the type of its field 'pulsar::RetryableOperation<T>::runImpl(pulsar::TimeDuration)::<lambda(pulsar::Result, const T&)> [with T = pulsar::LookupService::LookupResult]::<lambda(const boost::system::error_code&)>::<this capture>' [-Werror=attributes] It seems to be a bug for GCC <= 7 abort the visibility of the lambda expression might not be affected by the `-fvisibility=hidden` option. ### Modifications Add `__attribute__((visibility("hidden")))` to `RetryableOperation::runImpl` explicitly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
All the retryable operations share the same
Backoffobject inRetryableLookupService, so if the reconnection happens for some times, the delay of retrying will keeps the maximum value (30 seconds).Modifications
Refactor the design of the
RetryableLookupService:RetryableOperationclass to represent a retryable operation, each instance has its ownBackoffobject. The operation could only be executed once.RetryableOperationCacheclass to represent a map that maps a specific name to its associated operation. It's an optimization that if an operation (e.g. find the owner topic of topic A) was not complete while the same operation was executed, the future would be reused.RetryableLookupService, just maintain some caches for different operations.RetryableOperationCacheTestto verify the behaviors.Documentation
doc-required(Your PR needs to update docs and you will update later)
doc-not-needed(Please explain why)
doc(Your PR contains doc changes)
doc-complete(Docs have been already added)