Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 7 additions & 20 deletions test/library/encryption/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -201,29 +201,16 @@ func GetLastKeyMeta(t testing.TB, kubeClient kubernetes.Interface, namespace, la

// in theory the max time we tolerate disruption on an SNO cluster is 60 seconds
// so we set the timeout to 5 min just in case
pollTimeout := time.Minute * 5

// set the number of step to high value
// we should stop on timeout otherwise the backoff returns after 5 steps
// and we never wait the timeout value
backOff := retry.DefaultBackoff
backOff.Steps = 9999

// in theory the max time we tolerate disruption on an SNO cluster is 60 seconds
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to preserve this comment ? (it explains why 5m)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment is still there

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, ok.

// so we set the timeout to 5 min just in case
err := onErrorWithTimeout(pollTimeout, backOff, func(err error) bool {
if !transientAPIError(err) {
t.Logf("error = %v is not retriable, failed to get the metadata from the last encryption key", err)
return false
}
return true // retry
}, func() (err error) {
selectedSecrets, err = secretsClient.List(context.TODO(), metav1.ListOptions{LabelSelector: labelSelector})
err := wait.PollUntilContextTimeout(t.Context(), 30*time.Second, time.Minute*5, true, func(ctx context.Context) (result bool, err error) {
selectedSecrets, err = secretsClient.List(ctx, metav1.ListOptions{LabelSelector: labelSelector})
if err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use transientAPIError - makes the test less brittle

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned earlier on slack and on the other thread with the ai rabbit: it doesn't cover all transient errors

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but right now we have a mix. Some tests use transientAPIError and some don’t.

Are there any known transient errors we’re currently missing? Would it make sense to add them to transientAPIError?

In general, do we currently have issues with transient errors?

I think it would be better not to hide any errors. At the moment, we treat all errors as transient. Using transientAPIError instead would raise the bar for the entire platform, and I think we’re already doing a good job since the tests are generally stable :)

WDYT?

t.Logf("failed to list secrets, err = %v", err.Error())
t.Logf("GetLastKeyMeta: Failed to list secrets: %v", err)
return false, nil
}
return

return true, nil
})

if err != nil {
return EncryptionKeyMeta{}, err
}
Expand Down