Storage - Fix Flaky Stress Tests#48359
Conversation
- read functions had FAIL_FAST which would throw an error when the stream had reached then end and we wanted to read from the stream again. So we removed from both reads. - refactor code so that the exit criteria is a tthe beginning - refactor the emitContentInfo for dry
- changed emitValue to tryEmitValue - remove Sinks.EmitFailureHandler.FAIL_FAST so that multiple closes does not cause an error to be thrown
- opentelemetry-runtime-telemetry-java8 from 2.24.0-alpha -> 2.15.0-alpha - opentelemetry-logback-appender-1.0 from 2.24.0-alpha -> 2.15.0-alpha
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce flakiness in Storage stress tests by making cleanup more resilient, making CRC telemetry streams tolerate re-subscription/double-close behaviors, and aligning dependencies with the chosen OpenTelemetry runtime metrics version.
Changes:
- Replace unconditional deletes with
deleteIfExists()across multiple stress scenarios to avoid cleanup failures when resources are already gone. - Add retry/timeout-based global cleanup logic in scenario base classes and add retry logic to async runs.
- Adjust CRC stream emission behavior to avoid failures on repeated terminal events; downgrade OpenTelemetry instrumentation dependencies to
2.15.0-alpha.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/TelemetryHelper.java | Adjust JVM runtime metrics registration and make timeout/cancellation detection null-safe. |
| sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java | Switch sink emission to tryEmitValue to tolerate double-close. |
| sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcInputStream.java | Refactor EOF emission, add resubscription state reset, and switch to tryEmitValue. |
| sdk/storage/azure-storage-stress/pom.xml | Downgrade OTel runtime telemetry + logback appender to 2.15.0-alpha. |
| sdk/storage/azure-storage-file-share-stress/src/main/java/com/azure/storage/file/share/stress/UploadFromFile.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java | Add retrying global cleanup + async retry behavior and new logging. |
| sdk/storage/azure-storage-file-share-stress/pom.xml | Downgrade OTel runtime telemetry + logback appender to 2.15.0-alpha. |
| sdk/storage/azure-storage-file-datalake-stress/src/main/java/com/azure/storage/file/datalake/stress/UploadFromFile.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-file-datalake-stress/src/main/java/com/azure/storage/file/datalake/stress/Upload.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-file-datalake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java | Add retrying global cleanup + async retry behavior and new logging. |
| sdk/storage/azure-storage-file-datalake-stress/pom.xml | Downgrade OTel runtime telemetry + logback appender to 2.15.0-alpha. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/UploadPages.java | Use deleteIfExists() and swallow delete errors during cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/Upload.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/StageBlock.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobScenarioBase.java | Add retrying global cleanup + async retry behavior and new logging. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobOutputStream.java | Use deleteIfExists() and swallow delete errors during cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/CommitBlockList.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlockBlobUpload.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlockBlobOutputStream.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java | Add retrying global cleanup + async retry behavior and structured logging. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/AppendBlock.java | Use deleteIfExists() and swallow delete errors during cleanup. |
| sdk/storage/azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/AppendBlobOutputStream.java | Use deleteIfExists() during per-test cleanup. |
| sdk/storage/azure-storage-blob-stress/pom.xml | Downgrade OTel runtime telemetry + logback appender to 2.15.0-alpha. |
| sdk/parents/azure-client-sdk-parent/pom.xml | Downgrade io.clientcore:linting-extensions used by checkstyle plugin from beta.2 to beta.1. |
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Outdated
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
...re-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobScenarioBase.java
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcInputStream.java
Outdated
Show resolved
Hide resolved
...re-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobOutputStream.java
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Show resolved
Hide resolved
ibrandes
left a comment
There was a problem hiding this comment.
i resolved all comments that i don't think need to be addressed right now, but there are still a couple lingering ones (make sure you expand hidden conversations). not sure how important the ones about globalCleanupAsync are, but i think we should address the ones about doOnError and deleteAllFilesInFileSystem, thoughts?
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Outdated
Show resolved
Hide resolved
...re-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobScenarioBase.java
Outdated
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Outdated
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Outdated
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
...re-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/PageBlobScenarioBase.java
Outdated
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Show resolved
Hide resolved
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Outdated
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Show resolved
Hide resolved
… retries when ultimately successful
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Show resolved
Hide resolved
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcInputStream.java
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Outdated
Show resolved
Hide resolved
| <dependency> | ||
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-runtime-telemetry-java8</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-logback-appender-1.0</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-logback-appender-1.0;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-logback-appender-1.0;external_dependency} --> | ||
| </dependency> |
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
...e-file-share-stress/src/main/java/com/azure/storage/file/share/stress/ShareScenarioBase.java
Outdated
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Outdated
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcOutputStream.java
Show resolved
Hide resolved
sdk/storage/azure-storage-stress/src/main/java/com/azure/storage/stress/CrcInputStream.java
Show resolved
Hide resolved
...talake-stress/src/main/java/com/azure/storage/file/datalake/stress/DataLakeScenarioBase.java
Show resolved
Hide resolved
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-runtime-telemetry-java8</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> |
.../azure-storage-blob-stress/src/main/java/com/azure/storage/blob/stress/BlobScenarioBase.java
Show resolved
Hide resolved
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-runtime-telemetry-java8</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> |
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-runtime-telemetry-java8</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> |
| <groupId>io.opentelemetry.instrumentation</groupId> | ||
| <artifactId>opentelemetry-runtime-telemetry-java8</artifactId> | ||
| <version>2.24.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> | ||
| <version>2.15.0-alpha</version> <!-- {x-version-update;io.opentelemetry.instrumentation:opentelemetry-runtime-telemetry-java8;external_dependency} --> |
… failures upon success
This is a fix to fix the issues we've been having with the stress tests.