Skip to content

Westphalrafael/investigate 3.0 flakes#2367

Draft
rafaelwestphal wants to merge 1 commit into
ops_agent_3.0from
westphalrafael/investigate-3.0-flakes
Draft

Westphalrafael/investigate 3.0 flakes#2367
rafaelwestphal wants to merge 1 commit into
ops_agent_3.0from
westphalrafael/investigate-3.0-flakes

Conversation

@rafaelwestphal

Copy link
Copy Markdown
Contributor

Description

Related issue

How has this been tested?

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

During nightly E2E integration tests, TestRestartVM frequently flakes when restarting the VM if the zone is out of capacity (stockout), because the underlying gce-testing-internal library treats stockouts as permanent errors and aborts retries.

This CL implements a local helper restartInstanceWithRetries in the test suite that performs StopInstance followed by StartInstance inside a custom retry loop, specifically retrying on ZONE_RESOURCE_POOL_EXHAUSTED and other transient capacity errors.
@rafaelwestphal rafaelwestphal changed the base branch from master to ops_agent_3.0 June 26, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant