Skip to content

[Fix] Fail RayService if StepSizePercent exceeds MaxSurgePercent#4663

Open
JiangJiaWei1103 wants to merge 1 commit intoray-project:masterfrom
JiangJiaWei1103:hard-validate-stepsize
Open

[Fix] Fail RayService if StepSizePercent exceeds MaxSurgePercent#4663
JiangJiaWei1103 wants to merge 1 commit intoray-project:masterfrom
JiangJiaWei1103:hard-validate-stepsize

Conversation

@JiangJiaWei1103
Copy link
Copy Markdown
Contributor

Why are these changes needed?

When StepSizePercent exceeds MaxSurgePercent, the migrated traffic volume was silently capped by targetCapacity with no user-facing warning. This PR introduces strict validation, failing the RayService immediately if StepSizePercent exceeds MaxSurgePercent.

Related issue number

Closes #4636

Test Results

The following shows local unit test results:

Screenshot 2026-03-31 at 10 49 13 AM

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>
@JiangJiaWei1103
Copy link
Copy Markdown
Contributor Author

JiangJiaWei1103 commented Mar 31, 2026

cc @ryanaoleary @win5923 @machichima to take a look if you have time, thanks!

Copy link
Copy Markdown
Collaborator

@machichima machichima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Could you please check the failed CI?

@JiangJiaWei1103
Copy link
Copy Markdown
Contributor Author

JiangJiaWei1103 commented Mar 31, 2026

Hi @machichima,

Thanks for the review, I've observed this e2e failure over the past week (issue), which is not relevant to this PR.

I think @rueian is also trying to fix it. I'll take some time to figure out what's going on. Thanks!

Copy link
Copy Markdown
Collaborator

@ryanaoleary ryanaoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [RayService] Silent capping of stepSize by maxSurge during traffic migration

3 participants