Skip to content

fix: use milli-units for scalar in-queue resources#5391

Open
WHOIM1205 wants to merge 4 commits into
volcano-sh:masterfrom
WHOIM1205:fix-scalar-inqueue-resource
Open

fix: use milli-units for scalar in-queue resources#5391
WHOIM1205 wants to merge 4 commits into
volcano-sh:masterfrom
WHOIM1205:fix-scalar-inqueue-resource

Conversation

@WHOIM1205

Copy link
Copy Markdown

Summary

Fix a scalar resource accounting inconsistency in GetInqueueResource by replacing Value() with MilliValue() for scalar resource calculations.

Previously, scalar resources such as GPUs were recorded in whole units while the rest of the scheduler stored and compared them in milli-units. This caused in-queue GPU reservations to be undercounted and could allow jobs to be admitted beyond a queue's configured GPU quota.

This change aligns in-queue scalar resource accounting with the scheduler's internal resource representation, ensuring accurate quota enforcement across proportion, capacity, and overcommit plugins.

Fixes:

  • Consistent milli-unit accounting for scalar resources.
  • Correct reservation calculation when allocated resources are present.
  • Prevents GPU/scalar quota over-admission caused by unit mismatches.

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
Copilot AI review requested due to automatic review settings June 5, 2026 20:03
@volcano-sh-bot

Copy link
Copy Markdown
Contributor

Welcome @WHOIM1205! It looks like this is your first PR to volcano-sh/volcano 🎉

@volcano-sh-bot volcano-sh-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 5, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adjusts how scalar resource quantities are converted when computing a job’s in-queue (reserved) resources, switching from whole-unit values to milli-unit values.

Changes:

  • Convert rQuantity from Value() to MilliValue() when populating inqueue.ScalarResources
  • Apply the same milli-unit conversion when computing reservedScalarRes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/scheduler/plugins/util/util.go Outdated
Comment on lines +130 to +132
inqueue.ScalarResources[rName] = float64(rQuantity.MilliValue())
} else {
reservedScalarRes := float64(rQuantity.Value()) - allocatedMount
reservedScalarRes := float64(rQuantity.MilliValue()) - allocatedMount

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore this I think!🙂

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the GetInqueueResource function in pkg/scheduler/plugins/util/util.go to use MilliValue() instead of Value() when calculating scalar resource quantities, ensuring higher precision. There are no review comments, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@Aman-Cool

Copy link
Copy Markdown
Contributor

@WHOIM1205, Good catch on the bug. The fix is correct.., NewResource() stores scalar resources using .MilliValue(), so allocatedMount is already in milli-units. Both operands are now consistent.

One thing needed before this can merge: a unit test for GetInqueueResource. There's no util_test.go in this package. It should cover at least:

  • A job requesting 2 GPUs with nothing allocated -> inqueue.ScalarResources[gpu] should be 2000
  • A job requesting 2 GPUs with 1 already allocated -> inqueue.ScalarResources[gpu] should be 1000

Without this, the mismatch is easy to reintroduce silently since it doesn't panic ; it just computes the wrong number.

@Aman-Cool

Copy link
Copy Markdown
Contributor

@hajnalmt, WDYT?

@hajnalmt hajnalmt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice catch thanks for the change! I feel though that this is still not right.

The fix is correct.., NewResource() stores scalar resources using .MilliValue()

This doesn't stand for pods unfortunately: https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/api/resource_info.go#L97

So this will screw up the inqueue metric for pods. It will be some bogus number in the end.

To fix this correctly we need a proper converter, which we barely have as I am looking in the ResQuantity2Float64 function:

func ResQuantity2Float64(resName v1.ResourceName, quantity resource.Quantity) float64 {

This function definition seems to be lacking every other resource than CPU currently, so a proper fix would envolve fixing the two ResFloat642Quantity and ResQuantity2Float64 functions too to have proper switch bodys (which are capable of converting properly) and switching to this function.

Do you have time to do this @WHOIM1205 ?

Comment thread pkg/scheduler/plugins/util/util.go Outdated
}
if allocatedMount, ok := allocated.ScalarResources[rName]; !ok {
inqueue.ScalarResources[rName] = float64(rQuantity.Value())
inqueue.ScalarResources[rName] = float64(rQuantity.MilliValue())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aman-Cool

Copy link
Copy Markdown
Contributor

Good point, I missed the pods case. NewResource() stores pods using .Value() (whole units) not .MilliValue(), so the blanket MilliValue() change would inflate the pods reservation by 1000x.

Agreed with @hajnalmt, the right fix is to extend ResQuantity2Float64 and ResFloat642Quantity with proper per-resource handling and use those converters here instead.

…lar units

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 8, 2026
@WHOIM1205

Copy link
Copy Markdown
Author

Thanks for the feedback @Aman-Cool @hajnalmt

I reworked the fix to avoid changing the shared conversion helpers and NUMA-related paths.

Instead of using Value()/MilliValue() directly, GetInqueueResource now converts MinResources through api.NewResource(), which is the same constructor used to build job.Allocated. This ensures both operands use the same internal representation without introducing new conversion rules.

I also added regression tests covering:

  • no GPU allocation
  • partial GPU allocation
  • fully allocated GPU requests

This keeps the change scoped to GetInqueueResource while preventing the scalar accounting mismatch from reappearing.

@hajnalmt hajnalmt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! This is code-wise okay. Please find my remarks below.

Comment thread pkg/scheduler/plugins/util/util.go Outdated

// GetInqueueResource returns reserved resource for running job whose part of pods have not been allocated resource.
func GetInqueueResource(job *api.JobInfo, allocated *api.Resource) *api.Resource {
// Convert MinResources through api.NewResource so it shares the exact unit

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hacky, but I get it that this was easier. 😄 I would have been happier if you modify the resource functions. You don't need the comments here if you choose this route, the code speaks for itself.

// must be converted to milli-units to be consistent with job.Allocated. The
// previous implementation used Quantity.Value() for scalars, which under-counted
// reserved GPUs by a factor of 1000 once any pod had been allocated.
func TestGetInqueueResource(t *testing.T) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test here for the "pods" too please?

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
@WHOIM1205

Copy link
Copy Markdown
Author

hey @hajnalmt
Thanks for the review.

I've removed the explanatory comments and added coverage for the pods path as requested.

The new test verifies the current behavior of GetInqueueResource when MinResources contains v1.ResourcePods and ensures pods are not accidentally included in the reserved resources while other resources are still calculated correctly.

@hajnalmt hajnalmt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes and the UT, but you didn't get my review point properly. "pods" is our only scalar resource that's not converted to milli quantities, that's what needs to be verified.

Comment thread pkg/scheduler/plugins/util/util_test.go Outdated
expected: &api.Resource{
MilliCPU: 3000,
},
absentScalars: []v1.ResourceName{v1.ResourcePods},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 "pods" is a scalar resource but its not converted to millis, thats the point of the test. Why is it absent here? Is this really passing? I hope not, then we are in trouble 😄

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this passed the CI, something is fishy here. It shouldn't.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it i think i misunderstood what the test should be verifying

after tracing it again i found that pods is handled differently from the other scalar resources and my test was only exercising the skip path

just to make sure im targeting the right behavior are you looking for a test that verifies pods stays in whole units rather than milli units or should getinqueueresource actually account for pods as well

@hajnalmt hajnalmt Jun 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I think it should be both:

  1. pods should stay in whole units, not milli-units.
  2. GetInqueueResource() should still account for the remaining pods
    reservation when MinResources contains pods.

api.NewResource() already records pods in ScalarResources, using whole-unit Value():
https://github.com/volcano-sh/volcano/blob/v1.15.0-alpha.0/pkg/scheduler/api/resource_info.go#L95-L97

pods must not go through the generic milli-unit scalar comparison. But if this unit test passes as written, then pods currently disappears from inqueue accounting, which looks like a separate bug.

I think the current skip happens because GetInqueueResource() only allows generic scalar resources after this guard:

default:
if api.IsCountQuota(rName) || !v1helper.IsScalarResourceName(rName) {
continue
}

and Kubernetes' scheduler helper intentionally treats IsScalarResourceName as extended resources, hugepages, prefixed native resources, and attachable volumes:
https://github.com/kubernetes/kubernetes/blob/v1.35.3/pkg/scheduler/util/utils.go#L141-L145

Unprefixed pods does not match that helper; it is only native through IsNativeResource, not IsPrefixedNativeResource:
https://github.com/kubernetes/kubernetes/blob/v1.35.3/pkg/apis/core/v1/helper/helpers.go#L51-L63

But that Kubernetes helper classification is not the same thing as Volcano's internal accounting semantics. Volcano's api.NewResource() explicitly keeps pods in ScalarResources, so GetInqueueResource() should mirror that special case.

Concretely, for this test case:

MinResources["pods"] = 10
allocated.ScalarResources["pods"] = 4

I would expect the returned inqueue resource to contain:

ScalarResources: map[v1.ResourceName]float64{
    v1.ResourcePods: 6,
}

The fix I would suggest is to keep the api.NewResource() conversion for unit consistency, but add an explicit case v1.ResourcePods before the generic IsScalarResourceName branch:

case v1.ResourcePods:
    reservedPods := minResources.ScalarResources[rName]
    if allocatedPods, ok := allocated.ScalarResources[rName]; ok {
        reservedPods -= allocatedPods
    }
    if reservedPods > 0 {
        if inqueue.ScalarResources == nil {
            inqueue.ScalarResources = make(map[v1.ResourceName]float64)
        }
        inqueue.ScalarResources[rName] = reservedPods
    }

That keeps pods in whole units, while still preserving the remaining pod reservation in inqueue accounting.

Comment thread pkg/scheduler/plugins/util/util_test.go Outdated
},
},
{
// "pods" is neither a count quota nor a scalar resource, so it is

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"pods" is a scalar resource.

@WHOIM1205

Copy link
Copy Markdown
Author

Thanks, I misunderstood the intent of the test.

I was focusing on whether pods was included in the inqueue calculation, while the real thing to verify is that pods keeps its special non-milli representation. I'll update the test to cover that specifically.

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
@WHOIM1205

Copy link
Copy Markdown
Author

thanks for the explanation

you were right i was looking at it through the kubernetes scalar classification instead of volcanos internal accounting

i added an explicit pods path in getinqueueresource and updated the test to verify pods stays in whole units and that the remaining reservation is accounted for

@hajnalmt

hajnalmt commented Jun 10, 2026

Copy link
Copy Markdown
Member

/lgtm

Thanks for the update, this is the shape I was looking for.

@hajnalmt hajnalmt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side. Thanks!

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 10, 2026
@volcano-sh-bot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hajnalmt
Once this PR has been reviewed and has the lgtm label, please assign thor-wl for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants