Skip to content

lxcfs: fix wrong cpu count when setting cfs in hierarchy#690

Merged
stgraber merged 3 commits into
lxc:mainfrom
okhowang:recursive
Apr 2, 2026
Merged

lxcfs: fix wrong cpu count when setting cfs in hierarchy#690
stgraber merged 3 commits into
lxc:mainfrom
okhowang:recursive

Conversation

@okhowang
Copy link
Copy Markdown
Contributor

Resolve #688

@stgraber
Copy link
Copy Markdown
Member

@mihalicyn can you take a look?

@stgraber stgraber requested a review from mihalicyn October 16, 2025 23:42
Copy link
Copy Markdown
Member

@mihalicyn mihalicyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, reply on my comment in #688 (comment)

Copy link
Copy Markdown
Member

@mihalicyn mihalicyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for preparing and submitting this.

I believe that this is on the right direction but we need to slightly rework the code.

Also, please look #688 (comment)

Kind regards,
Alex

Comment thread src/bindings.h Outdated
Comment thread src/proc_cpuview.c Outdated
@mihalicyn
Copy link
Copy Markdown
Member

Please, also look on #712

It looks like we have two independent contributors who have stepped on the same issue but from a different directions ;-)

@okhowang okhowang force-pushed the recursive branch 2 times, most recently from d075f17 to 5f87d5b Compare March 20, 2026 09:26
@okhowang okhowang changed the title lxcfs: add --recursive flag lxcfs: fix wrong cpu count when setting cfs in hierarchy Mar 20, 2026
@okhowang okhowang force-pushed the recursive branch 2 times, most recently from dd4903b to fdef56c Compare March 23, 2026 02:05
@okhowang
Copy link
Copy Markdown
Contributor Author

@mihalicyn I couldn't reproduce psi timeout local, it maybe some lag on github actions, can you rerun the CI?

Comment thread src/proc_cpuview.c
Comment thread src/proc_cpuview.c Outdated
return 0;

if (!read_cpu_cfs_param(cg, "period", &cfs_period))
if (!read_cpu_count_cfs(cg, &rv))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've started to use get_min_cpu_count_cfs in max_cpu_count, at the same time you still rely on read_cpu_count_cfs here, why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's my fault, I modified here at first, and forgot to updatee here later.

@mihalicyn
Copy link
Copy Markdown
Member

I couldn't reproduce psi timeout local, it maybe some lag on github actions, can you rerun the CI?

yeah, I'll take care of this problem. It is not connected with this PR for sure.

Comment thread src/proc_cpuview.c
Comment thread src/proc_cpuview.c Outdated
Comment thread src/proc_cpuview.c Outdated
@mihalicyn
Copy link
Copy Markdown
Member

@okhowang the more I look and find issues here, the more I feel that we need to add a test for --enable-cfs. You can take some inspiration from https://github.com/lxc/lxcfs/blob/main/tests/test_proc.in. I believe we should spawn a separate LXCFS daemon with --enable-cfs parameter passed and run tests. You can do a draft implementation of tests to cover your use-case and validate that without this PR tests fails and with fix from this PR test passes. I'll then clean this up and figure out what is the best way to combine it with our CI pipelines.

@okhowang
Copy link
Copy Markdown
Contributor Author

sorry for my miss, there are lot of month since my first work on lxcfs.
and I forgot meson install before testing, so I tested with old liblxcfs.so.

I added a commit for test_cfs to test read cpu/memory in child cgroup. you can checkout the commit and test will failed.
the fixes worked after the testing commit.

@okhowang okhowang force-pushed the recursive branch 3 times, most recently from 6f3e129 to 0ee1e0b Compare March 24, 2026 13:00
@okhowang okhowang requested a review from mihalicyn March 24, 2026 13:06
@okhowang okhowang force-pushed the recursive branch 2 times, most recently from ad2f699 to 8d651df Compare March 24, 2026 13:17
@mihalicyn
Copy link
Copy Markdown
Member

Hi @okhowang,

please can you rebase your branch on top of recent main. I've added a bunch of new tests (#722), specifically https://github.com/lxc/lxcfs/blob/1881408808ac822867be56dc5b998645d415ac3a/tests/test_cpu_cfs_hierarchy.sh.in

Also, please, remove:

# TODO: remove after fixing https://github.com/lxc/lxcfs/issues/688
echo '300000 100000' > "/sys/fs/cgroup/${cg1}/${cg2}/${cg3}/cpu.max"

chunk from (

# TODO: remove after fixing https://github.com/lxc/lxcfs/issues/688
) and put this change into a separate commit.

This will allow us to validate if this PR fixes everything properly.

Kind regards,
Alex

Copy link
Copy Markdown
Member

@mihalicyn mihalicyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase is needed

Signed-off-by: okhowang(王沛文) <okhowang@tencent.com>
@okhowang
Copy link
Copy Markdown
Contributor Author

okhowang commented Apr 2, 2026

rebase is needed需要变基

done.
BTY, is my test commit duplicated? or should I remove it from PR?

I noticed that cgroup v1 has been dropped.
should I remove branch for cgroup v1 in the PR?

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 47.50000% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/proc_cpuview.c 48.64% 11 Missing and 8 partials ⚠️
src/proc_fuse.c 33.33% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@mihalicyn
Copy link
Copy Markdown
Member

mihalicyn commented Apr 2, 2026

Hi @okhowang,

yes, please remove the test you've added, because the version we have in main is more advanced and covers this stuff better.

okhowang added 2 commits April 2, 2026 20:16
Signed-off-by: okhowang(王沛文) <okhowang@tencent.com>
Signed-off-by: okhowang(王沛文) <okhowang@tencent.com>
@okhowang
Copy link
Copy Markdown
Contributor Author

okhowang commented Apr 2, 2026

Hi @okhowang,

yes, please remove the test you've added, because the version we have in main is more advanced and covers this stuff better.

test commit removed

Copy link
Copy Markdown
Member

@mihalicyn mihalicyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks, @okhowang !

@stgraber stgraber merged commit 02391e1 into lxc:main Apr 2, 2026
26 of 29 checks passed
@okhowang okhowang deleted the recursive branch April 3, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Wrong cpu number reported, when only CFS (cpu.max) cgroup2 limit is set

3 participants