walkup to root when getting cpu.max in cgroup v2#712
Conversation
Signed-off-by: weihong.xu <xuweihong.cn@gmail.com>
de18d5f to
f3f045b
Compare
| return false; | ||
|
|
||
| if (!cgroup_ops->get(cgroup_ops, "cpu", cg, file, &str)) | ||
| if (!cgroup_ops->get_walkup_to_root(cgroup_ops, "cpu", cg, file, &str)) |
There was a problem hiding this comment.
Have you tested this? I have a gut feeling that this is not correct, because for example strcmp("max 100000", "max") is not zero.
There was a problem hiding this comment.
Have you tested this?
Yes, it is already deployed in our production environment.
because for example strcmp("max 100000", "max") is not zero.
Replacing cgroup_ops->get with cgroup_ops->get_walkup_to_root only ensure when lxcfs cannot find cpu.max in /user.slice/user-1008.slice/session-219.scope, it will search in the parent directory /user.slice/user-1008.slice/cpu.max. And the parsing of cpu.max remains the same without any change.
There was a problem hiding this comment.
Replacing
cgroup_ops->getwithcgroup_ops->get_walkup_to_rootonly ensure when lxcfs cannot findcpu.maxin/user.slice/user-1008.slice/session-219.scope, it will search in the parent directory/user.slice/user-1008.slice/cpu.max.
yes, this makes sense. Another question is that why LXCFS even looks on /user.slice/user-1008.slice/session-219.scope cgroup in your case. LXCFS always tries to determine a "init" process of the container. See, for example proc_stat_read. It will go to /user.slice/user-1008.slice/session-219.scope only if requesting process is running in the same PID namespace as LXCFS itself, i.e. there is no LXC/Docker container. Am I right in understanding that in your setup you don't use any kind of containers?
And the parsing of
cpu.maxremains the same without any change.
Thanks, this makes sense. That would be awesome if you had described all of this in your commit message and issue ;-) Cause otherwise I have to guess what kind of setup do you have and why this fix helps.
There was a problem hiding this comment.
. Another question is that why LXCFS even looks on /user.slice/user-1008.slice/session-219.scope cgroup in your case. LXCFS always tries to determine a "init" process of the container.
I am using LXCFS without container. As a HPC admin I find that I can make use of LXCFS and pam_namespace to override the /proc/cpuinfo to make programs developed by Python and others to be cgroup-aware.
|
Looks like a duplicate of #690 |
Yes. In my case I didn't expect that situation when I solved my own problem. But the patch looks a little bit over complicated to me. |
no, there is no need to merge values, but you need to figure out that on a current level there is no information about limit. It means that
No worries. This stuff is quite complicated and it is hard to figure out everything at once. Thanks for your work on this. Let me know if you are interested to continue. |
|
I'm closing this PR for now, because we have another one with more activity and hopefully it will be ready to be merged soon #690. |
close #711