lxcfs: : Don't register .poll at all unless --enable-psi-poll is passed#726
lxcfs: : Don't register .poll at all unless --enable-psi-poll is passed#726tomponline wants to merge 1 commit into
Conversation
Without .poll in lxcfs_ops, the kernel handles poll entirely in-kernel (returning DEFAULT_POLLMASK without ever contacting userspace), so there is no FUSE round-trip and no D state when LXCFS is suspended. However this only fixes the problem as long as --enable-psi-poll is not passed. Partial fix for canonical/lxd#17983 Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
|
Hi @mihalicyn what are your thoughts on this one? I realise this is only really a band-aid solution, so I wondered if you had a more full-proof solution? Thanks! |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Hi Tom, thank you for the patch. WDYT about the following: |
At this point no. I need to dive into suspend code to figure out why it can't just cancel request. A long ago I've added this #594 to deal with cases like this one and avoid D-states when possible. |
Documentation for poll() callback says that: > The callee is responsible for destroying ph with > fuse_pollhandle_destroy() when no longer in use. In fuse_lib_poll() ((struct fuse_lowlevel_ops*)->poll) we need to be more careful: 1. If get_path_nullok() fails, we need to free fuse_pollhandle 2. If we passed execution down to fuse_fs_poll(), then it must release fuse_pollhandle resources when no (struct fuse_operations*)->poll provided by the filesystem driver. Found this by myself while reading the code as a part of [1] review. This is not critical at all, because once we return ENOSYS once, kernel never sends FUSE_POLL again. So memleak is unnoticable in practice. Alternatively, we can leak if get_path_nullok() fails all the time, but then we are in a much more serious troubles... Link: lxc/lxcfs#726 [1] Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Thanks @mihalicyn ! This works well too, would you like me to update the PR or do you want to push your own commit to it? Thanks |
yes, please update the PR. You don't need to mention me in the commit anyhow, it's just my review suggestion ;-) |
|
I realised my local branch was |
Documentation for poll() callback says that: > The callee is responsible for destroying ph with > fuse_pollhandle_destroy() when no longer in use. In fuse_lib_poll() ((struct fuse_lowlevel_ops*)->poll) we need to be more careful: 1. If get_path_nullok() fails, we need to free fuse_pollhandle 2. If we passed execution down to fuse_fs_poll(), then it must release fuse_pollhandle resources when no (struct fuse_operations*)->poll provided by the filesystem driver. Found this by myself while reading the code as a part of [1] review. This is not critical at all, because once we return ENOSYS once, kernel never sends FUSE_POLL again. So memleak is unnoticable in practice. Alternatively, we can leak if get_path_nullok() fails all the time, but then we are in a much more serious troubles... Link: lxc/lxcfs#726 [1] Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Without .poll in lxcfs_ops, the kernel handles poll entirely in-kernel (returning DEFAULT_POLLMASK without ever contacting userspace), so there is no FUSE round-trip and no D state when LXCFS is suspended.
However this only fixes the problem as long as
--enable-psi-pollis not passed.Partial fix for canonical/lxd#17983