CMP-4007: Fix aide-worker memory growth caused by cgroup page cache accumulation#877
CMP-4007: Fix aide-worker memory growth caused by cgroup page cache accumulation#877Vincent056 wants to merge 1 commit intoopenshift:masterfrom
Conversation
717303f to
fe75940
Compare
|
@Vincent056: This pull request references CMP-4007 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Vincent056 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
With a 3 master + 3 worker cluster, I can see huge difference without/with the PR. With PR #877 fix applied, I can see: The problem is only the first scan result will be loged into aide pod logs. Not sure it is an env issue or not. I will double check tomorrow: |
AIDE scans the entire host filesystem, and the resulting kernel page cache is charged to the container's cgroup. Without reclamation, reported memory grows toward the resource limit after each scan cycle. Use cgroup v2 memory.reclaim to evict file-backed page cache after each AIDE scan and database initialization. This reduced aide-worker memory from ~570 MiB to ~11 MiB in testing on OCP 4.18.22. Use raw syscalls (syscall.Open/Write/Close) for memory.reclaim instead of os.OpenFile, because Go's runtime registers fds with its epoll poller and the cgroup v2 file's poll support causes the goroutine to hang waiting for write-readiness that never arrives. Additional fixes: - Close leaked file descriptor in getNonEmptyFile when file is empty - Pre-compile regex patterns used in log parsing - Handle AlreadyExists on ConfigMap creation to avoid unnecessary retries - Call runtime.GC and debug.FreeOSMemory after scan to return heap to OS - Update outdated GODEBUG comment (madvdontneed=1 is default since Go 1.16)
fe75940 to
81bd8e5
Compare
|
With this update, the scan won't stuck now. You can see the scan can be triggered successfully and is running as expected. With PR #877 fix applied, I can see: More details: |
|
@Vincent056: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
memory.reclaimto evict file-backed page cache after each AIDE scan and DB init, reducing reported memory from ~570 MiB to ~11 MiB.getNonEmptyFile, pre-compile regex patterns, handleAlreadyExistson ConfigMap creation, and callruntime.GC/debug.FreeOSMemoryafter scans.Root Cause Analysis
The aide-worker container runs AIDE as a privileged process scanning the host root filesystem mounted at
/hostroot. Every file AIDE reads generates kernel page cache entries that are charged to the container's cgroup memory.oc adm top podsreportscontainer_memory_working_set_byteswhich includes this page cache, causing reported memory to grow toward the limit after each scan cycle. Increasing resource limits only causes memory to grow to the new limit.Cgroup memory breakdown before fix:
Test Results
Tested on OCP 4.18.22 with FIO 1.3.8 (6 nodes, 3 masters + 3 workers):
Changes
cmd/manager/daemon_util.go: AddreclaimCgroupPageCache()using cgroup v2memory.reclaim,getOwnCgroupPath()to discover the container cgroup, andreleaseMemoryAfterScan()for explicit GC. Fix file descriptor leak ingetNonEmptyFile().cmd/manager/loops.go: Call reclaim and GC after each AIDE scan inaideLoopand after DB initialization inhandleAIDEInit.cmd/manager/logcollector_util.go: Pre-compile regex patterns at package level. HandleAlreadyExistserror on ConfigMap creation with delete-and-recreate.pkg/controller/fileintegrity/fileintegrity_controller.go: Update outdatedGODEBUGcomment.