docs: restructure topology-aware documentation.#628
docs: restructure topology-aware documentation.#628klihub wants to merge 2 commits intocontainers:mainfrom
Conversation
0d1a829 to
d108a06
Compare
d108a06 to
4ef44cf
Compare
| ## Background | ||
| ## 1. Overview | ||
|
|
||
| ### What Problems Does the Topology-Aware Policy Solve? |
There was a problem hiding this comment.
I really like this introduction/background on why resource policies matter in the first place. Let's take it in.
In a later phase, when possibly adding a separate "Why to use a resource policy and how to choose the right one" (sub)document, perhaps we can move this background information there and include links from the topology-aware and balloons policy documents into that.
e144b3c to
720e3f5
Compare
ada0163 to
93d7449
Compare
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
93d7449 to
2424fdb
Compare
|
|
||
| ## Controlling Topology Hints Via Annotations | ||
|
|
||
| ### Topology Hints |
There was a problem hiding this comment.
The next paragraph used to be again an introduction to topology hints in general, to cover the subsection. But now it became a sub-subsection itself. I still assume we wouldn't like to go to 4th level sections, so perhaps these two subsection ("topology hints" and "enabling or disabling...") could b merged.
There was a problem hiding this comment.
@askervin Shouldn't we also move the earlier 'Implicit Hardware Topology Hints' section here and combine it with the rest so we'd have everything related to hints described together/close to each other ?
There was a problem hiding this comment.
Sounds good. It's easier to find the information under only one topic/section.
There was a problem hiding this comment.
I ended up moving the first section about (Implicit Hardware) Topology Hints together with the rest later, turning them into a single chapter and using bold paragraph headers as topic separators.
| # Prefer isolated CPUs when available | ||
| prefer-isolated-cpus.resource-policy.nri.io/pod: "true" |
There was a problem hiding this comment.
Perhaps we should not recommend isolcpus for generic HPC?
There was a problem hiding this comment.
I removed the HPC workload example altogether. It did not make sense with the isolated allocation attempt of multiple CPUs.
| name: default | ||
| spec: | ||
| preferSharedCPUs: false | ||
| preferIsolatedCPUs: true |
There was a problem hiding this comment.
(same as above, isolcpus might be difficult...)
| name: default | ||
| spec: | ||
| preferSharedCPUs: false | ||
| preferIsolatedCPUs: true |
There was a problem hiding this comment.
Suspecting that isolcpus for any Guaranteed container as a default is not a safe choice.
There was a problem hiding this comment.
Yeah, this is a bad/hallucinated example. Usually shared preference should be set or left at the default true so that isolated CPUs would be preferred only for guaranteed containers with < 2 CPU request. Plus globally preferred isolated for all exclusive allocation requires explicitly annotated opt-out from shared allocation. We need to check and rethink the examples more carefully.
There was a problem hiding this comment.
I updated this Guaranteed realtime app to ask for a single CPU, to let the isolated CPU preference make more sense.
| cpu: "4" | ||
| memory: "8Gi" | ||
|
|
||
| # Best-effort background task |
There was a problem hiding this comment.
Could we have one more workload in this cookbook? I'm thinking of a high-responsive, relatively low-latency Burstable container. We could treat it with slightly elevated scheduling priority (yet not realtime), and with memory locality set to NUMA level. That would be again clearly something that is not achievable with on-the-stock managers.
There was a problem hiding this comment.
Okay, I added a prioritized app like that, and changed the realtime app to take a single exclusive CPU and explicitly prefer isolated CPUs for it.
6579de9 to
0da59cc
Compare
Reorganize topology-aware.md taking inspiration from a similar update to balloons and using this table of content - Overview with problem statement and key features - Installation and Configuration - Configuration Options (8 subsections) - Cookbook with practical examples - Troubleshooting Co-authored-by: GitHub Copilot <noreply@github.com> Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
0da59cc to
8c29598
Compare
This PR is an attempt to restructure the topology-aware documentation. I threw copilot at it asking to reorganize the content taking the ToC from the corresponding balloons PR#627.