-
Notifications
You must be signed in to change notification settings - Fork 369
Create blog post on AKS NAP disruption management #5685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
0b0b088
b96d793
da6a66d
a8c6416
6b77abb
a912f3a
d4e6d8b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,304 @@ | ||||||
| --- | ||||||
| title: "Managing Disruption with AKS Node Auto-Provisioning" | ||||||
| description: "Learn AKS best practices to control NAP disruption with Pod Disruption Budgets (PDBs), node pool disruption budgets, consolidation, and maintenance windows." | ||||||
| date: 2026-04-12 | ||||||
| authors: ["wilson-darko"] | ||||||
| tags: | ||||||
| - node-auto-provisioning | ||||||
| --- | ||||||
|
|
||||||
| Azure Kubernetes Service (AKS) Node Auto-Provisioning (NAP) keeps your clusters efficient: it provisions nodes for pending pods, and it continuously *removes* nodes when it's safe to do so (for example, when nodes are empty or underutilized). That node-removal **disruption** is where many production surprises happen. | ||||||
|
|
||||||
| When managing Kubernetes, operational questions that users might have are: | ||||||
|
|
||||||
| - How do I control when scale downs happen, or where it shouldn't? | ||||||
| - How do I control workload disruption so it happens predictably (and not in the middle of business hours)? | ||||||
| - Why won’t NAP scale down, even though I have lots of underused capacity? | ||||||
| - Why do upgrades get “stuck” on certain nodes? | ||||||
|
|
||||||
| This post focuses on **NAP disruption best practices**, not workload scheduling (tools like topology spread constraints, node affinity, and taints). For scheduling best practices, see the NAP scheduling fundamentals post (link TBD). | ||||||
|
||||||
| This post focuses on **NAP disruption best practices**, not workload scheduling (tools like topology spread constraints, node affinity, and taints). For scheduling best practices, see the NAP scheduling fundamentals post (link TBD). | |
| This post focuses on **NAP disruption best practices**, not workload scheduling (tools like topology spread constraints, node affinity, and taints). |
Copilot
AI
Apr 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hero image referenced here is very large (~1.7 MB). Please compress/resize it (ideally <500 KB) to reduce page weight and improve load performance.
|  | |
|  |
Copilot
AI
Mar 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the repo’s blog post structure, add a hero image immediately after <!-- truncate -->. The post directory currently contains only index.md, so readers won’t get a hero/social image unless you add one (for example ./hero-image.png) and reference it here with descriptive alt text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bullet is grammatically incomplete ("where it shouldn't?"). Consider rephrasing to include the missing verb/object (for example, "where it shouldn't happen").