OCPSTRAT-1677: AWS spot instance support for HyperShift#1951
OCPSTRAT-1677: AWS spot instance support for HyperShift#1951enxebre wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@enxebre: This pull request references OCPSTRAT-1677 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
6a151c5 to
d66c090
Compare
|
/lgtm |
d66c090 to
2348b4a
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2348b4a to
d7e5483
Compare
|
@enxebre: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
/lgtm |
| - Add `Spot` as a value to the existing `MarketType` enum and add a `marketType` field to `PlacementOptions` on `AWSNodePoolPlatform`. | ||
| - Add a `SpotOptions` struct with an optional `maxPrice` field, referenced from `PlacementOptions` via a `spot` field. | ||
| - Add a `terminationHandlerQueueURL` field to `AWSPlatformSpec` on the HostedCluster to configure the NTH SQS queue as a proper API field. | ||
| - Ensure that spot instances are automatically labeled with `hypershift.openshift.io/interruptible-instance` so that workloads can use node affinity and anti-affinity rules to control placement. |
There was a problem hiding this comment.
nit: Nodes backed by spot instances
| aws: | ||
| region: us-east-1 | ||
| rolesRef: ... | ||
| terminationHandlerQueueURL: "https://sqs.us-east-1.amazonaws.com/123456789012/my-nth-queue" |
There was a problem hiding this comment.
What kind of validation do we do here? Can a misconfigured/malicious user pointing to an arbitrary URL result in a DoS of the NTH on the control plane?
Update: I now see validation in the API definition below.
| 3. CAPA creates EC2 spot instances with the specified market options. | ||
|
|
||
| 4. The NTH (deployed by the control-plane-operator when `terminationHandlerQueueURL` is set on the HostedCluster) watches for spot interruption and rebalance recommendation events, and cordons/drains nodes before termination. | ||
| 5. The spot remediation controller in the HCCO watches Nodes tainted with the `aws-node-termination-handler/` prefix and triggers Machine deletion for immediate replacement. |
There was a problem hiding this comment.
This is a new responsibility for the HCCO (RBAC?) and Machines exist in the mgmt cluster, which the HCCO does not normal operate against.
I don't know this area well, but what prevents us from using the MHC to trigger machine replacement in both cases, whether the terminationHandlerQueueURL is specified or not i.e. MHC gaining the ability to replace Machines when the Node is tainted with the aws-node-termination-handler taint.
| // Supports both standard and FIFO queues (FIFO queues end with .fifo suffix). | ||
| // | ||
| // +optional | ||
| // +kubebuilder:validation:Pattern=`^https://sqs\.[a-z0-9-]+\.amazonaws\.com/[0-9]{12}/[a-zA-Z0-9_-]+(\.fifo)?$` |
There was a problem hiding this comment.
Ah ok, so there is some limitation on the URL 👍
|
|
||
| This controller provides faster machine replacement than the MHC alone. The MHC waits for the Machine to become `Failed`, which only happens after the instance is already deleted. The spot remediation controller acts immediately upon receiving the NTH taint -- before the instance is actually terminated. | ||
|
|
||
| The MHC remains as a fallback for cases where the spot remediation controller or NTH is unavailable. |
There was a problem hiding this comment.
Again, I'm out of my depth here, but it seems possible to enhance the MHC to do this and leave the HCCO unchanged. I really would like to keep the HCCO out of the business of operating against CAPI in the HCP.
|
|
||
| #### IAM Permissions for SQS | ||
|
|
||
| The NTH deployment uses the NodePoolManagement IAM role credentials (via IRSA/STS) to poll and acknowledge messages from the SQS queue. This requires adding `sqs:ReceiveMessage` and `sqs:DeleteMessage` permissions to the NodePoolManagement role's IAM policy: |
There was a problem hiding this comment.
I understand that adding a new role for NTH is heavyweight. Just want to call out that we will be unable to distinguish between AWS calls from CAPA and NTH on the AWS side if we reuse the role/serviceaccount.
|
HI @enxebre : Hi — as part of the work on adding Could the IAM permissions section of the enhancement be updated to reflect this? Specifically: The current block: { Should become: |
Summary
marketType(OnDemand/Spot/CapacityBlocks) andspot(SpotOptions with maxPrice) fields toPlacementOptionsonAWSNodePoolPlatformterminationHandlerQueueURLfield toAWSPlatformSpecon HostedCluster for NTH SQS queue configurationaws-node-termination-handlerimage in the OCP release payloadmaxUnhealthy: 100%as a safety net fallbackTest plan
isSpotEnabled(), CAPA template mapping, resource tags, CEL validation rules🤖 Generated with Claude Code