-
Notifications
You must be signed in to change notification settings - Fork 31
Delete OCP cluster leftovers using openshift-installer #166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
oharan2
wants to merge
11
commits into
RedHatQE:master
Choose a base branch
from
oharan2:delete_installer
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
56f3344
Delete clusters trough openshift-installer
oharan2 0101795
Use strict filtering for all associated resources
oharan2 322f026
Filter cluster resources as group
oharan2 cd55d5f
Print openshift-installer cli output only during failures
oharan2 2c59933
Update note
oharan2 b2bd7e6
Add user confirmation and -y --yes flags to the cli
oharan2 e802971
Make filtering more strict
oharan2 a69e100
Set clusters as LeftoverAWSOcp object. Update metadata
oharan2 24dccae
Move AWS properties to CleanAWSOcps constructor
oharan2 2cffbe6
Move openshift-install under /usr/local/bin
oharan2 61607a6
Remove debug commands from Dockerfile
oharan2 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,63 +1,215 @@ | ||
| import tempfile | ||
|
|
||
| from cloudwash.config import settings | ||
| from cloudwash.constants import CLUSTER_EXP_DATE_TAG | ||
| from cloudwash.constants import CLUSTER_ID_TAGS | ||
| from cloudwash.constants import CLUSTER_NAME_TAGS | ||
| from cloudwash.constants import OCP_TAG_SUBSTR | ||
| from cloudwash.entities.resources.base import OCPsCleanup | ||
| from cloudwash.utils import calculate_time_threshold | ||
| from cloudwash.logger import logger | ||
| from cloudwash.utils import check_installer_exists | ||
| from cloudwash.utils import destroy_ocp_cluster_wrapper | ||
| from cloudwash.utils import dry_data | ||
| from cloudwash.utils import filter_resources_by_time_modified | ||
| from cloudwash.utils import group_ocps_by_cluster | ||
| from cloudwash.utils import OCP_TAG_SUBSTR | ||
| from cloudwash.utils import write_metadata_file | ||
|
|
||
|
|
||
| class LeftoverAWSOcp: | ||
| def __init__(self, infra_id: str, region: str): | ||
| self.infra_id = infra_id | ||
| self.region = region | ||
| self.associated_resources = {"Resources": [], "Instances": []} | ||
| self._cluster_name = "" # Extract using resources tags | ||
| self._cluster_id = "" # Extract using resources tags | ||
| self._expiration_date = "" # Extract using resources tags | ||
|
|
||
| def __repr__(self): | ||
| return ( | ||
| f'{self.infra_id}, Region: {self.region}, Instances: ' | ||
| f'{len(self.associated_resources.get("Instances"))}, other resources: ' | ||
| f'{len(self.associated_resources.get("Resources"))})' | ||
| ) | ||
|
|
||
| def get_cluster_info( | ||
| self, | ||
| ): | ||
| for resources_types in self.associated_resources.values(): | ||
| for resource in resources_types: | ||
| if all([self._cluster_id, self._cluster_name, self._expiration_date]): | ||
| break | ||
| if not self._expiration_date: | ||
| exp_date = resource.get_tag_value(key=CLUSTER_EXP_DATE_TAG) | ||
| if exp_date: | ||
| self._expiration_date = exp_date | ||
| for name in CLUSTER_NAME_TAGS: | ||
| if not self._cluster_name: | ||
| name_tag = resource.get_tag_value(key=name) | ||
| if name_tag: | ||
| self._cluster_name = name_tag | ||
| for id in CLUSTER_ID_TAGS: | ||
| if not self._cluster_id: | ||
| id_tag = resource.get_tag_value(key=id) | ||
| if id_tag: | ||
| self._cluster_id = id_tag | ||
|
|
||
| def get_cluster_metadata( | ||
| self, | ||
| ): | ||
| """ | ||
| TODO Complete | ||
| TODO Check if we can extract HostedZoneRole, clusterDomain | ||
| """ | ||
| # Prepare the data | ||
| infraID = self.infra_id | ||
| clusterName = self._cluster_name or infraID | ||
| clusterID = self._cluster_id or infraID | ||
|
|
||
| logger.info(f"\nPreparing metadata for cluster: {infraID}") | ||
|
|
||
| cluster_metadata = { | ||
| "clusterName": f"{clusterName}", | ||
| "clusterID": f"{clusterID}", | ||
| "infraID": f"{infraID}", | ||
| "aws": { | ||
| "region": self.region, | ||
| "identifier": [{f"{OCP_TAG_SUBSTR}{infraID}": "owned"}], | ||
| }, | ||
| } | ||
| return cluster_metadata | ||
|
|
||
|
|
||
| class CleanOCPs(OCPsCleanup): | ||
| def __init__(self, client): | ||
| self.client = client | ||
| self._delete = [] | ||
| def __init__(self): | ||
| self._deletable = {"ocp_clusters": [], "filtered_leftovers": []} | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think its good to be consistant with var names, That is |
||
| self._cluster_map = {} | ||
| self.list() | ||
|
|
||
| def _set_dry(self): | ||
| dry_data['OCPS']['delete'] = self._delete | ||
| def _make_printable(resources: list): | ||
| return { | ||
| ocp.resource_type: [ | ||
| r.name for r in resources if r.resource_type == ocp.resource_type | ||
| ] | ||
| for ocp in resources | ||
| } | ||
|
|
||
| dry_data['OCPS']['delete'] = _make_printable(self._deletable["filtered_leftovers"]) | ||
| dry_data['OCPS']['clusters'] = self._deletable["ocp_clusters"] | ||
|
|
||
| def list(self): | ||
| pass | ||
|
|
||
| def remove(self): | ||
| pass | ||
|
|
||
| def cleanup(self): | ||
| def cleanup(self, user_validation=False): | ||
| if not settings.dry_run: | ||
| self.remove() | ||
| check_installer_exists() | ||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| for cluster_name in self._deletable["ocp_clusters"]: | ||
| cluster = self._cluster_map[cluster_name] | ||
| cluster.get_cluster_info() | ||
| cluster.metadata = cluster.get_cluster_metadata() | ||
| metadata_path = write_metadata_file( | ||
| cluster_metadata=cluster.metadata, cleanup_dir=tmpdir | ||
| ) | ||
| destroy_ocp_cluster_wrapper( | ||
| metadata_path=metadata_path, | ||
| cluster_name=cluster_name, | ||
| user_validation=user_validation, | ||
| ) | ||
|
|
||
|
|
||
| class CleanAWSOcps(CleanOCPs): | ||
| def list(self): | ||
| resources = [] | ||
| time_threshold = calculate_time_threshold(time_ref=settings.aws.criteria.ocps.get("SLA")) | ||
| def __init__(self, client): | ||
| self.client = client | ||
| self.cleaning_region = self.client.cleaning_region | ||
| super().__init__() | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesnt seems to be required. |
||
|
|
||
| ocp_prefix = list(settings.aws.criteria.ocps.get("OCP_PREFIXES") or [""]) | ||
| for prefix in ocp_prefix: | ||
| query = " ".join( | ||
| [f"tag.key:{OCP_TAG_SUBSTR}{prefix}*", f"region:{self.client.cleaning_region}"] | ||
| ) | ||
| resources.extend(self.client.list_resources(query=query)) | ||
|
|
||
| # Prepare resources to be filtered before deletion | ||
| cluster_map = group_ocps_by_cluster(resources=resources) | ||
| for cluster_name in cluster_map.keys(): | ||
| cluster_resources = cluster_map[cluster_name].get("Resources") | ||
| instances = cluster_map[cluster_name].get("Instances") | ||
| def group_ocps_by_cluster(self, resources: list = None) -> dict: | ||
| """Group different types of AWS resources under their original OCP clusters | ||
|
|
||
| :param list resources: AWS resources collected by defined region and sla | ||
| :return: A dictionary with the clusters as keys and the associated resources as values | ||
| """ | ||
| if resources is None: | ||
| resources = [] | ||
| clusters_map = {} | ||
|
|
||
| for resource in resources: | ||
| for key in resource.get_tags(regex=OCP_TAG_SUBSTR): | ||
| cluster_infra_id = key.get("Key") | ||
| if OCP_TAG_SUBSTR in cluster_infra_id: | ||
| # Considering the following format: "kubernetes.io/cluster/<CLUSTER_INFRA_ID>" | ||
| cluster_infra_id = cluster_infra_id.split(OCP_TAG_SUBSTR)[1] | ||
| if cluster_infra_id not in clusters_map.keys(): | ||
| clusters_map[cluster_infra_id] = LeftoverAWSOcp( | ||
| infra_id=cluster_infra_id, region=self.cleaning_region | ||
| ) | ||
|
|
||
| # Set cluster's EC2 instances | ||
| if hasattr(resource, 'ec2_instance'): | ||
| clusters_map[cluster_infra_id].associated_resources["Instances"].append( | ||
| resource | ||
| ) | ||
| # Set resource under cluster | ||
| else: | ||
| clusters_map[cluster_infra_id].associated_resources["Resources"].append( | ||
| resource | ||
| ) | ||
| return clusters_map | ||
|
|
||
| def _filter_deletable(self): | ||
| time_threshold = settings.aws.criteria.ocps.get("SLA") | ||
| for cluster in self._cluster_map.keys(): | ||
| resources = self._cluster_map[cluster].associated_resources.get("Resources") | ||
| instances = self._cluster_map[cluster].associated_resources.get("Instances") | ||
| leftover_ocp = False | ||
|
|
||
| if instances: | ||
| # For resources with associated EC2 Instances, filter by Instances SLA | ||
| if not filter_resources_by_time_modified( | ||
| if filter_resources_by_time_modified( | ||
| time_threshold, | ||
| resources=instances, | ||
| ): | ||
| self._delete.extend(cluster_resources) | ||
| leftover_ocp = True | ||
| # If cluster is not selected due to other resources being used, | ||
| # the instances will only be printed in dry run | ||
| self._deletable["filtered_leftovers"].extend(instances) | ||
| else: | ||
| # For resources with no associated EC2 Instances, identify as leftovers | ||
| self._delete.extend( | ||
| filter_resources_by_time_modified(time_threshold, resources=cluster_resources) | ||
| ) | ||
| # For resources with no associated EC2 Instances, consider as leftovers | ||
| leftover_ocp = True | ||
|
|
||
| if leftover_ocp: | ||
| # Filter all cluster resources by SLA to avoid deletion of resources that are | ||
| # in use, like EBS volume or security groups | ||
| if filter_resources_by_time_modified(time_threshold, resources=resources): | ||
| # Will not collect resources recorded during the SLA time | ||
| self._deletable["ocp_clusters"].append(cluster) | ||
| self._deletable["filtered_leftovers"].extend(resources) | ||
| else: | ||
| logger.info( | ||
| f"Found resources in use, skipping the deletion of cluster {cluster}" | ||
| ) | ||
|
|
||
| def list(self): | ||
| resources = [] | ||
|
|
||
| ocp_prefixes = list(settings.aws.criteria.ocps.get("OCP_PREFIXES") or [""]) | ||
| for prefix in ocp_prefixes: | ||
| query = " ".join( | ||
| [f"tag.key:{OCP_TAG_SUBSTR}{prefix}*", f"region:{self.cleaning_region}"] | ||
| ) | ||
| resources.extend(self.client.list_resources(query=query)) | ||
|
|
||
| # Filter resources by SLA before deletion | ||
| self._cluster_map = self.group_ocps_by_cluster(resources=resources) | ||
| self._filter_deletable() | ||
|
|
||
| # Sort resources by type | ||
| self._delete = sorted(self._delete, key=lambda x: x.resource_type) | ||
| # Sort resources by type and cluster by name | ||
| self._deletable["filtered_leftovers"] = sorted( | ||
| self._deletable["filtered_leftovers"], key=lambda x: x.resource_type | ||
| ) | ||
| self._deletable["ocp_clusters"] = sorted(self._deletable["ocp_clusters"]) | ||
| self._set_dry() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have choice to say no, or if we should not provide the choice, let's remove the option here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is
-d, say no to all cluster deletion makes the deletion mode redundant.In deletion mode, if not passing
-y(yes to all prompts), you can go cluster by cluster and decide which one to clean up.Therefore you're in deletion mode, but you can safely exclude some of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the framework that Cloudwash comes up with for excluding is to have the resource in exceptions list. That means the user has to pre-choose the clusters that user dont want to delete resources from.
Can we move to exceptions mode, How that would impact your use case ?