Full parallel content guard test#75
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces batch processing capabilities for container-related resources in Pulp, including distributions, remotes, repositories, and sync operations. New Ansible modules were added to handle these operations concurrently using Python's ThreadPoolExecutor, and existing roles were refactored to utilize these batch modules. Feedback identifies critical bugs in the container_syncs module where missing existence checks for repositories and remotes could lead to runtime errors, and the sync API is incorrectly passed a context object instead of an HREF string. Additionally, there is a recommendation to refactor Ansible tasks that build lists within loops using set_fact to avoid O(N^2) performance degradation.
| repository_ctx = PulpContainerRepositoryContext( | ||
| pulp_ctx, entity={"name": sync_item["repository"]} | ||
| ) | ||
| repository = repository_ctx.entity |
There was a problem hiding this comment.
The code does not check if the repository exists before accessing its attributes. If repository_ctx.entity is None (e.g., if the repository name provided in sync_item is not found), subsequent calls like repository.get("remote") at line 170 or repository.get("latest_version_href") at line 180 will raise an AttributeError.
| repository = repository_ctx.entity | |
| repository = repository_ctx.entity | |
| if not repository: | |
| result["failed"] = True | |
| result["msg"] = f"Repository '{sync_item['repository']}' not found." | |
| return result |
| remote_ctx = PulpContainerRemoteContext( | ||
| pulp_ctx, entity={"name": remote_name} | ||
| ) | ||
| payload["remote"] = remote_ctx |
There was a problem hiding this comment.
There are two issues here: 1) The code does not check if the remote exists before using it. 2) The sync API expects a string (HREF) for the remote field, but the context object itself is being passed. This should be updated to use the pulp_href from the found entity, consistent with how it's handled in container_distributions.py (line 204).
| remote_ctx = PulpContainerRemoteContext( | |
| pulp_ctx, entity={"name": remote_name} | |
| ) | |
| payload["remote"] = remote_ctx | |
| remote_ctx = PulpContainerRemoteContext( | |
| pulp_ctx, entity={"name": remote_name} | |
| ) | |
| if not remote_ctx.entity: | |
| result["failed"] = True | |
| result["msg"] = f"Remote '{remote_name}' not found." | |
| return result | |
| payload["remote"] = remote_ctx.entity["pulp_href"] |
| - name: Build container remotes list | ||
| set_fact: | ||
| container_remotes_list: "{{ container_remotes_list + [{ | ||
| 'name': item.name + '-remote', | ||
| 'upstream_name': item.get('upstream_name', item.name), | ||
| 'url': item.get('url'), | ||
| 'ca_cert': item.get('ca_cert'), | ||
| 'client_cert': item.get('client_cert'), | ||
| 'client_key': item.get('client_key'), | ||
| 'download_concurrency': item.get('download_concurrency'), | ||
| 'exclude_tags': item.get('exclude_tags'), | ||
| 'include_tags': item.get('include_tags'), | ||
| 'policy': item.get('policy'), | ||
| 'proxy_url': item.get('proxy_url'), | ||
| 'proxy_username': item.get('proxy_username'), | ||
| 'proxy_password': item.get('proxy_password'), | ||
| 'remote_username': item.get('remote_username'), | ||
| 'remote_password': item.get('remote_password'), | ||
| 'tls_validation': item.get('tls_validation'), | ||
| 'state': item.get('state') } | dict2items | rejectattr('value', 'none') | items2dict ] }}" | ||
| loop: "{{ (pulp_repository_container_repos | selectattr('state', 'equalto', 'absent') + pulp_repository_container_repos | selectattr('url', 'defined')) | unique }}" | ||
| loop_control: | ||
| loop_var: item | ||
| label: "{{ item.name }}" |
There was a problem hiding this comment.
Building a list using set_fact inside a loop is an anti-pattern in Ansible due to its O(N^2) performance impact as the list grows (each set_fact creates a full copy of the list). It is recommended to use a single set_fact with a Jinja2 template or a complex filter chain to construct the list in one operation.
| - name: Build container syncs list | ||
| set_fact: | ||
| container_syncs_list: "{{ container_syncs_list + [{'repository': item.name, 'remote': item.name + '-remote'}] }}" | ||
| loop: "{{ pulp_repository_container_repos | selectattr('url', 'defined') | selectattr('state', 'equalto', 'present') }}" | ||
| loop_control: | ||
| loop_var: item | ||
| label: "{{ item.name }}" |
Signed-off-by: Bartosz Bezak <bartosz@stackhpc.com>
05794cb to
e350ac8
Compare
No description provided.