Feature branch sync - staging to pub/q2_upgrade#4784
Merged
Conversation
Feature branch sync - pub/q2_dev to staging
* offline installation of custom volume exporter (#4442) * change calico images registry from docker.io to quay.io (#4439) Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> --------- Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> Co-authored-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com>
Feature branch sync - pub/q2_dev to staging
* no slot matching for single nic
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* Merge pull request #4456 from jagadeeshnv/pub/q2_upgrade
Upgrade tasks for slurm
* ldms image tag changes in service_k8s.json anf values.yaml (#4511)
* ldms image tag changes in service_k8s.json anf values.yaml
Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com>
---------
Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com>
* Merge pull request #4516 from abhishek-sa1/pub/q2_upgrade
Image build fix for buildstream
* Perform gitlab config and pipeline upgrade as part of upgrade_build_stream.yml playbook. (#4513)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Merge pull request #4519 from mithileshreddy04/pub/q2_upgrade_fix4
Fix for undefined variable error in pre upgrade check
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook (#4521)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Upgrade fixes for tag component and approval file update (#4522)
* Update upgrade_oim.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Update build_image_common.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* upgrade fixes
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Update upgrade.yml
* upgrade approval
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Update bss.yaml.j2
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Merge pull request #4523 from mithileshreddy04/pub/q2_upgrade_fix4
Fix telemetry upgrade transform when ldms enabled
* Update create_k8s_config_nfs.yml
Signed-off-by: snarthan <narthan.s@dell.com>
* Update ci-group-service_kube_control_plane_first_x86_64.yaml.j2
Signed-off-by: snarthan <narthan.s@dell.com>
* Update ci-group-service_kube_control_plane_x86_64.yaml.j2
Signed-off-by: snarthan <narthan.s@dell.com>
* Upgrade for slurm with reboot tracking complete
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Upgrade input parameters new
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* csi powerscale chnages for version directory
* updating the folder paths
* correcting folder path
* change the path
* rearrangement of telemetry files for upgrade easiness (#4527)
* rearrangement of telemetry files for upgrade easiness
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* fix for UT issues
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* keep old timeout
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
---------
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* correcting the path for csi driver powerscale
* Rollback WIP - added code similar to upgrade_slurm
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* input validation for telemetry_storage_config.yml (#4529)
* input validation for telemetry_storage_config.yml
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* revert wrongly generated code changes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* upgrade template update
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* add schema json
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* schema validation
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* add replica count
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* refer from telemetry_storage_config.yml
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* refer from telemetry_storage_config.yml
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* update telemetry.sh
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* UT fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* pylint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
---------
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* Rollback working v1.0
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* chnages
* removing the hardcoded nfs directory creation
* correcting install code
* csi 2.16 installation directory chnages for upgrade usecase
* ansible lint fix
* lint fixes
* ACL permission for build_image_x86 and aarch_64 (#4525)
* ACL permission
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* delegate task and verbosity
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* cloudinit status check
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* Revert "cloudinit status check"
This reverts commit fd168a2a006de16faae91817d24bcf236b8fa40e.
---------
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* Update main.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update get_powerscale_dependencies.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update get_powerscale_dependencies.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update get_powerscale_dependencies.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* telemetry backup during provision upgrade and victoria upgrade (#4531)
* telemetry backup during provision
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* delegate to required component
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* correct delegation
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* addressing review comments
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* victoria_upgrade
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* commening out execure_telemetry_sh.j2
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* allow for sinlge control plane
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* ansible lint fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* UT fixes
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* read k8s version dynamically
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* read k8s version dynamically
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
---------
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* input validation for slot hex values
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* Merge pull request #4524 from Venu-p1/dev/upgrade-buildstream-1
feature: Upgrade BuildStream 2.1 to 2.2
* Removed emojis from ansible output
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Update main.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update get_powerscale_dependencies.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Added reboot tracking for successful and failed nodes in upgrade and
rollback
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* etcd on local disk support
Signed-off-by: Narthan_S <narthan.s@dell.com>
* shell scripts modified to mount etcd on disk
Signed-off-by: Narthan_S <narthan.s@dell.com>
* fix(checkmarx): resolve sensitive-log and info-exposure findings in ome_server_inventory
Addresses 7 Checkmarx findings (IDs 33-37, 39, 42) in ome_server_inventory.py:
Filtering_Sensitive_Logs (IDs 33-37):
- Clear self.password from memory immediately after authentication POST
- Pass ome_username/ome_password directly from module.params to OMEClient
constructor without intermediate local variables
- Breaks Checkmarx data-flow chain from ome_password to logger calls
Information_Exposure_Through_an_Error_Message (IDs 39, 42):
- Log only exception class name (type(exc).__name__) instead of full
exception string in retry and inventory-fetch warning messages
- Prevents potential exposure of internal paths, IPs, or connection
details in log output
Tested: prepare_oim and discovery flows re-tested successfully.
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
* Fail if ochami env setting failed
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Added enhanced error messgae if cloud-init fails
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Feature branch sync - pub/vast_telemetry - pub/q2_dev (#4539)
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com>
Co-authored-by: Nagachandan-P <Nagachandan.p@dell.com>
Co-authored-by: snarthan <narthan.s@dell.com>
Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com>
Co-authored-by: sujit-jadhav <sujit.jadhav@dell.com>
Co-authored-by: Kratika Patidar <Kratika.Patidar@dell.com>
* VAST telemetry/syslogs enabled (#4501) (#4540)
* VAST telemetry/syslogs enabled
* Added VAST credentials rules
* updated vast telemetry config in upgrade template
---------
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com>
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
Co-authored-by: Nagachandan-P <Nagachandan.p@dell.com>
Co-authored-by: snarthan <narthan.s@dell.com>
Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com>
Co-authored-by: sujit-jadhav <sujit.jadhav@dell.com>
Co-authored-by: Kratika Patidar <Kratika.Patidar@dell.com>
Co-authored-by: pullan1 <sudha.pullalaravu@dell.com>
* Increased retry interval to 15 seconds
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Updated retries and interval
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* Merge pull request #4544 from Kratika-P/pub/q2_upgrade_latest
upgrade telemetry complete changes along with integration
* template format for csi install
* template format for csi install
* lint fixes
* lint fixes
* Template version of csi installation (#4546)
* Update main.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update get_powerscale_dependencies.yml
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* template format for csi install
* template format for csi install
* lint fixes
* lint fixes
---------
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
Co-authored-by: Narthan_S <narthan.s@dell.com>
* Modification to Upgrade K8s Logic (#4541)
* rollback for k8s upgrade
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* fixes after UT on rollback k8s
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* logic updated for rollback status file
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* lint fixes
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* fix remaining lint issues
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* reversing rollback changes
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
---------
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* Cloud-init failure message enhanced (#4538)
* Fail if ochami env setting failed
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Added enhanced error messgae if cloud-init fails
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Increased retry interval to 15 seconds
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Updated retries and interval
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
---------
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* Fix integ issues during upgrade. (#4545)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Buildstream skip in upgrade playbook (#4547)
* Update upgrade_build_stream.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Update upgrade_build_stream.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Upgrade and rollback fix
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Update upgrade.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* external victoria config update (#4536)
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* upgarde powerscale
* removing redundant code
* command fix
Signed-off-by: Narthan_S <narthan.s@dell.com>
* removing pause
Signed-off-by: Narthan_S <narthan.s@dell.com>
* Openchami rollback from omnia 2.2 to 2.1 (#4542)
* Add Openchami rollback changes from omnia 2.2 to 2.1
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Fix ansible lint issues
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Input transform fixes
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Update omnia_config.j2
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Update ansible-lint.yml
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
---------
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
Signed-off-by: Mithilesh Reddy <mithilesh.reddy@dell.com>
* Fix/catalog k8s (#4548)
* Update service_k8s to 1.35.1 and csi_driver_powerscale to v2.16.0 in catalog software configs
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Handle optional K8S and Slurm configs, fix versioned bundle filename parsing
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Handle ldms packages in os_* functional layers when those roles exist in pxe.
Infrastructure packages populated only from csi_driver_powerscale.json
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Catalogs regenerated with below changes.
os_x86_64/os_aarch64 functional layers now populated from ldms.json
Infrastructure packages only from csi_driver_powerscale.json
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Remove name-based infrastructure package classification fallback
Infrastructure packages now determined solely by bundle membership in csi_driver_powerscale. Removed _is_infra_package_name() fallback logic and updated catalogs accordingly.
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
---------
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* final commit for upgarde
* scratch mount forlogin nodes
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* drain anad calico fix
Signed-off-by: Narthan_S <narthan.s@dell.com>
* lint fixes
Signed-off-by: Narthan_S <narthan.s@dell.com>
* removing tmp dir mount
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* slurm files reverted
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* login compiler nodes for tmp mount
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* Fix to use hapoxy-based endpoint resolution (#4550)
* Fix to use hapoxy-based endpoint resolution
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Update telemetry_storage_config.j2
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
---------
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Fix/k8s catalog 2 (#4552)
* Fix: Updated logic to handle missing K8s and other packages.
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Add 'v' prefix to service_k8s versioned filename format
Changed service_k8s filename format from service_k8s_{version}.json to service_k8s_v{version}.json for consistency with versioning conventions.
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
---------
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* modify step drain PDB logic
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* modify drain logic in workers
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* Add int to step_drain.yml
Signed-off-by: Vrinda Marwah <vrinda.marwah@dell.com>
* Update step_drain.yml
Signed-off-by: Vrinda Marwah <vrinda.marwah@dell.com>
* test commit on step_drain.yml
Signed-off-by: Vrinda Marwah <vrinda.marwah@dell.com>
* fix in step_drain.yml
Signed-off-by: Vrinda Marwah <vrinda.marwah@dell.com>
* Added mounts to login and login compiler
fixed vast_rdma mnt opts in input dir
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* BuildStream Feature Bug Fixes and Upgrade Stabilisation. (#4556)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update set_pxe_boot.yml (#4559)
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* Update openchami_auth.yml
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Storage_config doc update
ochami error fix
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* cloud-init enhancement on gpu nodes
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* Update set_pxe_boot.yml
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* Refactor functional layer generation to use PXE-only architecture mapping (#4565)
* hpc_tools from vast
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Added vast to mount on oim
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* fix(gitlab): Replace YAML folded URLs with single-line to fix InvalidURL error (#4567)
* Update deploy_openchami.yml
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Update main.yml
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Update push_single_example_catalog.yml
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
* Update main.yml
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
---------
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Update omnia.sh for selinux fix for external NFS (#4564)
* Update omnia.sh
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Feature branch update (#70)
* cloud-init enhancement on gpu nodes
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* Update set_pxe_boot.yml
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* Refactor functional layer generation to use PXE-only architecture mapping (#4565)
* fix(gitlab): Replace YAML folded URLs with single-line to fix InvalidURL error (#4567)
* Update deploy_openchami.yml
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Update main.yml
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Update push_single_example_catalog.yml
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
* Update main.yml
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
---------
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
---------
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: Nagachandan-P <Nagachandan.p@dell.com>
Co-authored-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Co-authored-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
Co-authored-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: Nagachandan-P <Nagachandan.p@dell.com>
Co-authored-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Co-authored-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
Co-authored-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
* Merge pull request #4568 from abhishek-sa1/pub/q2_dev
Update telemetry_storage_config.yml validation
* Update read_software_config.yml (#4570)
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* fix(discovery): detect IB NIC when iDRAC reports Unknown LinkStatus (OMN01D-2442)
iDRAC may report InfiniBand NIC LinkStatus as 'Unknown' even when the
interface is connected and active at the OS level. The previous logic
in extract_server_info() strictly required LinkStatus=='Up', silently
skipping these NICs and leaving IB_NIC_NAME and IB_IP empty in the
PXE mapping file.
Add prioritized fallback selection for IB NIC detection:
1. Up — selected immediately (no change)
2. Unknown — preferred fallback when no Up port exists
3. Down/other — last resort, first port at this tier wins
This mirrors the existing ethernet NIC fallback pattern (lines 436-465)
and ensures InfiniBand NICs are populated in the PXE mapping even when
iDRAC misreports their status.
Add 12 unit tests covering all priority combinations.
Fixes: OMN01D-2442
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
* adding secrets and values file
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* Fix/missing csi and docker.io/victoriametrics/operator (#4572)
* Fix image package deduplication by using tag-aware keys and update CSI filter type to substring.
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Refactor package deduplication to use tag/version-aware keys and merge duplicate entries
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
---------
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Fix checkmarx issues
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* vector-ome changes (#4569)
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* Update deploy_powerscale_csi.sh.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update credential_rules.json (#4577)
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
* update template
Signed-off-by: mcas <sakshi.s@dell.com>
* Update omnia_config.json
Signed-off-by: snarthan <narthan.s@dell.com>
* powerscale changes
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* powerscale backup changes
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
* Fix to execute rhel subscription validation when 'local_repo' is explicitly present in omnia_run_tags (#4574)
* Update deploy_powerscale_csi.sh.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update deploy_powerscale_csi.sh.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Upgrade and Rollback changes
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Update ci-group-slurm_node_x86_64.yaml.j2
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* Update mount_on_oim.yml
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
* signoff last commit (#4589)
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* Update ci-group-slurm_node_x86_64.yaml.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update ci-group-slurm_node_x86_64.yaml.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Update ci-group-slurm_node_aarch64.yaml.j2
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
* Refactor Gitlab configuration in Upgrade (#4576)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Upgrade and rollback more error scenarios
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Skipping validation and upgrade for service_k8s if not present in software_config json (#4597)
* Fix to execute rhel subscription validation when 'local_repo' is explicitly present in omnia_run_tags
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
* Added vast_telemetry run tag
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
* Skipping validation and upgrade for service_k8s if not present in software_config json
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
---------
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
* Sonarqube scan fixes (#4596)
* sonar cube update
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Fixing sonarqube issue
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Merge pull request #4598 from mithileshreddy04/q2_rollback_fix
Idempotency fixes and backup modification for openCHAMI upgrade and rollback
* Add rollback code for k8s
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* lint fixes
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* fix: vector-ome and idrac pods not coming up issue (#4591)
* vector-ome changes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* defect fix for vector-ome and idrac pods not initialized
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* lint-fix
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
---------
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
Signed-off-by: Kratika Patidar <Kratika.Patidar@dell.com>
* resource limits for idrac telemtry containers (#4590)
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
* updating pdb code and version for csi
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* updated
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* skip k8s rollback when service k8s is not enabled
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* Increase job wait timeout for build image to 2 hours (#4594)
* Increase job wait timeout
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* build image fix
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* If no slurm nodes - ending play instead of fail
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Changes to mount hpc_tools to vast
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Added comments about vast storage
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Cleanup stale file handle error
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Fix checkmarx vulnerabilities (#4611)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fic checkmarx issues3
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Feature - BuildStream Rollback (2.2 -> 2.1) (#4610)
* Fix checkmarx issues. (#4612)
* Merge pull request #4618 from VrindaMarwah/pub/q2_upgrade
Increase pod status timeout in K8s rollback
* Create readable pkg ids in Catalog (#4617)
* Replace sequential package IDs with human-readable identifiers in catalog generation
using format: {sanitized_name}_{version}_{type} or {sanitized_name}_{version}_{type}_{counter}
- Implement collision detection and handling with counter suffix when duplicate IDs occur
- Extract version information from package names for pip modules (name==version format) and RPMs
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* use 'na' if version is not available.
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
---------
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Add input config diff test & tool for comp aring expected vs actual build stream (adapter) outputs (#4586)
* Add input config diff test & tool for comparing expected vs actual build stream (adapter) outputs
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
* Updated example path
Signed-off-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
---------
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
Signed-off-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
* Merge pull request #4614 from balajikumaran-c-s/pub/q2_dev
fix: Use NFS_SHARE_PATH for artifact_dir validation and rename discovery scenario to provision
* feat(discovery): add BMC discovery report with NIC link statuses
Generate a BMC discovery report CSV alongside the PXE mapping file during
OME server discovery. The report is written to /opt/omnia/discovery/ with
the same timestamp as the PXE mapping file.
Report columns:
SERVICE_TAG, BMC_MAC, BMC_IP, BMC_NIC_STATUS,
ETHERNET_NIC_MAC, ETHERNET_NIC_LINK_STATUS,
IB_NIC_NAME, IB_NIC_LINK_STATUS
Changes:
- ome_server_inventory.py: enrich extract_server_info() to return
idrac_link_status, first_nic_link_status, and ib_nic_link_status
- generate_discovery_report.py: new Ansible module that writes the
discovery report CSV from the enriched server inventory data
- generate_discovery_report.yml: new task file that creates the
/opt/omnia/discovery/ directory and invokes the report module
- main.yml: add generate_discovery_report task after PXE mapping
- vars/main.yml: add discovery_report_dir and discovery_report_file vars
- generate_pxe_mapping.yml: move completion messages to
generate_discovery_report.yml so the report path is available
- test_generate_discovery_report.py: 7 unit tests covering headers,
empty servers, all fields, missing IB, multiple servers, nested dirs,
and return values
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
* Merge pull request #4620 from mithileshreddy04/q2_upgrade_fixes
Update prepare_upgrade.yml
* telemetry rollback and upgrade changes (#4613)
* telemetry rollback and upgrade changes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* rollback and upgrade comments fixes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* changing component and tag name from k8s -> k8s-telemetry
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* lint
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
---------
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* Updating oim host group in upgrade local repo flow (#4625)
* Updating podman excutable path
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* increase retries
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* local repo oim host fix
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Revert "Updating podman excutable path"
This reverts commit 9024f8a795aa62ecdd2bb06c9eaf5cc990760e26.
---------
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Merge pull request #4624 from pullan1/pub/q2_dev
pulp publication issue fix
* Latest input parameter transformation changes for prepare_upgrade.yml flow (#4628)
* Update input file migration logic to latest input file format
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Update prepare_upgrade.yml
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* Update omnia_config.j2
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
---------
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
* mpi jobs with doca compatible ucx
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
* changing the sdk installation
Signed-off-by: mcas <sakshi.s@dell.com>
* Upgrade omina_config import
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* changing the export
Signed-off-by: mcas <sakshi.s@dell.com>
* Fix incorrect pipeline execution by fixing the order of trigger files followed by pipeline yml files. (#4632)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fic checkmarx issues3
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the checkmarx issues4
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues5
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the order of trigger files followed by pipeline to avoid incorrect pipelines execution
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Add ci skip tags
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Input doc update
upgrade slurm backup updated
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* slurm_backup location update
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Slurm backup of conf and mysql
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* fix(discovery): exclude IB NIC from ADMIN_MAC and default empty LinkStatus to Unknown
Bug 1: ADMIN_MAC picking InfiniBand MAC
- Root cause: Ethernet NIC search only excluded iDRAC NICs, not InfiniBand NICs
- When all Ethernet NICs had UNKNOWN status, the fallback picked an IB NIC MAC
- Fix: Add INFINIBAND filter to both primary and deviceNics fallback search
Bug 2: Missing LINK STATUS for UNKNOWN NICs in discovery report
- Root cause: null/empty LinkStatus defaulted to empty string
- Fix: Default to 'Unknown' in all 5 LinkStatus fallback expressions
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
* Create setup_doca_mpi_env.sh.j2 for environment setup of mpi
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_x86_64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-slurm_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-slurm_node_x86_64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update main.yml (#4637)
Signed-off-by: Kratika Patidar <Kratika.Patidar@dell.com>
* Better logging visibility (#4639)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fic checkmarx issues3
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the checkmarx issues4
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues5
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the order of trigger files followed by pipeline to avoid incorrect pipelines execution
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Add ci skip tags
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Better logging visibility
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* rollback and upgrade formattign and skipped issues fixed
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
* Telemetry upgrade check updated based in service_k8s present (#4638)
* telemetry rollback and upgrade changes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* rollback and upgrade comments fixes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* changing component and tag name from k8s -> k8s-telemetry
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* lint
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* upgrade telemetry refinement changes
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* skip and end play tsks updated
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
---------
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
* build image selinux fix for aarch64 and x86_64 (#4640)
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
* Fix for bss and cloudinit update for service_kube_control_plane_x86_64 and display banner message (#4645)
* add code for banner fix
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* Fix incorrect pipeline execution by fixing the order of trigger files followed by pipeline yml files. (#4632)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fic checkmarx issues3
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the checkmarx issues4
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues5
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the order of trigger files followed by pipeline to avoid incorrect pipelines execution
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Add ci skip tags
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* fix(discovery): exclude IB NIC from ADMIN_MAC and default empty LinkStatus to Unknown
Bug 1: ADMIN_MAC picking InfiniBand MAC
- Root cause: Ethernet NIC search only excluded iDRAC NICs, not InfiniBand NICs
- When all Ethernet NICs had UNKNOWN status, the fallback picked an IB NIC MAC
- Fix: Add INFINIBAND filter to both primary and deviceNics fallback search
Bug 2: Missing LINK STATUS for UNKNOWN NICs in discovery report
- Root cause: null/empty LinkStatus defaulted to empty string
- Fix: Default to 'Unknown' in all 5 LinkStatus fallback expressions
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
* Upgrade omina_config import
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Input doc update
upgrade slurm backup updated
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* slurm_backup location update
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Slurm backup of conf and mysql
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
* Update main.yml (#4637)
Signed-off-by: Kratika Patidar <Kratika.Patidar@dell.com>
* Create setup_doca_mpi_env.sh.j2 for environment setup of mpi
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_x86_64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-slurm_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-slurm_node_x86_64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Update ci-group-login_compiler_node_aarch64.yaml.j2
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
* Better logging visibility (#4639)
* Upgrade gitlab config and Pipeline as part of BuildStream 2.1 to 2.2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Update the upgrade and upgrade_build_stream playbooks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fixing lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Modularize the gitlab tasks under upgrade_build_stream.yml playbook
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Move relevant tasks
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix integration issues during upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* BuildStream Feature Bug fixes and Upgrade Stabilisation
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix lint issues and review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Removed unused methods
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Refactor Gitlab configuration in Upgrade
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix upgrade validation, summary paths, lint issues, and validate stage log_file_path
- Fix validation: only abort if enable_build_stream was true in 2.1 backup AND false in current config
- Fix summary: resolve backup_dir to actual path using source_version instead of Jinja template
- Fix 3 yaml line-length lint failures (gitlab_ci_file.yml, gitlab_example_catalog.yml, gitlab_input_file.yml)
- Fix missing log_file_path for validate stage: add NFS log file creation in playbook watcher execute_molecule() to match other stages
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Remove buildstream directories as part of cleanup
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Address review comments
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix documentation for discover.yml
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues2
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fic checkmarx issues3
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the checkmarx issues4
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix checkmarx issues5
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Fix the order of trigger files followed by pipeline to avoid incorrect pipelines execution
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Add ci skip tags
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* Better logging visibility
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
---------
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
* fix for setting the bss and cloudinit for service kube control plane when additional cp are present
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* fix: resolve yaml linting errors in upgrade.yml
- Fix line length issue on line 502 (split long line)
- Remove trailing spaces on line 524
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
---------
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
Signed-off-by: Kratika Patidar <Kratika.Patidar@dell.com>
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com>
Co-authored-by: sujit-jadhav <sujit.jadhav@dell.com>
Co-authored-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
Co-authored-by: Kratika Patidar <Kratika.Patidar@dell.com>
Co-authored-by: Nagachandan P <Nagachandan.p@dell.com>
---------
Signed-off-by: Nagachandan-P <Nagachandan.p@dell.com>
Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com>
Signed-off-by: mithileshreddy04 <mithilesh.reddy@dell.com>
Signed-off-by: Abhishek S A <abhishek.sa3@dell.com>
Signed-off-by: Rajeshkumar S <rajeshkumar.s2@dell.com>
Signed-off-by: pullan1 <sudha.pullalaravu@dell.com>
Signed-off-by: snarthan <narthan.s@dell.com>
Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
Signed-off-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
Signed-off-by: SOWJANYAJAGADISH123 <sowjanya.jagadish@dell.com>
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
Signed-off-by: Narthan_S <narthan.s@dell.com>
Signed-off-by: sujit-jadhav <sujit.jadhav@dell.com>
Signed-off-by: Vrinda_Marwah <Vrinda.Marwah@dell.com>
Signed-off-by: Mithilesh Reddy <mithilesh.reddy@dell.com>
Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com>
Signed-off-by: Vrinda Marwah <vrinda.marwah@dell.com>
Signed-off-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Signed-off-by: Kratika.Patidar <kratika.patidar@dell.com>
Signed-off-by: mcas <sakshi.s@dell.com>
Signed-off-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com>
Signed-off-by: Kratika Patidar <Kratika.Patidar@dell.com>
Signed-off-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
Signed-off-by: Nagachandan P <Nagachandan.p@dell.com>
Co-authored-by: Nagachandan-P <Nagachandan.p@dell.com>
Co-authored-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Co-authored-by: Kratika Patidar <Kratika.Patidar@dell.com>
Co-authored-by: Sujit Jadhav <sujit.jadhav@dell.com>
Co-authored-by: sakshi-singla-1735 <sakshi.s@dell.com>
Co-authored-by: Mithilesh Reddy <mithilesh.reddy@dell.com>
Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com>
Co-authored-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com>
Co-authored-by: snarthan <narthan.s@dell.com>
Co-authored-by: pullan1 <sudha.pullalaravu@dell.com>
Co-authored-by: Jagadeesh N V <jagadeesh_n_v@dell.com>
Co-authored-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
Co-authored-by: SOWJANYAJAGADISH123 <Sowjanya.Jagadish@dell.com>
Co-authored-by: Venu-p1 <236371043+Venu-p1@users.noreply.github.com>
Co-authored-by: Vrinda Marwah <vrinda.marwah@dell.com>
Co-authored-by: balajikumaran.cs <balajikumaran.cs@dellteam.com>
Co-authored-by: balajikumaran.cs <balajikumaran.c.s@gmail.com>
Co-authored-by: Jagadeesh N V <jagadeesh.n.v@dell.com>
Co-authored-by: Katakam-Rakesh <katakam.rakesh@dell.com>
* Update OME Discovery completion message for BuildStream workflow Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com> * Added skip logic for rollback_slurm Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> * Update install_dcgm.sh.j2 Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com> * upgrade defect fixes (#4641) * upgrade defects fixes and fix for crashloopback on pod restart Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * remove stale services and deployments for victoria Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert changes as it si taken care in another Pr Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert idrac terminationgraceperiod Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * ansible lint fixes Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> --------- Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * Update install_dcgm.sh.j2 Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com> * Delegate manifest to localhost Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> * Merge pull request #4649 from mithileshreddy04/q2_upgrade_rollback_fixes Upgrade and rollback fixes --------- Signed-off-by: balajikumaran.cs <balajikumaran.cs@dellteam.com> Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com> Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: balajikumaran.cs <balajikumaran.cs@dellteam.com> Co-authored-by: Jagadeesh N V <jagadeesh_n_v@dell.com> Co-authored-by: sakshi-singla-1735 <sakshi.s@dell.com> Co-authored-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: Mithilesh Reddy <mithilesh.reddy@dell.com> Co-authored-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Feature branch sync - pub/q2_upgrade to staging
Feature branch sync - pub/q2_upgrade to staging
Feature branch sync - pub/q2 upgrade to staging
* task failure ansible.cfg update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Revert "task failure ansible.cfg update" This reverts commit 7b2a70b. callback plugin update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Update omnia_default.py Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Update omnia_default.py Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Update omnia_default.py Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Update omnia_default.py Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Update omnia_default.py Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * fix(validation): validate HA VIP against service_kube_control_plane subnet (OMN01D-2534) In multi-subnet deployments, service K8s control plane nodes may reside in an additional_subnet (e.g. 10.40.2.0/24) rather than the primary admin subnet (e.g. 10.40.1.0/24). The VIP for K8s HA must be in the same subnet as the control plane nodes, not the OIM admin NIC subnet. The fix: 1. In validate_service_k8s_cluster_ha(), extract control plane node IPs from PXE mapping (FUNCTIONAL_GROUP_NAME starts with service_kube_control_plane) and determine their subnet by checking the primary admin subnet and additional_subnets. 2. Pass the control plane subnet (kcp_subnet_ip, kcp_subnet_bits) to validate_vip_address(). 3. In validate_vip_address(), validate the VIP against the control plane subnet if provided, otherwise fall back to the primary admin subnet for backward compatibility. Fixes: OMN01D-2534 Signed-off-by: Sujit Jadhav <sujit.jadhav@dell.com> * fix: wait for kube-controller-manager pod before checking readiness Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> * Fix nodes not ready logic Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> * update logic Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> * Update MinIO S3 credential variable mapping to use s3_access_id and s3_secret_key Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com> * Set PXE boot replace lc check moduel with POST call Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> * fix(provision): fix DNS resolution on slurm/login nodes when dns_enabled is true Two issues prevent nid hostname resolution on slurm and login nodes: 1. OIM firewall blocks port 53 (DNS) for external access CoreDNS on the OIM binds to admin_nic_ip:53, but firewalld only opens ports for DHCP/TFTP/HTTP/etc. Nodes querying 10.x.x.x:53 get their packets dropped. From the OIM itself, DNS works because podman interfaces are in the trusted zone (local traffic bypasses the firewall). Fix: Open port 53/tcp and 53/udp in the OIM firewall when dns_enabled is true. 2. NetworkManager overwrites /etc/resolv.conf after cloud-init set-ssh.sh runs nmcli con add/up which triggers NetworkManager to overwrite /etc/resolv.conf with DHCP-provided DNS servers, removing the CoreDNS nameserver entry. Fix: After set-ssh.sh completes, restore /etc/resolv.conf and lock it with chattr +i. Matches existing K8s template protection. Files changed: - prepare_oim/.../openchami/tasks/configs/firewall.yml (port 53) - ci-group-slurm_control_node_x86_64.yaml.j2 - ci-group-slurm_node_x86_64.yaml.j2 - ci-group-slurm_node_aarch64.yaml.j2 - ci-group-login_node_x86_64.yaml.j2 - ci-group-login_node_aarch64.yaml.j2 - ci-group-login_compiler_node_x86_64.yaml.j2 - ci-group-login_compiler_node_aarch64.yaml.j2 Only active when dns_enabled is true (no impact on non-DNS deployments). Signed-off-by: Sujit Jadhav <sujit.jadhav@dell.com> * csi version change from 2.16 to 2.17 Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com> * minor fix * fix(provision): fix Python script embedding in CoreDNS cloud-init template (OMN01D-2533) (#4729) The cloud-init template has two YAML literal block scalar levels: 1. Outer content: | (base indent 6sp) - strips 6 spaces 2. Inner runcmd - | (base indent 4sp after outer) - strips 4 spaces Total: 10 spaces stripped from template lines. Previous heredoc fix used 12sp indent with spaces embedded in the delimiter string (' PYEOF'). After YAML stripping, the terminator line became ' PYEOF' (2sp) but the shell expected ' PYEOF' (12sp literal) — heredoc never terminated. Fix: Place Python code and PYEOF terminator at 10sp in the template. After both YAML levels strip their indentation, these lines land at column 0 in the shell script. The simple delimiter 'PYEOF' matches the column-0 terminator exactly. Python receives column-0 code with correct relative indentation for with/if/else blocks. All lines >= 10sp > 6sp, so the outer YAML content: | block stays intact (lines at < 6sp would prematurely terminate it). Signed-off-by: Sujit Jadhav <sujit.jadhav@dell.com> * fix(provision): use control plane nodes subnet for Calico IP autodetection (OMN01D-2532) (#4724) In multi-subnet deployments, service K8s control plane nodes may reside in an additional_subnet (e.g. 10.40.2.0/24) while the OIM admin NIC is in the primary subnet (e.g. 10.40.1.0/24). Calico's IP_AUTODETECTION_METHOD was hardcoded to admin_nic_cidr (the OIM subnet), causing Calico to fail IP auto-detection on nodes in different subnets with: 'Unable to auto-detect an IPv4 address using interface cidr [10.40.1.0/24]: no valid IPv4 addresses found' The fix: 1. In create_k8s_config_nfs.yml, read the PXE mapping to find the first service_kube_control_plane node's ADMIN_IP and determine which subnet (primary or additional) it belongs to. Set calico_cidr to that subnet's CIDR. 2. Update the cloud-init template to use calico_cidr instead of admin_nic_cidr for Calico's IP_AUTODETECTION_METHOD. The upgrade path is intentionally left unchanged (uses admin_nic_cidr) since multi-subnet is a fresh deployment feature and changing the upgrade flow could impact existing deployments. Fixes: OMN01D-2532 Signed-off-by: Sujit Jadhav <sujit.jadhav@dell.com> * defct fix for input valdition and pxe mapping check Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> * adding connection for task Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> * Merge pull request #4739 from Rajeshkumar-s2/pub/q2_upgrade Push software_config..json from artifacts during deploy * Update image builder container tag for vulnerability (#4728) * Update container tag for vulnerability Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update requirements.txt Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * tag update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Buildstream upgrade validation Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update upgrade.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update main.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update catalog_rhel.json Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update requirements.txt Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update provision_config.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update provision_config.j2 Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> --------- Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * feat(provision): Add user-defined cloud-init config support with centralized Python L2 validation (#4735) * add cloud init Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update container tag for vulnerability Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * additional cloud init group Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update validate_additional_cloud_init.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * logic update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update requirements.txt Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update provision config Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update validate_additional_cloud_init_section.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * cloud init update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * tag update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * minimal os group update Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Buildstream upgrade validation Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update upgrade.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * moving packages as prohibited Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update main.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update catalog_rhel.json Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update requirements.txt Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update provision_config.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update provision_config.j2 Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> --------- Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * fix rollback: Check upgrade manifest status before skipping components during rollback (#4738) * fix rollback Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> * fix(rollback): remove 'skipped' from build_stream_terminal condition Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> * Rollback conditions for slurm and k8s Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> * Update rollback.yml Signed-off-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com> * Lint fixes Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> --------- Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> Signed-off-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com> Co-authored-by: Jagadeesh N V <jagadeesh_n_v@dell.com> --------- Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Signed-off-by: Sujit Jadhav <sujit.jadhav@dell.com> Signed-off-by: Katakam-Rakesh <katakam.rakesh@dell.com> Signed-off-by: venu <236371043+Venu-p1@users.noreply.github.com> Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com> Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> Signed-off-by: Jagadeesh N V <jagadeesh_n_v@dell.com> Signed-off-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com> Co-authored-by: Sujit Jadhav <sujit.jadhav@dell.com> Co-authored-by: Katakam-Rakesh <katakam.rakesh@dell.com> Co-authored-by: Katakam Rakesh Naga Sai <125246792+Katakam-Rakesh@users.noreply.github.com> Co-authored-by: venu <236371043+Venu-p1@users.noreply.github.com> Co-authored-by: Jagadeesh N V <jagadeesh_n_v@dell.com> Co-authored-by: sakshi-singla-1735 <sakshi.s@dell.com> Co-authored-by: snarthan <narthan.s@dell.com> Co-authored-by: Kratika_Patidar <Kratika.Patidar@dell.com> Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com> Co-authored-by: Jagadeesh N V <39791839+jagadeeshnv@users.noreply.github.com>
Feature branch sync - pub/q2_upgrade to staging
Feature branch sync - pub/q2_upgrade to staging
Feature branch sync - pub/q2_upgrade to staging
Feature branch sync - pub/q2_upgrade to staging
* provison playbook fix when telemetry disabled but service_k8s is true (#4766) * provison playbook fix when telemetry disabled but service_k8s is true * remvoing unsed commented task Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> --------- Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> * Update validate_additional_cloud_init.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Update validate_additional_cloud_init.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * Fix to handle the intermittent missing header issue in localrepo Signed-off-by: pullan1 <sudha.pullalaravu@dell.com> * Update validate_additional_cloud_init.yml Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> * powerscale example files and loadbalancer IP preserved (#4762) * upgrade defects fixes and fix for crashloopback on pod restart Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * remove stale services and deployments for victoria Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert changes as it si taken care in another Pr Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert idrac terminationgraceperiod Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * ansible lint fixes Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * rescue block for upgrade telemetry Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert upgrade telemetry Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * default size of idrac telemetry containers Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * add new line Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * input/config/x86_64/rhel/10.0/service_k8s_v1.35.1.json Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update values in upgrade path Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * updating values in integer instead decimal Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert service k8s json file Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * powescale telemetry upgrade and preserve loadbalancer IP for Victoria Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * powerscale telemetry version upgrade Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * ansible lint fixes Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update software_config with updated csi driver version Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * upgrade powerscale values.yml Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert kafka patch variables Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update delegation as mount_on_oim can be false also Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update vars Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * example files for powescale Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * remove old files Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Fix for victoria loadbalacer IP preservation * Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> address review comments * adding shell script Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * address ansible lint issues Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> ansible lint fixes --------- Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * "Update the status for failed deploy and restart stages to ensure cleanup" (#4771) --------- Signed-off-by: Kratika_Patidar <Kratika.Patidar@dell.com> Signed-off-by: Abhishek S A <abhishek.sa3@dell.com> Signed-off-by: pullan1 <sudha.pullalaravu@dell.com> Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: Kratika Patidar <Kratika.Patidar@dell.com> Co-authored-by: pullan1 <sudha.pullalaravu@dell.com> Co-authored-by: snarthan <narthan.s@dell.com> Co-authored-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: Rajeshkumar-s2 <rajeshkumar.s2@dell.com>
Signed-off-by: sakshi-singla-1735 <sakshi.s@dell.com>
CSI Version change in software_config.json
* rhel subscription validation fix (#4772) Signed-off-by: pullan1 <sudha.pullalaravu@dell.com> * fix when stdout is not defined (#4774) * upgrade defects fixes and fix for crashloopback on pod restart Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * remove stale services and deployments for victoria Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert changes as it si taken care in another Pr Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert idrac terminationgraceperiod Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * ansible lint fixes Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * rescue block for upgrade telemetry Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert upgrade telemetry Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * default size of idrac telemetry containers Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * add new line Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * input/config/x86_64/rhel/10.0/service_k8s_v1.35.1.json Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update values in upgrade path Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * updating values in integer instead decimal Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert service k8s json file Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * powescale telemetry upgrade and preserve loadbalancer IP for Victoria Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * powerscale telemetry version upgrade Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * ansible lint fixes Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update software_config with updated csi driver version Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * upgrade powerscale values.yml Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * revert kafka patch variables Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update delegation as mount_on_oim can be false also Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * update vars Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * example files for powescale Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * remove old files Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Fix for victoria loadbalacer IP preservation * Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> address review comments * adding shell script Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * address ansible lint issues Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> ansible lint fixes * update until condition Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * fixed stdout check Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * merge Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> merge * resolve merge conflict Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> * resolve merge conflict Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> --------- Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> --------- Signed-off-by: pullan1 <sudha.pullalaravu@dell.com> Signed-off-by: priti-parate <140157516+priti-parate@users.noreply.github.com> Co-authored-by: pullan1 <sudha.pullalaravu@dell.com> Co-authored-by: priti-parate <140157516+priti-parate@users.noreply.github.com>
sync pub/q2_upgrade with staging
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Feature branch sync - staging to pub/q2_upgrade