Add OVNDBBackup and OVNDBRestore CRDs for managed backup/restore#559
Add OVNDBBackup and OVNDBRestore CRDs for managed backup/restore#559lmiccini wants to merge 4 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: lmiccini The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Introduces two new Custom Resource Definitions for automated OVN database backup and restore operations: OVNDBBackup: - Scheduled backups via CronJob using ovsdb-client backup - Configurable retention policy - TLS support for database connections - Persistent storage for backup files (survives CR deletion) OVNDBRestore: - Phase-based state machine: Validating → ScalingDown → Restoring → ScalingUp → Completed - Annotation-based replica override prevents higher-level operators from interfering during restore - Force-deletes pods during scale-down (preStop hooks hang when all RAFT members terminate simultaneously) - Deletes non-pod-0 PVCs to prevent stale RAFT membership state - Copies standalone backup to pod-0 PVC and lets ovn-ctl handle the standalone-to-clustered conversion on startup - Controlled scale-up: pod-0 first, then remaining replicas - Post-restore DB verification via exec Also includes: - Functional tests for both controllers - Webhooks with validation - Documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1f189e72fde444b2908cea46d7183592 ✔️ openstack-k8s-operators-content-provider SUCCESS in 44m 52s |
Tests the full backup/restore lifecycle: deploy OVN with 3 replicas, seed test data, create backup, trigger manual backup job, restore from backup, and verify data survives the restore. Force-deletes pods during cleanup to avoid preStop hook hangs when the entire cluster is torn down. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Delete pod-0's PVC in phaseScaleDown and recreate it in phaseRestore before creating the restore Job. With local-storage, pod-0's PVC may be bound to a PV on a different node than the backup PVC, causing a volume node affinity conflict that prevents the restore Job pod from scheduling. Recreating the PVC lets WaitForFirstConsumer bind it to a PV on the same node as the backup PVC. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/fac4823f5d0443d2b4f5d73efcc02a72 ✔️ openstack-k8s-operators-content-provider SUCCESS in 46m 27s |
|
recheck |
Allow specifying a BACKUP_TIMESTAMP on backup jobs and a backupTimestamp field on OVNDBRestore so that OVN DB backups can participate in a coordinated full-environment backup/restore workflow alongside Galera and OADP using a single shared timestamp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c55d7e0e04dc4eeb99e749d2c1bb0a04 ✔️ openstack-k8s-operators-content-provider SUCCESS in 44m 06s |
|
recheck |
|
I have not yet read the patch, but I'm not sure what the use case is for scheduled ovsdb backups. We already have an active-active cluster, and aside from non-changing initial configuration data the entire database can be generated from the neutron db and ovn-northd. In addition, any backup, will be out of date and need to be synced with the neutron db--basically the same process as not having a backup. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Introduces two new Custom Resource Definitions for automated OVN database backup and restore operations:
OVNDBBackup:
OVNDBRestore:
Also includes: