-
Notifications
You must be signed in to change notification settings - Fork 175
Fix workspace deletion race condition in AMPLS private endpoint #4764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…PLS private endpoints Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
|
/test-extended c32a87a |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19553208613 (with refid (in response to this comment from @marrobi) |
…o fix deletion ordering Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
The test revealed a different root cause. The error was that Private DNS Zones couldn't be deleted because their Virtual Network Links still existed. I've added |
Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
|
/test-extended d66e33f |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19571335811 (with refid (in response to this comment from @marrobi) |
|
/test-extended d66e33f |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19576408604 (with refid (in response to this comment from @marrobi) |
Unit Test Results0 tests 0 ✅ 0s ⏱️ Results for commit e722447. ♻️ This comment has been updated with latest results. |
|
@copilot please resolve: Code update without version change |
Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
|
/test-extended 50d36ae |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19730219587 (with refid (in response to this comment from @marrobi) |
|
/test-extended 50d36ae |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19733426740 (with refid (in response to this comment from @marrobi) |
|
/test-extended 5941585 |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19763804503 (with refid (in response to this comment from @marrobi) |
|
/test-extended 5941585 |
|
🤖 pr-bot 🤖 (in response to this comment from @marrobi) |
|
/test-extended fc12afe |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid (in response to this comment from @marrobi) |
1 similar comment
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid (in response to this comment from @marrobi) |
|
/test-extended f12f533 |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21595816510 (with refid (in response to this comment from @marrobi) |
|
@copilot can you try: Also fix the linting issues. |
…gress errors Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
... Implemented in commit ae93bc4. The |
|
/test-extended ae93bc4 |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21597746061 (with refid (in response to this comment from @marrobi) |
|
/test-extended e722447 |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid (in response to this comment from @marrobi) |
|
@copilot we have some linting failing: Failed to load configurations; network.tf:128,1-53: "terraform_azurerm_environment_configuration" module is not found. Did you run "terraform init"?; : Error: "terraform_azurerm_environment_configuration" module is not found. Did you run "terraform init"? 128: module "terraform_azurerm_environment_configuration" { We should not be validating remote modules as we don't run init. We did this for workspaces, but the workflow much use a differnt configuration file for core? |
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid (in response to this comment from @marrobi) |
1 similar comment
|
🤖 pr-bot 🤖 🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid (in response to this comment from @marrobi) |
Resolves #3194
What is being addressed
Workspace deletions fail intermittently with dependency ordering errors when Terraform attempts to delete Azure Monitor resources. The issue manifests as
AnotherOperationInProgresserrors when deleting the AMPLS private endpoint and its private DNS zone group simultaneously.How is this addressed
private_dns_zone_groupblock from within theazurerm_private_endpointresource and created it as a separateazapi_resourceusing the Azure Resource Manager API (Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-11-01). This provides explicit control over deletion ordering and avoids race conditions.ampls_app_insightsandampls_log_anaytics) to the private endpoint'sdepends_onlist to ensure proper creation and deletion orderingTechnical Details
The fix addresses the root cause by separating the DNS zone group configuration from the private endpoint resource, allowing Terraform to manage the deletion order explicitly:
Before (inline block causing race conditions):
After (separate resource with explicit dependencies):
The explicit
depends_onrelationships ensure:Implementation Note
This solution uses the AzAPI provider (already required for AMPLS resources) to create the private DNS zone group as a separate resource. This approach provides the same benefits as the recommended
azurerm_private_endpoint_private_dns_zone_groupresource (which is not available in the azurerm provider version used in the porter bundle build environment) while maintaining compatibility with existing provider versions.References
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.