Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 20, 2025

Resolves #3194

What is being addressed

Workspace deletions fail intermittently with dependency ordering errors when Terraform attempts to delete Azure Monitor resources. The issue manifests as AnotherOperationInProgress errors when deleting the AMPLS private endpoint and its private DNS zone group simultaneously.

How is this addressed

  • Extracted DNS zone group to separate azapi_resource: Removed the inline private_dns_zone_group block from within the azurerm_private_endpoint resource and created it as a separate azapi_resource using the Azure Resource Manager API (Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-11-01). This provides explicit control over deletion ordering and avoids race conditions.
  • Enhanced dependency management: Added both AMPLS scoped services (ampls_app_insights and ampls_log_anaytics) to the private endpoint's depends_on list to ensure proper creation and deletion ordering
  • Update CHANGELOG.md with bug fix entry
  • Increment workspace base template version from 2.7.1 to 2.7.2

Technical Details

The fix addresses the root cause by separating the DNS zone group configuration from the private endpoint resource, allowing Terraform to manage the deletion order explicitly:

Before (inline block causing race conditions):

resource "azurerm_private_endpoint" "azure_monitor_private_endpoint" {
  # ... resource configuration ...
  
  private_dns_zone_group {
    name = "azure-monitor-private-dns-zone-group"
    private_dns_zone_ids = [...]
  }
  
  depends_on = [
    azurerm_monitor_private_link_scoped_service.ampls_app_insights,
  ]
}

After (separate resource with explicit dependencies):

resource "azurerm_private_endpoint" "azure_monitor_private_endpoint" {
  # ... resource configuration without private_dns_zone_group ...
  
  lifecycle { ignore_changes = [tags] }
  
  depends_on = [
    azurerm_monitor_private_link_scoped_service.ampls_app_insights,
    azurerm_monitor_private_link_scoped_service.ampls_log_anaytics,
  ]
}

resource "azapi_resource" "azure_monitor_dns_zone_group" {
  type      = "Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-11-01"
  name      = "azure-monitor-private-dns-zone-group"
  parent_id = azurerm_private_endpoint.azure_monitor_private_endpoint.id
  
  body = {
    properties = {
      privateDnsZoneConfigs = [
        {
          name = "privatelink-monitor-azure-com"
          properties = {
            privateDnsZoneId = var.azure_monitor_dns_zone_id
          }
        },
        # ... additional DNS zone configs ...
      ]
    }
  }
  
  depends_on = [
    azurerm_private_endpoint.azure_monitor_private_endpoint,
  ]
}

The explicit depends_on relationships ensure:

  • Creation order: Scoped services are fully configured → Private endpoint is created → DNS zone group is configured
  • Deletion order: DNS zone group is deleted → Private endpoint is deleted → Scoped services are removed
  • No race conditions: Azure operations on the private endpoint and DNS zone group are properly sequenced

Implementation Note

This solution uses the AzAPI provider (already required for AMPLS resources) to create the private DNS zone group as a separate resource. This approach provides the same benefits as the recommended azurerm_private_endpoint_private_dns_zone_group resource (which is not available in the azurerm provider version used in the porter bundle build environment) while maintaining compatibility with existing provider versions.

References


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…PLS private endpoints

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix workspace deletion failure due to AnotherOperationInProgress error Fix workspace deletion race condition in AMPLS private endpoint Nov 20, 2025
Copilot AI requested a review from marrobi November 20, 2025 22:15
@marrobi
Copy link
Member

marrobi commented Nov 20, 2025

/test-extended c32a87a

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19553208613 (with refid d81faf5c)

(in response to this comment from @marrobi)

…o fix deletion ordering

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 21, 2025

@copilot this failed, same issue, see https://[productionresultssa1.blob.core.windows.net/actions-results/792780c6-7f37-45d1-8de6-7062b4b8358e/workflow-job-run-9bd45ba8-ed52-5b06-ac86-bd5f0db51320/log...

The test revealed a different root cause. The error was that Private DNS Zones couldn't be deleted because their Virtual Network Links still existed. I've added create_before_destroy = true to all Azure Monitor DNS zones and their VNet links to ensure proper deletion ordering. Changes are in commit 5d9bb7d.

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copilot AI changed the title Fix workspace deletion race condition in AMPLS private endpoint Fix workspace deletion race condition in AMPLS private endpoint and DNS zones Nov 21, 2025
@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

/test-extended d66e33f

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19571335811 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

/test-extended d66e33f

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19576408604 (with refid d81faf5c)

(in response to this comment from @marrobi)

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

Unit Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit e722447.

♻️ This comment has been updated with latest results.

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

@copilot please resolve: Code update without version change

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 21, 2025

@copilot please resolve: Code update without version change

Version updates applied in commit 38788f5:

  • Workspace base template: 2.7.1 → 2.7.2
  • Core: 0.16.10 → 0.16.11
  • CHANGELOG.md updated with component version entries

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

@marrobi
Copy link
Member

marrobi commented Nov 27, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19730219587 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 27, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19733426740 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 28, 2025

/test-extended 5941585

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19763804503 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Dec 17, 2025

/test-extended 5941585

@github-actions
Copy link

🤖 pr-bot 🤖

⚠️ Cannot run tests as PR is not mergeable. Ensure that the PR is open and doesn't have any conflicts.

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Dec 18, 2025

/test-extended fc12afe

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid d81faf5c)

(in response to this comment from @marrobi)

1 similar comment
@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Feb 2, 2026

/test-extended f12f533

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21595816510 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Feb 2, 2026

@copilot can you try:

resource "azurerm_private_endpoint" "azure_monitor_private_endpoint" {
  name                = "pe-ampls-${var.tre_id}-ws-${local.short_workspace_id}"
  # ... other config ...
  
  # REMOVE the private_dns_zone_group block from here
  
  lifecycle { ignore_changes = [tags] }
}

resource "azapi_resource" "azure_monitor_dns_zone_group" {
  type      = "Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-11-01"
  name      = "azure-monitor-private-dns-zone-group"
  parent_id = azurerm_private_endpoint.azure_monitor_private_endpoint.id
  
  # ... DNS zone configs ...
}

Also fix the linting issues.

…gress errors

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 2, 2026

@copilot can you try:

...

Implemented in commit ae93bc4. The private_dns_zone_group has been extracted from the inline block and created as a separate azapi_resource using the Microsoft.Network/privateEndpoints/privateDnsZoneGroups@2023-11-01 API. This provides explicit control over deletion ordering while using the already-required azapi provider. Code review completed with no blocking issues.

@marrobi
Copy link
Member

marrobi commented Feb 2, 2026

/test-extended ae93bc4

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21597746061 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Feb 2, 2026

/test-extended e722447

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Feb 2, 2026

@copilot we have some linting failing:

Failed to load configurations; network.tf:128,1-53: "terraform_azurerm_environment_configuration" module is not found. Did you run "terraform init"?; :

Error: "terraform_azurerm_environment_configuration" module is not found. Did you run "terraform init"?

on network.tf line 128, in module "terraform_azurerm_environment_configuration":

128: module "terraform_azurerm_environment_configuration" {

We should not be validating remote modules as we don't run init. We did this for workspaces, but the workflow much use a differnt configuration file for core?

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid d81faf5c)

(in response to this comment from @marrobi)

1 similar comment
@github-actions
Copy link

github-actions bot commented Feb 3, 2026

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/21603027184 (with refid d81faf5c)

(in response to this comment from @marrobi)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants