Skip to content

Duplicate AuthorizationModels cause Store failures when multiple APIExports include same resource types #317

@ifdotpy

Description

@ifdotpy

Summary

When multiple APIExports include the same resource types (e.g., private-llm re-exporting postgres resources), the security-operator creates duplicate AuthorizationModels in different provider workspaces. This causes the organization's FGA Store to fail with "duplicate type definition" errors, breaking ALL user authorization for that organization.

Impact

  • Severity: High
  • Affected environments: cc-d2, cc-two (both had showroom Store broken)
  • User impact: Users get Forbidden with NoOpinion on ALL resource operations (create, list, watch) until manually fixed

Symptoms

Store shows Ready: False with errors like:

transformation error at line=12, column=5: duplicate type definition postgresql_cnpg_io_cluster
transformation error at line=12, column=5: duplicate type definition ui_privatellms_msp_chatuiinstance
transformation error at line=7, column=9: relation create_postgresql_cnpg_io_clusters already exists on type core_namespace

Root Cause

File: internal/subroutine/authorization_model_generation.go:243-289

The Process function creates AuthorizationModels for ALL resources in an APIExport without checking if another provider already created one for the same resource type:

for _, latestResourceSchema := range apiExport.Spec.Resources {
    // ...
    model := securityv1alpha1.AuthorizationModel{
        ObjectMeta: metav1.ObjectMeta{
            // Name is just <resource>-<org>, no provider identifier
            Name: fmt.Sprintf("%s-%s", resourceSchema.Spec.Names.Plural, accountInfo.Spec.Organization.Name),
        },
    }
    // Creates in the provider workspace - no duplicate check
    _, err = controllerutil.CreateOrUpdate(ctx, apiExportCluster.GetClient(), &model, ...)
}

Why duplicates occur

Provider APIExport includes Creates AuthorizationModels
cnpg-postgres clusters, databases, secrets clusters-showroom, databases-showroom, secrets-showroom
private-llm llminstances, clusters, databases, secrets (re-exported) llminstances-showroom, clusters-showroom, databases-showroom, secrets-showroom
chat-ui chatuiinstances chatuiinstances-showroom
cnpg-postgres clusters, chatuiinstances (unclear why) chatuiinstances-showroom (duplicate)

When private-llm re-exports postgres resources as dependencies, duplicate AuthorizationModels are created. The Store merges all models with the same storeRef and fails due to duplicate FGA type definitions.

Evidence

cc-d2 (before fix)

Store showroom: Ready=False
17 errors including:
- duplicate type definition postgresql_cnpg_io_cluster
- duplicate type definition postgresql_cnpg_io_database
- duplicate type definition secrets_postgresql_cnpg_io_secret
- duplicate type definition ui_privatellms_msp_chatuiinstance
- duplicate type definition camaraproject_org_deviceconnectivity

Duplicates found:

  • private-llm workspace: clusters-showroom, databases-showroom, secrets-showroom (duplicates of cnpg-postgres)
  • cnpg-postgres workspace: chatuiinstances-showroom (duplicate of chat-ui)
  • tsystems workspace: deviceconnectivity-showroom AND deviceconnectivities-showroom (singular vs plural)

cc-two (before fix)

Store showroom: Ready=False
4 errors:
- duplicate type definition ui_privatellms_msp_chatuiinstance
- relation create/list/watch_ui_privatellms_msp_chatuiinstances already exists

Duplicate found:

  • cnpg-postgres workspace: chatuiinstances-showroom (duplicate of chat-ui)

Manual Fix Applied

  1. Delete duplicate AuthorizationModels from wrong provider workspaces
  2. Remove finalizers from stuck deletions: kubectl patch authorizationmodel <name> --type=json -p='[{"op": "remove", "path": "/metadata/finalizers"}]'
  3. Force Store reconciliation: kubectl annotate store <org> reconcile.platform-mesh.io/requestedAt="$(date +%s)" --overwrite

Suggested Fixes

Option 1: Skip re-exported resources (Recommended)

Only create AuthorizationModels for resources where the APIExport's name matches the resource's API group:

for _, latestResourceSchema := range apiExport.Spec.Resources {
    // Skip if this resource belongs to a different provider
    if !strings.Contains(apiExport.Name, resourceSchema.Spec.Group) {
        continue
    }
    // ... create AuthorizationModel
}

Option 2: Include provider in AuthorizationModel name

Change naming to avoid collisions:

Name: fmt.Sprintf("%s-%s-%s", resourceSchema.Spec.Names.Plural, apiExport.Name, accountInfo.Spec.Organization.Name)
// e.g., "clusters-postgresql.cnpg.io-showroom" vs "clusters-llm.privatellms.msp-showroom"

Option 3: Check for existing AuthorizationModel before creating

Query all provider workspaces for existing AuthorizationModel with same resource+org before creating.

Option 4: Deduplicate in Store controller

When merging AuthorizationModels, detect and skip duplicates instead of failing.

Additional Issues Found

Finalizer stuck on deletion

When AuthorizationModels are deleted, the finalizer core.platform-mesh.io/fga-tuples doesn't get removed, leaving objects stuck in terminating state indefinitely.

tsystems singular vs plural naming

tsystems had both deviceconnectivity-showroom and deviceconnectivities-showroom - suggests inconsistent resource naming or a bug in how the resource name is determined.

Steps to Reproduce

  1. Have two APIExports that include the same resource type (e.g., private-llm re-exporting postgres clusters)
  2. Create an account in an organization
  3. Bind to both APIExports from the account
  4. Check Store status: kubectl get store <org> -o wide
  5. Store will show Ready: False with duplicate type definition errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    Status

    ForRefinement

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions