Deployment Guide: 3DGS Video Processor

Production deployment patterns and operational best practices.

Container Deployment
Azure Container Apps Job (GPU) — azd
- Reusing Existing Azure Resources
- Full Quickstart and Teardown
Batch Mode (Azure SDK)
Resource Requirements
Storage Configuration
Azure Blob Storage Setup
Security Hardening
Monitoring and Logging

Container Deployment

Docker Compose (Recommended for Single Server)

Create docker-compose.yml:

version: '3.8'

services:
  3dgs-processor:
    image: 3dgs-processor:gpu
    container_name: 3dgs-processor
    restart: unless-stopped
    
    # Resource limits
    deploy:
      resources:
        limits:
          cpus: '8'
          memory: 16G
        reservations:
          cpus: '4'
          memory: 8G
    
    # Volume mounts
    volumes:
      - ./input:/data/input
      - ./output:/data/output
      - ./processed:/data/processed
      - ./error:/data/error
      - ./config.yaml:/app/config.yaml:ro
    
    # Environment configuration
    environment:
      INPUT_PATH: /data/input
      OUTPUT_PATH: /data/output
      PROCESSED_PATH: /data/processed
      ERROR_PATH: /data/error
      BACKEND: gsplat
      LOG_LEVEL: info
      STABILITY_TIMEOUT_SECS: 30
      MAX_RETRIES: 3
      RETENTION_DAYS: 30
      MIN_DISK_SPACE_GB: 50
    
    # Logging configuration
    logging:
      driver: json-file
      options:
        max-size: "100m"
        max-file: "5"
    
    # Health check
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Deploy:

docker-compose up -d
docker-compose logs -f 3dgs-processor

Kubernetes Deployment

Create k8s/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: 3dgs-processor
  namespace: 3dgs
spec:
  replicas: 1  # Single instance (sequential processing)
  selector:
    matchLabels:
      app: 3dgs-processor
  template:
    metadata:
      labels:
        app: 3dgs-processor
    spec:
      containers:
      - name: processor
        image: 3dgs-processor:gpu
        imagePullPolicy: Always
        
        resources:
          requests:
            cpu: "4"
            memory: "8Gi"
          limits:
            cpu: "8"
            memory: "16Gi"
        
        env:
        - name: INPUT_PATH
          value: "/data/input"
        - name: OUTPUT_PATH
          value: "/data/output"
        - name: PROCESSED_PATH
          value: "/data/processed"
        - name: ERROR_PATH
          value: "/data/error"
        - name: BACKEND
          valueFrom:
            configMapKeyRef:
              name: 3dgs-config
              key: backend
        
        volumeMounts:
        - name: data
          mountPath: /data
        - name: config
          mountPath: /app/config.yaml
          subPath: config.yaml
          readOnly: true
        
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
        
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
      
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: 3dgs-pvc
      - name: config
        configMap:
          name: 3dgs-config

Create k8s/pvc.yaml for persistent storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: 3dgs-pvc
  namespace: 3dgs
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 500Gi

Deploy:

kubectl create namespace 3dgs
kubectl apply -f k8s/pvc.yaml
kubectl apply -f k8s/deployment.yaml
kubectl logs -f deployment/3dgs-processor -n 3dgs

Azure Container Instances

Deploy serverless containers with Azure Files or Blobfuse2:

Option 1: Using Azure Files (Simpler)

# Set variables
RESOURCE_GROUP="<your-resource-group>"
LOCATION="eastus"
CONTAINER_NAME="3dgs-processor"
IMAGE="youracr.azurecr.io/3dgs-processor:gpu"
STORAGE_ACCOUNT="<your-storage-account>"
STORAGE_KEY="<your-storage-key>"

# Create container with Azure Files mount
az container create \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --image $IMAGE \
  --cpu 8 \
  --memory 16 \
  --os-type Linux \
  --restart-policy Always \
  --environment-variables \
    INPUT_PATH=/data/input \
    OUTPUT_PATH=/data/output \
    PROCESSED_PATH=/data/processed \
    ERROR_PATH=/data/error \
    BACKEND=gsplat \
    LOG_LEVEL=info \
  --azure-file-volume-account-name $STORAGE_ACCOUNT \
  --azure-file-volume-account-key "$STORAGE_KEY" \
  --azure-file-volume-share-name input \
  --azure-file-volume-mount-path /data/input

Option 2: Using Managed Identity with Azure Blob Storage

Create aci-deployment.yaml:

apiVersion: '2021-09-01'
location: eastus
name: 3dgs-processor-group
properties:
  containers:
  - name: 3dgs-processor
    properties:
      image: youracr.azurecr.io/3dgs-processor:gpu
      resources:
        requests:
          cpu: 8
          memoryInGB: 16
      environmentVariables:
      - name: INPUT_PATH
        value: /data/input
      - name: OUTPUT_PATH
        value: /data/output
      - name: PROCESSED_PATH
        value: /data/processed
      - name: ERROR_PATH
        value: /data/error
      - name: BACKEND
        value: gsplat
      - name: AZURE_STORAGE_ACCOUNT
        value: <your-storage-account>
      volumeMounts:
      - name: input-volume
        mountPath: /data/input
      - name: output-volume
        mountPath: /data/output
  
  volumes:
  - name: input-volume
    azureFile:
      shareName: input
      storageAccountName: <your-storage-account>
      storageAccountKey: <key>
  - name: output-volume
    azureFile:
      shareName: output
      storageAccountName: <your-storage-account>
      storageAccountKey: <key>
  
  osType: Linux
  restartPolicy: Always
  
  identity:
    type: SystemAssigned
type: Microsoft.ContainerInstance/containerGroups

Deploy:

az container create --resource-group $RESOURCE_GROUP --file aci-deployment.yaml

Monitor ACI deployment:

# View logs
az container logs --resource-group $RESOURCE_GROUP --name $CONTAINER_NAME --follow

# Check status
az container show --resource-group $RESOURCE_GROUP --name $CONTAINER_NAME --query instanceView.state

# Execute commands inside container
az container exec --resource-group $RESOURCE_GROUP --name $CONTAINER_NAME --exec-command "/bin/sh"

# Delete when done
az container delete --resource-group $RESOURCE_GROUP --name $CONTAINER_NAME

Cost Optimization:

Use Azure Container Instances for variable workloads (pay per second)
Use AKS for consistent high-volume processing
Set --restart-policy OnFailure for one-time jobs
Use spot instances for non-critical workloads

Azure Container Apps Job (GPU) — azd

Deploy the 3DGS processor as a serverless GPU job on Azure Container Apps using the Azure Developer CLI (azd). The job runs in batch mode: download videos from Azure Blob Storage → extract frames → COLMAP reconstruction → gsplat GPU training → export PLY/SPLAT → upload results → exit.

Key characteristics:

Serverless GPU — NVIDIA T4 (16 GB VRAM) via Consumption-GPU-NC8as-T4 workload profile
No local Docker required — images are built remotely via ACR Tasks
Batch mode — single job execution, no long-running container
Managed Identity — user-assigned MI for RBAC-based access to ACR and Storage
RBAC separation — infrastructure provisioning and role assignments are separate steps
Existing resource preservation — optionally reuse pre-existing Resource Groups, ACR, Storage Accounts, and Container Apps Environments. These resources are referenced via Bicep's existing keyword and are not deleted by azd down

Prerequisites

Requirement	Purpose
Azure CLI (`az`)	Azure resource management
Azure Developer CLI (`azd`)	Infrastructure-as-code orchestration
Azure subscription	With GPU quota in your target region
Test data	Download via `./scripts/e2e/01-download-testdata.sh`

No local Docker daemon is needed. The GPU image is built remotely on Azure Container Registry Tasks.

Quick Start

# 1. Initialize the azd environment
azd init

# 2. Configure environment
azd env set AZURE_LOCATION swedencentral
azd env set USE_GPU true

# 3. (Optional) Reuse existing Azure resources instead of creating new ones
#    See "Reusing Existing Azure Resources" below for details.
./scripts/configure-existing-resources.sh

# 4. Provision infrastructure (also builds the GPU image on ACR — ~40 min)
azd provision

# 5. (Privileged user) Assign RBAC roles to the Managed Identity
./infra/scripts/assign-rbac.sh

# 6. Verify RBAC assignments
./infra/scripts/verify-rbac.sh

# 7. Upload test data to Azure Blob Storage
./infra/scripts/upload-testdata.sh

# 8. Run the GPU job
./infra/scripts/run-job.sh --logs

What Gets Provisioned

azd provision creates the following Azure resources (all in a single resource group). If you have configured existing resources via ./scripts/configure-existing-resources.sh, those resources are referenced (Bicep existing keyword) rather than created — their lifecycle is not managed by this deployment, and azd down will not delete them.

Resource	Bicep Module	Purpose	Can Reuse Existing?
Resource Group	`main.bicep`	`rg-<env-name>`	✅ Yes
User-Assigned Managed Identity	`modules/managed-identity.bicep`	Authenticate to ACR and Storage	No (always created)
Azure Container Registry (Basic)	`modules/acr.bicep`	Store GPU Docker images	✅ Yes
Storage Account + 4 containers	`modules/storage.bicep`	`input`, `output`, `processed`, `error`	✅ Yes
Log Analytics Workspace	`modules/monitoring.bicep`	Container log aggregation	✅ Yes
Container Apps Environment	`modules/container-apps-env.bicep`	GPU workload profile (T4)	✅ Yes
Container Apps Job (Manual trigger)	`modules/container-apps-job.bicep`	The processor job itself	No (always created)

After provisioning, the postprovision hook automatically:

Builds the GPU Docker image via ACR Tasks (infra/scripts/hooks/acr-build.sh)
Updates the Container Apps Job with the new image

RBAC Requirements — Read This Carefully

RBAC role assignments are intentionally separated from infrastructure provisioning. This is because in many organizations, the person deploying infrastructure does not have permission to assign IAM roles. The two operations may require different privilege levels.

Permissions Required by the Deployer (the person running `azd provision`)

The deployer needs these permissions on the Azure subscription or resource group:

Permission	Why
`Microsoft.Resources/subscriptions/resourceGroups/write`	Create the resource group
`Microsoft.ContainerRegistry/registries/*`	Create ACR and push images
`Microsoft.Storage/storageAccounts/*`	Create storage account and containers
`Microsoft.App/managedEnvironments/*`	Create Container Apps Environment
`Microsoft.App/jobs/*`	Create Container Apps Job
`Microsoft.ManagedIdentity/userAssignedIdentities/*`	Create managed identity
`Microsoft.OperationalInsights/workspaces/*`	Create Log Analytics workspace

Typically the Contributor built-in role on the subscription is sufficient.

The preprovision hook (infra/scripts/hooks/preprovision.sh) also attempts to assign deployer-level RBAC (AcrPush, Storage Blob Data Contributor) so the deployer can push images and upload test data. This requires the deployer to have User Access Administrator or Owner on the resource group. If the deployer lacks this permission, the hook continues (it's non-blocking) — a privileged user can assign these roles later.

Permissions Required for the Managed Identity (assigned after provisioning)

The Managed Identity needs two role assignments so the container job can pull images and read/write blobs at runtime:

Role	Scope	Role Definition ID	Purpose
AcrPull	Container Registry	`7f951dda-4ed3-4680-a7ca-43fe172d538d`	Pull GPU image from ACR
Storage Blob Data Contributor	Storage Account	`ba92f5b4-2d11-453d-a403-e96b0029c9fe`	Download input videos, upload outputs, move blobs

Who Can Assign These Roles?

A user with one of these roles on the target scope:

Owner (full control, including RBAC)
User Access Administrator (RBAC management only)
A custom role with Microsoft.Authorization/roleAssignments/write permission

If the deployer does not have these permissions, the azd provision step will succeed but the job will fail at runtime with authentication errors (HTTP 403 on Storage or image pull failures on ACR). Have a privileged user run the RBAC scripts.

Assigning RBAC Roles

# Assign roles via Azure CLI (reads values from azd env automatically)
./infra/scripts/assign-rbac.sh

# Or assign via Bicep deployment (alternative)
./infra/scripts/assign-rbac.sh --use-bicep

Verifying RBAC Roles

# Check that both roles are assigned
./infra/scripts/verify-rbac.sh

Expected output when roles are correctly assigned:

🔍 Verifying RBAC role assignments for Managed Identity...
   Principal ID: <managed-identity-principal-id>

  ✅ AcrPull on Container Registry
  ✅ Storage Blob Data Contributor on Storage Account

✅ All RBAC role assignments are in place.

If any roles are missing:

  ❌ AcrPull on Container Registry — MISSING

⚠️  1 RBAC role assignment(s) missing.
   Run './infra/scripts/assign-rbac.sh' as a privileged user to fix.

What Fails Without RBAC

Missing Role	Symptom
AcrPull	Job execution fails immediately — Container Apps cannot pull the image. The execution shows `Failed` status with no container logs (image never starts).
Storage Blob Data Contributor	Container starts but fails with `Failed to download input blobs from Azure Blob Storage` — the managed identity token is rejected with HTTP 403.

Uploading Test Data

The South Building dataset (128 multi-view images from UNC Chapel Hill) is used for testing. The upload script downloads it if needed, creates 3 test videos, and uploads them:

# Download test data + upload to blob storage (default prefix: south_building/)
./infra/scripts/upload-testdata.sh

# Custom blob prefix
./infra/scripts/upload-testdata.sh --prefix my_scene/

# Download only (no upload)
./infra/scripts/upload-testdata.sh --download-only

Running the Job

# Start job and return immediately
./infra/scripts/run-job.sh

# Start job, wait for completion, show status
./infra/scripts/run-job.sh --wait

# Start job, wait, then show container logs
./infra/scripts/run-job.sh --logs

A successful run produces these outputs in the output blob container:

File	Description	Typical Size
`south_building/south_building.ply`	3D Gaussian point cloud (real GPU-trained geometry)	~65 KB
`south_building/south_building.splat`	Web-optimized format for real-time rendering	~53 KB
`south_building/manifest.json`	Video metadata (resolution, duration, codec, frame count)	~7 KB
`south_building/.checkpoint.json`	Pipeline progress tracking	~11 KB

Input videos are moved from input/ to processed/ on success, or error/ on failure.

Redeploying Code Changes

After modifying the processor code, rebuild and redeploy without re-provisioning:

# Build new image on ACR + update the job
./infra/scripts/deploy-job.sh

# Or skip the build and just redeploy the existing image
./infra/scripts/deploy-job.sh --skip-build

Configuration Variables

Set these via azd env set <NAME> <VALUE> before provisioning:

Variable	Default	Description
`AZURE_LOCATION`	(required)	Azure region. Must support GPU T4 (e.g., `swedencentral`, `eastus`, `westus`)
`USE_GPU`	`true`	Enable GPU workload profile
`GPU_PROFILE_TYPE`	`Consumption-GPU-NC8as-T4`	GPU type (`Consumption-GPU-NC8as-T4` or `Consumption-GPU-NC24-A100`)
`PROCESSOR_BACKEND`	`gsplat`	3DGS backend (`gsplat`, `gaussian-splatting`, `mock`)
`INCLUDE_RBAC`	`true`	Include RBAC assignments in Bicep deployment
`USE_STORAGE_KEYS`	`false`	Use storage account keys instead of RBAC (fallback)
`EXISTING_RESOURCE_GROUP`	(empty)	Name of an existing Resource Group to deploy into (see Reusing Existing Resources)
`EXISTING_ACR_NAME`	(empty)	Name of an existing Azure Container Registry to reuse
`EXISTING_STORAGE_ACCOUNT_NAME`	(empty)	Name of an existing Storage Account to reuse
`EXISTING_CONTAINER_APPS_ENV_NAME`	(empty)	Name of an existing Container Apps Environment to reuse
`EXISTING_LOG_ANALYTICS_NAME`	(empty)	Name of an existing Log Analytics workspace to reuse
`FORCE_DELETE`	(empty)	Set to `true` to allow `azd down` when existing resources are configured

Tip: Use ./scripts/configure-existing-resources.sh to interactively set the EXISTING_* variables instead of setting them manually.

GPU Region Availability

Serverless GPU (T4) Container Apps is available in these regions:

swedencentral, eastus, westus, canadacentral, brazilsouth, australiaeast, italynorth, francecentral, centralindia, japaneast, northcentralus, southcentralus, southeastasia, southindia, westeurope, westus2, westus3

If you get a deployment error about workload profiles, ensure your subscription has GPU quota in the selected region. Check quota at: Azure Portal → Quotas

Cleaning Up RBAC Before Teardown

Before running azd down, remove the RBAC assignments:

./infra/scripts/cleanup-rbac.sh
azd down --purge --force

⚠️ Existing resource protection: If your environment is configured to reuse existing Azure resources, azd down will be blocked by the predown hook to prevent accidental deletion of shared infrastructure. See Reusing Existing Azure Resources for details.

Scripts Reference

All infrastructure scripts are in infra/scripts/:

Script	Purpose	Requires Privilege?
`hooks/preprovision.sh`	Captures deployer identity, runs RBAC preflight check	No (auto-run by azd)
`hooks/postprovision.sh`	Builds GPU image on ACR, updates job	No (auto-run by azd)
`hooks/predown.sh`	Blocks `azd down` when existing resources are configured (override with `FORCE_DELETE=true`)	No (auto-run by azd)
`hooks/acr-build.sh`	Creates minimal staging dir, runs `az acr build` for GPU target	No
`assign-rbac.sh`	Assigns AcrPull + Storage Blob Data Contributor to MI	Yes — Owner or User Access Admin
`verify-rbac.sh`	Checks if required RBAC roles are assigned	No
`cleanup-rbac.sh`	Removes RBAC role assignments	Yes — Owner or User Access Admin
`run-job.sh`	Starts a job execution with `--wait`/`--logs` options	No
`deploy-job.sh`	Rebuilds image on ACR + updates job	No
`upload-testdata.sh`	Downloads South Building dataset + uploads videos to blob storage	No (needs Storage Blob Data Contributor on deployer)

Additional scripts in scripts/:

Script	Purpose	Requires Privilege?
`configure-existing-resources.sh`	Interactive wizard to select existing Azure resources for reuse (sets azd env vars)	No (needs Azure CLI login)

Troubleshooting

"MANAGED_IDENTITY_PRINCIPAL_ID is not set" Run azd provision first — this creates the managed identity and saves its principal ID.

Image pull failure (no container logs) The AcrPull role is missing. Run ./infra/scripts/assign-rbac.sh.

"Failed to download input blobs from Azure Blob Storage" Either: (a) Storage Blob Data Contributor is missing — run ./infra/scripts/assign-rbac.sh, or (b) no blobs exist at the BATCH_INPUT_PREFIX — run ./infra/scripts/upload-testdata.sh.

"Reconstruction quality too low" Increase FRAME_RATE (e.g., from 2 to 3) in the job env vars to extract more frames.

ACR build timeout The GPU image build compiles COLMAP from source (~30 min). The default timeout is 3600s. If it still times out, check ACR Tasks quotas.

COLMAP matching timeout Ensure COLMAP_MATCHER=sequential (not exhaustive) in the job env vars.

azd down blocked with "EXISTING RESOURCES DETECTED" The predown hook prevents accidental deletion of shared resources. Either reset the existing resource configuration or explicitly opt in:

# Option 1: Remove existing resource configuration and tear down normally
./scripts/configure-existing-resources.sh --reset
azd down --purge --force

# Option 2: Force delete (will destroy the entire resource group)
azd env set FORCE_DELETE true
azd down --purge --force

Reusing Existing Azure Resources

When deploying into an environment with pre-existing Azure infrastructure (e.g., a shared resource group, a team ACR, or a storage account with data), you can configure azd to reference those resources instead of creating new ones. Referenced resources use Bicep's existing keyword — their lifecycle is not managed by this deployment and they are not deleted by azd down.

Which Resources Can Be Reused?

Resource	Env Variable	What Happens When Set
Resource Group	`EXISTING_RESOURCE_GROUP`	Deploy into this RG instead of creating `rg-<env-name>`
Azure Container Registry	`EXISTING_ACR_NAME`	Reference existing ACR; skip `modules/acr.bicep`
Storage Account	`EXISTING_STORAGE_ACCOUNT_NAME`	Reference existing storage; skip `modules/storage.bicep`. Required blob containers (`input`, `output`, `processed`, `error`) must already exist
Container Apps Environment	`EXISTING_CONTAINER_APPS_ENV_NAME`	Reference existing env; skip `modules/container-apps-env.bicep`
Log Analytics Workspace	`EXISTING_LOG_ANALYTICS_NAME`	Reference existing workspace; skip `modules/monitoring.bicep`. When reusing an existing Container Apps Environment, this is not needed (the existing env already has logging configured)

Resources that are always created by the deployment, regardless of configuration:

Managed Identity — a new user-assigned identity is always created
Container Apps Job — the processor job itself is always created
RBAC role assignments — AcrPull and Storage Blob Data Contributor (when INCLUDE_RBAC=true)

Configuring Existing Resources (Interactive)

The easiest way to configure resource reuse is the interactive wizard:

./scripts/configure-existing-resources.sh

The script:

Lists available Resource Groups in your subscription
Scans the selected RG for ACRs, Storage Accounts, and Container Apps Environments
Lets you pick which resources to reuse (or skip to create new ones)
Validates that required blob containers exist on a selected Storage Account
Sets the EXISTING_* environment variables in the azd environment
Sets AZURE_LOCATION to match the existing resource group's region

Configuring Existing Resources (Manual)

You can also set the variables directly:

azd env set EXISTING_RESOURCE_GROUP "my-shared-rg"
azd env set EXISTING_ACR_NAME "myteamacr"
azd env set EXISTING_STORAGE_ACCOUNT_NAME "mystorageaccount"
azd env set EXISTING_CONTAINER_APPS_ENV_NAME "my-cae"
azd env set EXISTING_LOG_ANALYTICS_NAME "my-law"

Leave a variable empty (or omit it) to create that resource fresh.

Resetting to Fresh Resources

To go back to creating all resources from scratch:

./scripts/configure-existing-resources.sh --reset

This clears all EXISTING_* variables and FORCE_DELETE from the azd environment.

Teardown Protection (`predown` Hook)

When any EXISTING_* variable is set, azd down is blocked by the predown hook (infra/scripts/hooks/predown.sh). This prevents accidental deletion of the resource group (and everything inside it, including the existing shared resources).

The hook prints which existing resources are configured and exits with an error:

╔══════════════════════════════════════════════════════╗
║  ⚠️  EXISTING RESOURCES DETECTED                     ║
╚══════════════════════════════════════════════════════╝

This environment references pre-existing Azure resources:
   • Resource Group:             my-shared-rg
   • Storage Account:            mystorageaccount

Running 'azd down' would delete the resource group and ALL
resources inside it, including those listed above.

To proceed with teardown despite existing resources:

# Explicitly opt in to deletion
azd env set FORCE_DELETE true
azd down --purge --force

⚠️ Warning: FORCE_DELETE deletes the entire resource group, including any existing resources you configured for reuse. Use with extreme caution.

How It Works (Bicep)

The infra/main.bicep template uses conditional deployment based on whether EXISTING_* parameters are empty:

var useExistingAcr = !empty(existingAcrName)

// New ACR — only created when not reusing
module acr 'modules/acr.bicep' = if (!useExistingAcr) { ... }

// Existing ACR — only referenced when reusing
module existingAcrRef 'modules/existing-acr.bicep' = if (useExistingAcr) { ... }

// Outputs resolve to whichever was used
output AZURE_CONTAINER_REGISTRY_NAME string = useExistingAcr
  ? existingAcrRef.outputs.name
  : acr.outputs.name

The same pattern applies to Storage Account, Container Apps Environment, and Resource Group.

Full Quickstart and Teardown

Step-by-step instructions covering the complete lifecycle — from initial setup through a successful job run to clean teardown.

Prerequisites

Ensure the following are installed and configured:

# Azure CLI — logged in
az login
az account show  # verify correct subscription

# Azure Developer CLI
azd version      # requires azd ≥ 1.5

# Test data (South Building dataset)
./scripts/e2e/01-download-testdata.sh

Quickstart: Fresh Deployment (New Resources)

# ── 1. Initialize ────────────────────────────────────────────────────────
azd init
# Choose an environment name (e.g., "3dgs-dev")

# ── 2. Configure ─────────────────────────────────────────────────────────
azd env set AZURE_LOCATION swedencentral    # Must support GPU T4
azd env set USE_GPU true

# ── 3. Provision infrastructure ──────────────────────────────────────────
# Creates: RG, ACR, Storage, Container Apps Env, MI, Job
# Also builds the GPU Docker image on ACR (~40 min first time)
azd provision

# ── 4. Assign RBAC roles ─────────────────────────────────────────────────
# Requires Owner or User Access Administrator on the resource group
./infra/scripts/assign-rbac.sh
./infra/scripts/verify-rbac.sh

# ── 5. Upload test data ──────────────────────────────────────────────────
./infra/scripts/upload-testdata.sh

# ── 6. Run the job ────────────────────────────────────────────────────────
./infra/scripts/run-job.sh --logs

# ── 7. Verify outputs ────────────────────────────────────────────────────
# Check the output blob container for PLY, SPLAT, manifest, checkpoint files
STORAGE_ACCOUNT=$(azd env get-value AZURE_STORAGE_ACCOUNT_NAME)
az storage blob list --account-name "$STORAGE_ACCOUNT" -c output --auth-mode login -o table

Quickstart: Deployment with Existing Resources

# ── 1. Initialize ────────────────────────────────────────────────────────
azd init

# ── 2. Configure existing resources ──────────────────────────────────────
# Interactive: select which existing resources to reuse
./scripts/configure-existing-resources.sh

# The script sets AZURE_LOCATION automatically from the resource group.
# Optionally configure additional settings:
azd env set USE_GPU true

# ── 3. Provision ─────────────────────────────────────────────────────────
# Only creates resources that aren't being reused (MI, Job, and any
# resources you chose to create fresh)
azd provision

# ── 4. Assign RBAC + upload data + run ────────────────────────────────────
./infra/scripts/assign-rbac.sh
./infra/scripts/verify-rbac.sh
./infra/scripts/upload-testdata.sh
./infra/scripts/run-job.sh --logs

Teardown: Fresh Deployment

When all resources were created by azd, teardown is straightforward:

# ── 1. Clean up RBAC assignments ─────────────────────────────────────────
./infra/scripts/cleanup-rbac.sh

# ── 2. Destroy all resources ─────────────────────────────────────────────
azd down --purge --force
# Deletes: Resource Group and all resources inside it
# --purge: permanently deletes soft-delete-enabled resources (Key Vault, etc.)
# --force: skip confirmation prompts

Teardown: Deployment with Existing Resources

When existing resources are configured, azd down is blocked to prevent accidental deletion. You have two options:

Option A — Remove only deployment-managed resources (recommended):

This approach leaves existing shared resources intact and only removes what the deployment created (Managed Identity, Container Apps Job, RBAC assignments):

# 1. Clean up RBAC assignments
./infra/scripts/cleanup-rbac.sh

# 2. Reset existing resource config so azd down only targets managed resources
./scripts/configure-existing-resources.sh --reset

# 3. Tear down
azd down --purge --force

Note: Because azd down deletes the entire resource group, this option still deletes the RG and all resources in it when the RG was created by the deployment. If you deployed into an existing RG, be aware that azd down will attempt to delete that RG too. In that case, use Option B or manually delete only the managed resources.

Option B — Force delete everything (use with caution):

# 1. Clean up RBAC assignments
./infra/scripts/cleanup-rbac.sh

# 2. Explicitly opt in to deletion
azd env set FORCE_DELETE true

# 3. Destroy the entire resource group (including existing resources!)
azd down --purge --force

⚠️ Warning: This deletes the entire resource group and all resources inside it, including any pre-existing ACR, Storage Account, or Container Apps Environment you configured for reuse.

Batch Mode (Azure SDK)

Batch mode processes a single job using the Azure Blob Storage SDK directly — no BlobFuse2 FUSE mounts, no privileged containers, no continuous watching. The processor downloads input videos from blob storage, runs the pipeline locally, uploads outputs, and exits.

How It Works

┌─────────────────┐     ┌──────────────────────────────────┐     ┌──────────────────┐
│  Azure Blob     │     │  3DGS Processor (batch mode)     │     │  Azure Blob      │
│  input container│────>│  1. Download MP4s from blob      │     │  output container │
│  scene_001/     │     │  2. FFmpeg frame extraction       │     │                  │
│    view1.mp4    │     │  3. FFprobe metadata             │────>│  scene_001.ply   │
│    view2.mp4    │     │  4. Manifest generation          │     │  scene_001.splat  │
│    view3.mp4    │     │  5. COLMAP reconstruction        │     │  manifest.json    │
│                 │     │  6. 3DGS training                │     │                  │
│                 │     │  7. Export PLY/SPLAT              │     │                  │
└─────────────────┘     │  8. Upload outputs to blob       │     └──────────────────┘
                        │  9. Move inputs → processed/error │
                        │ 10. Exit (code 0 or 1)           │
                        └──────────────────────────────────┘

Step-by-Step Setup

1. Create Azure Storage Resources

# Set variables
RESOURCE_GROUP=<your-resource-group>
STORAGE_ACCOUNT=my3dgsdata    # Must be globally unique, lowercase, 3-24 chars
LOCATION=eastus

# Create resource group (skip if exists)
az group create --name $RESOURCE_GROUP --location $LOCATION

# Create storage account
az storage account create \
  --name $STORAGE_ACCOUNT \
  --resource-group $RESOURCE_GROUP \
  --location $LOCATION \
  --sku Standard_LRS \
  --kind StorageV2

# Create the four containers
for CONTAINER in input output processed error; do
  az storage container create \
    --account-name $STORAGE_ACCOUNT \
    --name $CONTAINER
done

2. Upload Input Videos

# Upload MP4 files to the input container under a scene prefix
az storage blob upload-batch \
  --account-name $STORAGE_ACCOUNT \
  --destination input \
  --source ./my-scene-videos/ \
  --destination-path scene_001/

# Verify upload
az storage blob list \
  --account-name $STORAGE_ACCOUNT \
  --container-name input \
  --prefix scene_001/ \
  --output table

The input container should look like:

input/
  scene_001/
    view1.mp4
    view2.mp4
    view3.mp4

3. Choose Authentication Method

See the Authentication section below for details on each method.

4. Run the Processor

RUN_MODE=batch \
  AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  BATCH_INPUT_PREFIX=scene_001/ \
  BACKEND=mock \
  cargo run --release

5. Check Outputs

# List output blobs
az storage blob list \
  --account-name $STORAGE_ACCOUNT \
  --container-name output \
  --output table

# Download results
az storage blob download-batch \
  --account-name $STORAGE_ACCOUNT \
  --source output \
  --destination ./results/

After success, inputs are moved to the processed container:

processed/
  scene_001/
    view1.mp4
    view2.mp4
    view3.mp4

On failure, inputs are moved to the error container instead.

Required Environment Variables

# Mode selection
RUN_MODE=batch                              # Required: enables batch mode

# Azure storage account
AZURE_STORAGE_ACCOUNT=mystorageaccount      # Required: the storage account name

# Batch job input
BATCH_INPUT_PREFIX=scene_001/               # Required: blob prefix to process
BATCH_JOB_ID=my-job-123                     # Optional: custom job ID for logging

# Processing backend
BACKEND=mock                                # Or: gsplat, gaussian-splatting, 3dgs-cpp

Optional Environment Variables

# Container names (defaults shown)
AZURE_BLOB_CONTAINER_INPUT=input            # Where MP4s are uploaded
AZURE_BLOB_CONTAINER_OUTPUT=output          # Where PLY/SPLAT/manifest are written
AZURE_BLOB_CONTAINER_PROCESSED=processed    # Archive after success
AZURE_BLOB_CONTAINER_ERROR=error            # Archive after failure

# Auth: see Authentication section below
AZURE_USE_MANAGED_IDENTITY=true             # Use Managed Identity (Azure VMs, ACI, AKS)
AZURE_STORAGE_SAS_TOKEN="sv=2022-11-02&..." # Use SAS token auth

# Paths default to /tmp/3dgs-work/* in batch mode, but can be overridden
OUTPUT_PATH=/custom/output/dir
TEMP_PATH=/custom/temp/dir

Authentication

Batch mode supports three authentication methods, checked in priority order:

Priority	Method	Env Variable	Best For
1	SAS Token	`AZURE_STORAGE_SAS_TOKEN`	CI/CD pipelines, time-limited access, cross-tenant
2	Managed Identity	`AZURE_USE_MANAGED_IDENTITY=true`	Azure VMs, ACI, AKS (production)
3	Azure CLI	(default)	Local development (`az login`)

Note: Storage account keys (connection strings) are not supported in batch mode. Use SAS tokens for key-based scenarios — they provide time-limited access and can be scoped to specific containers and operations.

SAS Token Authentication

Generate a SAS token with the required permissions and set it as an environment variable. The token is appended to the storage endpoint URL — no credential object is needed.

# Generate an account-level SAS token (read, write, delete, list on blob service)
SAS_TOKEN=$(az storage account generate-sas \
  --account-name $STORAGE_ACCOUNT \
  --permissions rwdlac \
  --services b \
  --resource-types sco \
  --expiry $(date -u -d '+24 hours' +%Y-%m-%dT%H:%MZ) \
  --output tsv)

# Run with SAS token
RUN_MODE=batch \
  AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  AZURE_STORAGE_SAS_TOKEN="$SAS_TOKEN" \
  BATCH_INPUT_PREFIX=scene_001/ \
  BACKEND=mock \
  cargo run --release

You can also generate a container-level SAS token for tighter scoping:

# Container-level SAS (input container: read+list, output container: write)
# Note: you'll need separate tokens per container, or use account-level SAS
SAS_TOKEN=$(az storage container generate-sas \
  --account-name $STORAGE_ACCOUNT \
  --name input \
  --permissions rl \
  --expiry $(date -u -d '+24 hours' +%Y-%m-%dT%H:%MZ) \
  --output tsv)

Azure CLI Authentication (Local Development)

# Log in to Azure (one-time setup)
az login

# Assign yourself Storage Blob Data Contributor role
az role assignment create \
  --assignee $(az ad signed-in-user show --query id -o tsv) \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/$(az account show --query id -o tsv)/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT

# Run — Azure CLI credentials are used automatically
RUN_MODE=batch \
  AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  BATCH_INPUT_PREFIX=scene_001/ \
  BACKEND=mock \
  cargo run --release

Important: az login alone is not enough — you also need the Storage Blob Data Contributor RBAC role on the storage account. Without it, you'll get a 403 Forbidden error even though authentication succeeds.

Managed Identity Authentication (Azure VMs, ACI, AKS)

# Enable system-assigned managed identity on a VM
az vm identity assign --name myVM --resource-group $RESOURCE_GROUP

# Get the VM's identity principal ID
PRINCIPAL_ID=$(az vm show --name myVM --resource-group $RESOURCE_GROUP \
  --query identity.principalId -o tsv)

# Assign Storage Blob Data Contributor role
az role assignment create \
  --assignee $PRINCIPAL_ID \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/$(az account show --query id -o tsv)/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT

# On the VM, run with managed identity
RUN_MODE=batch \
  AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  AZURE_USE_MANAGED_IDENTITY=true \
  BATCH_INPUT_PREFIX=scene_001/ \
  BACKEND=gsplat \
  ./3dgs-processor

Complete Examples

Local Development (Azure CLI)

# Prerequisites: az login, RBAC role assigned
az login

RUN_MODE=batch \
  AZURE_STORAGE_ACCOUNT=my3dgsdata \
  BATCH_INPUT_PREFIX=scene_001/ \
  BACKEND=mock \
  RECONSTRUCTION_BACKEND=colmap \
  FRAME_RATE=2 \
  cargo run --release

Docker with SAS Token

# Generate SAS token
SAS_TOKEN=$(az storage account generate-sas \
  --account-name my3dgsdata \
  --permissions rwdlac --services b --resource-types sco \
  --expiry $(date -u -d '+24 hours' +%Y-%m-%dT%H:%MZ) \
  --output tsv)

# Run in Docker — no privileged mode needed
docker run --rm \
  -e RUN_MODE=batch \
  -e AZURE_STORAGE_ACCOUNT=my3dgsdata \
  -e AZURE_STORAGE_SAS_TOKEN="$SAS_TOKEN" \
  -e BATCH_INPUT_PREFIX=scene_001/ \
  -e BACKEND=mock \
  youracr.azurecr.io/3dgs-processor:cpu

Docker with Managed Identity (Azure VM)

docker run --rm \
  -e RUN_MODE=batch \
  -e AZURE_STORAGE_ACCOUNT=my3dgsdata \
  -e AZURE_USE_MANAGED_IDENTITY=true \
  -e BATCH_INPUT_PREFIX=scene_001/ \
  -e BACKEND=gsplat \
  youracr.azurecr.io/3dgs-processor:gpu

Azure Container Instances (Managed Identity)

az container create \
  --resource-group mygroup \
  --name 3dgs-batch-job \
  --image myregistry.azurecr.io/3dgs-processor:gpu \
  --assign-identity \
  --environment-variables \
    RUN_MODE=batch \
    AZURE_STORAGE_ACCOUNT=my3dgsdata \
    BATCH_INPUT_PREFIX=scene_001/ \
    BACKEND=gsplat \
    AZURE_USE_MANAGED_IDENTITY=true \
    RECONSTRUCTION_BACKEND=colmap \
    FRAME_RATE=2 \
  --restart-policy Never \
  --cpu 8 --memory 16

Azure Container Instances (SAS Token)

# Use --secure-environment-variables for SAS tokens
az container create \
  --resource-group mygroup \
  --name 3dgs-batch-job \
  --image myregistry.azurecr.io/3dgs-processor:gpu \
  --environment-variables \
    RUN_MODE=batch \
    AZURE_STORAGE_ACCOUNT=my3dgsdata \
    BATCH_INPUT_PREFIX=scene_001/ \
    BACKEND=gsplat \
  --secure-environment-variables \
    AZURE_STORAGE_SAS_TOKEN="$SAS_TOKEN" \
  --restart-policy Never \
  --cpu 8 --memory 16

Batch Mode vs Watch Mode

Feature	Watch Mode	Batch Mode
Azure access	BlobFuse2 (FUSE mount)	Azure SDK (direct API)
Privileged container	Yes (`--device /dev/fuse`)	No
Lifecycle	Continuous service	Single job → exit
Trigger	File watcher detects new folders	External orchestration
`INPUT_PATH` etc.	Required local paths	Optional (defaults to /tmp)
Auth: Azure CLI	✗	✓ (default for local dev)
Auth: SAS Token	Via BlobFuse2 config	✓ `AZURE_STORAGE_SAS_TOKEN`
Auth: Managed Identity	Via BlobFuse2 config	✓ `AZURE_USE_MANAGED_IDENTITY=true`
Auth: Connection String	Via BlobFuse2 config	✗ (use SAS token instead)

Troubleshooting Batch Mode

Symptom	Cause	Fix
`AZURE_STORAGE_ACCOUNT is required`	Missing env var	Set `AZURE_STORAGE_ACCOUNT=<name>`
`No blobs found in container 'input' with prefix 'X'`	Wrong prefix or empty container	Check `az storage blob list --container-name input --prefix X`
`403 Forbidden` with Azure CLI auth	Missing RBAC role	Assign Storage Blob Data Contributor role
`403 Forbidden` with SAS token	Token expired or wrong permissions	Regenerate SAS with `rwdlac` permissions on blob service
`401 Unauthorized` with Managed Identity	Identity not assigned or wrong scope	Verify `az vm identity show` and RBAC assignment
Process exits with code 1	Pipeline failure	Check stderr logs; inputs moved to error container
`Failed to create DeveloperToolsCredential`	Not logged in	Run `az login`

Resource Requirements

Minimum Requirements

CPU: 4 cores (8+ cores recommended)
RAM: 8GB (16GB+ recommended for large scenes)
Disk: 100GB (500GB+ recommended for production)
GPU: Optional (required for gaussian-splatting and gsplat backends)

Sizing Guidelines

Resource needs scale with:

Video Resolution: 4K requires 2-3x resources vs. 1080p
Video Count: 5+ videos per job require more memory
Training Iterations: Higher iterations = longer GPU time
Concurrent Jobs: Sequential processing (1 at a time)

Example Profiles:

Profile	Videos	Resolution	CPU	RAM	Disk	GPU
Small	2-3	720p-1080p	4	8GB	100GB	Optional
Medium	3-5	1080p-1440p	8	16GB	250GB	Recommended
Large	5+	1440p-4K	16	32GB	500GB	Required

Disk I/O Optimization

Use SSDs for input/output paths (10x faster frame extraction)
Separate volumes for input, output, and processed (parallel I/O)
Local storage preferred over network mounts for performance
NVMe drives ideal for high-throughput scenarios

Storage Configuration

Local Filesystem

Recommended directory layout:

/data/
├── input/          # Active uploads (fast SSD)
├── output/         # Processing outputs (fast SSD)
├── processed/      # Archive (slower HDD acceptable)
└── error/          # Failed jobs (manual review)

Mount with appropriate permissions:

docker run -d \
  -v /data/input:/data/input \
  -v /data/output:/data/output \
  -v /data/processed:/data/processed:ro \  # Read-only for safety
  -v /data/error:/data/error \
  ...

Network File Systems

NFS Configuration:

volumes:
  - type: nfs
    source: nfs-server.example.com:/export/3dgs/input
    target: /data/input
    volume:
      nocopy: true
      o: "addr=nfs-server.example.com,rw,sync"

Note: Use poll_watcher for NFS mounts (set WATCHER_MODE=poll).

Azure Files

# Create file share
az storage share create --name 3dgs-input --quota 1024

# Mount in container
docker run -d \
  -v 3dgs-input:/data/input \
  --mount type=volume,dst=/data/input,volume-driver=azure,volume-opt=share=3dgs-input \
  ...

Azure Blob Storage Setup

Prerequisites

Azure Subscription - Active subscription (free tier works for testing)
Azure Storage Account - General Purpose v2 with Blob Storage
Azure CLI - For setup and management (az login required)
Authentication - One of: Connection String, SAS Token, Azure AD, or Managed Identity
Linux Container Host - Blobfuse2 requires Linux kernel (not available on macOS)

Quick Setup (For Testing)

Use the provided setup script to create test resources:

# Clone repository
cd 3DGS-accelerator

# Create Azure resources (auto-generates unique storage account name)
./scripts/azure-setup.sh

# Load credentials
source azure-test-config.env

# Run test to validate
./scripts/azure-test.sh

# Cleanup when done
./scripts/azure-cleanup.sh --delete-all

For production deployment, continue with the sections below.

Azure Storage Account Setup

1. Create Storage Account

# Set variables
RESOURCE_GROUP="<your-resource-group>"
LOCATION="eastus"
STORAGE_ACCOUNT="3dgsprodstorage"  # Must be globally unique, lowercase, 3-24 chars

# Create resource group
az group create \
  --name $RESOURCE_GROUP \
  --location $LOCATION

# Create storage account
az storage account create \
  --name $STORAGE_ACCOUNT \
  --resource-group $RESOURCE_GROUP \
  --location $LOCATION \
  --sku Standard_LRS \
  --kind StorageV2 \
  --https-only true \
  --min-tls-version TLS1_2

2. Create Blob Containers

# Option A: Using connection string (if shared key enabled)
CONN_STRING=$(az storage account show-connection-string \
  --name $STORAGE_ACCOUNT \
  --resource-group $RESOURCE_GROUP \
  --query connectionString -o tsv)

for container in input output processed error; do
  az storage container create \
    --name $container \
    --connection-string "$CONN_STRING"
done

# Option B: Using Azure AD authentication (enterprise environments)
for container in input output processed error; do
  az storage container create \
    --name $container \
    --account-name $STORAGE_ACCOUNT \
    --auth-mode login
done

Container Purpose:

input/ - Active uploads waiting for processing
output/ - Generated PLY/SPLAT models and manifests
processed/ - Archive of successfully processed jobs
error/ - Failed jobs for manual review

Authentication Methods

The processor supports four authentication methods. Choose based on your security requirements:

1. Connection String (Development/Testing)

Best for: Local testing, simple setups
Security: ⚠️ Highest privilege - protect carefully
Availability: ❌ Not available if shared key access disabled

# Get connection string
az storage account show-connection-string \
  --name $STORAGE_ACCOUNT \
  --resource-group $RESOURCE_GROUP \
  --query connectionString -o tsv

# Use in container
docker run -d --privileged \
  --name 3dgs-processor \
  -e AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=3dgsprodstorage;AccountKey=abc123...;EndpointSuffix=core.windows.net" \
  -e AZURE_BLOB_CONTAINER_INPUT=input \
  -e AZURE_BLOB_CONTAINER_OUTPUT=output \
  -e AZURE_BLOB_CONTAINER_PROCESSED=processed \
  -e AZURE_BLOB_CONTAINER_ERROR=error \
  youracr.azurecr.io/3dgs-processor:gpu

2. SAS Token (Time-Limited)

Best for: Temporary access, third-party integrations
Security: ✅ Time-limited, scoped permissions
Availability: ⚠️ Account SAS requires shared keys; User delegation SAS works with Azure AD

Generate User Delegation SAS (Azure AD-based):

# Requires Storage Blob Data Contributor role
EXPIRY="2026-12-31"

# Get user delegation SAS (works even with shared key disabled)
az storage container generate-sas \
  --name input \
  --account-name $STORAGE_ACCOUNT \
  --permissions racwdl \
  --expiry $EXPIRY \
  --auth-mode login \
  --as-user

# Output: se=2026-12-31&sp=racwdl&sv=2021-06-08&sr=c&sig=...

# Use in container
docker run -d --privileged \
  --name 3dgs-processor \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_STORAGE_SAS_TOKEN="se=2026-12-31&sp=racwdl&sv=2021-06-08&sr=c&sig=..." \
  -e AZURE_BLOB_CONTAINER_INPUT=input \
  youracr.azurecr.io/3dgs-processor:gpu

Best Practices:

Set expiry to shortest necessary duration (e.g., 7 days for testing, 90 days for production)
Use user delegation SAS (Azure AD-based) instead of account SAS
Rotate SAS tokens regularly
Store in secrets manager (Azure Key Vault), not in code

3. Azure AD Authentication (Enterprise) ⭐ Recommended

Best for: Enterprise environments, security-first organizations
Security: ✅ No shared keys, identity-based, full audit trail
Availability: ✅ Works when shared key access is disabled

Setup:

# 1. Assign role to user/service principal/managed identity
USER_ID=$(az ad signed-in-user show --query id -o tsv)

az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee $USER_ID \
  --scope "/subscriptions/<subscription-id>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

# 2. Login to Azure (for local testing)
az login

# 3. Run container with Azure AD
docker run -d --privileged \
  --name 3dgs-processor \
  -v $HOME/.azure:/root/.azure:ro \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_USE_AZURE_AD=true \
  youracr.azurecr.io/3dgs-processor:gpu

For CI/CD (Service Principal):

# Create service principal
SP=$(az ad sp create-for-rbac --name "3dgs-processor-sp" --role "Storage Blob Data Contributor" --scopes "/subscriptions/<sub>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT")

# Use credentials in container
docker run -d --privileged \
  -e AZURE_TENANT_ID=$(echo $SP | jq -r .tenant) \
  -e AZURE_CLIENT_ID=$(echo $SP | jq -r .appId) \
  -e AZURE_CLIENT_SECRET=$(echo $SP | jq -r .password) \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  youracr.azurecr.io/3dgs-processor:gpu

4. Managed Identity (Production Azure Deployment) ⭐ Most Secure

Best for: Azure VM, Azure Container Instances, AKS
Security: ✅✅ No credentials, Azure-managed, most secure
Availability: Only in Azure environments (VM/ACI/AKS)

Azure Container Instances Example:

# 1. Create container with managed identity
az container create \
  --resource-group $RESOURCE_GROUP \
  --name 3dgs-processor \
  --image youracr.azurecr.io/3dgs-processor:gpu \
  --cpu 8 --memory 16 \
  --assign-identity [system] \
  --environment-variables \
    AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
    AZURE_USE_MANAGED_IDENTITY=true

# 2. Get managed identity ID
IDENTITY_ID=$(az container show \
  --resource-group $RESOURCE_GROUP \
  --name 3dgs-processor \
  --query identity.principalId -o tsv)

# 3. Grant storage access
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee-object-id $IDENTITY_ID \
  --scope "/subscriptions/<sub>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

No credentials needed in container! Azure manages authentication automatically.

Enterprise Environments (Shared Key Disabled)

Many organizations disable shared key access for compliance (SOC 2, ISO 27001, etc.):

# Check if shared key is disabled
az storage account show \
  --name $STORAGE_ACCOUNT \
  --resource-group $RESOURCE_GROUP \
  --query allowSharedKeyAccess -o tsv
# Output: false

If shared key is disabled:

✅ Azure AD authentication - Works (use --auth-mode login)
✅ User delegation SAS - Works (Azure AD-based)
✅ Managed Identity - Works (Azure AD-based)
❌ Connection string - Blocked
❌ Account SAS - Blocked

Required Setup:

# 1. Assign required role
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee <user-or-sp-object-id> \
  --scope "/subscriptions/<sub>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

# 2. All Azure CLI commands must use --auth-mode login
az storage blob upload \
  --account-name $STORAGE_ACCOUNT \
  --container-name input \
  --name test.mp4 \
  --file test.mp4 \
  --auth-mode login  # Required!

# 3. Container uses Azure AD or managed identity (see methods 3 & 4 above)

Blobfuse2 Container Mount (Linux Only)

Requirements:

Linux kernel (not available on macOS/Windows)
Privileged container (requires FUSE kernel module access)
AMD64 architecture (blobfuse2 not available for ARM64)

Basic Mount:

docker run -d --privileged \
  --name 3dgs-processor \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_STORAGE_CONNECTION_STRING="..." \
  -e INPUT_PATH=/mnt/blobfuse/input \
  -e OUTPUT_PATH=/mnt/blobfuse/output \
  -e PROCESSED_PATH=/mnt/blobfuse/processed \
  -e ERROR_PATH=/mnt/blobfuse/error \
  youracr.azurecr.io/3dgs-processor:gpu

Performance Tuning:

# Configure caching for better performance
docker run -d --privileged \
  --name 3dgs-processor \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_STORAGE_CONNECTION_STRING="..." \
  -e BLOBFUSE_CACHE_SIZE_MB=10240 \
  -e BLOBFUSE_FILE_CACHE_TIMEOUT_SECS=120 \
  -e BLOBFUSE_ATTR_CACHE_TIMEOUT_SECS=60 \
  -e BLOBFUSE_ENABLE_STREAMING=true \
  youracr.azurecr.io/3dgs-processor:gpu

Cache Configuration:

BLOBFUSE_CACHE_SIZE_MB: Local disk cache size (default: 1024, recommend: 10240 for production)
BLOBFUSE_FILE_CACHE_TIMEOUT_SECS: File cache validity (default: 120)
BLOBFUSE_ATTR_CACHE_TIMEOUT_SECS: Attribute cache (default: 60)
BLOBFUSE_ENABLE_STREAMING: Stream large files (default: false)

Why Privileged? Blobfuse2 requires CAP_SYS_ADMIN to mount FUSE filesystems.

Production Deployment Examples

Azure Container Instances (Recommended for Production)

Deployment with Managed Identity (Most Secure):

# Set variables
RESOURCE_GROUP="<your-resource-group>"
LOCATION="eastus"
CONTAINER_NAME="3dgs-processor"
ACR_NAME="yourcontainerregistry"
IMAGE="$ACR_NAME.azurecr.io/3dgs-processor:gpu"
STORAGE_ACCOUNT="3dgsprodstorage"

# Create container with managed identity
az container create \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --image $IMAGE \
  --cpu 8 \
  --memory 16 \
  --os-type Linux \
  --restart-policy Always \
  --assign-identity [system] \
  --acr-identity [system] \
  --environment-variables \
    INPUT_PATH=/mnt/blobfuse/input \
    OUTPUT_PATH=/mnt/blobfuse/output \
    PROCESSED_PATH=/mnt/blobfuse/processed \
    ERROR_PATH=/mnt/blobfuse/error \
    AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
    AZURE_USE_MANAGED_IDENTITY=true \
    BACKEND=gsplat \
    LOG_LEVEL=info \
    HEALTH_CHECK_ENABLED=true \
    HEALTH_CHECK_PORT=8080 \
    RETENTION_DAYS=30 \
    MAX_RETRIES=3 \
  --ports 8080

# Get managed identity principal ID
IDENTITY_ID=$(az container show \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --query identity.principalId -o tsv)

# Grant storage access to managed identity
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee-object-id $IDENTITY_ID \
  --scope "/subscriptions/<subscription-id>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

# Grant ACR pull access (if using private ACR)
az role assignment create \
  --role "AcrPull" \
  --assignee-object-id $IDENTITY_ID \
  --scope "/subscriptions/<subscription-id>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.ContainerRegistry/registries/$ACR_NAME"

echo "Waiting 2 minutes for role assignments to propagate..."
sleep 120

# Restart container to apply permissions
az container restart --resource-group $RESOURCE_GROUP --name $CONTAINER_NAME

Monitor Deployment:

# View logs (follow mode)
az container logs \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --follow

# Check health endpoint
CONTAINER_IP=$(az container show \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --query ipAddress.ip -o tsv)

curl http://$CONTAINER_IP:8080/health

# Check container status
az container show \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --query '{state:instanceView.state,cpu:containers[0].resources.requests.cpu,memory:containers[0].resources.requests.memoryInGB,restartCount:instanceView.currentState.restartCount}'

# Execute command inside container
az container exec \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --exec-command "/bin/bash"

Update Deployment:

# Delete old container
az container delete \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME \
  --yes

# Deploy new version (repeat create command with new image tag)

Cost Optimization:

# Use spot instances for development (60-90% cheaper)
az container create \
  --resource-group $RESOURCE_GROUP \
  --name $CONTAINER_NAME-dev \
  --image $IMAGE \
  --priority Spot \
  --cpu 4 \
  --memory 8 \
  ...

# Set restart policy for one-time jobs
--restart-policy OnFailure  # Only restart on failure, not on success

Azure Virtual Machine (For GPU Workloads)

Create GPU-enabled VM:

RESOURCE_GROUP="<your-resource-group>"
VM_NAME="3dgs-gpu-vm"
LOCATION="eastus"
STORAGE_ACCOUNT="3dgsprodstorage"

# Create VM with GPU (NC-series for CUDA)
az vm create \
  --resource-group $RESOURCE_GROUP \
  --name $VM_NAME \
  --location $LOCATION \
  --size Standard_NC6s_v3 \
  --image Ubuntu2204 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --assign-identity [system]

# Get managed identity
IDENTITY_ID=$(az vm show \
  --resource-group $RESOURCE_GROUP \
  --name $VM_NAME \
  --query identity.principalId -o tsv)

# Grant storage access
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee-object-id $IDENTITY_ID \
  --scope "/subscriptions/<sub>/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

# SSH into VM
VM_IP=$(az vm show \
  --resource-group $RESOURCE_GROUP \
  --name $VM_NAME \
  --show-details \
  --query publicIps -o tsv)

ssh azureuser@$VM_IP

Setup VM:

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER

# Install NVIDIA drivers (for GPU)
sudo apt update
sudo apt install -y ubuntu-drivers-common
sudo ubuntu-drivers autoinstall
sudo reboot

# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker

# Verify GPU
nvidia-smi
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Run Container on VM:

# Login to Azure from VM (for managed identity)
az login --identity

# Run container with GPU
docker run -d --privileged \
  --name 3dgs-processor \
  --gpus all \
  --restart unless-stopped \
  -v /mnt/blobfuse:/mnt/blobfuse \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_USE_MANAGED_IDENTITY=true \
  -e INPUT_PATH=/mnt/blobfuse/input \
  -e OUTPUT_PATH=/mnt/blobfuse/output \
  -e PROCESSED_PATH=/mnt/blobfuse/processed \
  -e ERROR_PATH=/mnt/blobfuse/error \
  -e BACKEND=gsplat \
  -e LOG_LEVEL=info \
  youracr.azurecr.io/3dgs-processor:gpu

# Monitor
docker logs -f 3dgs-processor

Auto-start on VM reboot:

# Create systemd service
sudo tee /etc/systemd/system/3dgs-processor.service <<EOF
[Unit]
Description=3DGS Video Processor
Requires=docker.service
After=docker.service

[Service]
Restart=always
ExecStart=/usr/bin/docker start -a 3dgs-processor
ExecStop=/usr/bin/docker stop -t 10 3dgs-processor

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable 3dgs-processor
sudo systemctl start 3dgs-processor
sudo systemctl status 3dgs-processor

Kubernetes / AKS (For Enterprise Scale)

See Kubernetes Deployment section above for full deployment manifests.

AKS-specific quick start:

# Create AKS cluster with managed identity
az aks create \
  --resource-group $RESOURCE_GROUP \
  --name 3dgs-aks-cluster \
  --node-count 2 \
  --node-vm-size Standard_D8s_v3 \
  --enable-managed-identity \
  --generate-ssh-keys

# Get credentials
az aks get-credentials \
  --resource-group $RESOURCE_GROUP \
  --name 3dgs-aks-cluster

# Enable Azure AD workload identity (for pod-level managed identity)
az aks update \
  --resource-group $RESOURCE_GROUP \
  --name 3dgs-aks-cluster \
  --enable-oidc-issuer \
  --enable-workload-identity

# Deploy application
kubectl create namespace 3dgs
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/pvc.yaml

# Monitor
kubectl logs -f deployment/3dgs-processor -n 3dgs
kubectl get pods -n 3dgs

Scale down when idle:

# Manual scale
kubectl scale deployment/3dgs-processor --replicas=0 -n 3dgs  # Stop
kubectl scale deployment/3dgs-processor --replicas=1 -n 3dgs  # Start

# Auto-scale based on CPU (future enhancement)
kubectl autoscale deployment/3dgs-processor \
  --cpu-percent=50 \
  --min=0 \
  --max=3 \
  -n 3dgs

Security Hardening

Container Security

Run as Non-Root User (when possible):

USER 1000:1000

Read-Only Root Filesystem:

security_opt:
  - no-new-privileges:true
read_only: true
tmpfs:
  - /tmp
  - /var/tmp

Limit Capabilities:

cap_drop:
  - ALL
cap_add:
  - SYS_ADMIN  # Required for Blobfuse2 mounting

Secret Management

Never hardcode credentials. Use secrets management:

# Docker Compose with secrets
secrets:
  azure_connection_string:
    file: ./secrets/azure_conn.txt

services:
  3dgs-processor:
    secrets:
      - azure_connection_string
    environment:
      AZURE_STORAGE_CONNECTION_STRING_FILE: /run/secrets/azure_connection_string

Kubernetes Secrets:

kubectl create secret generic azure-storage \
  --from-literal=connection-string="DefaultEndpointsProtocol=..." \
  -n 3dgs

# Reference in deployment
env:
- name: AZURE_STORAGE_CONNECTION_STRING
  valueFrom:
    secretKeyRef:
      name: azure-storage
      key: connection-string

Network Security

Restrict Ingress: Only health check port (8080) if needed
Use Private Endpoints: Azure Storage private endpoints
Firewall Rules: Whitelist container IPs in storage account

Credential Redaction

Logs automatically redact Azure credentials:

[INFO] Mounting Azure Blob Storage: account=mystorageaccount, container=3dgs-input, auth=[REDACTED]

Monitoring and Logging

Structured Logging

Logs are JSON-formatted for easy parsing:

{
  "timestamp": "2024-01-01T12:00:00Z",
  "level": "INFO",
  "target": "three_dgs_processor::processor::job",
  "message": "Job completed successfully",
  "job_id": "scene_001",
  "duration_secs": 245.3,
  "frames_extracted": 450,
  "gaussian_count": 150000
}

Log Aggregation

ELK Stack:

logging:
  driver: "fluentd"
  options:
    fluentd-address: "fluentd.example.com:24224"
    tag: "3dgs-processor"

Azure Monitor:

docker run -d \
  --log-driver=azuremonitor \
  --log-opt WorkspaceId=<workspace-id> \
  --log-opt WorkspaceKey=<workspace-key> \
  ...

Metrics

Key metrics to monitor:

Job Success Rate: processed/ vs. error/ folder counts
Processing Duration: Time from upload to output
Disk Usage: /data partition utilization
Memory Usage: Container RSS
Frame Extraction Rate: Frames/second
Training Loss: Final loss value per job

Prometheus Integration (future):

# Expose metrics endpoint
-e METRICS_PORT=9090

# Scrape config
scrape_configs:
  - job_name: '3dgs-processor'
    static_configs:
      - targets: ['processor:9090']

Health Checks

HTTP health endpoint:

curl http://localhost:8080/health

# Response
{
  "status": "healthy",
  "uptime_secs": 86400,
  "jobs_processed": 42,
  "disk_free_gb": 150.5
}

Backup and Disaster Recovery

Backup Strategy

Configuration: Version control config.yaml and env files
Processed Data: Archive to cold storage after retention period
Error Folder: Backup before manual deletion
Logs: Retain for 30-90 days

Disaster Recovery

Restart Resilience:

Service checks processed/ and error/ on startup
Duplicate detection prevents reprocessing
In-progress jobs resume from last checkpoint

Data Loss Prevention:

Use Azure Blob Storage geo-redundant replication
Regular snapshots of persistent volumes
Separate processed/error archives from active processing

Scaling Considerations

Horizontal Scaling: Not recommended (sequential processing model)

Vertical Scaling: Increase CPU/RAM/GPU for faster processing

Alternative Approach: Deploy multiple instances with different input folders

# Instance 1
-e INPUT_PATH=/data/input-1

# Instance 2
-e INPUT_PATH=/data/input-2

Route uploads to different instances based on load balancing.

Azure Troubleshooting

Authentication Errors

Error: Key based authentication is not permitted on this storage account

Cause: Shared key access is disabled (enterprise security policy)

Solution: Use Azure AD authentication or managed identity:

# Option 1: Azure AD
docker run -d --privileged \
  -v $HOME/.azure:/root/.azure:ro \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  -e AZURE_USE_AZURE_AD=true \
  youracr.azurecr.io/3dgs-processor:gpu

# Option 2: Service Principal
docker run -d --privileged \
  -e AZURE_TENANT_ID=$TENANT_ID \
  -e AZURE_CLIENT_ID=$CLIENT_ID \
  -e AZURE_CLIENT_SECRET=$CLIENT_SECRET \
  -e AZURE_STORAGE_ACCOUNT=$STORAGE_ACCOUNT \
  youracr.azurecr.io/3dgs-processor:gpu

Error: AuthorizationFailed or This request is not authorized to perform this operation

Cause: Missing role assignment

Solution: Grant Storage Blob Data Contributor role:

# For user
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee $(az ad signed-in-user show --query id -o tsv) \
  --scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<account>"

# For service principal
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee <sp-object-id> \
  --scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<account>"

# For managed identity (from ACI/VM)
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee-object-id <managed-identity-principal-id> \
  --scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<account>"

Wait 5 minutes for role propagation before retrying.

Error: SAS token expired or Server failed to authenticate the request

Cause: SAS token expired or malformed

Solution: Generate new SAS token:

# User delegation SAS (works with shared key disabled)
az storage container generate-sas \
  --name input \
  --account-name $STORAGE_ACCOUNT \
  --permissions racwdl \
  --expiry 2026-12-31 \
  --auth-mode login \
  --as-user

Blobfuse2 Mount Errors

Error: fusermount: command not found or FUSE device not found

Cause: Running on macOS or Windows (FUSE requires Linux kernel)

Solution: Deploy to Linux environment:

Azure Container Instances (Linux)
Azure VM (Ubuntu/Debian)
Local Linux machine via Docker/Podman

Error: operation not permitted when mounting

Cause: Container not running in privileged mode

Solution: Add --privileged flag:

docker run -d --privileged ...  # Required for FUSE

Error: blobfuse2: No such file or directory

Cause: Running ARM64 image (blobfuse2 only available on AMD64)

Solution: Use AMD64 image:

docker pull --platform linux/amd64 youracr.azurecr.io/3dgs-processor:cpu

Error: Slow mount performance or timeouts

Cause: Insufficient cache configuration

Solution: Increase cache size:

# Add caching environment variables
-e BLOBFUSE_CACHE_SIZE_MB=10240 \
-e BLOBFUSE_FILE_CACHE_TIMEOUT_SECS=120

Container Deployment Issues

Error: No space left on device

Cause: Insufficient disk space in container or host

Solution:

# 1. Check host disk space
df -h

# 2. Clean old containers/images
docker system prune -a

# 3. Increase container disk quota (if using overlay2)
docker run -d --storage-opt size=100G ...

# 4. Check retention policy removes old jobs
-e RETENTION_DAYS=7 \
-e MIN_DISK_SPACE_GB=50

Error: Container exits immediately or processor crashes on startup

Cause: Missing required environment variables

Solution: Check logs and verify minimum required variables:

# View logs
docker logs 3dgs-processor

# Minimum required variables
-e INPUT_PATH=/mnt/blobfuse/input \
-e OUTPUT_PATH=/mnt/blobfuse/output \
-e PROCESSED_PATH=/mnt/blobfuse/processed \
-e ERROR_PATH=/mnt/blobfuse/error \
-e AZURE_STORAGE_ACCOUNT=<account> \
-e AZURE_STORAGE_CONNECTION_STRING=<connection-string>  # OR other auth method

Network and Connectivity

Error: Connection timeout or Unable to connect to Azure

Cause: Firewall, private endpoint, or network policy

Solution:

# 1. Check firewall rules on storage account
az storage account show \
  --name $STORAGE_ACCOUNT \
  --query networkRuleSet

# 2. Add container IP to whitelist (if restricted)
az storage account network-rule add \
  --account-name $STORAGE_ACCOUNT \
  --ip-address <container-public-ip>

# 3. Enable service endpoint (for ACI/AKS)
az storage account update \
  --name $STORAGE_ACCOUNT \
  --default-action Deny \
  --bypass AzureServices

# 4. Use private endpoint (VNet deployment)
az network private-endpoint create \
  --name 3dgs-private-endpoint \
  --resource-group $RESOURCE_GROUP \
  --vnet-name <vnet> --subnet <subnet> \
  --private-connection-resource-id <storage-account-id> \
  --group-id blob \
  --connection-name 3dgs-blob-connection

Error: DNS resolution failed for storage account

Cause: Private endpoint DNS not configured

Solution: Configure private DNS zone:

az network private-dns zone create \
  --resource-group $RESOURCE_GROUP \
  --name privatelink.blob.core.windows.net

az network private-dns link vnet create \
  --resource-group $RESOURCE_GROUP \
  --zone-name privatelink.blob.core.windows.net \
  --name 3dgs-dns-link \
  --virtual-network <vnet> \
  --registration-enabled false

Performance Issues

Slow blob upload/download

Causes & Solutions:

Geographic distance: Deploy container in same region as storage account

# Check storage account location
az storage account show --name $STORAGE_ACCOUNT --query location

# Deploy to same region
az container create --location <same-region> ...

Insufficient bandwidth: Upgrade storage account tier

# Upgrade to Premium (better IOPS)
az storage account update \
  --name $STORAGE_ACCOUNT \
  --sku Premium_LRS

Cache not configured: Add blobfuse2 cache settings (see above)

Concurrent operations throttled: Azure storage limits operations per second

# Monitor throttling
az monitor metrics list \
  --resource <storage-account-id> \
  --metric "Throttling"

Debugging Commands

Check container status:

# Container logs
docker logs 3dgs-processor --tail 100 --follow

# Container exec for debugging
docker exec -it 3dgs-processor /bin/bash

# Check mounted blobfuse
df -h  # Inside container
mount | grep blobfuse

# Test Azure connectivity
az storage blob list \
  --account-name $STORAGE_ACCOUNT \
  --container-name input \
  --auth-mode login

Check Azure resources:

# Storage account status
az storage account show \
  --name $STORAGE_ACCOUNT \
  --query '{name:name, location:location, sku:sku.name, allowSharedKey:allowSharedKeyAccess}'

# List containers
az storage container list \
  --account-name $STORAGE_ACCOUNT \
  --auth-mode login

# Check role assignments
az role assignment list \
  --scope "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/$STORAGE_ACCOUNT"

# View activity logs (recent operations)
az monitor activity-log list \
  --resource-group $RESOURCE_GROUP \
  --offset 1h

Getting Help

If issues persist:

Review Documentation:
- TROUBLESHOOTING.md - General troubleshooting
- scripts/AZURE_TESTING.md - Azure test scripts
- USER_GUIDE.md - Configuration reference

Test with Scripts:

# Validate Azure setup
./scripts/azure-setup.sh
source azure-test-config.env
./scripts/azure-test.sh

Check Logs:
- Container logs: docker logs 3dgs-processor
- Azure Activity Logs: Azure Portal → Storage Account → Activity log
- Health endpoint: curl http://localhost:8080/health
File Issue:
- GitHub Issues: Include logs, configuration (redact credentials), and error messages

FilesExpand file tree

DEPLOYMENT.md

Latest commit

History

DEPLOYMENT.md

File metadata and controls

Deployment Guide: 3DGS Video Processor

Table of Contents

Container Deployment

Docker Compose (Recommended for Single Server)

Kubernetes Deployment

Azure Container Instances

Azure Container Apps Job (GPU) — azd

Prerequisites

Quick Start

What Gets Provisioned

RBAC Requirements — Read This Carefully

Permissions Required by the Deployer (the person running azd provision)

Permissions Required for the Managed Identity (assigned after provisioning)

Who Can Assign These Roles?

Assigning RBAC Roles

Verifying RBAC Roles

What Fails Without RBAC

Uploading Test Data

Running the Job

Redeploying Code Changes

Configuration Variables

GPU Region Availability

Cleaning Up RBAC Before Teardown

Scripts Reference

Troubleshooting

Reusing Existing Azure Resources

Which Resources Can Be Reused?

Configuring Existing Resources (Interactive)

Configuring Existing Resources (Manual)

Resetting to Fresh Resources

Teardown Protection (predown Hook)

How It Works (Bicep)

Full Quickstart and Teardown

Prerequisites

Quickstart: Fresh Deployment (New Resources)

Quickstart: Deployment with Existing Resources

Teardown: Fresh Deployment

Teardown: Deployment with Existing Resources

Batch Mode (Azure SDK)

How It Works

Step-by-Step Setup

1. Create Azure Storage Resources

2. Upload Input Videos

3. Choose Authentication Method

4. Run the Processor

5. Check Outputs

Required Environment Variables

Optional Environment Variables

Authentication

SAS Token Authentication

Azure CLI Authentication (Local Development)

Managed Identity Authentication (Azure VMs, ACI, AKS)

Complete Examples

Local Development (Azure CLI)

Docker with SAS Token

Docker with Managed Identity (Azure VM)

Azure Container Instances (Managed Identity)

Azure Container Instances (SAS Token)

Batch Mode vs Watch Mode

Troubleshooting Batch Mode

Resource Requirements

Minimum Requirements

Sizing Guidelines

Disk I/O Optimization

Storage Configuration

Local Filesystem

Network File Systems

Azure Files

Azure Blob Storage Setup

Prerequisites

Quick Setup (For Testing)

Azure Storage Account Setup

1. Create Storage Account

2. Create Blob Containers

Permissions Required by the Deployer (the person running `azd provision`)

Teardown Protection (`predown` Hook)