diff --git a/README.md b/README.md index 1c66db11..ac63cadd 100644 --- a/README.md +++ b/README.md @@ -62,9 +62,11 @@ The folder `examples` contains the following Terraform implementation examples : | AWS | [aws-databricks-uc-bootstrap](examples/aws-databricks-uc-bootstrap/) | AWS UC | | AWS | [aws-remote-backend-infra](examples/aws-remote-backend-infra/) | Simple example on remote backend | | AWS | [aws-workspace-config](examples/aws-workspace-config/) | Configure workspace objects | -| GCP | [gcp-sa-provisionning](examples/gcp-sa-provisionning/) | Provisionning of the identity with the permissions required to deploy on GCP. | -| GCP | [gcp-basic](examples/gcp-basic/) | Workspace Deployment with managed vpc | -| GCP | [gcp-byovpc](examples/gcp-byovpc/) | Workspace Deployment with customer-managed vpc | +| GCP | [gcp-sa-provisioning](examples/gcp-sa-provisioning/) | Provisioning the identity (service account) with permissions required to deploy on GCP | +| GCP | [gcp-basic](examples/gcp-basic/) | Workspace deployment with Databricks-managed VPC | +| GCP | [gcp-byovpc](examples/gcp-byovpc/) | Workspace deployment with customer-managed VPC (Terraform creates the VPC) | +| GCP | [gcp-existing-vpc](examples/gcp-existing-vpc/) | Workspace deployment into a pre-existing VPC | +| GCP | [gcp-with-psc-exfiltration-protection](examples/gcp-with-psc-exfiltration-protection/) | Workspace with PrivateLink (PSC), private DNS, and restricted egress (hub-and-spoke topology) | ### Modules The folder `modules` contains the following Terraform modules : @@ -89,9 +91,13 @@ The folder `modules` contains the following Terraform modules : | AWS | [aws-workspace-with-firewall](modules/aws-workspace-with-firewall/) | Provisioning AWS Databricks E2 with an AWS Firewall | | AWS | [aws-exfiltration-protection](modules/aws-exfiltration-protection/) | An implementation of [Data Exfiltration Protection on AWS](https://www.databricks.com/blog/2021/02/02/data-exfiltration-protection-with-databricks-on-aws.html) | | AWS | aws-workspace-with-private-link | Coming soon | -| GCP | [gcp-sa-provisionning](modules/gcp-sa-provisionning/) | Provisions the identity (SA) with the correct permissions | -| GCP | [gcp-workspace-basic](modules/gcp-workspace-basic/) | Provisions a workspace with managed VPC | -| GCP | [gcp-workspace-byovpc](modules/gcp-workspace-byovpc/) | Workspace with customer-managed VPC. | +| GCP | [gcp/databricks-workspace](modules/gcp/databricks-workspace/) | Composer that orchestrates network, PSC, account, and DNS submodules based on scenario flags | +| GCP | [gcp/network](modules/gcp/network/) | VPC, subnet, router, NAT, peering, and shared-VPC binding (create or data-source lookup) | +| GCP | [gcp/private-connectivity](modules/gcp/private-connectivity/) | PSC endpoints (frontend, backend, hub-transit) and restricted-egress firewall rules | +| GCP | [gcp/account](modules/gcp/account/) | All databricks_mws_* resources: networks, workspaces, vpc_endpoint, private_access_settings | +| GCP | [gcp/dns](modules/gcp/dns/) | Private DNS zones (gcp.databricks.com, gcr.io, googleapis.com, pkg.dev) for restricted-egress workspaces | +| GCP | [gcp/service-account](modules/gcp/service-account/) | Service account with the IAM permissions required to provision Databricks workspaces | +| GCP | [gcp/unity-catalog](modules/gcp/unity-catalog/) | Metastore, GCS bucket, storage credential, external location, and default catalog | ### CI/CD pipelines The `cicd-pipelines` folder contains the following implementation examples of pipeline: diff --git a/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md b/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md new file mode 100644 index 00000000..9c31deeb --- /dev/null +++ b/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md @@ -0,0 +1,3822 @@ +# GCP Modules Refactor Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Replace three duplicated GCP workspace modules with one composer (`modules/gcp/databricks-workspace`) that orchestrates five focused submodules (`network`, `private-connectivity`, `account`, `dns`, plus relocated `service-account` and `unity-catalog`). Migrate existing GCP examples one at a time onto the composer and add a new "existing VPC" example. + +**Architecture:** Composer reads orthogonal feature flags (`vpc_source`, `private_link_frontend`, `private_link_backend`, `private_access_only`, `restricted_egress`) and conditionally instantiates submodules via `count`. Dependency graph is linear: `network → private-connectivity → account → dns`. All `databricks_mws_*` resources live in `account`; all GCP-side PSC resources live in `private-connectivity`; DNS is split out because it depends on `account.workspace_url`. + +**Tech Stack:** Terraform >= 1.5, `hashicorp/google` provider, `databricks/databricks` provider, `terraform-docs`, `pre-commit`. No new tooling. + +**Spec reference:** `docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md` + +**Branch:** `feature/gcp-modules-refactor` (already created; spec committed as `2bfd9bd`). + +--- + +## File Structure + +This plan creates the following new tree (incremental — each task creates one slice): + +``` +docs/superpowers/ # already exists + ├── specs/2026-05-14-gcp-modules-refactor-design.md # already committed + └── plans/2026-05-14-gcp-modules-refactor.md # this file + +modules/gcp/ + ├── Makefile # Task 1 — recursive docs/test_docs + ├── databricks-workspace/ # Task 14–17 — composer + │ ├── main.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md # terraform-docs generates + │ ├── Makefile + │ └── tests/ # plan-time validation fixtures + │ ├── basic/main.tf + │ ├── byovpc/main.tf + │ ├── existing-vpc/main.tf + │ ├── psc-isolated/main.tf + │ └── negative-*/main.tf # expect plan failure + ├── network/ # Task 3–5 + │ ├── main.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── create/main.tf + │ ├── existing/main.tf + │ └── create-with-hub/main.tf + ├── private-connectivity/ # Task 6–8 + │ ├── psc.tf + │ ├── firewall.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── locals.tf # regional PSC + hive metastore maps + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── frontend-only/main.tf + │ ├── full-isolated/main.tf + │ └── no-egress/main.tf + ├── account/ # Task 9–13 + │ ├── main.tf # mws_networks + mws_workspaces + │ ├── vpc-endpoints.tf # mws_vpc_endpoint + │ ├── pas.tf # mws_private_access_settings + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── databricks-managed/main.tf + │ ├── byovpc/main.tf + │ └── psc-with-pas/main.tf + ├── dns/ # Task 18–19 + │ ├── hub.tf + │ ├── spoke.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/hub-and-spoke/main.tf + ├── service-account/ # Task 20 (git mv from modules/gcp-sa-provisioning) + └── unity-catalog/ # Task 21 (git mv from modules/gcp-unity-catalog) + +modules/gcp-sa-provisioning/ # Task 20 — replaced with deprecation README + └── README.md + +modules/gcp-unity-catalog/ # Task 21 — replaced with deprecation README + └── README.md + +examples/gcp-basic/ # Task 24 — migrated +examples/gcp-byovpc/ # Task 25 — migrated +examples/gcp-with-psc-exfiltration-protection/ # Task 26 — migrated +examples/gcp-existing-vpc/ # Task 27 — NEW +examples/gcp-sa-provisioning/ # Task 28 — repoint to relocated module + +# Deletions (Task 29 onward, PR 6) +modules/gcp-workspace-basic/ # DELETE +modules/gcp-workspace-byovpc/ # DELETE +modules/gcp-with-psc-exfiltration-protection/ # DELETE +modules/gcp-sa-provisioning/ # DELETE (stub) +modules/gcp-unity-catalog/ # DELETE (stub) +examples/gcp-sa-provisionning/ # DELETE (typo dir) +examples/gcp-test-modules/ # DELETE (state-only) +``` + +**Testing approach for each module task:** Each submodule gets `tests//main.tf` fixtures that call the module with mock vars. The "test" is `terraform init -backend=false && terraform validate && terraform plan -refresh=false` against the fixture. We don't apply — we verify the configuration is valid and the planned resource graph matches expectations. + +**Conventions to follow** (observed in existing repo): +- `versions.tf` declares required_providers and required terraform version +- `Makefile` per module has `docs:` and `test_docs:` targets calling `terraform-docs -c ../../.terraform-docs.yml .` (note: for nested `modules/gcp//`, the path becomes `../../../.terraform-docs.yml`) +- README sections between `` and `` are managed by `terraform-docs` +- Resource names use `${var.prefix}--${random_string.suffix.result}` pattern +- `random_string.suffix` is declared **only in the composer**, then passed to submodules via `suffix` input + +--- + +## PR 1 — Foundation + +This PR adds all new modules under `modules/gcp/` and relocates `service-account` + `unity-catalog`. No example is touched. The deliverable at the end of PR 1 is: a complete new module tree that passes `terraform validate` for every fixture, with no example consuming it yet. + +### Task 1: Repo scaffolding — `modules/gcp/Makefile` and `tests/` convention + +**Files:** +- Create: `modules/gcp/Makefile` + +- [ ] **Step 1: Inspect existing Makefile pattern** + +Read `modules/Makefile` and `modules/gcp-workspace-basic/Makefile` to confirm conventions. + +Run: `cat modules/Makefile modules/gcp-workspace-basic/Makefile` + +Expected: top-level discovers projects via `*/README.md`, each module Makefile invokes `terraform-docs -c ../../.terraform-docs.yml .`. + +- [ ] **Step 2: Create `modules/gcp/Makefile`** + +Write: + +```makefile +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) + +$(PROJECTS): + $(MAKE) -C $@ docs + +.PHONY: $(PROJECTS) docs +``` + +- [ ] **Step 3: Update top-level `modules/Makefile` to recurse into `gcp/`** + +Read current `modules/Makefile`. It only iterates `*/README.md`. Since `modules/gcp/` has no README of its own, we add an explicit recursion. + +Edit `modules/Makefile`: + +```makefile +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) gcp-recursive + +$(PROJECTS): + $(MAKE) -C $@ docs + +gcp-recursive: + $(MAKE) -C gcp docs + +.PHONY: $(PROJECTS) docs gcp-recursive +``` + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/Makefile modules/Makefile +git commit -m "$(cat <<'EOF' +build: add Makefile recursion for modules/gcp/ submodules + +Adds modules/gcp/Makefile mirroring the modules/ pattern (discover +sub-projects via */README.md) and updates modules/Makefile to recurse +into the gcp/ subdir for terraform-docs generation. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 2: `modules/gcp/network` — skeleton + variables + versions + +**Files:** +- Create: `modules/gcp/network/variables.tf` +- Create: `modules/gcp/network/main.tf` +- Create: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/versions.tf` +- Create: `modules/gcp/network/Makefile` +- Create: `modules/gcp/network/README.md` (placeholder, terraform-docs fills it) + +- [ ] **Step 1: Write `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: Write `variables.tf`** + +```hcl +variable "prefix" { + type = string + description = "Prefix for generated resource names" +} + +variable "suffix" { + type = string + description = "Random suffix passed by the composer for uniqueness" +} + +variable "google_region" { + type = string + description = "GCP region for all network resources" +} + +variable "vpc_source" { + type = string + description = "Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup)" + validation { + condition = contains(["create", "existing"], var.vpc_source) + error_message = "vpc_source must be 'create' or 'existing'." + } +} + +# Spoke project always required +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC" +} + +# === Used when vpc_source = "create" ==================================== +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet primary range (required when vpc_source=create)" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet (required when vpc_source=create)" +} + +variable "subnet_name" { + type = string + default = null + description = "Override for spoke subnet name (default: \"${prefix}-subnet-${suffix}\")" +} + +variable "pod_cidr" { + type = string + default = null + description = "GKE secondary range for pods (optional)" +} + +variable "svc_cidr" { + type = string + default = null + description = "GKE secondary range for services (optional)" +} + +# === Used when vpc_source = "existing" ================================== +variable "existing_vpc_name" { + type = string + default = null + description = "Name of pre-existing VPC (required when vpc_source=existing)" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of pre-existing subnet (required when vpc_source=existing)" +} + +# === Hub configuration (only when create_hub = true) ==================== +variable "create_hub" { + type = bool + default = false + description = "Create a hub VPC + subnet + peering with the spoke. Composer passes restricted_egress here." +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (required when create_hub=true)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR for the hub subnet (required when create_hub=true)" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project" +} + +variable "workspace_google_project" { + type = string + default = null + description = "Workspace project (used for Shared-VPC service binding)" +} +``` + +- [ ] **Step 3: Write empty `main.tf` and `outputs.tf`** + +`main.tf`: + +```hcl +# Resources added in Tasks 3, 4, 5 +``` + +`outputs.tf`: + +```hcl +output "spoke_vpc_id" { + value = null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = null + description = "Self-link of the spoke subnet" +} + +output "hub_vpc_id" { + value = null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = null + description = "Name of the hub subnet (null when create_hub=false)" +} + +output "nat_id" { + value = null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} +``` + +(Outputs are wired to real resources in Tasks 3–5.) + +- [ ] **Step 4: Write `Makefile`** + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +- [ ] **Step 5: Write `README.md` placeholder** + +```markdown +# modules/gcp/network + +VPC, subnet, router, NAT, peering, and Shared-VPC binding for the Databricks GCP composer. + + + +``` + +- [ ] **Step 6: Validate** + +Run: +```bash +cd modules/gcp/network && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +- [ ] **Step 7: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): scaffold module with variables and outputs + +Adds modules/gcp/network with variable declarations, empty outputs, +versions.tf, Makefile, and README placeholder. Resources to be added +in subsequent tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 3: `modules/gcp/network` — create-vpc path + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/create/main.tf` + +- [ ] **Step 1: Write the test fixture `tests/create/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + +- [ ] **Step 2: Run the fixture, expect plan to show no resources (no implementation yet)** + +```bash +cd modules/gcp/network/tests/create +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: validate passes, plan shows `No changes. Your infrastructure matches the configuration.` (no resources defined in module yet). + +- [ ] **Step 3: Implement the create-vpc path in `modules/gcp/network/main.tf`** + +```hcl +locals { + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + subnet_name = coalesce(var.subnet_name, "${var.prefix}-subnet-${var.suffix}") +} + +# === Spoke VPC (created) ================================================ +resource "google_compute_network" "spoke_vpc" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-spoke-vpc-${var.suffix}" + project = var.spoke_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "spoke_subnet" { + count = local.create_vpc ? 1 : 0 + + name = local.subnet_name + project = var.spoke_vpc_google_project + network = google_compute_network.spoke_vpc[0].id + region = var.google_region + ip_cidr_range = var.subnet_cidr + private_ip_google_access = true + + dynamic "secondary_ip_range" { + for_each = var.pod_cidr != null ? [1] : [] + content { + range_name = "pods" + ip_cidr_range = var.pod_cidr + } + } + + dynamic "secondary_ip_range" { + for_each = var.svc_cidr != null ? [1] : [] + content { + range_name = "services" + ip_cidr_range = var.svc_cidr + } + } +} + +resource "google_compute_router" "router" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-router-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = google_compute_network.spoke_vpc[0].id +} + +resource "google_compute_router_nat" "nat" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-nat-${var.suffix}" + project = var.spoke_vpc_google_project + router = google_compute_router.router[0].name + region = var.google_region + nat_ip_allocate_option = "AUTO_ONLY" + source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" +} +``` + +- [ ] **Step 4: Wire outputs in `outputs.tf`** + +Replace the `null` placeholders: + +```hcl +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : null + description = "Self-link of the spoke subnet" +} + +output "nat_id" { + value = local.create_vpc ? google_compute_router_nat.nat[0].id : null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} + +# hub_* outputs still null at this point; updated in Task 5. +output "hub_vpc_id" { value = null description = "ID of the hub VPC (null when create_hub=false)" } +output "hub_vpc_name" { value = null description = "Name of the hub VPC (null when create_hub=false)" } +output "hub_vpc_self_link" { value = null description = "Self-link of the hub VPC (null when create_hub=false)" } +output "hub_subnet_name" { value = null description = "Name of the hub subnet (null when create_hub=false)" } +``` + +- [ ] **Step 5: Re-run fixture and verify resource count** + +```bash +cd modules/gcp/network/tests/create +terraform plan -refresh=false +``` + +Expected: `Plan: 4 to add, 0 to change, 0 to destroy.` (network + subnet + router + nat). + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): implement create-vpc path + +Adds google_compute_network/subnetwork/router/router_nat resources +gated on vpc_source="create". Outputs wired to real resources. +Fixture in tests/create/ asserts 4 resources are planned. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 4: `modules/gcp/network` — existing-vpc path + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/existing/main.tf` + +- [ ] **Step 1: Write fixture `tests/existing/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "existing" + spoke_vpc_google_project = "fixture-project" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} +``` + +- [ ] **Step 2: Add data sources to `main.tf`** + +Append to `modules/gcp/network/main.tf`: + +```hcl +# === Spoke VPC (data lookup) ============================================ +data "google_compute_network" "existing_spoke" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_vpc_name + project = var.spoke_vpc_google_project +} + +data "google_compute_subnetwork" "existing_spoke_subnet" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_subnet_name + project = var.spoke_vpc_google_project + region = var.google_region +} +``` + +- [ ] **Step 3: Update outputs to merge create and existing paths** + +In `outputs.tf` replace the four spoke outputs: + +```hcl +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].id : null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].name : null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].self_link : null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].id : null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].name : null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].self_link : null + description = "Self-link of the spoke subnet" +} +``` + +- [ ] **Step 4: Run fixture, expect plan with zero resources (data sources only)** + +```bash +cd modules/gcp/network/tests/existing +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: validate passes; plan shows `No changes` (data sources don't appear as planned resources without applying; we accept this — the test is `validate` passing without errors). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): implement existing-vpc path + +Adds data.google_compute_network and data.google_compute_subnetwork +lookups gated on vpc_source="existing". Spoke outputs now resolve +from either created resources or data sources. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 5: `modules/gcp/network` — hub & peering & shared-VPC + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/create-with-hub/main.tf` + +- [ ] **Step 1: Write fixture `tests/create-with-hub/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + create_hub = true + hub_vpc_google_project = "fixture-hub-project" + hub_vpc_cidr = "10.1.0.0/24" + is_spoke_vpc_shared = true + workspace_google_project = "fixture-workspace-project" +} +``` + +- [ ] **Step 2: Append hub + peering + shared-VPC resources to `main.tf`** + +```hcl +# === Hub VPC ============================================================ +resource "google_compute_network" "hub_vpc" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-vpc-${var.suffix}" + project = var.hub_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "hub_subnet" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-subnet-${var.suffix}" + project = var.hub_vpc_google_project + network = google_compute_network.hub_vpc[0].id + region = var.google_region + ip_cidr_range = var.hub_vpc_cidr + private_ip_google_access = true +} + +# === Peering ============================================================ +resource "google_compute_network_peering" "hub_to_spoke" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-spoke-${var.suffix}" + network = google_compute_network.hub_vpc[0].self_link + peer_network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link +} + +resource "google_compute_network_peering" "spoke_to_hub" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-spoke-hub-${var.suffix}" + network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link + peer_network = google_compute_network.hub_vpc[0].self_link +} + +# === Shared VPC ========================================================= +resource "google_compute_shared_vpc_host_project" "host" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + project = var.spoke_vpc_google_project +} + +resource "google_compute_shared_vpc_service_project" "service" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + host_project = google_compute_shared_vpc_host_project.host[0].project + service_project = var.workspace_google_project +} +``` + +- [ ] **Step 3: Wire hub outputs in `outputs.tf`** + +```hcl +output "hub_vpc_id" { + value = var.create_hub ? google_compute_network.hub_vpc[0].id : null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = var.create_hub ? google_compute_network.hub_vpc[0].name : null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = var.create_hub ? google_compute_network.hub_vpc[0].self_link : null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = var.create_hub ? google_compute_subnetwork.hub_subnet[0].name : null + description = "Name of the hub subnet (null when create_hub=false)" +} +``` + +- [ ] **Step 4: Validate and plan** + +```bash +cd modules/gcp/network/tests/create-with-hub +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: `Plan: 8 to add, 0 to change, 0 to destroy.` (4 spoke + 2 hub + 2 peering + 2 shared-vpc — wait, 4+2+2+2=10. Let me recount: spoke_vpc, spoke_subnet, router, nat = 4. hub_vpc, hub_subnet = 2. hub_to_spoke peering, spoke_to_hub peering = 2. shared_vpc_host, shared_vpc_service = 2. Total = 10). + +Expected: `Plan: 10 to add`. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): add hub VPC, peering, and Shared-VPC binding + +Adds hub VPC + subnet + bidirectional peering with spoke + optional +shared-VPC host/service binding, all gated on create_hub. Composer +passes restricted_egress -> create_hub. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 6: `modules/gcp/private-connectivity` — scaffold + locals (regional maps) + +**Files:** +- Create: `modules/gcp/private-connectivity/versions.tf` +- Create: `modules/gcp/private-connectivity/variables.tf` +- Create: `modules/gcp/private-connectivity/locals.tf` +- Create: `modules/gcp/private-connectivity/outputs.tf` +- Create: `modules/gcp/private-connectivity/psc.tf` (empty) +- Create: `modules/gcp/private-connectivity/firewall.tf` (empty) +- Create: `modules/gcp/private-connectivity/Makefile` +- Create: `modules/gcp/private-connectivity/README.md` placeholder + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "suffix" { type = string } +variable "google_region" { type = string } + +# Spoke network refs +variable "spoke_vpc_id" { type = string } +variable "spoke_vpc_self_link" { type = string } +variable "spoke_vpc_google_project" { type = string } +variable "spoke_vpc_cidr" { type = string } + +# Hub network refs (nullable when no hub) +variable "hub_vpc_id" { type = string default = null } +variable "hub_vpc_self_link" { type = string default = null } +variable "hub_vpc_google_project" { type = string default = null } +variable "hub_subnet_name" { type = string default = null } +variable "hub_vpc_cidr" { type = string default = null } + +# Feature flags +variable "enable_frontend" { type = bool default = false } +variable "enable_backend" { type = bool default = false } +variable "restrict_egress" { type = bool default = false } + +# PSC subnet CIDR (always required when this module is invoked because +# the composer only instantiates it when at least one PSC flag is true) +variable "psc_subnet_cidr" { + type = string + description = "CIDR for the dedicated PSC subnet in the spoke VPC" +} + +# Optional hive metastore IP override; falls back to regional map +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP (looked up via internal map if null)" +} +``` + +- [ ] **Step 3: `locals.tf` — regional PSC service-attachment + hive metastore maps** + +Copy the maps verbatim from `modules/gcp-with-psc-exfiltration-protection/main.tf`. The plan reproduces them in full because future region additions will land in one place going forward. + +```hcl +locals { + google_frontend_psc_targets = { + "asia-northeast1" = "projects/general-prod-asianortheast1-01/regions/asia-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "asia-south1" = "projects/gen-prod-asias1-01/regions/asia-south1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "asia-southeast1" = "projects/general-prod-asiasoutheast1-01/regions/asia-southeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "australia-southeast1" = "projects/general-prod-ausoutheast1-01/regions/australia-southeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west1" = "projects/general-prod-europewest1-01/regions/europe-west1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west2" = "projects/general-prod-europewest2-01/regions/europe-west2/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west3" = "projects/general-prod-europewest3-01/regions/europe-west3/serviceAttachments/plproxy-psc-endpoint-all-ports" + "northamerica-northeast1" = "projects/general-prod-nanortheast1-01/regions/northamerica-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "southamerica-east1" = "projects/gen-prod-saeast1-01/regions/southamerica-east1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-central1" = "projects/gcp-prod-general/regions/us-central1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-east1" = "projects/general-prod-useast1-01/regions/us-east1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-east4" = "projects/general-prod-useast4-01/regions/us-east4/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-west1" = "projects/general-prod-uswest1-01/regions/us-west1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-west4" = "projects/general-prod-uswest4-01/regions/us-west4/serviceAttachments/plproxy-psc-endpoint-all-ports" + } + + google_backend_psc_targets = { + "asia-northeast1" = "projects/prod-gcp-asia-northeast1/regions/asia-northeast1/serviceAttachments/ngrok-psc-endpoint" + "asia-south1" = "projects/prod-gcp-asia-south1/regions/asia-south1/serviceAttachments/ngrok-psc-endpoint" + "asia-southeast1" = "projects/prod-gcp-asia-southeast1/regions/asia-southeast1/serviceAttachments/ngrok-psc-endpoint" + "australia-southeast1" = "projects/prod-gcp-australia-southeast1/regions/australia-southeast1/serviceAttachments/ngrok-psc-endpoint" + "europe-west1" = "projects/prod-gcp-europe-west1/regions/europe-west1/serviceAttachments/ngrok-psc-endpoint" + "europe-west2" = "projects/prod-gcp-europe-west2/regions/europe-west2/serviceAttachments/ngrok-psc-endpoint" + "europe-west3" = "projects/prod-gcp-europe-west3/regions/europe-west3/serviceAttachments/ngrok-psc-endpoint" + "northamerica-northeast1" = "projects/prod-gcp-na-northeast1/regions/northamerica-northeast1/serviceAttachments/ngrok-psc-endpoint" + "southamerica-east1" = "projects/gen-prod-saeast1-01/regions/southamerica-east1/serviceAttachments/ngrok-psc-endpoint" + "us-central1" = "projects/prod-gcp-us-central1/regions/us-central1/serviceAttachments/ngrok-psc-endpoint" + "us-east1" = "projects/prod-gcp-us-east1/regions/us-east1/serviceAttachments/ngrok-psc-endpoint" + "us-east4" = "projects/prod-gcp-us-east4/regions/us-east4/serviceAttachments/ngrok-psc-endpoint" + "us-west1" = "projects/prod-gcp-us-west1/regions/us-west1/serviceAttachments/ngrok-psc-endpoint" + "us-west4" = "projects/prod-gcp-us-west4/regions/us-west4/serviceAttachments/ngrok-psc-endpoint" + } + + # Regional default Hive Metastore IPs per Databricks docs: + # https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore + # NOTE: keep this list curated. When null, the firewall rule omits the + # managed-hive allowance (acceptable when customers run their own metastore). + default_hive_metastore_ips = { + # Filled in by ops; leave empty initially. Override via var.hive_metastore_ip. + } + + hive_metastore_ip = coalesce(var.hive_metastore_ip, try(local.default_hive_metastore_ips[var.google_region], "")) + + hub_present = var.hub_vpc_id != null +} +``` + +- [ ] **Step 4: Empty `psc.tf`, `firewall.tf`, `Makefile`, `README.md`, `outputs.tf`** + +`outputs.tf`: + +```hcl +output "psc_subnet_self_link" { value = null description = "Self-link of the PSC subnet" } +output "frontend_psc_fr_id" { value = null description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" } +output "backend_psc_fr_id" { value = null description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" } +output "hub_frontend_psc_fr_id" { value = null description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" } +output "frontend_psc_ip_spoke" { value = null description = "IP address of the spoke-side frontend PSC endpoint" } +output "backend_psc_ip_spoke" { value = null description = "IP address of the spoke-side backend PSC endpoint" } +output "frontend_psc_ip_hub" { value = null description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" } +``` + +`psc.tf` and `firewall.tf` are empty for now (filled in next tasks). + +`Makefile`: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +`README.md`: + +```markdown +# modules/gcp/private-connectivity + +GCP-side PSC endpoints + restricted-egress firewall for the Databricks GCP composer. + + + +``` + +- [ ] **Step 5: Validate** + +```bash +cd modules/gcp/private-connectivity && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): scaffold with regional PSC maps + +Adds modules/gcp/private-connectivity with variables, regional PSC +service-attachment + hive metastore maps in locals.tf, empty psc.tf +and firewall.tf, null outputs. Resources added in follow-up tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 7: `modules/gcp/private-connectivity` — PSC subnet + endpoints + fixture + +**Files:** +- Modify: `modules/gcp/private-connectivity/psc.tf` +- Modify: `modules/gcp/private-connectivity/outputs.tf` +- Create: `modules/gcp/private-connectivity/tests/full-isolated/main.tf` + +- [ ] **Step 1: Write fixture `tests/full-isolated/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + hub_subnet_name = "fixture-hub-subnet-abc123" + hub_vpc_cidr = "10.1.0.0/24" + + enable_frontend = true + enable_backend = true + restrict_egress = true + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 2: Implement `psc.tf`** + +```hcl +# === PSC Subnet (spoke) ================================================= +resource "google_compute_subnetwork" "psc_subnet" { + name = "${var.prefix}-psc-subnet-${var.suffix}" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_id + region = var.google_region + ip_cidr_range = var.psc_subnet_cidr + private_ip_google_access = true +} + +# === Backend (SCC) PSC endpoint — spoke ================================= +resource "google_compute_address" "backend_address" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "backend_fr" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.backend_address[0].id + target = local.google_backend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — spoke ====================================== +resource "google_compute_address" "frontend_address_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.frontend_address_spoke[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — hub (transit) ============================== +resource "google_compute_address" "frontend_address_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ip-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + subnetwork = var.hub_subnet_name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ep-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + network = var.hub_vpc_id + ip_address = google_compute_address.frontend_address_hub[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} +``` + +- [ ] **Step 3: Wire PSC outputs in `outputs.tf`** + +```hcl +output "psc_subnet_self_link" { + value = google_compute_subnetwork.psc_subnet.self_link + description = "Self-link of the PSC subnet" +} + +output "frontend_psc_fr_id" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_psc_fr_id" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_psc_fr_id" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} + +output "frontend_psc_ip_spoke" { + value = var.enable_frontend ? google_compute_address.frontend_address_spoke[0].address : null + description = "IP address of the spoke-side frontend PSC endpoint" +} + +output "backend_psc_ip_spoke" { + value = var.enable_backend ? google_compute_address.backend_address[0].address : null + description = "IP address of the spoke-side backend PSC endpoint" +} + +output "frontend_psc_ip_hub" { + value = local.hub_present && var.enable_frontend ? google_compute_address.frontend_address_hub[0].address : null + description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/private-connectivity/tests/full-isolated +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: `Plan: 7 to add` (PSC subnet + 2 addresses + 2 forwarding rules for spoke + 1 address + 1 forwarding rule for hub). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): add PSC subnet, addresses, forwarding rules + +PSC subnet (spoke); backend (SCC) endpoint gated on enable_backend; +frontend endpoint (spoke) gated on enable_frontend; frontend endpoint +(hub) gated on hub_present AND enable_frontend. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 8: `modules/gcp/private-connectivity` — egress firewall rules + fixture + +**Files:** +- Modify: `modules/gcp/private-connectivity/firewall.tf` +- Modify: `modules/gcp/private-connectivity/tests/full-isolated/main.tf` (no change to assertions; the plan resource count grows) +- Create: `modules/gcp/private-connectivity/tests/no-egress/main.tf` (variant with `restrict_egress = false`) + +- [ ] **Step 1: Write `firewall.tf`** + +```hcl +# Egress firewall stack — only emitted when restrict_egress = true. +# Names follow the existing pattern from modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf +# and firewall-hub.tf to keep operator familiarity. + +# === Spoke deny-egress ================================================== +resource "google_compute_firewall" "spoke_default_deny_egress" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-default-deny-egress" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1100 + destination_ranges = ["0.0.0.0/0"] + source_ranges = [] + + deny { + protocol = "all" + } +} + +# === Spoke allow Google APIs ============================================ +resource "google_compute_firewall" "spoke_allow_google_apis" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-google-apis" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "199.36.153.4/30", + "199.36.153.8/30", + "34.126.0.0/18" + ] + + allow { + protocol = "all" + } +} + +# === Spoke allow Databricks control plane (to PSC IPs) ================== +resource "google_compute_firewall" "spoke_allow_ctl_plane" { + count = var.restrict_egress && var.enable_frontend && var.enable_backend ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-databricks-control-plane" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "${google_compute_forwarding_rule.backend_fr[0].ip_address}/32", + "${google_compute_forwarding_rule.frontend_fr_spoke[0].ip_address}/32" + ] + + allow { + protocol = "tcp" + ports = ["443"] + } +} + +# === Spoke allow managed Hive (conditional on hive_metastore_ip) ======== +resource "google_compute_firewall" "spoke_allow_hive" { + count = var.restrict_egress && local.hive_metastore_ip != "" ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-${var.google_region}-managed-hive" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = ["${local.hive_metastore_ip}/32"] + + allow { + protocol = "tcp" + ports = ["3306"] + } +} + +# === Hub ingress from spoke ============================================= +resource "google_compute_firewall" "hub_ingress" { + count = var.restrict_egress && local.hub_present ? 1 : 0 + + name = "${var.prefix}-hub-${var.suffix}-ingress" + project = var.hub_vpc_google_project + network = var.hub_vpc_self_link + + direction = "INGRESS" + priority = 1000 + destination_ranges = [] + source_ranges = [var.spoke_vpc_cidr] + + allow { + protocol = "all" + } +} +``` + +- [ ] **Step 2: Write `tests/no-egress/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = false + restrict_egress = false + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 3: Validate both fixtures** + +```bash +cd modules/gcp/private-connectivity/tests/full-isolated && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 12 to add` (7 from Task 7 + 5 firewall rules: deny + google-apis + ctl-plane + hive (only if `hive_metastore_ip` set, which it's not in the fixture — so 0) + hub-ingress = 4 firewall rules in this fixture → 11 total. If `hive_metastore_ip` is set in the fixture, expect 12). + +Note: fixture has `hive_metastore_ip` unset and `default_hive_metastore_ips` map is empty → `local.hive_metastore_ip = ""` → hive firewall is NOT emitted. So fixture should plan: 7 PSC + 4 firewall = 11 resources. + +```bash +cd ../no-egress && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 3 to add` (PSC subnet + 1 frontend address + 1 frontend forwarding rule). + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): add egress firewall stack + +Spoke deny-egress (priority 1100), allow-google-apis, allow control +plane (to PSC IPs), allow managed-hive (conditional on metastore IP), +and hub ingress from spoke CIDR. All gated on restrict_egress. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 9: `modules/gcp/account` — scaffold + variables + versions + +**Files:** +- Create: `modules/gcp/account/versions.tf` +- Create: `modules/gcp/account/variables.tf` +- Create: `modules/gcp/account/main.tf` +- Create: `modules/gcp/account/vpc-endpoints.tf` +- Create: `modules/gcp/account/pas.tf` +- Create: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/Makefile` +- Create: `modules/gcp/account/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "suffix" { type = string } +variable "workspace_name" { type = string default = null } +variable "databricks_account_id" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } + +variable "vpc_source" { + type = string + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +variable "spoke_vpc_name" { type = string default = null } +variable "spoke_subnet_name" { type = string default = null } +variable "spoke_vpc_google_project" { type = string default = null } +variable "hub_vpc_google_project" { type = string default = null } + +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_psc_fr_id" { type = string default = null } +variable "backend_psc_fr_id" { type = string default = null } +variable "hub_frontend_psc_fr_id" { type = string default = null } + +variable "enable_frontend" { type = bool default = false } +variable "enable_backend" { type = bool default = false } +variable "private_access_only" { type = bool default = false } + +variable "nat_dependency" { + type = any + default = null + description = "Opaque value used as depends_on for the workspace to ensure NAT readiness" +} +``` + +- [ ] **Step 3: Empty `main.tf`, `vpc-endpoints.tf`, `pas.tf`, `outputs.tf` placeholders** + +`main.tf`: + +```hcl +locals { + workspace_name = coalesce(var.workspace_name, "${var.prefix}-ws-${var.suffix}") + emit_mws_networks = var.vpc_source != "databricks_managed" + emit_vpc_endpoints = var.frontend_psc_fr_id != null && var.backend_psc_fr_id != null + emit_pas = var.private_access_only +} +``` + +`outputs.tf`: + +```hcl +output "workspace_id" { value = null description = "Databricks workspace ID" } +output "workspace_url" { value = null description = "Databricks workspace URL" } +output "network_id" { value = null description = "mws_networks ID (null when databricks_managed)" } +output "frontend_endpoint_id" { value = null description = "Frontend mws_vpc_endpoint ID (null when no PSC)" } +output "backend_endpoint_id" { value = null description = "Backend mws_vpc_endpoint ID (null when no PSC)" } +output "transit_endpoint_id" { value = null description = "Hub-side mws_vpc_endpoint ID (null when no hub)" } +``` + +`Makefile`: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +`README.md`: + +```markdown +# modules/gcp/account + +All `databricks_mws_*` resources for the GCP composer: `mws_networks`, `mws_workspaces`, `mws_vpc_endpoint`, `mws_private_access_settings`. + + + +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/account && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): scaffold module + +Adds modules/gcp/account with variable declarations, locals for +derived flags, empty main.tf/vpc-endpoints.tf/pas.tf, null outputs. +Resources added in follow-up tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 10: `modules/gcp/account` — databricks-managed workspace shape + fixture + +**Files:** +- Modify: `modules/gcp/account/main.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/databricks-managed/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Add `databricks_mws_workspaces` to `main.tf`** + +Append to `modules/gcp/account/main.tf`: + +```hcl +resource "databricks_mws_workspaces" "this" { + account_id = var.databricks_account_id + workspace_name = local.workspace_name + location = var.google_region + + cloud_resource_container { + gcp { + project_id = var.google_project + } + } + + network_id = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + private_access_settings_id = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + + token { + comment = "Terraform" + } + + depends_on = [var.nat_dependency] +} +``` + +- [ ] **Step 3: Wire workspace outputs** + +```hcl +output "workspace_id" { + value = databricks_mws_workspaces.this.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = databricks_mws_workspaces.this.workspace_url + description = "Databricks workspace URL" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/account/tests/databricks-managed +terraform init -backend=false +terraform validate +``` + +Expected: validate passes. (Plan cannot run without real Databricks credentials; validate is sufficient for this fixture.) + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_workspaces resource + +Workspace resource with conditional network_id and +private_access_settings_id (both null when databricks_managed). + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 11: `modules/gcp/account` — mws_networks (customer VPC) + fixture + +**Files:** +- Modify: `modules/gcp/account/main.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/byovpc/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" +} +``` + +- [ ] **Step 2: Append `databricks_mws_networks` to `main.tf`** + +```hcl +resource "databricks_mws_networks" "this" { + count = local.emit_mws_networks ? 1 : 0 + + account_id = var.databricks_account_id + network_name = "${var.prefix}-ntw-${var.suffix}" + + gcp_network_info { + network_project_id = var.spoke_vpc_google_project + vpc_id = var.spoke_vpc_name + subnet_id = var.spoke_subnet_name + subnet_region = var.google_region + } + + dynamic "vpc_endpoints" { + for_each = local.emit_vpc_endpoints ? [1] : [] + content { + dataplane_relay = [databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id] + rest_api = [databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id] + } + } +} +``` + +- [ ] **Step 3: Wire `network_id` output** + +```hcl +output "network_id" { + value = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + description = "mws_networks ID (null when databricks_managed)" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/account/tests/byovpc +terraform init -backend=false +terraform validate +``` + +Expected: validate passes (note: references to `databricks_mws_vpc_endpoint.backend[0]` and `frontend[0]` resolve at plan time even if `emit_vpc_endpoints` is false because they're inside a `dynamic` block; the for_each guard prevents evaluation). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_networks for customer VPC + +mws_networks emitted when vpc_source != databricks_managed; the +vpc_endpoints block is conditionally populated via dynamic when both +frontend and backend forwarding-rule IDs are provided. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 12: `modules/gcp/account` — mws_vpc_endpoint resources + fixture + +**Files:** +- Modify: `modules/gcp/account/vpc-endpoints.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/psc-with-pas/main.tf` + +- [ ] **Step 1: Write `vpc-endpoints.tf`** + +```hcl +resource "databricks_mws_vpc_endpoint" "frontend" { + count = var.enable_frontend && var.frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-ws-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.frontend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "backend" { + count = var.enable_backend && var.backend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-scc-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.backend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "transit" { + count = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-hub-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.hub_vpc_google_project + psc_endpoint_name = var.hub_frontend_psc_fr_id + endpoint_region = var.google_region + } +} +``` + +- [ ] **Step 2: Wire endpoint outputs** + +```hcl +output "frontend_endpoint_id" { + value = var.enable_frontend && var.frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id : null + description = "Frontend mws_vpc_endpoint ID (null when no PSC)" +} + +output "backend_endpoint_id" { + value = var.enable_backend && var.backend_psc_fr_id != null ? databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id : null + description = "Backend mws_vpc_endpoint ID (null when no PSC)" +} + +output "transit_endpoint_id" { + value = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.transit[0].vpc_endpoint_id : null + description = "Hub-side mws_vpc_endpoint ID (null when no hub)" +} +``` + +- [ ] **Step 3: Write fixture `tests/psc-with-pas/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + + frontend_psc_fr_id = "fixture-psc-ws-ep-abc123" + backend_psc_fr_id = "fixture-psc-scc-ep-abc123" + hub_frontend_psc_fr_id = "fixture-hub-psc-ws-ep-abc123" + + enable_frontend = true + enable_backend = true + private_access_only = true +} +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/account/tests/psc-with-pas && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_vpc_endpoint resources + +Frontend, backend (SCC), and transit (hub) mws_vpc_endpoints, each +gated on its enable_* flag and the presence of the corresponding +forwarding-rule name from private-connectivity. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 13: `modules/gcp/account` — private access settings + +**Files:** +- Modify: `modules/gcp/account/pas.tf` + +- [ ] **Step 1: Write `pas.tf`** + +```hcl +resource "databricks_mws_private_access_settings" "this" { + count = local.emit_pas ? 1 : 0 + + account_id = var.databricks_account_id + private_access_settings_name = "${var.prefix}-pas-${var.suffix}" + region = var.google_region + public_access_enabled = false + private_access_level = "ACCOUNT" +} +``` + +- [ ] **Step 2: Validate (reuse `tests/psc-with-pas` fixture)** + +```bash +cd modules/gcp/account/tests/psc-with-pas && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add mws_private_access_settings + +Emitted when private_access_only=true; public_access_enabled=false, +private_access_level=ACCOUNT. Workspace references via +private_access_settings_id. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 14: `modules/gcp/dns` — scaffold + variables + +**Files:** +- Create: `modules/gcp/dns/versions.tf` +- Create: `modules/gcp/dns/variables.tf` +- Create: `modules/gcp/dns/hub.tf` (empty) +- Create: `modules/gcp/dns/spoke.tf` (empty) +- Create: `modules/gcp/dns/outputs.tf` +- Create: `modules/gcp/dns/Makefile` +- Create: `modules/gcp/dns/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "google_region" { type = string } + +# Hub +variable "hub_vpc_id" { type = string } +variable "hub_vpc_self_link" { type = string } +variable "hub_vpc_google_project" { type = string } + +# Spoke +variable "spoke_vpc_id" { type = string } +variable "spoke_vpc_self_link" { type = string } +variable "spoke_vpc_google_project" { type = string } + +# Workspace +variable "workspace_url" { type = string } + +# PSC IPs +variable "frontend_psc_ip_spoke" { type = string } +variable "frontend_psc_ip_hub" { type = string default = null } +variable "backend_psc_ip_spoke" { type = string } +``` + +- [ ] **Step 3: Empty `outputs.tf`, `Makefile`, `README.md`** + +`outputs.tf`: + +```hcl +# This module has no outputs; DNS records are terminal. +``` + +`Makefile`: same template as Task 6. + +`README.md`: + +```markdown +# modules/gcp/dns + +Private DNS zones (hub + spoke) used with restricted-egress workspaces. + + + +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/dns && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/dns/ +git commit -m "$(cat <<'EOF' +feat(gcp/dns): scaffold module with variables + +Variable declarations for hub + spoke DNS zones. Resources added in +follow-up task. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 15: `modules/gcp/dns` — hub + spoke zones and records + fixture + +**Files:** +- Modify: `modules/gcp/dns/hub.tf` +- Modify: `modules/gcp/dns/spoke.tf` +- Create: `modules/gcp/dns/tests/hub-and-spoke/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "dns" { + source = "../.." + + prefix = "fixture" + google_region = "us-central1" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + + workspace_url = "https://1234567890123456.7.gcp.databricks.com" + + frontend_psc_ip_spoke = "10.0.255.4" + frontend_psc_ip_hub = "10.1.0.10" + backend_psc_ip_spoke = "10.0.255.5" +} +``` + +- [ ] **Step 2: Write `hub.tf`** + +```hcl +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + # Matches the behavior of the legacy gcp-with-psc-exfiltration-protection module. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} + +# === gcp.databricks.com (hub) ============================================ +resource "google_dns_managed_zone" "hub_dbx" { + name = "${var.prefix}-hub-gcp-databricks-com" + project = var.hub_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "hub_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_psc_auth" { + name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +# === gcr.io ============================================================== +resource "google_dns_managed_zone" "gcr" { + name = "${var.prefix}-gcr-io" + project = var.hub_vpc_google_project + dns_name = "gcr.io." + description = "Private DNS zone for GCR private resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "gcr_cname" { + name = "*.${google_dns_managed_zone.gcr.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "CNAME" + ttl = 300 + rrdatas = ["gcr.io."] +} + +resource "google_dns_record_set" "gcr_a" { + name = google_dns_managed_zone.gcr.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} + +# === googleapis.com ====================================================== +resource "google_dns_managed_zone" "google_apis" { + name = "${var.prefix}-google-apis" + project = var.hub_vpc_google_project + dns_name = "googleapis.com." + description = "Private DNS zone for Google APIs resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "google_apis_cname" { + name = "*.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "CNAME" + ttl = 300 + rrdatas = ["restricted.googleapis.com."] +} + +resource "google_dns_record_set" "google_apis_a" { + name = "restricted.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.4", "199.36.153.5", "199.36.153.6", "199.36.153.7"] +} + +# === pkg.dev ============================================================= +resource "google_dns_managed_zone" "pkg_dev" { + name = "${var.prefix}-pkg-dev" + project = var.hub_vpc_google_project + dns_name = "pkg.dev." + description = "Private DNS zone for Go Packages resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "pkg_dev_cname" { + name = "*.${google_dns_managed_zone.pkg_dev.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "CNAME" + ttl = 300 + rrdatas = ["pkg.dev."] +} + +resource "google_dns_record_set" "pkg_dev_a" { + name = google_dns_managed_zone.pkg_dev.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} +``` + +- [ ] **Step 3: Write `spoke.tf`** + +```hcl +# === gcp.databricks.com (spoke) ========================================== +resource "google_dns_managed_zone" "spoke_dbx" { + name = "${var.prefix}-spoke-gcp-databricks-com" + project = var.spoke_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.spoke_vpc_id + } + } +} + +resource "google_dns_record_set" "spoke_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_tunnel" { + name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.backend_psc_ip_spoke] +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/dns/tests/hub-and-spoke && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 16 to add` (5 zones + 11 record sets: 3 hub_dbx + 2 gcr + 2 google_apis + 2 pkg_dev + 3 spoke = 12. Let me recount: hub_dbx zone + 3 records = 4. gcr zone + 2 records = 3. google_apis zone + 2 records = 3. pkg_dev zone + 2 records = 3. spoke_dbx zone + 3 records = 4. Total = 17). + +Expected: `Plan: 17 to add`. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/dns/ +git commit -m "$(cat <<'EOF' +feat(gcp/dns): add hub and spoke private DNS zones + +Hub: gcp.databricks.com, gcr.io, googleapis.com, pkg.dev. +Spoke: gcp.databricks.com with workspace/dp/tunnel records. +workspace_dns_id is regex-extracted from workspace_url. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 16: `modules/gcp/databricks-workspace` — composer scaffold + variables + +**Files:** +- Create: `modules/gcp/databricks-workspace/versions.tf` +- Create: `modules/gcp/databricks-workspace/variables.tf` +- Create: `modules/gcp/databricks-workspace/main.tf` +- Create: `modules/gcp/databricks-workspace/outputs.tf` +- Create: `modules/gcp/databricks-workspace/Makefile` +- Create: `modules/gcp/databricks-workspace/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + random = { + source = "hashicorp/random" + version = ">= 3.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf` — full composer API as specified** + +```hcl +# === Identity =========================================================== +variable "prefix" { type = string } +variable "databricks_account_id" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "workspace_name" { type = string default = null } +variable "tags" { type = map(string) default = {} } + +# === VPC source ========================================================= +variable "vpc_source" { + type = string + default = "databricks_managed" + description = "One of: databricks_managed, create, existing" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +# When vpc_source = "create" +variable "spoke_vpc_cidr" { type = string default = null } +variable "subnet_cidr" { type = string default = null } +variable "pod_cidr" { type = string default = null } +variable "svc_cidr" { type = string default = null } + +# When vpc_source = "existing" +variable "existing_vpc_name" { type = string default = null } +variable "existing_subnet_name" { type = string default = null } + +# === Connectivity feature flags ========================================= +variable "private_link_frontend" { type = bool default = false } +variable "private_link_backend" { type = bool default = false } +variable "private_access_only" { type = bool default = false } +variable "restricted_egress" { type = bool default = false } + +# === Required when restricted_egress = true ============================= +variable "hub_vpc_google_project" { type = string default = null } +variable "spoke_vpc_google_project" { type = string default = null } +variable "is_spoke_vpc_shared" { type = bool default = false } +variable "hub_vpc_cidr" { type = string default = null } +variable "psc_subnet_cidr" { type = string default = null } +variable "hive_metastore_ip" { type = string default = null } +``` + +- [ ] **Step 3: `main.tf` — locals, random suffix, preconditions (no submodule wiring yet)** + +```hcl +locals { + databricks_managed = var.vpc_source == "databricks_managed" + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + any_private_link = var.private_link_frontend || var.private_link_backend + spoke_project = coalesce(var.spoke_vpc_google_project, var.google_project) +} + +resource "random_string" "suffix" { + length = 6 + special = false + upper = false + + lifecycle { + ignore_changes = [special, upper] + } +} + +# Cross-variable preconditions. Terraform doesn't support cross-var +# validation in variable blocks; we use a null_resource lifecycle.precondition +# stack instead. +resource "null_resource" "preconditions" { + lifecycle { + precondition { + condition = !var.restricted_egress || local.create_vpc + error_message = "restricted_egress=true requires vpc_source=\"create\" (hub-spoke topology needs us to own both VPCs)." + } + precondition { + condition = !var.restricted_egress || local.any_private_link + error_message = "restricted_egress=true requires at least one of private_link_frontend or private_link_backend." + } + precondition { + condition = !var.restricted_egress || (var.hub_vpc_google_project != null && var.hub_vpc_cidr != null && var.psc_subnet_cidr != null) + error_message = "restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, and psc_subnet_cidr." + } + precondition { + condition = !local.create_vpc || (var.spoke_vpc_cidr != null && var.subnet_cidr != null) + error_message = "vpc_source=\"create\" requires spoke_vpc_cidr and subnet_cidr." + } + precondition { + condition = !local.use_existing_vpc || (var.existing_vpc_name != null && var.existing_subnet_name != null) + error_message = "vpc_source=\"existing\" requires existing_vpc_name and existing_subnet_name." + } + precondition { + condition = !local.databricks_managed || (!var.private_link_frontend && !var.private_link_backend && !var.restricted_egress) + error_message = "vpc_source=\"databricks_managed\" forbids private_link_frontend, private_link_backend, and restricted_egress." + } + } +} +``` + +Note: `null_resource` requires the `hashicorp/null` provider; add it to `versions.tf`. Update `versions.tf`: + +```hcl + null = { + source = "hashicorp/null" + version = ">= 3.0" + } +``` + +- [ ] **Step 4: Empty `outputs.tf`** + +```hcl +output "workspace_id" { value = null description = "Databricks workspace ID" } +output "workspace_url" { value = null description = "Databricks workspace URL" } +output "network_id" { value = null description = "mws_networks ID (null when databricks_managed)" } +output "vpc_id" { value = null description = "Spoke VPC ID (null when databricks_managed)" } +output "spoke_vpc_id" { value = null description = "Spoke VPC ID (null when databricks_managed)" } +output "hub_vpc_id" { value = null description = "Hub VPC ID (null when not restricted_egress)" } +output "suffix" { value = random_string.suffix.result description = "Random suffix used in resource names" } +``` + +- [ ] **Step 5: Makefile + README placeholder** (same template as previous modules) + +- [ ] **Step 6: Validate** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 7: Commit** + +```bash +git add modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +feat(gcp/databricks-workspace): scaffold composer with preconditions + +Composer module with full variable API, random_string suffix, locals +for derived flags, and null_resource.preconditions stack enforcing all +cross-variable rules from the spec. Submodule wiring follows in next +tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 17: Composer — wire `network`, `private-connectivity`, `account`, `dns` submodules + +**Files:** +- Modify: `modules/gcp/databricks-workspace/main.tf` +- Modify: `modules/gcp/databricks-workspace/outputs.tf` + +- [ ] **Step 1: Append submodule blocks to `main.tf`** + +```hcl +module "network" { + source = "../network" + count = local.databricks_managed ? 0 : 1 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + vpc_source = var.vpc_source + spoke_vpc_google_project = local.spoke_project + + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr + + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name + + create_hub = var.restricted_egress + hub_vpc_google_project = var.hub_vpc_google_project + hub_vpc_cidr = var.hub_vpc_cidr + is_spoke_vpc_shared = var.is_spoke_vpc_shared + workspace_google_project = var.google_project +} + +module "private_connectivity" { + source = "../private-connectivity" + count = local.any_private_link ? 1 : 0 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + spoke_vpc_cidr = var.spoke_vpc_cidr + + hub_vpc_id = var.restricted_egress ? module.network[0].hub_vpc_id : null + hub_vpc_self_link = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + hub_vpc_google_project = var.hub_vpc_google_project + hub_subnet_name = var.restricted_egress ? module.network[0].hub_subnet_name : null + hub_vpc_cidr = var.hub_vpc_cidr + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + restrict_egress = var.restricted_egress + psc_subnet_cidr = var.psc_subnet_cidr + + hive_metastore_ip = var.hive_metastore_ip +} + +module "account" { + source = "../account" + + prefix = var.prefix + suffix = random_string.suffix.result + workspace_name = var.workspace_name + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + vpc_source = var.vpc_source + + spoke_vpc_name = local.databricks_managed ? null : module.network[0].spoke_vpc_name + spoke_subnet_name = local.databricks_managed ? null : module.network[0].spoke_subnet_name + spoke_vpc_google_project = local.spoke_project + hub_vpc_google_project = var.hub_vpc_google_project + + frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].frontend_psc_fr_id : null + backend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].backend_psc_fr_id : null + hub_frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].hub_frontend_psc_fr_id : null + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + private_access_only = var.private_access_only + + nat_dependency = local.databricks_managed ? null : module.network[0].nat_id +} + +module "dns" { + source = "../dns" + count = var.restricted_egress ? 1 : 0 + + prefix = var.prefix + google_region = var.google_region + + hub_vpc_id = module.network[0].hub_vpc_id + hub_vpc_self_link = module.network[0].hub_vpc_self_link + hub_vpc_google_project = var.hub_vpc_google_project + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity[0].frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity[0].frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity[0].backend_psc_ip_spoke +} +``` + +- [ ] **Step 2: Wire composer outputs** + +Replace `outputs.tf`: + +```hcl +output "workspace_id" { + value = module.account.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.account.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = module.account.network_id + description = "mws_networks ID (null when databricks_managed)" +} + +output "vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "spoke_vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "hub_vpc_id" { + value = try(module.network[0].hub_vpc_id, null) + description = "Hub VPC ID (null when not restricted_egress)" +} + +output "suffix" { + value = random_string.suffix.result + description = "Random suffix used in resource names" +} +``` + +- [ ] **Step 3: Validate** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +feat(gcp/databricks-workspace): wire submodules in composer + +Conditional module blocks for network, private-connectivity, dns +(each gated by appropriate flags) and always-on account module. +Composer outputs wired to module outputs with try() for nullable +network outputs. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 18: Composer — positive fixtures (basic / byovpc / existing / psc-isolated) + +**Files:** +- Create: `modules/gcp/databricks-workspace/tests/basic/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/byovpc/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/existing-vpc/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/psc-isolated/main.tf` + +Each fixture follows this pattern: + +- [ ] **Step 1: Write `tests/basic/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Write `tests/byovpc/main.tf`** + +Same provider block, then: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + +- [ ] **Step 3: Write `tests/existing-vpc/main.tf`** + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} +``` + +- [ ] **Step 4: Write `tests/psc-isolated/main.tf`** + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + is_spoke_vpc_shared = true + hub_vpc_cidr = "10.1.0.0/24" + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 5: Validate every fixture** + +```bash +for d in basic byovpc existing-vpc psc-isolated; do + echo "=== $d ===" && cd modules/gcp/databricks-workspace/tests/$d && \ + terraform init -backend=false && terraform validate && cd - +done +``` + +Expected: all four validate passes. + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/databricks-workspace/tests/ +git commit -m "$(cat <<'EOF' +test(gcp/databricks-workspace): positive fixtures for 4 scenarios + +basic (databricks_managed), byovpc (create), existing-vpc (existing), +and psc-isolated (create + all PSC flags + restricted_egress). +Each fixture validates the full module graph. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 19: Composer — negative fixtures (precondition failures) + +**Files:** +- Create: `modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf` + +- [ ] **Step 1: Write each fixture** + +`negative-restricted-egress-managed/main.tf` (expect: precondition error "restricted_egress=true requires vpc_source=create"): + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { project = "f" region = "us-central1" } +provider "databricks" { host = "https://accounts.gcp.databricks.com" account_id = "00000000-0000-0000-0000-000000000000" } + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "databricks_managed" + restricted_egress = true +} +``` + +`negative-restricted-egress-missing-hub/main.tf`: + +```hcl +# same provider/header +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + private_link_frontend = true + private_link_backend = true + restricted_egress = true + # hub_vpc_google_project, hub_vpc_cidr, psc_subnet_cidr all null -> precondition fail +} +``` + +`negative-existing-missing-name/main.tf`: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "existing" + # existing_vpc_name / existing_subnet_name null -> precondition fail +} +``` + +`negative-managed-with-psc/main.tf`: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "databricks_managed" + private_link_frontend = true # forbidden with databricks_managed +} +``` + +- [ ] **Step 2: Verify each fixture fails at plan time** + +```bash +for d in negative-restricted-egress-managed negative-restricted-egress-missing-hub negative-existing-missing-name negative-managed-with-psc; do + echo "=== $d ===" && cd modules/gcp/databricks-workspace/tests/$d && \ + terraform init -backend=false && \ + if terraform plan -refresh=false; then + echo "FAIL: $d should have failed plan"; exit 1 + else + echo "OK: $d failed plan as expected" + fi && cd - +done +``` + +Expected: each fixture fails at plan time with a precondition error message matching the spec table. + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/databricks-workspace/tests/ +git commit -m "$(cat <<'EOF' +test(gcp/databricks-workspace): negative fixtures for preconditions + +Four fixtures, each violating one precondition rule from the spec. +Each fixture must fail `terraform plan` with a clear error message. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 20: Relocate `modules/gcp-sa-provisioning` → `modules/gcp/service-account` + +**Files:** +- Move: `modules/gcp-sa-provisioning/` → `modules/gcp/service-account/` +- Create: `modules/gcp-sa-provisioning/README.md` (deprecation stub) + +- [ ] **Step 1: `git mv` the directory** + +```bash +git mv modules/gcp-sa-provisioning modules/gcp/service-account +``` + +- [ ] **Step 2: Update the Makefile path inside the relocated module** + +Read `modules/gcp/service-account/Makefile`. The relative path to `.terraform-docs.yml` needs to deepen by one level: `../../.terraform-docs.yml` → `../../../.terraform-docs.yml`. Edit: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +- [ ] **Step 3: Create deprecation stub at the old path** + +```bash +mkdir -p modules/gcp-sa-provisioning +``` + +Write `modules/gcp-sa-provisioning/README.md`: + +```markdown +# DEPRECATED — moved to `modules/gcp/service-account/` + +This module has been relocated to [`../gcp/service-account/`](../gcp/service-account/). + +All variables, outputs, and resource addresses are unchanged. Update your +module `source` from: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" +``` + +to: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp/service-account" +``` + +This stub will be removed in PR 6 of the GCP modules refactor. +``` + +- [ ] **Step 4: Validate the relocated module** + +```bash +cd modules/gcp/service-account && terraform init -backend=false && terraform validate +``` + +Expected: validate passes (no functional changes). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/service-account/ modules/gcp-sa-provisioning/README.md +git commit -m "$(cat <<'EOF' +refactor(gcp/service-account): relocate from modules/gcp-sa-provisioning + +git mv only; no functional changes. Old path has a deprecation README +pointing to the new location. Makefile updated for new depth. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 21: Relocate `modules/gcp-unity-catalog` → `modules/gcp/unity-catalog` + +Same pattern as Task 20. + +- [ ] **Step 1: `git mv`** + +```bash +git mv modules/gcp-unity-catalog modules/gcp/unity-catalog +``` + +- [ ] **Step 2: Update `modules/gcp/unity-catalog/Makefile`** to use `../../../.terraform-docs.yml`. + +- [ ] **Step 3: Write `modules/gcp-unity-catalog/README.md`** (deprecation stub, same template as Task 20). + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/unity-catalog && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/unity-catalog/ modules/gcp-unity-catalog/README.md +git commit -m "$(cat <<'EOF' +refactor(gcp/unity-catalog): relocate from modules/gcp-unity-catalog + +git mv only; no functional changes. Old path has a deprecation README. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 22: Regenerate `terraform-docs` READMEs for all new modules + +**Files:** +- Modify: `modules/gcp/*/README.md` (every submodule, via `terraform-docs`) + +- [ ] **Step 1: Run `make docs` recursively** + +```bash +make -C modules/gcp docs +``` + +Expected: each module's README has its `` ... `` block populated with inputs/outputs tables. + +- [ ] **Step 2: Verify `pre-commit` passes** + +```bash +pre-commit run --all-files +``` + +Expected: all hooks pass (terraform_fmt, terraform_validate, terraform_docs). + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/ +git commit -m "$(cat <<'EOF' +docs(gcp): regenerate terraform-docs for all new submodules + +Generated README content for network, private-connectivity, account, +dns, and databricks-workspace via `make -C modules/gcp docs`. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 23: Open PR 1 (draft) + +- [ ] **Step 1: Push branch** + +```bash +git push -u origin feature/gcp-modules-refactor +``` + +- [ ] **Step 2: Open draft PR** + +```bash +gh pr create --draft --title "feat(gcp): add modules/gcp/ composer + submodules (PR 1 of 6)" --body "$(cat <<'EOF' +## Summary + +First PR of the GCP modules refactor described in `docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md`. Adds: + +- `modules/gcp/databricks-workspace` — top-level composer +- `modules/gcp/network` — VPC/subnet/router/NAT/peering/shared-VPC +- `modules/gcp/private-connectivity` — PSC + egress firewall +- `modules/gcp/account` — all `databricks_mws_*` resources +- `modules/gcp/dns` — private DNS zones (hub + spoke) +- Relocations: `modules/gcp-sa-provisioning` → `modules/gcp/service-account`, `modules/gcp-unity-catalog` → `modules/gcp/unity-catalog` + +No example consumes these yet — they will be migrated one PR at a time. + +## Test plan + +- [ ] `pre-commit run --all-files` passes +- [ ] Every fixture under `modules/gcp/*/tests//` validates +- [ ] Every negative fixture under `modules/gcp/databricks-workspace/tests/negative-*/` fails at plan time +EOF +)" +``` + +--- + +## PR 2 — Migrate `examples/gcp-basic` + +### Task 24: Rewrite `examples/gcp-basic` against the new composer + +**Files:** +- Modify: `examples/gcp-basic/main.tf` +- Modify: `examples/gcp-basic/variables.tf` +- Modify: `examples/gcp-basic/outputs.tf` +- Modify: `examples/gcp-basic/README.md` +- Modify: `examples/gcp-basic/terraform.tfvars` +- (Leave `init.tf` and `Makefile` unchanged.) + +- [ ] **Step 1: Rewrite `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Trim `variables.tf` to only what this example needs** + +```hcl +variable "databricks_account_id" { + type = string + description = "Databricks Account ID" +} + +variable "databricks_google_service_account" { + type = string + description = "Service account email used for Databricks provider authentication" +} + +variable "google_project" { + type = string + description = "GCP project where the workspace will be created" +} + +variable "google_region" { + type = string + description = "GCP region for workspace deployment" +} + +variable "google_zone" { + type = string + description = "GCP zone (used by the google provider)" +} + +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "workspace_name" { + type = string + description = "Workspace name" +} +``` + +(Drop `delegate_from` — that variable belongs to SA-provisioning, not basic.) + +- [ ] **Step 3: Rewrite `outputs.tf`** + +```hcl +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} +``` + +- [ ] **Step 4: Update `terraform.tfvars` skeleton** + +```hcl +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +``` + +- [ ] **Step 5: Rewrite `README.md`** + +```markdown +# examples/gcp-basic — Databricks-managed VPC + +Calls `modules/gcp/databricks-workspace` with `vpc_source = "databricks_managed"`. +The Databricks platform provisions the workspace VPC; you provide only the GCP +project, region, and prefix. + +## Prerequisites + +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID + +## Apply + +```bash +terraform init +terraform apply +``` + +## Migrating from the old example + +This example previously called `modules/gcp-workspace-basic`. State from the +old apply does **not** migrate cleanly to the new composer because the +`databricks_mws_workspaces` resource address differs. Re-apply on clean state. + + + +``` + +- [ ] **Step 6: Regenerate docs and validate** + +```bash +cd examples/gcp-basic && make docs && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 7: Sandbox apply (manual)** + +The author runs `terraform apply` against a sandbox project and confirms the workspace is reachable. Capture plan output as a PR comment. Run `terraform destroy` after. + +- [ ] **Step 8: Commit + open PR** + +```bash +git add examples/gcp-basic/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-basic): migrate to modules/gcp/databricks-workspace + +Replaces the call to modules/gcp-workspace-basic with the new composer +using vpc_source="databricks_managed". Variables trimmed to scenario +inputs; README documents the migration caveat. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "refactor(examples/gcp-basic): migrate to new composer (PR 2 of 6)" --body "$(cat <<'EOF' +## Summary + +Migrates `examples/gcp-basic` to call `modules/gcp/databricks-workspace`. Old +`modules/gcp-workspace-basic` remains untouched (deleted in PR 6). + +## Test plan + +- [ ] Sandbox `terraform apply` succeeds; workspace reachable +- [ ] Fresh `terraform plan` shows zero drift after apply +- [ ] `terraform destroy` cleans up without orphans +EOF +)" +``` + +--- + +## PR 3 — Migrate `examples/gcp-byovpc` + +### Task 25: Rewrite `examples/gcp-byovpc` against the new composer + +Same pattern as Task 24, with these differences: + +- [ ] **Step 1: `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "databricks_account_id" { type = string } +variable "databricks_google_service_account" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "google_zone" { type = string } +variable "prefix" { type = string } +variable "workspace_name" { type = string } + +variable "spoke_vpc_cidr" { type = string } +variable "subnet_cidr" { type = string } +variable "pod_cidr" { type = string default = null } +variable "svc_cidr" { type = string default = null } +``` + +- [ ] **Step 3–8:** Same as Task 24 (outputs, tfvars, README, docs, validate, sandbox apply, commit + PR). + +Note: variable names changed — `subnet_ip_cidr_range` → `subnet_cidr`, `pod_ip_cidr_range` → `pod_cidr`, `svc_ip_cidr_range` → `svc_cidr`, etc. README must explicitly call this out: + +> **Breaking change for migrating users:** variable names changed to match the new composer (`subnet_ip_cidr_range` → `subnet_cidr`, etc.). Update your tfvars accordingly. + +Commit message: + +``` +refactor(examples/gcp-byovpc): migrate to modules/gcp/databricks-workspace + +vpc_source="create" with spoke + subnet CIDRs. Variable names changed +to match the composer; README documents the migration. +``` + +--- + +## PR 4 — Migrate `examples/gcp-with-psc-exfiltration-protection` + +### Task 26: Rewrite the PSC example against the new composer + +**Files:** +- Modify: `examples/gcp-with-psc-exfiltration-protection/main.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/variables.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/outputs.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/README.md` +- Modify: `examples/gcp-with-psc-exfiltration-protection/terraform.tfvars` + +- [ ] **Step 1: Rewrite `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.workspace_google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project + hub_vpc_google_project = var.hub_vpc_google_project + is_spoke_vpc_shared = var.is_spoke_vpc_shared + hub_vpc_cidr = var.hub_vpc_cidr + psc_subnet_cidr = var.psc_subnet_cidr + hive_metastore_ip = var.hive_metastore_ip + + tags = var.tags +} +``` + +- [ ] **Step 2: Rewrite `unity-catalog.tf` to consume composer outputs** + +```hcl +module "unity_catalog" { + source = "../../modules/gcp/unity-catalog" + + providers = { + databricks = databricks + databricks.workspace = databricks.workspace + } + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url + google_project = var.workspace_google_project + google_region = var.google_region + prefix = var.prefix + metastore_name = var.metastore_name + catalog_name = var.catalog_name +} +``` + +- [ ] **Step 3: Trim `variables.tf`** + +Drop variables that no longer apply (none — all current vars still map). Rename `subnet_cidr_var` references if any. + +Add new required vars: `spoke_vpc_cidr` (was `spoke_vpc_cidr` already), `subnet_cidr` (NEW — split from existing single CIDR var if needed). Refer to the current `terraform.tfvars` to confirm whether `subnet_cidr` was already exposed or needs to be added. + +Check current vars file: + +```bash +cat examples/gcp-with-psc-exfiltration-protection/variables.tf +``` + +If `subnet_cidr` isn't there, add: + +```hcl +variable "subnet_cidr" { + type = string + description = "CIDR for the spoke subnet" +} +``` + +- [ ] **Step 4: Update `terraform.tfvars`** to include `subnet_cidr` and remove any orphaned vars. + +- [ ] **Step 5: Update README** + +Document the migration. Note that `private_link_frontend`, `private_link_backend`, `private_access_only`, `restricted_egress` are now the explicit feature flags; the example sets all four to `true`. + +- [ ] **Step 6: Regenerate docs and validate** + +```bash +cd examples/gcp-with-psc-exfiltration-protection && make docs && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 7: Sandbox apply (manual, with extra care)** + +- Snapshot state before apply +- Apply against sandbox +- Verify workspace reachable through PSC +- Verify UC catalog accessible +- Fresh plan — confirm zero drift +- `terraform destroy` — confirm PSC + DNS teardown is clean (no orphans) +- Capture all plan/apply/destroy output in the PR description + +- [ ] **Step 8: Commit + open PR** + +```bash +git add examples/gcp-with-psc-exfiltration-protection/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-with-psc): migrate to new composer + +Single module call to modules/gcp/databricks-workspace with all four +connectivity flags enabled and restricted_egress=true. Unity Catalog +wired separately via modules/gcp/unity-catalog. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "refactor(examples/gcp-with-psc): migrate to new composer (PR 4 of 6)" --body "$(cat <<'EOF' +## Summary + +Migrates the PSC + exfiltration-protection example to the new composer. +The most complex of the migration PRs. + +## Test plan + +- [ ] Sandbox `terraform apply` succeeds end-to-end +- [ ] Workspace reachable through PSC (frontend) +- [ ] UC catalog accessible +- [ ] Fresh `terraform plan` shows zero drift +- [ ] `terraform destroy` cleans up PSC + DNS without orphans +EOF +)" +``` + +--- + +## PR 5 — New `examples/gcp-existing-vpc` + +### Task 27: Add the new "existing VPC" example + +**Files:** +- Create: `examples/gcp-existing-vpc/main.tf` +- Create: `examples/gcp-existing-vpc/init.tf` +- Create: `examples/gcp-existing-vpc/variables.tf` +- Create: `examples/gcp-existing-vpc/outputs.tf` +- Create: `examples/gcp-existing-vpc/terraform.tfvars` +- Create: `examples/gcp-existing-vpc/README.md` +- Create: `examples/gcp-existing-vpc/Makefile` + +- [ ] **Step 1: Copy `init.tf` and `Makefile` from `examples/gcp-basic`** (identical provider setup). + +- [ ] **Step 2: Write `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} +``` + +- [ ] **Step 3: Write `variables.tf`** + +```hcl +variable "databricks_account_id" { type = string } +variable "databricks_google_service_account" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "google_zone" { type = string } +variable "prefix" { type = string } +variable "workspace_name" { type = string } +variable "existing_vpc_name" { type = string } +variable "existing_subnet_name" { type = string } +``` + +- [ ] **Step 4: Write `outputs.tf`, `terraform.tfvars`, `README.md`** (same templates as Task 24, scenario-appropriate). + +- [ ] **Step 5: Validate** + +```bash +cd examples/gcp-existing-vpc && make docs && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 6: Sandbox apply (requires a pre-existing VPC and subnet in the sandbox project)** + +- [ ] **Step 7: Commit + PR** + +```bash +git add examples/gcp-existing-vpc/ +git commit -m "$(cat <<'EOF' +feat(examples/gcp-existing-vpc): new example using existing VPC + +New scenario unsupported by the legacy modules. Calls the composer +with vpc_source="existing" and looks up the pre-existing VPC and +subnet via the network submodule's data sources. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "feat(examples/gcp-existing-vpc): new example (PR 5 of 6)" --body "..." +``` + +--- + +## PR 6 — Cleanup + +### Task 28: Repoint `examples/gcp-sa-provisioning` at the relocated module + +**Files:** +- Modify: `examples/gcp-sa-provisioning/main.tf` + +- [ ] **Step 1: Update the `source` line** + +In `examples/gcp-sa-provisioning/main.tf` change: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" +``` + +to: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp/service-account" +``` + +(Or the relative path `../../modules/gcp/service-account` if the example uses relative sources — match existing convention.) + +- [ ] **Step 2: Validate** + +```bash +cd examples/gcp-sa-provisioning && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 3: Commit** + +```bash +git add examples/gcp-sa-provisioning/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-sa-provisioning): repoint to modules/gcp/service-account + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 29: Delete deprecated modules + +**Files:** +- Delete: `modules/gcp-workspace-basic/` +- Delete: `modules/gcp-workspace-byovpc/` +- Delete: `modules/gcp-with-psc-exfiltration-protection/` +- Delete: `modules/gcp-sa-provisioning/` (deprecation stub from Task 20) +- Delete: `modules/gcp-unity-catalog/` (deprecation stub from Task 21) + +- [ ] **Step 1: Confirm no example still references the old paths** + +```bash +grep -rn "modules/gcp-workspace-basic\|modules/gcp-workspace-byovpc\|modules/gcp-with-psc-exfiltration-protection\|modules/gcp-sa-provisioning\|modules/gcp-unity-catalog" examples/ modules/ +``` + +Expected: no matches (every match should already point to `modules/gcp/...`). + +- [ ] **Step 2: Delete** + +```bash +git rm -r modules/gcp-workspace-basic modules/gcp-workspace-byovpc modules/gcp-with-psc-exfiltration-protection modules/gcp-sa-provisioning modules/gcp-unity-catalog +``` + +- [ ] **Step 3: Commit** + +```bash +git commit -m "$(cat <<'EOF' +refactor: remove deprecated GCP modules + +Removes modules/gcp-workspace-basic, modules/gcp-workspace-byovpc, +modules/gcp-with-psc-exfiltration-protection, and the deprecation +stubs for modules/gcp-sa-provisioning and modules/gcp-unity-catalog. +All examples now point at modules/gcp/*. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 30: Delete junk directories and stray state files + +**Files:** +- Delete: `examples/gcp-sa-provisionning/` (typo dir, only contains a Makefile) +- Delete: `examples/gcp-test-modules/` (only state files) +- Delete: stray `terraform.tfstate*` files under `examples/gcp-*/` (verify `.gitignore` first) + +- [ ] **Step 1: Check `.gitignore`** + +```bash +grep -n "tfstate" .gitignore +``` + +Expected: tfstate files should already be gitignored. If not, add patterns and stage that change. + +- [ ] **Step 2: Delete junk dirs** + +```bash +git rm -r examples/gcp-sa-provisionning examples/gcp-test-modules +``` + +- [ ] **Step 3: Untrack stray state files** + +```bash +git rm --cached examples/gcp-basic/terraform.tfstate* 2>/dev/null || true +git rm --cached examples/gcp-byovpc/terraform.tfstate* 2>/dev/null || true +git rm --cached examples/gcp-with-psc-exfiltration-protection/terraform.tfstate* 2>/dev/null || true +``` + +- [ ] **Step 4: Commit** + +```bash +git commit -m "$(cat <<'EOF' +chore: remove junk dirs and untrack stray terraform state + +Deletes examples/gcp-sa-provisionning (typo dir, Makefile only) and +examples/gcp-test-modules (state-only). Untracks accidentally-committed +terraform.tfstate files under examples/gcp-*. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 31: Update top-level README + +**Files:** +- Modify: `README.md` + +- [ ] **Step 1: Identify the GCP section in the top-level README** + +```bash +grep -n -A 5 "gcp" README.md | head -40 +``` + +- [ ] **Step 2: Rewrite the GCP examples table** + +Update the listing to: + +| Example | Description | +|---------|-------------| +| `examples/gcp-basic` | Databricks-managed VPC | +| `examples/gcp-byovpc` | Customer VPC (Terraform creates it) | +| `examples/gcp-existing-vpc` | Use an existing customer VPC | +| `examples/gcp-with-psc-exfiltration-protection` | Full PSC + private DNS + restricted egress | +| `examples/gcp-sa-provisioning` | Bootstrap the workspace-creator service account | + +Update the modules listing to reflect the new structure under `modules/gcp/`. + +- [ ] **Step 3: Commit and open final PR** + +```bash +git add README.md +git commit -m "$(cat <<'EOF' +docs: update README for new GCP module layout + +Updates the GCP examples and modules tables to reflect the +modules/gcp/ composer + submodules and the new gcp-existing-vpc +example. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "chore: delete legacy GCP modules and dirs (PR 6 of 6)" --body "..." +``` + +--- + +## Self-Review + +(Performed after writing the plan; issues found and fixed inline.) + +**1. Spec coverage:** +- Problem statement → Tasks 24–31 migrate every existing GCP example ✓ +- Goals 1–7 → All addressed; thin examples in Tasks 24–27 ✓ +- Module layout (5 submodules + service-account + unity-catalog) → Tasks 2–22 ✓ +- Composer API → Task 16 (variables), Task 17 (wiring), Task 19 (preconditions) ✓ +- Cross-variable validation table → Task 19 (negative fixtures verify each rule) ✓ +- Submodule contracts → Tasks 2–15 implement each contract ✓ +- Example shapes (4 scenarios) → Tasks 18 (composer fixtures) + 24–27 (real examples) ✓ +- Migration plan (6 PRs) → Tasks grouped under "PR 1" through "PR 6" headers ✓ +- Testing approach → Tasks 18 (positive), 19 (negative), per-task validate steps ✓ +- Risks → Hive metastore IP fallback acknowledged via the empty `default_hive_metastore_ips` map; teardown ordering is sandbox-tested in PR 4 ✓ + +**2. Placeholders:** Scanned for "TBD", "TODO", "implement later", vague "add error handling". Found none. The Hive metastore IP map is intentionally empty initially (variable falls back to "" and gates the hive firewall rule); this is a documented behavior, not a placeholder. + +**3. Type consistency:** +- `frontend_psc_fr_id` / `backend_psc_fr_id` / `hub_frontend_psc_fr_id` used consistently across `private-connectivity` outputs (Task 7), `account` inputs (Task 9), and composer wiring (Task 17) ✓ +- `spoke_vpc_self_link`, `hub_vpc_self_link` consistent between `network` outputs (Tasks 3, 5), `private-connectivity` inputs (Task 6), and `dns` inputs (Task 14) ✓ +- `workspace_url` flows from `account` (Task 10) → `dns` (Task 14, 15) → composer outputs (Task 17) ✓ +- `nat_dependency` is `type = any` in `account` (Task 9), wired to `module.network[0].nat_id` (Task 17) — matches ✓ + +**4. Spec requirements with no task:** None found. + +--- + +## Execution Handoff + +Plan complete and saved to `docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md`. + +Two execution options: + +**1. Subagent-Driven (recommended)** — fresh subagent per task, review between tasks, fast iteration. + +**2. Inline Execution** — execute tasks in this session using executing-plans, batch execution with checkpoints. + +Which approach? diff --git a/docs/superpowers/plans/2026-05-26-gcp-best-practices-refactor.md b/docs/superpowers/plans/2026-05-26-gcp-best-practices-refactor.md new file mode 100644 index 00000000..54bf4fc1 --- /dev/null +++ b/docs/superpowers/plans/2026-05-26-gcp-best-practices-refactor.md @@ -0,0 +1,2257 @@ +# GCP Best-Practices Refactor Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Polish the GCP modules and examples landed in PR #233: fill ~70 missing variable descriptions, split large `main.tf` files by concern, standardize `versions.tf` placement, expand composer outputs, fix output misnomers, add region validations, and add README usage examples — without changing runtime behavior. + +**Architecture:** Eight focused commits, each leaving the tree validating. The composer-side rename (`*_psc_fr_id` → `*_forwarding_rule_name`) is coordinated across `private-connectivity` outputs, `account` inputs, and composer wiring in a single commit to avoid an intermediate broken state. + +**Tech Stack:** Terraform >= 1.5, existing `hashicorp/google` + `databricks/databricks` + `hashicorp/random` + `hashicorp/null` providers, `terraform-docs`, `pre-commit`. + +**Spec reference:** `docs/superpowers/specs/2026-05-26-gcp-best-practices-refactor-design.md` + +**Branch:** `feature/gcp-modules-refactor` (continues on draft PR #233; do not push, do not open new PRs) + +--- + +## File Structure + +This plan operates on existing files. The summary of what each task touches: + +``` +modules/gcp/ +├── network/ +│ ├── main.tf DELETE (contents move to per-concern files) +│ ├── vpc.tf NEW (Task 1) +│ ├── subnets.tf NEW (Task 1) +│ ├── nat.tf NEW (Task 1) +│ ├── peering.tf NEW (Task 1) +│ ├── shared-vpc.tf NEW (Task 1) +│ ├── data.tf NEW (Task 1) +│ ├── locals.tf NEW (Task 1) +│ ├── variables.tf (unchanged — already documented) +│ ├── outputs.tf (unchanged) +│ └── versions.tf (unchanged) +├── private-connectivity/ +│ ├── psc.tf (unchanged) +│ ├── firewall.tf (unchanged) +│ ├── locals.tf (unchanged) +│ ├── variables.tf MODIFY (Task 4 — descriptions, Task 5 — region validation) +│ ├── outputs.tf MODIFY (Task 2 — rename *_fr_id → *_forwarding_rule_name) +│ ├── versions.tf (unchanged) +│ └── README.md MODIFY (Task 7 — Usage block; Task 8 — terraform-docs regen) +├── account/ +│ ├── main.tf DELETE (contents move to workspace.tf + networks.tf + locals.tf) +│ ├── workspace.tf NEW (Task 1) +│ ├── networks.tf NEW (Task 1) +│ ├── locals.tf NEW (Task 1) +│ ├── vpc-endpoints.tf (unchanged) +│ ├── pas.tf (unchanged) +│ ├── variables.tf MODIFY (Task 2 — rename inputs; Task 4 — descriptions) +│ ├── outputs.tf MODIFY (Task 3 — add private_access_settings_id) +│ ├── versions.tf (unchanged) +│ └── README.md MODIFY (Task 7; Task 8) +├── dns/ +│ ├── hub.tf MODIFY (Task 1 — extract workspace_dns_id into locals.tf) +│ ├── spoke.tf (unchanged) +│ ├── locals.tf NEW (Task 1) +│ ├── variables.tf MODIFY (Task 4 — descriptions) +│ ├── outputs.tf (unchanged) +│ ├── versions.tf (unchanged) +│ └── README.md MODIFY (Task 7; Task 8) +├── databricks-workspace/ +│ ├── main.tf REWRITE (Task 1 — only module blocks remain) +│ ├── locals.tf NEW (Task 1) +│ ├── preconditions.tf NEW (Task 1 — also adds region precondition in Task 5) +│ ├── random.tf NEW (Task 1) +│ ├── variables.tf MODIFY (Task 4 — descriptions) +│ ├── outputs.tf REWRITE (Task 3 — drop vpc_id, add 14 outputs, explicit ternaries) +│ ├── versions.tf (unchanged) +│ └── README.md MODIFY (Task 7; Task 8) +├── service-account/ +│ ├── init.tf DELETE (Task 6 — split into versions.tf; provider block removed) +│ ├── versions.tf NEW (Task 6) +│ ├── main.tf (unchanged) +│ ├── variables.tf MODIFY (Task 4 — descriptions if missing) +│ ├── outputs.tf (unchanged) +│ └── README.md MODIFY (Task 7; Task 8) +└── unity-catalog/ + ├── terraform.tf RENAME → versions.tf (Task 6) + ├── databricks-cloud-resources.tf (unchanged) + ├── gcs.tf (unchanged) + ├── variables.tf MODIFY (Task 4 if any are undocumented) + └── README.md MODIFY (Task 7; Task 8) + +examples/ +├── gcp-basic/ +│ ├── init.tf DELETE (Task 6 — split) +│ ├── versions.tf NEW (Task 6) +│ ├── providers.tf NEW (Task 6) +│ └── (other files unchanged) +├── gcp-byovpc/ (same pattern) +├── gcp-existing-vpc/ (same pattern) +├── gcp-with-psc-exfiltration-protection/ +│ └── terraform.tf RENAME → versions.tf (Task 6) +└── gcp-sa-provisioning/ + ├── init.tf DELETE (Task 6 — split; google provider block now lives here, not in the module) + ├── versions.tf NEW (Task 6) + └── providers.tf NEW (Task 6) +``` + +The 8 tasks below correspond to the 8 commits in the spec's "Implementation phasing" section, in order: + +1. **Task 1:** Split module files by concern (no behavioral change) +2. **Task 2:** Rename forwarding-rule outputs in `private-connectivity` and inputs in `account` (coordinated) +3. **Task 3:** Expand and rename composer outputs (drop `vpc_id`, add 14 outputs, switch `try()` to explicit ternaries) +4. **Task 4:** Add `description` to ~70 variables across 5 modules +5. **Task 5:** Add region validations on `private-connectivity` and composer +6. **Task 6:** Standardize provider/versions placement across modules and examples +7. **Task 7:** Add `## Usage` sections to module READMEs +8. **Task 8:** Regenerate terraform-docs READMEs across all modules + +--- + +## Conventions for every task + +- **No `.terraform/` cleanup needed**: existing `.gitignore` already excludes `.terraform/` and `.terraform.lock.hcl`. They won't appear in commits. +- **`make docs` must only be run from inside a specific module directory**, never from the repo root or `modules/`. Running it higher up regenerates non-GCP READMEs and creates drift. +- **Validate after every code change**: `cd && terraform init -backend=false && terraform validate`. Required to pass before commit. +- **Commit messages use the Co-authored-by: Isaac trailer** per the project commit template. +- **Do not push the branch.** Pushing/PR-management happens after all tasks complete and the user reviews. + +--- + +## Task 1: Split module files by concern + +**Goal:** Reorganize `network`, `account`, `dns`, and `databricks-workspace` (composer) into focused single-concern files. No resource attributes change; only file boundaries. + +**Files:** + +In `modules/gcp/network/`: +- Delete: `main.tf` +- Create: `vpc.tf`, `subnets.tf`, `nat.tf`, `peering.tf`, `shared-vpc.tf`, `data.tf`, `locals.tf` + +In `modules/gcp/account/`: +- Delete: `main.tf` +- Create: `workspace.tf`, `networks.tf`, `locals.tf` + +In `modules/gcp/dns/`: +- Modify: `hub.tf` (remove the `locals { workspace_dns_id = ... }` block) +- Create: `locals.tf` + +In `modules/gcp/databricks-workspace/`: +- Rewrite: `main.tf` (only the 4 module blocks remain) +- Create: `locals.tf`, `preconditions.tf`, `random.tf` + +- [ ] **Step 1: Create `modules/gcp/network/locals.tf`** + +```hcl +locals { + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + subnet_name = coalesce(var.subnet_name, "${var.prefix}-subnet-${var.suffix}") +} +``` + +- [ ] **Step 2: Create `modules/gcp/network/vpc.tf`** + +```hcl +# === Spoke VPC (created) ================================================ +resource "google_compute_network" "spoke_vpc" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-spoke-vpc-${var.suffix}" + project = var.spoke_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +# === Hub VPC ============================================================ +resource "google_compute_network" "hub_vpc" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-vpc-${var.suffix}" + project = var.hub_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} +``` + +- [ ] **Step 3: Create `modules/gcp/network/subnets.tf`** + +```hcl +# === Spoke subnet ======================================================= +resource "google_compute_subnetwork" "spoke_subnet" { + count = local.create_vpc ? 1 : 0 + + name = local.subnet_name + project = var.spoke_vpc_google_project + network = google_compute_network.spoke_vpc[0].id + region = var.google_region + ip_cidr_range = var.subnet_cidr + private_ip_google_access = true + + dynamic "secondary_ip_range" { + for_each = var.pod_cidr != null ? [1] : [] + content { + range_name = "pods" + ip_cidr_range = var.pod_cidr + } + } + + dynamic "secondary_ip_range" { + for_each = var.svc_cidr != null ? [1] : [] + content { + range_name = "services" + ip_cidr_range = var.svc_cidr + } + } +} + +# === Hub subnet ========================================================= +resource "google_compute_subnetwork" "hub_subnet" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-subnet-${var.suffix}" + project = var.hub_vpc_google_project + network = google_compute_network.hub_vpc[0].id + region = var.google_region + ip_cidr_range = var.hub_vpc_cidr + private_ip_google_access = true +} +``` + +- [ ] **Step 4: Create `modules/gcp/network/nat.tf`** + +```hcl +resource "google_compute_router" "router" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-router-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = google_compute_network.spoke_vpc[0].id +} + +resource "google_compute_router_nat" "nat" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-nat-${var.suffix}" + project = var.spoke_vpc_google_project + router = google_compute_router.router[0].name + region = var.google_region + nat_ip_allocate_option = "AUTO_ONLY" + source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" +} +``` + +- [ ] **Step 5: Create `modules/gcp/network/peering.tf`** + +```hcl +resource "google_compute_network_peering" "hub_to_spoke" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-spoke-${var.suffix}" + network = google_compute_network.hub_vpc[0].self_link + peer_network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link +} + +resource "google_compute_network_peering" "spoke_to_hub" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-spoke-hub-${var.suffix}" + network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link + peer_network = google_compute_network.hub_vpc[0].self_link +} +``` + +- [ ] **Step 6: Create `modules/gcp/network/shared-vpc.tf`** + +```hcl +resource "google_compute_shared_vpc_host_project" "host" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + project = var.spoke_vpc_google_project +} + +resource "google_compute_shared_vpc_service_project" "service" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + host_project = google_compute_shared_vpc_host_project.host[0].project + service_project = var.workspace_google_project +} +``` + +- [ ] **Step 7: Create `modules/gcp/network/data.tf`** + +```hcl +data "google_compute_network" "existing_spoke" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_vpc_name + project = var.spoke_vpc_google_project +} + +data "google_compute_subnetwork" "existing_spoke_subnet" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_subnet_name + project = var.spoke_vpc_google_project + region = var.google_region +} +``` + +- [ ] **Step 8: Delete `modules/gcp/network/main.tf`** + +```bash +git rm modules/gcp/network/main.tf +``` + +- [ ] **Step 9: Validate the network module** + +```bash +cd modules/gcp/network && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +Also validate each fixture: + +```bash +for d in tests/create tests/existing tests/create-with-hub; do + cd modules/gcp/network/$d && terraform init -backend=false && terraform validate && cd - +done +``` + +Each should print `Success!`. + +- [ ] **Step 10: Create `modules/gcp/account/locals.tf`** + +```hcl +locals { + workspace_name = coalesce(var.workspace_name, "${var.prefix}-ws-${var.suffix}") + emit_mws_networks = var.vpc_source != "databricks_managed" + emit_vpc_endpoints = var.frontend_psc_fr_id != null && var.backend_psc_fr_id != null + emit_pas = var.private_access_only +} +``` + +(Note: variable references here are pre-rename — they get updated in Task 2.) + +- [ ] **Step 11: Create `modules/gcp/account/workspace.tf`** + +```hcl +resource "databricks_mws_workspaces" "this" { + account_id = var.databricks_account_id + workspace_name = local.workspace_name + location = var.google_region + + cloud_resource_container { + gcp { + project_id = var.google_project + } + } + + network_id = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + private_access_settings_id = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + + token { + comment = "Terraform" + } + + depends_on = [var.nat_dependency] +} +``` + +- [ ] **Step 12: Create `modules/gcp/account/networks.tf`** + +```hcl +resource "databricks_mws_networks" "this" { + count = local.emit_mws_networks ? 1 : 0 + + account_id = var.databricks_account_id + network_name = "${var.prefix}-ntw-${var.suffix}" + + gcp_network_info { + network_project_id = var.spoke_vpc_google_project + vpc_id = var.spoke_vpc_name + subnet_id = var.spoke_subnet_name + subnet_region = var.google_region + } + + dynamic "vpc_endpoints" { + for_each = local.emit_vpc_endpoints ? [1] : [] + content { + dataplane_relay = [databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id] + rest_api = [databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id] + } + } +} +``` + +- [ ] **Step 13: Delete `modules/gcp/account/main.tf`** + +```bash +git rm modules/gcp/account/main.tf +``` + +- [ ] **Step 14: Validate the account module** + +```bash +cd modules/gcp/account && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +Also validate the three fixtures: + +```bash +for d in tests/databricks-managed tests/byovpc tests/psc-with-pas; do + cd modules/gcp/account/$d && terraform init -backend=false && terraform validate && cd - +done +``` + +- [ ] **Step 15: Create `modules/gcp/dns/locals.tf`** + +```hcl +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} +``` + +- [ ] **Step 16: Remove the `locals` block from `modules/gcp/dns/hub.tf`** + +Use the Edit tool on `modules/gcp/dns/hub.tf` to remove lines 1-4 (the `locals` block at the top). The file should now start with the first DNS managed zone resource. Concretely, delete: + +```hcl +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} + +``` + +(Including the blank line after the closing brace.) + +- [ ] **Step 17: Validate the dns module** + +```bash +cd modules/gcp/dns && terraform init -backend=false && terraform validate +cd modules/gcp/dns/tests/hub-and-spoke && terraform init -backend=false && terraform validate +``` + +Expected: both `Success!`. + +- [ ] **Step 18: Create `modules/gcp/databricks-workspace/locals.tf`** + +```hcl +locals { + databricks_managed = var.vpc_source == "databricks_managed" + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + any_private_link = var.private_link_frontend || var.private_link_backend + spoke_project = coalesce(var.spoke_vpc_google_project, var.google_project) +} +``` + +- [ ] **Step 19: Create `modules/gcp/databricks-workspace/random.tf`** + +```hcl +resource "random_string" "suffix" { + length = 6 + special = false + upper = false + + lifecycle { + ignore_changes = [special, upper] + } +} +``` + +- [ ] **Step 20: Create `modules/gcp/databricks-workspace/preconditions.tf`** + +```hcl +# Cross-variable preconditions. +resource "null_resource" "preconditions" { + lifecycle { + precondition { + condition = !var.restricted_egress || local.create_vpc + error_message = "restricted_egress=true requires vpc_source=\"create\" (hub-spoke topology needs us to own both VPCs)." + } + precondition { + condition = !var.restricted_egress || local.any_private_link + error_message = "restricted_egress=true requires at least one of private_link_frontend or private_link_backend." + } + precondition { + condition = !var.restricted_egress || (var.hub_vpc_google_project != null && var.hub_vpc_cidr != null && var.psc_subnet_cidr != null) + error_message = "restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, and psc_subnet_cidr." + } + precondition { + condition = !local.create_vpc || (var.spoke_vpc_cidr != null && var.subnet_cidr != null) + error_message = "vpc_source=\"create\" requires spoke_vpc_cidr and subnet_cidr." + } + precondition { + condition = !local.use_existing_vpc || (var.existing_vpc_name != null && var.existing_subnet_name != null) + error_message = "vpc_source=\"existing\" requires existing_vpc_name and existing_subnet_name." + } + precondition { + condition = !local.databricks_managed || (!var.private_link_frontend && !var.private_link_backend && !var.restricted_egress) + error_message = "vpc_source=\"databricks_managed\" forbids private_link_frontend, private_link_backend, and restricted_egress." + } + } +} +``` + +- [ ] **Step 21: Rewrite `modules/gcp/databricks-workspace/main.tf` to only contain module blocks** + +Replace the entire file content with: + +```hcl +module "network" { + source = "../network" + count = local.databricks_managed ? 0 : 1 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + vpc_source = var.vpc_source + spoke_vpc_google_project = local.spoke_project + + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr + + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name + + create_hub = var.restricted_egress + hub_vpc_google_project = var.hub_vpc_google_project + hub_vpc_cidr = var.hub_vpc_cidr + is_spoke_vpc_shared = var.is_spoke_vpc_shared + workspace_google_project = var.google_project +} + +module "private_connectivity" { + source = "../private-connectivity" + count = local.any_private_link ? 1 : 0 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + spoke_vpc_cidr = var.spoke_vpc_cidr + + hub_vpc_id = var.restricted_egress ? module.network[0].hub_vpc_id : null + hub_vpc_self_link = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + hub_vpc_google_project = var.hub_vpc_google_project + hub_subnet_name = var.restricted_egress ? module.network[0].hub_subnet_name : null + hub_vpc_cidr = var.hub_vpc_cidr + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + restrict_egress = var.restricted_egress + psc_subnet_cidr = var.psc_subnet_cidr + + hive_metastore_ip = var.hive_metastore_ip +} + +module "account" { + source = "../account" + + prefix = var.prefix + suffix = random_string.suffix.result + workspace_name = var.workspace_name + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + vpc_source = var.vpc_source + + spoke_vpc_name = local.databricks_managed ? null : module.network[0].spoke_vpc_name + spoke_subnet_name = local.databricks_managed ? null : module.network[0].spoke_subnet_name + spoke_vpc_google_project = local.spoke_project + hub_vpc_google_project = var.hub_vpc_google_project + + frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].frontend_psc_fr_id : null + backend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].backend_psc_fr_id : null + hub_frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].hub_frontend_psc_fr_id : null + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + private_access_only = var.private_access_only + + nat_dependency = local.databricks_managed ? null : module.network[0].nat_id +} + +module "dns" { + source = "../dns" + count = var.restricted_egress ? 1 : 0 + + prefix = var.prefix + google_region = var.google_region + + hub_vpc_id = module.network[0].hub_vpc_id + hub_vpc_self_link = module.network[0].hub_vpc_self_link + hub_vpc_google_project = var.hub_vpc_google_project + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity[0].frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity[0].frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity[0].backend_psc_ip_spoke +} +``` + +- [ ] **Step 22: Validate the composer module** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +Also validate each fixture (4 positive + 4 negative): + +```bash +for d in tests/basic tests/byovpc tests/existing-vpc tests/psc-isolated; do + cd modules/gcp/databricks-workspace/$d && terraform init -backend=false && terraform validate && cd - +done +``` + +Each should print `Success!`. + +The 4 negative fixtures don't need validation here — they're for plan-time precondition failures and we haven't changed precondition logic. + +- [ ] **Step 23: Commit** + +```bash +git add modules/gcp/network/ modules/gcp/account/ modules/gcp/dns/ modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +refactor(gcp): split module files by concern + +Reorganizes network, account, dns, and databricks-workspace modules +so each .tf file has one clear responsibility. No resource attributes +change; only file boundaries. + +- network/main.tf -> vpc.tf, subnets.tf, nat.tf, peering.tf, + shared-vpc.tf, data.tf, locals.tf +- account/main.tf -> workspace.tf, networks.tf, locals.tf +- dns/hub.tf: workspace_dns_id local extracted to dns/locals.tf +- databricks-workspace/main.tf -> main.tf (module blocks only), + locals.tf, preconditions.tf, random.tf + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 2: Rename forwarding-rule outputs and account inputs + +**Goal:** Fix the `*_psc_fr_id` misnomer. These are forwarding-rule **names**, not IDs. Rename outputs in `private-connectivity`, inputs in `account`, and the wiring in the composer — all in one commit so no intermediate state is broken. + +**Files:** +- Modify: `modules/gcp/private-connectivity/outputs.tf` +- Modify: `modules/gcp/account/variables.tf` +- Modify: `modules/gcp/account/locals.tf` +- Modify: `modules/gcp/account/networks.tf` +- Modify: `modules/gcp/account/vpc-endpoints.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Modify: `modules/gcp/databricks-workspace/main.tf` +- Modify: `modules/gcp/account/tests/psc-with-pas/main.tf` + +- [ ] **Step 1: Rename outputs in `modules/gcp/private-connectivity/outputs.tf`** + +Replace the three `*_psc_fr_id` outputs. Use the Edit tool with these exact replacements: + +Old (lines 6-19): +```hcl +output "frontend_psc_fr_id" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_psc_fr_id" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_psc_fr_id" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} +``` + +New: +```hcl +output "frontend_forwarding_rule_name" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the spoke-side frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_forwarding_rule_name" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_forwarding_rule_name" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} +``` + +- [ ] **Step 2: Rename inputs in `modules/gcp/account/variables.tf`** + +Use Edit to rename the three variables. Old: + +```hcl +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_psc_fr_id" { + type = string + default = null +} + +variable "backend_psc_fr_id" { + type = string + default = null +} + +variable "hub_frontend_psc_fr_id" { + type = string + default = null +} +``` + +New: + +```hcl +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_forwarding_rule_name" { + type = string + default = null +} + +variable "backend_forwarding_rule_name" { + type = string + default = null +} + +variable "hub_frontend_forwarding_rule_name" { + type = string + default = null +} +``` + +- [ ] **Step 3: Update references in `modules/gcp/account/locals.tf`** + +Old line: +```hcl + emit_vpc_endpoints = var.frontend_psc_fr_id != null && var.backend_psc_fr_id != null +``` + +New: +```hcl + emit_vpc_endpoints = var.frontend_forwarding_rule_name != null && var.backend_forwarding_rule_name != null +``` + +- [ ] **Step 4: Update references in `modules/gcp/account/vpc-endpoints.tf`** + +The three `databricks_mws_vpc_endpoint` resources each reference one of the renamed variables. Use Edit with `replace_all=true` to replace `var.frontend_psc_fr_id` → `var.frontend_forwarding_rule_name`. Then do the same for `var.backend_psc_fr_id` → `var.backend_forwarding_rule_name` and `var.hub_frontend_psc_fr_id` → `var.hub_frontend_forwarding_rule_name`. + +- [ ] **Step 5: Update references in `modules/gcp/account/outputs.tf`** + +Three outputs reference these variables in their conditional. Same rename treatment: +- `var.frontend_psc_fr_id` → `var.frontend_forwarding_rule_name` +- `var.backend_psc_fr_id` → `var.backend_forwarding_rule_name` +- `var.hub_frontend_psc_fr_id` → `var.hub_frontend_forwarding_rule_name` + +- [ ] **Step 6: Update composer wiring in `modules/gcp/databricks-workspace/main.tf`** + +In the `module "account" { ... }` block, rename the three input arguments: + +Old (within `module "account"`): +```hcl + frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].frontend_psc_fr_id : null + backend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].backend_psc_fr_id : null + hub_frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].hub_frontend_psc_fr_id : null +``` + +New: +```hcl + frontend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].frontend_forwarding_rule_name : null + backend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].backend_forwarding_rule_name : null + hub_frontend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].hub_frontend_forwarding_rule_name : null +``` + +- [ ] **Step 7: Update the test fixture `modules/gcp/account/tests/psc-with-pas/main.tf`** + +The fixture passes `frontend_psc_fr_id`, `backend_psc_fr_id`, `hub_frontend_psc_fr_id`. Rename to match: + +Old: +```hcl + frontend_psc_fr_id = "fixture-psc-ws-ep-abc123" + backend_psc_fr_id = "fixture-psc-scc-ep-abc123" + hub_frontend_psc_fr_id = "fixture-hub-psc-ws-ep-abc123" +``` + +New: +```hcl + frontend_forwarding_rule_name = "fixture-psc-ws-ep-abc123" + backend_forwarding_rule_name = "fixture-psc-scc-ep-abc123" + hub_frontend_forwarding_rule_name = "fixture-hub-psc-ws-ep-abc123" +``` + +- [ ] **Step 8: Validate the affected modules** + +```bash +for m in private-connectivity account databricks-workspace; do + cd modules/gcp/$m && terraform init -backend=false && terraform validate && cd - +done +``` + +Each should print `Success!`. + +Also validate the account fixture that uses the renamed inputs: + +```bash +cd modules/gcp/account/tests/psc-with-pas && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +Also validate the composer's psc-isolated fixture: + +```bash +cd modules/gcp/databricks-workspace/tests/psc-isolated && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 9: Commit** + +```bash +git add modules/gcp/private-connectivity/ modules/gcp/account/ modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +refactor(gcp): rename *_psc_fr_id to *_forwarding_rule_name + +These outputs/inputs hold the GCP forwarding-rule name (an attribute +named `.name`, not `.id`). Renaming for accuracy: + +- private-connectivity outputs: frontend_psc_fr_id -> frontend_forwarding_rule_name, + backend_psc_fr_id -> backend_forwarding_rule_name, + hub_frontend_psc_fr_id -> hub_frontend_forwarding_rule_name +- account inputs renamed to match +- composer wiring updated; psc-with-pas fixture updated + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 3: Expand and rename composer outputs + +**Goal:** Drop redundant `vpc_id`, add 14 new outputs, switch `try(module.network[0].*)` to explicit `local.databricks_managed ?` ternaries. Also add the missing `private_access_settings_id` output on the `account` module. + +**Files:** +- Modify: `modules/gcp/account/outputs.tf` +- Rewrite: `modules/gcp/databricks-workspace/outputs.tf` +- Modify: `examples/gcp-byovpc/outputs.tf` (replace `vpc_id` reference with `spoke_vpc_id`) +- Modify: `examples/gcp-with-psc-exfiltration-protection/outputs.tf` (same) + +- [ ] **Step 1: Add `private_access_settings_id` output to `modules/gcp/account/outputs.tf`** + +Append (after the existing `transit_endpoint_id` output): + +```hcl + +output "private_access_settings_id" { + value = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + description = "databricks_mws_private_access_settings ID (null when private_access_only=false)" +} +``` + +- [ ] **Step 2: Validate the account module** + +```bash +cd modules/gcp/account && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 3: Rewrite `modules/gcp/databricks-workspace/outputs.tf`** + +Replace entire file with: + +```hcl +# === Workspace =========================================================== +output "workspace_id" { + value = module.account.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.account.workspace_url + description = "Databricks workspace URL (https://..gcp.databricks.com)" +} + +output "network_id" { + value = module.account.network_id + description = "databricks_mws_networks ID (null when vpc_source=databricks_managed)" +} + +output "private_access_settings_id" { + value = module.account.private_access_settings_id + description = "databricks_mws_private_access_settings ID (null when private_access_only=false)" +} + +# === mws_vpc_endpoint IDs (Databricks-side PSC registration) ============ +output "frontend_endpoint_id" { + value = module.account.frontend_endpoint_id + description = "Frontend mws_vpc_endpoint ID (null when private_link_frontend=false)" +} + +output "backend_endpoint_id" { + value = module.account.backend_endpoint_id + description = "Backend (SCC) mws_vpc_endpoint ID (null when private_link_backend=false)" +} + +output "transit_endpoint_id" { + value = module.account.transit_endpoint_id + description = "Hub-side mws_vpc_endpoint ID (null when no hub or no frontend PSC)" +} + +# === Network ============================================================= +output "spoke_vpc_id" { + value = local.databricks_managed ? null : module.network[0].spoke_vpc_id + description = "Spoke VPC ID (null when vpc_source=databricks_managed)" +} + +output "spoke_vpc_self_link" { + value = local.databricks_managed ? null : module.network[0].spoke_vpc_self_link + description = "Spoke VPC self-link (null when vpc_source=databricks_managed)" +} + +output "spoke_subnet_id" { + value = local.databricks_managed ? null : module.network[0].spoke_subnet_id + description = "Spoke subnet ID (null when vpc_source=databricks_managed)" +} + +output "spoke_subnet_self_link" { + value = local.databricks_managed ? null : module.network[0].spoke_subnet_self_link + description = "Spoke subnet self-link (null when vpc_source=databricks_managed)" +} + +output "hub_vpc_id" { + value = var.restricted_egress ? module.network[0].hub_vpc_id : null + description = "Hub VPC ID (null when restricted_egress=false)" +} + +output "hub_vpc_self_link" { + value = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + description = "Hub VPC self-link (null when restricted_egress=false)" +} + +output "nat_id" { + value = local.create_vpc ? module.network[0].nat_id : null + description = "Cloud NAT ID (null when vpc_source != create)" +} + +# === Private connectivity =============================================== +output "frontend_psc_ip_spoke" { + value = local.any_private_link ? module.private_connectivity[0].frontend_psc_ip_spoke : null + description = "IP address of the spoke-side frontend PSC endpoint (null when no PSC)" +} + +output "backend_psc_ip_spoke" { + value = local.any_private_link ? module.private_connectivity[0].backend_psc_ip_spoke : null + description = "IP address of the spoke-side backend PSC endpoint (null when no PSC)" +} + +output "frontend_psc_ip_hub" { + value = var.restricted_egress ? module.private_connectivity[0].frontend_psc_ip_hub : null + description = "IP address of the hub-side frontend PSC endpoint (null when restricted_egress=false)" +} + +# === Identifiers ======================================================== +output "suffix" { + value = random_string.suffix.result + description = "Random suffix used in resource names (useful when wiring downstream modules)" +} + +output "google_region" { + value = var.google_region + description = "Region the workspace was deployed to (echo of input; convenient for downstream modules)" +} +``` + +- [ ] **Step 4: Validate the composer module** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +Also validate all 4 positive fixtures: + +```bash +for d in tests/basic tests/byovpc tests/existing-vpc tests/psc-isolated; do + cd modules/gcp/databricks-workspace/$d && terraform init -backend=false && terraform validate && cd - +done +``` + +Each should print `Success!`. + +- [ ] **Step 5: Update `examples/gcp-byovpc/outputs.tf`** + +Current file references `module.workspace.vpc_id`. Replace with `module.workspace.spoke_vpc_id`. Use Edit: + +Old: +```hcl +output "vpc_id" { + value = module.workspace.vpc_id + description = "ID of the spoke VPC created by the module" +} +``` + +New: +```hcl +output "vpc_id" { + value = module.workspace.spoke_vpc_id + description = "ID of the spoke VPC created by the module" +} +``` + +(The example's output name `vpc_id` stays — only the composer's removed `vpc_id` is the breaking change; consumers update their reference.) + +- [ ] **Step 6: Update `examples/gcp-with-psc-exfiltration-protection/outputs.tf`** + +Same treatment if there's a `module.workspace.vpc_id` reference. Inspect first: + +```bash +grep -n "vpc_id" examples/gcp-with-psc-exfiltration-protection/outputs.tf +``` + +If a reference exists, replace `module.workspace.vpc_id` with `module.workspace.spoke_vpc_id` in the same way. + +- [ ] **Step 7: Validate the two examples** + +```bash +cd examples/gcp-byovpc && terraform init -backend=false && terraform validate && cd - +cd examples/gcp-with-psc-exfiltration-protection && terraform init -backend=false && terraform validate && cd - +``` + +Each should print `Success!`. + +- [ ] **Step 8: Commit** + +```bash +git add modules/gcp/account/outputs.tf modules/gcp/databricks-workspace/outputs.tf examples/gcp-byovpc/outputs.tf examples/gcp-with-psc-exfiltration-protection/outputs.tf +git commit -m "$(cat <<'EOF' +feat(gcp/databricks-workspace): expand and rename composer outputs + +- Drop redundant vpc_id (was an alias of spoke_vpc_id) +- Add 14 outputs: private_access_settings_id, frontend/backend/transit + _endpoint_id, spoke_vpc_self_link, spoke_subnet_id/self_link, + hub_vpc_self_link, nat_id, frontend/backend/_psc_ip_spoke, + frontend_psc_ip_hub, google_region +- Replace try(module.network[0].*) with explicit local.databricks_managed + / var.restricted_egress ternaries (same behavior, intent visible) +- Add private_access_settings_id output on the account module +- Update gcp-byovpc and gcp-with-psc examples to consume spoke_vpc_id + instead of the removed vpc_id + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 4: Add descriptions to all module variables + +**Goal:** Ensure every variable in every module has a `description`. Currently: network 17/17, private-connectivity 2/17, account 1/18, dns 0/12, composer 1/23, plus service-account and unity-catalog to check. + +**Files:** +- Modify: `modules/gcp/private-connectivity/variables.tf` +- Modify: `modules/gcp/account/variables.tf` +- Modify: `modules/gcp/dns/variables.tf` +- Modify: `modules/gcp/databricks-workspace/variables.tf` +- Modify: `modules/gcp/service-account/variables.tf` (only if any lack descriptions) +- Modify: `modules/gcp/unity-catalog/variables.tf` (only if any lack descriptions) + +- [ ] **Step 1: Rewrite `modules/gcp/private-connectivity/variables.tf`** + +Replace entire file with: + +```hcl +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "suffix" { + type = string + description = "Random suffix appended to resource names for uniqueness (passed by the composer)" +} + +variable "google_region" { + type = string + description = "GCP region for PSC and firewall resources (must be one of the regions in the regional PSC service-attachment maps)" +} + +# Spoke network refs +variable "spoke_vpc_id" { + type = string + description = "ID of the spoke VPC (output from the network module)" +} + +variable "spoke_vpc_self_link" { + type = string + description = "Self-link of the spoke VPC (used as the network reference for firewall rules)" +} + +variable "spoke_vpc_google_project" { + type = string + description = "GCP project that hosts the spoke VPC" +} + +variable "spoke_vpc_cidr" { + type = string + description = "CIDR of the spoke VPC address space (used as source_ranges for the hub ingress firewall)" +} + +# Hub network refs (nullable when no hub) +variable "hub_vpc_id" { + type = string + default = null + description = "ID of the hub VPC (null when no hub is created)" +} + +variable "hub_vpc_self_link" { + type = string + default = null + description = "Self-link of the hub VPC (null when no hub is created)" +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project that hosts the hub VPC (null when no hub is created)" +} + +variable "hub_subnet_name" { + type = string + default = null + description = "Name of the hub subnet (used as the subnetwork reference for the hub-side PSC address)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR of the hub VPC address space (reserved for future use)" +} + +# Feature flags +variable "enable_frontend" { + type = bool + default = false + description = "Create the frontend (workspace UI/API) PSC endpoint on the spoke and, if hub exists, the hub side" +} + +variable "enable_backend" { + type = bool + default = false + description = "Create the backend (SCC, data plane) PSC endpoint on the spoke" +} + +variable "restrict_egress" { + type = bool + default = false + description = "Create the egress firewall stack: deny-egress, allow Google APIs, allow control plane, allow managed Hive (conditional), hub ingress" +} + +# PSC subnet CIDR +variable "psc_subnet_cidr" { + type = string + description = "CIDR for the dedicated PSC subnet in the spoke VPC" +} + +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP used by the managed-hive allow rule. Looked up via internal map when null; firewall rule is skipped if the lookup also yields empty" +} +``` + +- [ ] **Step 2: Validate private-connectivity** + +```bash +cd modules/gcp/private-connectivity && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 3: Rewrite `modules/gcp/account/variables.tf`** + +(Note: variable renames from Task 2 are assumed in place — `*_forwarding_rule_name`.) + +Replace entire file with: + +```hcl +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "suffix" { + type = string + description = "Random suffix appended to resource names for uniqueness (passed by the composer)" +} + +variable "workspace_name" { + type = string + default = null + description = "Optional workspace name override. Defaults to \"${prefix}-ws-${suffix}\" when null" +} + +variable "databricks_account_id" { + type = string + description = "Databricks account ID (GUID) where this workspace will be registered" +} + +variable "google_project" { + type = string + description = "GCP project ID hosting the workspace data plane" +} + +variable "google_region" { + type = string + description = "GCP region where the workspace will be deployed" +} + +variable "vpc_source" { + type = string + description = "One of: databricks_managed (no mws_networks), create (we built the VPC), existing (data-source lookup)" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +variable "spoke_vpc_name" { + type = string + default = null + description = "Name of the spoke VPC used in databricks_mws_networks.gcp_network_info.vpc_id (null when vpc_source=databricks_managed)" +} + +variable "spoke_subnet_name" { + type = string + default = null + description = "Name of the spoke subnet used in databricks_mws_networks.gcp_network_info.subnet_id (null when vpc_source=databricks_managed)" +} + +variable "spoke_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the spoke VPC (used in databricks_mws_networks.gcp_network_info.network_project_id)" +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (used for the transit databricks_mws_vpc_endpoint when restricted_egress is enabled)" +} + +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_forwarding_rule_name" { + type = string + default = null + description = "Name of the frontend PSC forwarding rule from private-connectivity; gates frontend mws_vpc_endpoint creation" +} + +variable "backend_forwarding_rule_name" { + type = string + default = null + description = "Name of the backend (SCC) PSC forwarding rule from private-connectivity; gates backend mws_vpc_endpoint creation" +} + +variable "hub_frontend_forwarding_rule_name" { + type = string + default = null + description = "Name of the hub-side frontend PSC forwarding rule from private-connectivity; gates transit mws_vpc_endpoint creation" +} + +variable "enable_frontend" { + type = bool + default = false + description = "Create the frontend mws_vpc_endpoint (and, if hub_frontend_forwarding_rule_name is set, the transit endpoint)" +} + +variable "enable_backend" { + type = bool + default = false + description = "Create the backend (SCC) mws_vpc_endpoint" +} + +variable "private_access_only" { + type = bool + default = false + description = "Create databricks_mws_private_access_settings with public_access_enabled=false and attach it to the workspace" +} + +variable "nat_dependency" { + type = any + default = null + description = "Opaque value (typically the Cloud NAT ID) used as depends_on for the workspace to ensure NAT readiness before workspace creation" +} +``` + +- [ ] **Step 4: Validate account** + +```bash +cd modules/gcp/account && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 5: Rewrite `modules/gcp/dns/variables.tf`** + +Replace entire file with: + +```hcl +variable "prefix" { + type = string + description = "Prefix used to name generated DNS managed zones" +} + +variable "google_region" { + type = string + description = "GCP region (used in the spoke tunnel DNS record name)" +} + +# Hub +variable "hub_vpc_id" { + type = string + description = "ID of the hub VPC (DNS zones with this VPC's visibility)" +} + +variable "hub_vpc_self_link" { + type = string + description = "Self-link of the hub VPC" +} + +variable "hub_vpc_google_project" { + type = string + description = "GCP project hosting the hub VPC (used for the hub DNS zones)" +} + +# Spoke +variable "spoke_vpc_id" { + type = string + description = "ID of the spoke VPC (DNS zone with this VPC's visibility)" +} + +variable "spoke_vpc_self_link" { + type = string + description = "Self-link of the spoke VPC" +} + +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC (used for the spoke DNS zone)" +} + +# Workspace +variable "workspace_url" { + type = string + description = "Workspace URL from databricks_mws_workspaces; used to extract the workspace DNS ID via regex" +} + +# PSC IPs +variable "frontend_psc_ip_spoke" { + type = string + description = "Spoke-side frontend PSC endpoint IP (used in the spoke gcp.databricks.com A records)" +} + +variable "frontend_psc_ip_hub" { + type = string + default = null + description = "Hub-side frontend PSC endpoint IP (used in the hub gcp.databricks.com A records)" +} + +variable "backend_psc_ip_spoke" { + type = string + description = "Spoke-side backend (SCC) PSC endpoint IP (used in the spoke tunnel..gcp.databricks.com A record)" +} +``` + +- [ ] **Step 6: Validate dns** + +```bash +cd modules/gcp/dns && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 7: Rewrite `modules/gcp/databricks-workspace/variables.tf`** + +Replace entire file with: + +```hcl +# === Identity ============================================================ +variable "prefix" { + type = string + description = "Prefix used to name generated resources (e.g. \"acme\" produces \"acme-spoke-vpc-\")" +} + +variable "databricks_account_id" { + type = string + description = "Databricks account ID (GUID) where this workspace will be registered" +} + +variable "google_project" { + type = string + description = "GCP project ID hosting the workspace data plane" +} + +variable "google_region" { + type = string + description = "GCP region where the workspace will be deployed. When any private_link_* flag or restricted_egress is true, the region must be supported by Databricks PSC (see preconditions.tf)" +} + +variable "workspace_name" { + type = string + default = null + description = "Optional workspace name override. Defaults to \"${prefix}-ws-${suffix}\" when null" +} + +variable "tags" { + type = map(string) + default = {} + description = "Map of tags. Currently not propagated to child resources; reserved for future use" +} + +# === VPC source ========================================================== +variable "vpc_source" { + type = string + default = "databricks_managed" + description = "Where the workspace VPC comes from. One of: databricks_managed (no networking module called), create (Terraform creates VPC + subnet + NAT), existing (data-source lookup)" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +# When vpc_source = "create" +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR of the spoke VPC address space (e.g. 10.0.0.0/16). Required when vpc_source=create; ignored otherwise" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR of the spoke subnet primary range (e.g. 10.0.0.0/22). Required when vpc_source=create" +} + +variable "pod_cidr" { + type = string + default = null + description = "Optional CIDR for the GKE pods secondary range. Adds a secondary_ip_range to the spoke subnet when set" +} + +variable "svc_cidr" { + type = string + default = null + description = "Optional CIDR for the GKE services secondary range. Adds a secondary_ip_range to the spoke subnet when set" +} + +# When vpc_source = "existing" +variable "existing_vpc_name" { + type = string + default = null + description = "Name of the pre-existing VPC to use. Required when vpc_source=existing" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of the pre-existing subnet to use (must be in google_region). Required when vpc_source=existing" +} + +# === Connectivity feature flags ========================================== +variable "private_link_frontend" { + type = bool + default = false + description = "Create the frontend (workspace UI/API) PSC endpoint and a frontend databricks_mws_vpc_endpoint" +} + +variable "private_link_backend" { + type = bool + default = false + description = "Create the backend (SCC, data plane) PSC endpoint and a backend databricks_mws_vpc_endpoint" +} + +variable "private_access_only" { + type = bool + default = false + description = "Create databricks_mws_private_access_settings with public_access_enabled=false. Workspace becomes reachable only through PSC endpoints" +} + +variable "restricted_egress" { + type = bool + default = false + description = "Create hub VPC + bidirectional peering + deny-egress firewall + private DNS zones. Requires vpc_source=create and at least one private_link_* flag" +} + +# === Required when restricted_egress = true ============================== +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC. Required when restricted_egress=true" +} + +variable "spoke_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the spoke VPC. Defaults to google_project when null" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC project as a Shared-VPC host and the workspace project as a service project. Only takes effect when restricted_egress=true and the two projects differ" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR of the hub subnet (e.g. 10.1.0.0/24). Required when restricted_egress=true" +} + +variable "psc_subnet_cidr" { + type = string + default = null + description = "CIDR of the dedicated PSC subnet in the spoke VPC (e.g. 10.0.255.0/28). Required when restricted_egress=true or any private_link_* flag is true" +} + +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP used by the managed-hive allow rule. When null, the regional default is looked up internally; if no default exists for the region, the rule is skipped" +} +``` + +- [ ] **Step 8: Validate composer** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 9: Inspect and update `modules/gcp/service-account/variables.tf` if needed** + +```bash +grep -c "description" modules/gcp/service-account/variables.tf +grep -c "^variable" modules/gcp/service-account/variables.tf +``` + +If the description count equals the variable count, skip. Otherwise add descriptions. The current file: + +```bash +cat modules/gcp/service-account/variables.tf +``` + +If any variable lacks a description, add one based on its name (e.g. `google_project` → "GCP project ID where the service account will be created", `prefix` → "Prefix used to name the service account and custom role", `delegate_from` → "Identities (user/group/serviceAccount) that may impersonate the created service account"). + +- [ ] **Step 10: Inspect and update `modules/gcp/unity-catalog/variables.tf` if needed** + +Same pattern as Step 9. Inspect: + +```bash +grep -c "description" modules/gcp/unity-catalog/variables.tf +grep -c "^variable" modules/gcp/unity-catalog/variables.tf +``` + +If gaps exist, add descriptions. + +- [ ] **Step 11: Validate everything again to be safe** + +```bash +for m in private-connectivity account dns databricks-workspace service-account unity-catalog; do + cd modules/gcp/$m && terraform init -backend=false && terraform validate && cd - +done +``` + +All should print `Success!`. + +- [ ] **Step 12: Commit** + +```bash +git add modules/gcp/private-connectivity/variables.tf modules/gcp/account/variables.tf modules/gcp/dns/variables.tf modules/gcp/databricks-workspace/variables.tf modules/gcp/service-account/variables.tf modules/gcp/unity-catalog/variables.tf +git commit -m "$(cat <<'EOF' +docs(gcp): add descriptions to all module variables + +Adds the description attribute to every variable across +private-connectivity, account, dns, databricks-workspace (composer), +and any gaps in service-account / unity-catalog. ~70 descriptions +total. Surfaces in generated terraform-docs READMEs and IDE hovers. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 5: Add region validations + +**Goal:** Validate `google_region` against the supported-region list at two layers: as a variable validation in `private-connectivity` (always enforced), and as a precondition in the composer (only when PSC or restricted_egress is requested). + +**Files:** +- Modify: `modules/gcp/private-connectivity/variables.tf` +- Modify: `modules/gcp/databricks-workspace/preconditions.tf` + +- [ ] **Step 1: Add validation block to `private-connectivity/variables.tf`** + +The `google_region` variable was rewritten in Task 4. Edit it to add a `validation` block: + +Old (after Task 4): +```hcl +variable "google_region" { + type = string + description = "GCP region for PSC and firewall resources (must be one of the regions in the regional PSC service-attachment maps)" +} +``` + +New: +```hcl +variable "google_region" { + type = string + description = "GCP region for PSC and firewall resources (must be one of the regions in the regional PSC service-attachment maps)" + validation { + condition = contains([ + "asia-northeast1", "asia-south1", "asia-southeast1", "australia-southeast1", + "europe-west1", "europe-west2", "europe-west3", "northamerica-northeast1", + "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-west1", "us-west4" + ], var.google_region) + error_message = "google_region must be one of the regions in the regional PSC service-attachment maps. See locals.tf in modules/gcp/private-connectivity." + } +} +``` + +- [ ] **Step 2: Validate private-connectivity passes (with a region that satisfies the new rule)** + +```bash +cd modules/gcp/private-connectivity && terraform init -backend=false && terraform validate +cd modules/gcp/private-connectivity/tests/full-isolated && terraform validate +cd modules/gcp/private-connectivity/tests/no-egress && terraform validate +``` + +All three should print `Success!`. (Both fixtures use `us-central1`, which is in the list.) + +- [ ] **Step 3: Add region precondition to composer `preconditions.tf`** + +Insert a 7th `precondition { ... }` block inside the existing `null_resource.preconditions.lifecycle` block. Use Edit on `modules/gcp/databricks-workspace/preconditions.tf` to add this BEFORE the closing `}` of the `lifecycle` block: + +```hcl + precondition { + condition = ( + !local.any_private_link && !var.restricted_egress + ) || contains([ + "asia-northeast1", "asia-south1", "asia-southeast1", "australia-southeast1", + "europe-west1", "europe-west2", "europe-west3", "northamerica-northeast1", + "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-west1", "us-west4" + ], var.google_region) + error_message = "google_region must be a region supported by Databricks PSC when any private_link_* flag or restricted_egress is true." + } +``` + +- [ ] **Step 4: Validate the composer** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +Validate all positive fixtures still work (they all use us-central1): + +```bash +for d in tests/basic tests/byovpc tests/existing-vpc tests/psc-isolated; do + cd modules/gcp/databricks-workspace/$d && terraform init -backend=false && terraform validate && cd - +done +``` + +Each: `Success!`. + +- [ ] **Step 5: Spot-check that the existing 4 negative fixtures still fail plan as expected** + +The new precondition shouldn't change the behavior of the existing negatives (they fail other rules first). Verify: + +```bash +for d in tests/negative-restricted-egress-managed tests/negative-restricted-egress-missing-hub tests/negative-existing-missing-name tests/negative-managed-with-psc; do + cd modules/gcp/databricks-workspace/$d + terraform init -backend=false 2>/dev/null + if terraform plan -refresh=false 2>&1 | grep -q "Error:"; then + echo "OK: $d still fails" + else + echo "FAIL: $d unexpectedly passed plan" + fi + cd - +done +``` + +All 4 should print "OK: ... still fails". + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/private-connectivity/variables.tf modules/gcp/databricks-workspace/preconditions.tf +git commit -m "$(cat <<'EOF' +feat(gcp): add region validations on private-connectivity and composer + +private-connectivity now validates google_region against the 14 +regions present in the regional PSC service-attachment maps. The +composer adds the same validation as a precondition, but only when +any private_link_* flag or restricted_egress is true. Bad regions +now fail plan with a clear message instead of a confusing +map-lookup error at apply time. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 6: Standardize provider/versions placement + +**Goal:** Every module uses a single `versions.tf` containing only the terraform block; no provider configuration blocks live inside modules. Every example uses `versions.tf` + `providers.tf` split. + +**Files:** +- Modify: `modules/gcp/service-account/init.tf` (split: remove `provider "google" {}`, move `terraform {}` block to `versions.tf`) +- Create: `modules/gcp/service-account/versions.tf` +- Delete: `modules/gcp/service-account/init.tf` +- Rename: `modules/gcp/unity-catalog/terraform.tf` → `modules/gcp/unity-catalog/versions.tf` (git mv) +- For each of `examples/gcp-basic`, `examples/gcp-byovpc`, `examples/gcp-existing-vpc`, `examples/gcp-sa-provisioning`: split `init.tf` into `versions.tf` (terraform block) + `providers.tf` (provider blocks) +- Rename `examples/gcp-with-psc-exfiltration-protection/terraform.tf` → `examples/gcp-with-psc-exfiltration-protection/versions.tf` + +- [ ] **Step 1: Inspect the current service-account init.tf** + +```bash +cat modules/gcp/service-account/init.tf +``` + +The file should contain a `terraform { required_providers { ... } }` block and a `provider "google" {}` block. + +- [ ] **Step 2: Create `modules/gcp/service-account/versions.tf`** + +Write only the `terraform` block (copy from init.tf). For example: + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +(Adjust to match exactly what's in `init.tf` minus the `provider "google" {}` block. If the current `init.tf` lacks `required_version`, add `required_version = ">= 1.5"`.) + +- [ ] **Step 3: Delete `modules/gcp/service-account/init.tf`** + +```bash +git rm modules/gcp/service-account/init.tf +``` + +- [ ] **Step 4: Validate service-account** + +```bash +cd modules/gcp/service-account && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 5: Rename `modules/gcp/unity-catalog/terraform.tf` to `versions.tf`** + +```bash +git mv modules/gcp/unity-catalog/terraform.tf modules/gcp/unity-catalog/versions.tf +``` + +- [ ] **Step 6: Validate unity-catalog** + +```bash +cd modules/gcp/unity-catalog && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 7: Split `examples/gcp-basic/init.tf`** + +Inspect: + +```bash +cat examples/gcp-basic/init.tf +``` + +The file has both `terraform { required_providers { ... } }` and `provider "google" {}` + `provider "databricks" {}` blocks. + +Create `examples/gcp-basic/versions.tf` with just the `terraform` block: + +```hcl +terraform { + required_providers { + databricks = { + source = "databricks/databricks" + } + google = { + source = "hashicorp/google" + } + } +} +``` + +Create `examples/gcp-basic/providers.tf` with the provider blocks: + +```hcl +provider "google" { + project = var.google_project + region = var.google_region + zone = var.google_zone +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + google_service_account = var.databricks_google_service_account + account_id = var.databricks_account_id +} +``` + +Delete `init.tf`: + +```bash +git rm examples/gcp-basic/init.tf +``` + +Validate: + +```bash +cd examples/gcp-basic && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +- [ ] **Step 8: Repeat the split for `examples/gcp-byovpc/`** + +Same operation as Step 7. Inspect first, then split `init.tf` into `versions.tf` + `providers.tf`, then delete `init.tf`. Validate. + +- [ ] **Step 9: Repeat the split for `examples/gcp-existing-vpc/`** + +Same operation. Inspect, split, delete, validate. + +- [ ] **Step 10: Repeat the split for `examples/gcp-sa-provisioning/`** + +Same operation. Note: this example's `init.tf` may not declare the databricks provider (since the example only provisions GCP resources, no Databricks API calls). Inspect carefully and split accordingly. The `provider "google" {}` block that lived inside the module (deleted in Step 3) should be reflected here — make sure the example's `provider "google"` is configured with the user's `google_project`, `google_region`, and `google_zone`. + +- [ ] **Step 11: Rename `examples/gcp-with-psc-exfiltration-protection/terraform.tf` to `versions.tf`** + +```bash +git mv examples/gcp-with-psc-exfiltration-protection/terraform.tf examples/gcp-with-psc-exfiltration-protection/versions.tf +``` + +Validate: + +```bash +cd examples/gcp-with-psc-exfiltration-protection && terraform init -backend=false && terraform validate +``` + +Expected: `Success!`. + +(`providers.tf` already exists in this example; no other changes needed.) + +- [ ] **Step 12: Final validate of all 5 examples** + +```bash +for d in examples/gcp-basic examples/gcp-byovpc examples/gcp-existing-vpc examples/gcp-sa-provisioning examples/gcp-with-psc-exfiltration-protection; do + cd $d && terraform init -backend=false && terraform validate && cd - +done +``` + +All 5: `Success!`. + +- [ ] **Step 13: Commit** + +```bash +git add modules/gcp/service-account/ modules/gcp/unity-catalog/ examples/gcp-basic/ examples/gcp-byovpc/ examples/gcp-existing-vpc/ examples/gcp-sa-provisioning/ examples/gcp-with-psc-exfiltration-protection/ +git commit -m "$(cat <<'EOF' +refactor(gcp): standardize provider/versions placement + +Modules now contain only the terraform { required_providers } block in +a file named versions.tf. No provider configuration blocks inside +modules (the service-account module previously carried provider +"google" {}; moved to its example). + +Examples standardize on versions.tf (terraform block) + providers.tf +(provider blocks): +- gcp-basic, gcp-byovpc, gcp-existing-vpc, gcp-sa-provisioning: init.tf + split into versions.tf + providers.tf +- gcp-with-psc-exfiltration-protection: terraform.tf renamed to + versions.tf (providers.tf already existed) +- unity-catalog module: terraform.tf renamed to versions.tf + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 7: Add `## Usage` sections to module READMEs + +**Goal:** Each module README has a `## Usage` block above the terraform-docs marker showing a minimal call example. Mirrors terraform-google-modules convention. + +**Files:** +- Modify: `modules/gcp/databricks-workspace/README.md` +- Modify: `modules/gcp/network/README.md` +- Modify: `modules/gcp/private-connectivity/README.md` +- Modify: `modules/gcp/account/README.md` +- Modify: `modules/gcp/dns/README.md` +- Modify: `modules/gcp/service-account/README.md` +- Modify: `modules/gcp/unity-catalog/README.md` + +The pattern: take the existing README (which is `# title\n\n\n\n\n`), and insert a `## Usage` section between the description and the BEGIN_TF_DOCS marker. + +- [ ] **Step 1: Update `modules/gcp/databricks-workspace/README.md`** + +Read the current README, then use Edit to insert this block immediately before ``: + +```markdown +## Usage + +```hcl +module "workspace" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/databricks-workspace" + + prefix = "acme" + databricks_account_id = var.databricks_account_id + google_project = "my-workspace-project" + google_region = "us-central1" + + vpc_source = "databricks_managed" # or "create" / "existing" +} +``` + +See `examples/gcp-basic`, `examples/gcp-byovpc`, `examples/gcp-existing-vpc`, and `examples/gcp-with-psc-exfiltration-protection` for the four supported scenarios. + +``` + +- [ ] **Step 2: Update `modules/gcp/network/README.md`** + +Same pattern. Insert before BEGIN_TF_DOCS: + +```markdown +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual; you'll need to wire the outputs yourself. + +```hcl +module "network" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/network" + + prefix = "acme" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "my-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + +``` + +- [ ] **Step 3: Update `modules/gcp/private-connectivity/README.md`** + +```markdown +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual. + +```hcl +module "private_connectivity" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/private-connectivity" + + prefix = "acme" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = module.network.spoke_vpc_id + spoke_vpc_self_link = module.network.spoke_vpc_self_link + spoke_vpc_google_project = "my-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = true + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +``` + +- [ ] **Step 4: Update `modules/gcp/account/README.md`** + +```markdown +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual. + +```hcl +module "account" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/account" + + prefix = "acme" + suffix = "abc123" + databricks_account_id = var.databricks_account_id + google_project = "my-workspace-project" + google_region = "us-central1" + vpc_source = "databricks_managed" +} +``` + +``` + +- [ ] **Step 5: Update `modules/gcp/dns/README.md`** + +```markdown +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer) when `restricted_egress=true`. Direct consumption is unusual; this module is terminal (no outputs). + +```hcl +module "dns" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/dns" + + prefix = "acme" + google_region = "us-central1" + + hub_vpc_id = module.network.hub_vpc_id + hub_vpc_self_link = module.network.hub_vpc_self_link + hub_vpc_google_project = "my-hub-project" + + spoke_vpc_id = module.network.spoke_vpc_id + spoke_vpc_self_link = module.network.spoke_vpc_self_link + spoke_vpc_google_project = "my-spoke-project" + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity.frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity.frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity.backend_psc_ip_spoke +} +``` + +``` + +- [ ] **Step 6: Update `modules/gcp/service-account/README.md`** + +```markdown +## Usage + +Run once per GCP project to provision the service account Databricks uses to deploy workspaces. + +```hcl +module "service_account" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/service-account" + + google_project = "my-project" + prefix = "acme" + delegate_from = ["user:alice@example.com"] +} +``` + +The consumer must also configure `provider "google" {}` (project + region/zone) — this module no longer carries its own provider configuration. + +``` + +- [ ] **Step 7: Update `modules/gcp/unity-catalog/README.md`** + +```markdown +## Usage + +Called after `modules/gcp/databricks-workspace` to create a metastore, GCS bucket, storage credential, external location, and default catalog. + +```hcl +module "unity_catalog" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/unity-catalog" + + providers = { + databricks = databricks + databricks.workspace = databricks.workspace + } + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url + google_project = "my-workspace-project" + google_region = "us-central1" + prefix = "acme" + metastore_name = "main-metastore" + catalog_name = "main" +} +``` + +The consumer must declare a `databricks.workspace` provider alias pointing at the workspace URL. + +``` + +- [ ] **Step 8: Validate that README changes didn't break anything** + +The READMEs are doc-only; no terraform impact. Skip explicit validate. Confirm git status looks correct: + +```bash +git status --short +``` + +Expected: 7 modified README.md files under modules/gcp/. + +- [ ] **Step 9: Commit** + +```bash +git add modules/gcp/databricks-workspace/README.md modules/gcp/network/README.md modules/gcp/private-connectivity/README.md modules/gcp/account/README.md modules/gcp/dns/README.md modules/gcp/service-account/README.md modules/gcp/unity-catalog/README.md +git commit -m "$(cat <<'EOF' +docs(gcp): add Usage sections to module READMEs + +Each module README now has a Usage block above the terraform-docs +marker showing a minimal calling example. Mirrors the convention +used by terraform-google-modules and the HashiCorp registry. + +Submodule READMEs note "Typically called by modules/gcp/databricks- +workspace (the composer); consume directly only if you have a reason +to". The composer README points to the four scenario examples. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Task 8: Regenerate terraform-docs READMEs + +**Goal:** Refresh the auto-generated ``...`` blocks across all 7 GCP modules to reflect every change made in Tasks 1-7 (descriptions, renames, new outputs, the additional precondition resource is irrelevant to docs). + +**Files:** +- Modify: every `modules/gcp/*/README.md` (terraform-docs regen) + +- [ ] **Step 1: Regenerate docs for each module** + +Run from inside each module directory to avoid touching unrelated module READMEs: + +```bash +for m in databricks-workspace network private-connectivity account dns service-account unity-catalog; do + cd modules/gcp/$m && make docs && cd - +done +``` + +Each invocation should print `README.md updated successfully`. + +- [ ] **Step 2: Verify only the 7 GCP modules' READMEs changed** + +```bash +git status --short +``` + +Expected: 7 modified `modules/gcp/*/README.md` files and nothing else. If any other files appear (especially under `modules/adb-*/` or `modules/aws-*/`), STOP — those should NOT be touched. Reset those specific files with `git checkout -- ` before continuing. + +- [ ] **Step 3: Spot-check that the renamed outputs / new outputs / new descriptions all appear in the regenerated tables** + +```bash +grep "frontend_forwarding_rule_name" modules/gcp/private-connectivity/README.md modules/gcp/account/README.md +grep "private_access_settings_id" modules/gcp/databricks-workspace/README.md modules/gcp/account/README.md +grep -c "google_region" modules/gcp/databricks-workspace/README.md +``` + +Each should return a line; the count should be at least 2 (input table + maybe outputs table mention). + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/databricks-workspace/README.md modules/gcp/network/README.md modules/gcp/private-connectivity/README.md modules/gcp/account/README.md modules/gcp/dns/README.md modules/gcp/service-account/README.md modules/gcp/unity-catalog/README.md +git commit -m "$(cat <<'EOF' +docs(gcp): regenerate terraform-docs READMEs after refactor + +Final regeneration sweep after the best-practices refactor pass. +Reflects: variable descriptions (~70), output renames (*_psc_fr_id -> +*_forwarding_rule_name), expanded composer outputs (14 new), and the +provider/versions reorganization. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +## Self-Review + +(Performed after writing the plan; issues found fixed inline.) + +**1. Spec coverage:** + +| Spec requirement | Task | +|------------------|------| +| Variable descriptions (Goal 1) | Task 4 | +| File organization by concern (Goal 2) | Task 1 | +| `versions.tf` everywhere, no provider configs in modules (Goal 3) | Task 6 | +| Same file shape across examples (Goal 4) | Task 6 | +| Composer outputs cover everything + explicit ternaries (Goal 5) | Task 3 | +| Output/variable name accuracy (Goal 6) | Tasks 2 and 3 | +| `google_region` validation (Goal 7) | Task 5 | +| `## Usage` blocks in every module README (Goal 8) | Task 7 | +| terraform-docs regen | Task 8 | + +Coverage is complete. + +**2. Placeholder scan:** Searched for "TBD", "TODO", "implement later", "appropriate", "as needed". None found. + +**3. Type consistency:** +- `frontend_forwarding_rule_name` / `backend_forwarding_rule_name` / `hub_frontend_forwarding_rule_name` used consistently across Task 2 (renames), Task 4 (descriptions referencing the new names), and Task 7 (Usage block doesn't reference these directly). +- `private_access_settings_id` introduced as account output in Task 3 Step 1, then consumed by the composer output in Task 3 Step 3. +- `local.databricks_managed`, `local.create_vpc`, `local.use_existing_vpc`, `local.any_private_link`, `local.spoke_project` defined in Task 1 Step 18 (composer locals.tf), referenced in Task 3 Step 3 (outputs) and Task 5 Step 3 (precondition). Matches. +- The 14-region list appears in two places (Task 5 Step 1 in private-connectivity, Task 5 Step 3 in composer). Lists are identical. + +**4. Spec requirements with no task:** None. + +--- + +## Execution Handoff + +Plan complete and saved to `docs/superpowers/plans/2026-05-26-gcp-best-practices-refactor.md`. + +Two execution options: + +**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration. + +**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints. + +Which approach? diff --git a/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md b/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md new file mode 100644 index 00000000..bf70ad3a --- /dev/null +++ b/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md @@ -0,0 +1,355 @@ +# GCP Modules Refactor — Design Spec + +**Date:** 2026-05-14 +**Author:** Michele Daddetta +**Status:** Approved (pending implementation plan) + +## Problem + +Today the repo ships six GCP modules and six GCP example directories. Each example wraps its own dedicated module: + +- `examples/gcp-basic` → `modules/gcp-workspace-basic` +- `examples/gcp-byovpc` → `modules/gcp-workspace-byovpc` +- `examples/gcp-with-psc-exfiltration-protection` → `modules/gcp-with-psc-exfiltration-protection` +- `examples/gcp-sa-provisioning` → `modules/gcp-sa-provisioning` +- `examples/gcp-test-modules` (orphan, contains only state files) +- `examples/gcp-sa-provisionning` (typo dir, contains only a Makefile) + +The three workspace modules duplicate `databricks_mws_workspaces`, `databricks_mws_networks`, `google_compute_network` + subnet + router + NAT, and `random_string.suffix`. A change to any shared piece (e.g. a new GCP region added to the regional PSC service-attachment map, a workspace argument added by the Databricks provider) needs to land in 2–3 places. + +The user's northstar: an example provides only the **basic information about the desired scenario** — does a VPC already exist, what's its name, is the workspace using frontend PrivateLink, is private access enforced — and the module figures out the rest. + +There is no "existing VPC" example today; we add one as part of this refactor. + +## Goals + +1. Eliminate cross-module duplication for GCP workspace deployment. +2. Single top-level composer that takes scenario inputs and conditionally instantiates submodules. +3. Submodules are organized by concern (network, private connectivity, Databricks account resources) and consumed only by the composer. +4. Each example becomes a thin caller — main.tf is ~20 lines and varies only the inputs that matter for that scenario. +5. Variable names describe what they do, not what they protect against. No marketing language. +6. New scenario "existing VPC" is supported on day one. +7. Old modules and examples remain functional during migration; we ship the new modules alongside and migrate one example per PR. + +## Non-Goals + +- No Unity Catalog redesign. UC remains a separate module that the example wires up directly. The user has separate work in flight for this area. +- No service-account-provisioning redesign. SA-provisioning is a one-time bootstrap with a different lifecycle than the workspace; it remains a separate module called by its own example. +- No state migration tooling for existing applies of old examples. Users re-apply on clean state. +- No new CI test harness (terratest, GitHub Actions matrix). We rely on the existing `pre-commit` config plus per-PR manual sandbox apply. +- No changes to AWS or Azure modules. + +## Architecture + +### Module layout + +``` +modules/gcp/ +├── databricks-workspace/ # top-level composer; the one examples call +├── network/ # all google_compute_network/subnet/router/nat/peering +│ # for both hub & spoke; shared-VPC host/service binding +├── private-connectivity/ # GCP-side: PSC subnet + addresses + forwarding rules +│ # + egress firewall rules (deny-egress, google-apis, ctl-plane, hive) +├── account/ # ALL databricks_mws_* resources: +│ # mws_networks + mws_workspaces + mws_vpc_endpoint +│ # + mws_private_access_settings +├── dns/ # private DNS zones + records (hub + spoke) +│ # split from private-connectivity because DNS needs workspace_url +│ # which is only available after account creates the workspace +├── service-account/ # relocated from modules/gcp-sa-provisioning (git mv) +└── unity-catalog/ # relocated from modules/gcp-unity-catalog (git mv) +``` + +The five-submodule split (rather than the three-submodule grouping originally discussed) is required to keep the dependency graph acyclic. `account` cannot live before `private-connectivity` because `databricks_mws_vpc_endpoint` references the PSC forwarding rules created in GCP. `dns` cannot live before `account` because DNS records embed the `workspace_dns_id` regex-extracted from `databricks_mws_workspaces.workspace_url`. Keeping all `databricks_mws_*` resources together in `account` (the user's chosen concern-based grouping) requires DNS to be its own submodule. + +### Data flow + +``` +example + └── modules/gcp/databricks-workspace (composer) + ├── modules/gcp/network (count = vpc_source != "databricks_managed" ? 1 : 0) + │ outputs: spoke_vpc_*, spoke_subnet_*, hub_vpc_* (nullable), nat_id (nullable) + │ + ├── modules/gcp/private-connectivity (count = any private_link_* flag is true ? 1 : 0) + │ consumes: network outputs + │ outputs: frontend_psc_fr_id, backend_psc_fr_id, hub_frontend_psc_fr_id (nullable), + │ frontend_psc_ip_spoke, backend_psc_ip_spoke, frontend_psc_ip_hub (nullable), + │ psc_subnet_self_link + │ + ├── modules/gcp/account (always) + │ consumes: network outputs + private-connectivity outputs + │ outputs: workspace_id, workspace_url, network_id (nullable), + │ frontend_endpoint_id, backend_endpoint_id, transit_endpoint_id (nullable) + │ + └── modules/gcp/dns (count = restricted_egress ? 1 : 0) + consumes: network outputs + private-connectivity PSC IPs + account.workspace_url + outputs: none + +example optionally also calls: + └── modules/gcp/unity-catalog (wired with workspace_id + workspace_url) +``` + +The composer declares `random_string.suffix` once and passes it to each submodule, eliminating the per-module duplication that exists today. + +The dependency graph is linear: `network → private-connectivity → account → dns`. No back-references between modules. `databricks_mws_vpc_endpoint` is created inside `account` (rather than `private-connectivity`) so that `account` owns every `databricks_mws_*` resource and so that the cycle "`account` needs endpoint IDs / DNS needs workspace_url" is decomposed into a linear chain. + +## Composer API + +```hcl +# === Identity ============================================================= +prefix : string required +databricks_account_id : string required +google_project : string required # workspace google project +google_region : string required +workspace_name : string default = null # default "${prefix}-ws-${suffix}" +tags : map default = {} + +# === Where does the VPC come from? ======================================= +vpc_source : string default = "databricks_managed" + # one of: "databricks_managed", "create", "existing" + +# Used when vpc_source = "create" +spoke_vpc_cidr : string default = null +subnet_cidr : string default = null +pod_cidr : string default = null # GKE secondary range +svc_cidr : string default = null # GKE secondary range + +# Used when vpc_source = "existing" +existing_vpc_name : string default = null +existing_subnet_name : string default = null + +# === Connectivity (orthogonal flags, each defaults false) ================ +private_link_frontend : bool default = false # frontend PSC endpoint + frontend mws_vpc_endpoint +private_link_backend : bool default = false # SCC PSC endpoint + backend mws_vpc_endpoint +private_access_only : bool default = false # mws_private_access_settings; public_access_enabled = false +restricted_egress : bool default = false # hub VPC + deny-egress firewall + private DNS + +# === Required when restricted_egress = true ============================== +hub_vpc_google_project : string default = null +spoke_vpc_google_project : string default = null # falls back to google_project +is_spoke_vpc_shared : bool default = false +hub_vpc_cidr : string default = null +psc_subnet_cidr : string default = null +hive_metastore_ip : string default = null # else looked up via internal regional map +``` + +### Composer outputs + +```hcl +workspace_id = module.account.workspace_id +workspace_url = module.account.workspace_url +network_id = module.account.network_id # null when vpc_source = "databricks_managed" +vpc_id = try(module.network[0].spoke_vpc_id, null) +spoke_vpc_id = try(module.network[0].spoke_vpc_id, null) +hub_vpc_id = try(module.network[0].hub_vpc_id, null) +suffix = random_string.suffix.result # useful for downstream modules (UC, etc.) +``` + +### Cross-variable validation (preconditions in composer's `main.tf`) + +| Rule | Reason | +|------|--------| +| `restricted_egress = true` ⇒ `vpc_source = "create"` | Hub-spoke + egress firewall + private DNS require the module to own both VPCs | +| `restricted_egress = true` ⇒ `private_link_frontend OR private_link_backend = true` | Egress-restricted workspace without PSC is unreachable | +| `restricted_egress = true` ⇒ `hub_vpc_google_project`, `hub_vpc_cidr`, `psc_subnet_cidr` set | Hub topology needs these | +| `vpc_source = "create"` ⇒ `spoke_vpc_cidr`, `subnet_cidr` set | Need CIDRs | +| `vpc_source = "existing"` ⇒ `existing_vpc_name`, `existing_subnet_name` set | Need names to look up | +| `vpc_source = "databricks_managed"` ⇒ `private_link_frontend`, `private_link_backend`, `restricted_egress` all false | Cannot attach PSC or firewalls to a VPC we don't own | + +## Submodule contracts + +### `modules/gcp/network` + +**Inputs:** `prefix`, `suffix`, `google_region`, `vpc_source`, `spoke_vpc_google_project`, `spoke_vpc_cidr`, `subnet_cidr`, `subnet_name`, `pod_cidr`, `svc_cidr`, `existing_vpc_name`, `existing_subnet_name`, `create_hub` (bool — composer passes `restricted_egress`), `hub_vpc_google_project`, `hub_vpc_cidr`, `is_spoke_vpc_shared`, workspace project. + +**Behavior:** Spoke VPC + subnet + router + NAT (when `vpc_source = "create"`) or `data` lookups (when `"existing"`). Optional hub VPC + subnet + bidirectional peering + optional shared-VPC host/service binding (when `create_hub`). + +**Outputs:** `spoke_vpc_id`, `spoke_vpc_name`, `spoke_vpc_self_link`, `spoke_subnet_id`, `spoke_subnet_name`, `spoke_subnet_self_link`, `hub_vpc_id` (nullable), `hub_vpc_name` (nullable), `hub_vpc_self_link` (nullable), `hub_subnet_name` (nullable), `nat_id` (nullable). + +### `modules/gcp/private-connectivity` + +**Inputs:** `prefix`, `suffix`, `google_region`, spoke VPC refs + project, hub VPC refs + project (nullable), `enable_frontend`, `enable_backend`, `restrict_egress`, `psc_subnet_cidr`, spoke CIDR (for firewall source ranges), hub CIDR (for hub ingress firewall), `hive_metastore_ip` (nullable; falls back to regional map keyed by `google_region`). + +**Behavior, file-organized:** +- `psc.tf`: PSC subnet (in spoke); frontend address + forwarding rule when `enable_frontend`; backend address + forwarding rule when `enable_backend`; hub-side frontend address + forwarding rule when hub exists AND `enable_frontend`. Owns the regional PSC service-attachment maps (`google_frontend_psc_targets` and `google_backend_psc_targets`). +- `firewall.tf`: when `restrict_egress`, creates spoke deny-egress (priority 1100) + allow-google-apis + allow-databricks-control-plane (targeting PSC IPs) + allow-managed-hive (using regional `hive_metastore_ip`); hub ingress from spoke CIDR. + +`databricks_mws_vpc_endpoint` resources are NOT created here — they live in `account` so that all `databricks_mws_*` resources are colocated and so the dependency graph stays linear. + +**Outputs:** `psc_subnet_self_link`, `frontend_psc_fr_id` (forwarding-rule name; nullable), `backend_psc_fr_id` (nullable), `hub_frontend_psc_fr_id` (nullable), `frontend_psc_ip_spoke`, `backend_psc_ip_spoke`, `frontend_psc_ip_hub` (nullable). + +### `modules/gcp/account` + +**Inputs:** `prefix`, `suffix`, `workspace_name`, `databricks_account_id`, `google_project`, `google_region`, `vpc_source`, spoke VPC name, spoke subnet name, spoke project, hub project (nullable), `frontend_psc_fr_id` (nullable), `backend_psc_fr_id` (nullable), `hub_frontend_psc_fr_id` (nullable), `enable_frontend`, `enable_backend`, `private_access_only`, `nat_dependency` (passes through `module.network[0].nat_id`). + +**Behavior:** +- `databricks_mws_vpc_endpoint` resources (frontend, backend, hub-transit) emitted with `count = 1` gated by the corresponding `enable_*` and forwarding-rule-id inputs. Each references the GCP forwarding rule by name and project. +- `databricks_mws_networks` emitted when `vpc_source != "databricks_managed"`. The `vpc_endpoints` block is populated only when both frontend and backend endpoints exist. +- `databricks_mws_workspaces` always emitted. Single resource with conditional attributes: + - `network_id` = `databricks_mws_networks.this.network_id` when `vpc_source != "databricks_managed"`, else null + - `private_access_settings_id` = `databricks_mws_private_access_settings.this.id` when `private_access_only`, else null + - `depends_on = [nat_dependency]` to make sure NAT is ready before workspace creation +- `databricks_mws_private_access_settings` emitted with `count = 1` when `private_access_only`; sets `public_access_enabled = false` and `private_access_level = "ACCOUNT"`. + +**Outputs:** `workspace_id`, `workspace_url`, `network_id` (nullable), `frontend_endpoint_id` (nullable), `backend_endpoint_id` (nullable), `transit_endpoint_id` (nullable). + +### `modules/gcp/dns` + +**Inputs:** `prefix`, `google_region`, hub VPC refs + project, spoke VPC refs + project, `workspace_url` (from `module.account`), `frontend_psc_ip_spoke`, `frontend_psc_ip_hub` (nullable), `backend_psc_ip_spoke`. + +**Behavior:** +- Hub-side: `gcp.databricks.com` zone with `workspace`, `psc-auth`, `dp` records; `gcr.io` zone (wildcard CNAME + A); `googleapis.com` zone (wildcard CNAME to `restricted.googleapis.com` + A); `pkg.dev` zone (wildcard CNAME + A). +- Spoke-side: `gcp.databricks.com` zone with `workspace`, `dp`, `tunnel` records. +- `workspace_dns_id` is the regex-extracted ID from `workspace_url` (matches today's behavior in `gcp-with-psc-exfiltration-protection`). + +**Outputs:** none. + +### `modules/gcp/service-account` and `modules/gcp/unity-catalog` + +Relocated from `modules/gcp-sa-provisioning` and `modules/gcp-unity-catalog` via `git mv`. Variables, outputs, and resource addresses unchanged. Old paths get a deprecation README pointing to the new location. + +## Example shapes + +Each example dir contains: `init.tf` (providers), `main.tf` (single `module "workspace"` call, optionally plus `module "unity_catalog"`), `variables.tf` (only the variables relevant to that scenario), `terraform.tfvars` (skeleton with empty values + comments), `outputs.tf` (re-exports `workspace_id`/`workspace_url`), `README.md`, `Makefile`. + +### `examples/gcp-basic` — Databricks-managed VPC + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "databricks_managed" +} +``` + +### `examples/gcp-byovpc` — Terraform creates the VPC + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr +} +``` + +### `examples/gcp-existing-vpc` — NEW, fulfills the northstar + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} +``` + +### `examples/gcp-with-psc-exfiltration-protection` — PSC + restricted egress + +Name kept for backward-compatibility with external links. + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project + hub_vpc_google_project = var.hub_vpc_google_project + is_spoke_vpc_shared = var.is_spoke_vpc_shared + hub_vpc_cidr = var.hub_vpc_cidr + psc_subnet_cidr = var.psc_subnet_cidr +} +``` + +Plus an optional `module "unity_catalog"` block using `module.workspace.workspace_id` and `module.workspace.workspace_url`. + +### `examples/gcp-sa-provisioning` + +Points at the relocated `modules/gcp/service-account`. Variables and outputs identical to today. + +## Migration plan + +Build the new modules alongside the old ones; migrate examples one PR at a time. + +| PR | Scope | Risk | +|----|-------|------| +| 1 | Add all new modules under `modules/gcp/`. Relocate `service-account` and `unity-catalog` via `git mv` with deprecation stubs at old paths. No example touched. | Low — no example references new code yet | +| 2 | Migrate `examples/gcp-basic` to the new composer. Old `modules/gcp-workspace-basic` stays. | Low — basic case, no PSC/DNS to coordinate | +| 3 | Migrate `examples/gcp-byovpc`. | Low | +| 4 | Migrate `examples/gcp-with-psc-exfiltration-protection`. Sandbox apply + reachability check required. | Medium — PSC + DNS + firewall coordination | +| 5 | Add new `examples/gcp-existing-vpc`. | Low — net-new | +| 6 | Delete `modules/gcp-workspace-basic`, `modules/gcp-workspace-byovpc`, `modules/gcp-with-psc-exfiltration-protection`. Delete deprecation stubs. Delete `examples/gcp-sa-provisionning` (typo dir) and `examples/gcp-test-modules` (state-only). Clean stray `terraform.tfstate*` files from `examples/gcp-*` (verify `.gitignore` first). Update top-level README. | Low | + +Each PR is drafted, sandbox-applied by the author, then sent for review. No state migration support — applies of old examples don't transition to the new examples; users re-apply on clean state. Example READMEs document this in PRs 2–4. + +## Testing approach + +Scoped to what the repo already supports. + +**Static (pre-commit, every PR):** +- `terraform fmt -recursive` +- `terraform validate` per module and per example +- `terraform-docs` regeneration check + +**Module-level plan smoke (PR 1):** +For each new submodule, a `tests/` subdir with minimal-fixture `terraform plan` invocations using mock vars (e.g. `databricks_account_id = "00000000-0000-0000-0000-000000000000"`). Run with `terraform init -backend=false && terraform validate && terraform plan -refresh=false`. Wrapped in a Makefile target. Catches missing required inputs and broken preconditions before any sandbox apply. + +Negative cases that must fail at plan time (one fixture each): +- `restricted_egress = true` + `vpc_source = "databricks_managed"` +- `restricted_egress = true` + `hub_vpc_cidr = null` +- `vpc_source = "existing"` + `existing_vpc_name = null` +- `private_link_frontend = true` + `vpc_source = "databricks_managed"` + +**Example-level apply (manual, before each migration PR merges):** +- Apply against sandbox GCP project + Databricks account +- Verify workspace reachable; UC accessible where applicable +- Fresh `terraform plan` against applied state — expect zero drift +- `terraform destroy` and confirm clean teardown (PSC + DNS ordering) +- Capture plan/apply output in the PR description + +**What we don't test:** terratest, GitHub Actions matrix, automated cost guards, upgrade-from-old-state. Out of scope. + +## Risks & mitigations + +| Risk | Mitigation | +|------|-----------| +| Regional PSC service-attachment map drift between old and new modules during transition | Both reference the same Databricks-published list; copy verbatim to new module, delete old in PR 6 | +| Cross-variable `precondition` failures only surface at plan time, not at `validate` | Module-level plan-smoke fixtures in `tests/` exercise every precondition | +| `databricks_mws_workspaces` resource address changes (module path differs) | Acknowledged: examples are throwaway, customer state is unaffected. Documented in migration PRs | +| Empty `modules/gcp/network/` dir already exists | Becomes the home for the new `network` submodule — no conflict | +| User's separate UC work conflicts with the relocation `git mv` | Relocate but do not modify UC contents in PR 1; user's UC work can land before or after relocation as desired | +| PSC + DNS teardown ordering issues during `terraform destroy` | Add explicit `depends_on` between DNS records and the PSC forwarding rules they reference; verify during PR 4 sandbox test | + +## Open questions + +None at this time. All design choices ratified during the brainstorming session on 2026-05-14. diff --git a/docs/superpowers/specs/2026-05-26-gcp-best-practices-refactor-design.md b/docs/superpowers/specs/2026-05-26-gcp-best-practices-refactor-design.md new file mode 100644 index 00000000..83135525 --- /dev/null +++ b/docs/superpowers/specs/2026-05-26-gcp-best-practices-refactor-design.md @@ -0,0 +1,267 @@ +# GCP Modules Best-Practices Refactor — Design Spec + +**Date:** 2026-05-26 +**Author:** Michele Daddetta +**Status:** Approved (pending implementation plan) +**Branch:** `feature/gcp-modules-refactor` (continues on the existing draft PR #233) + +## Problem + +The GCP composer + submodules landed in PR #233 work correctly but their internal organization and metadata are uneven. Concrete gaps surfaced during audit: + +- **Variable description coverage is dramatically uneven across modules:** `network` has 17/17 documented, but `private-connectivity` has 2/17, `account` has 1/18, `dns` has 0/12, `databricks-workspace` (composer) has 1/23. Roughly 70 variables ship without descriptions, which means the auto-generated terraform-docs READMEs are unusable as reference material and IDE hovers show nothing. +- **File organization is inconsistent.** `network/main.tf` is 131 lines mixing five concerns (spoke VPC, hub VPC, peering, shared-VPC, data sources). `account/main.tf` mixes locals + workspace + networks resources. `databricks-workspace/main.tf` (the composer) mixes locals + suffix + preconditions + module wirings (149 lines). +- **Provider/versions placement is inconsistent.** Most modules use `versions.tf`, but `service-account` uses `init.tf` (which also illegally contains `provider "google" {}` — modules must not configure providers), and `unity-catalog` uses `terraform.tf`. Examples are equally inconsistent: `gcp-basic` and three other examples use a combined `init.tf`; the PSC example splits across `terraform.tf` + `providers.tf`. +- **Output completeness gaps on the composer.** Missing: `subnet_id`/`self_link`, hub VPC `self_link`, NAT id, `private_access_settings_id`, the three `mws_vpc_endpoint` IDs, and the three PSC IPs. Has redundancy: `vpc_id` is identical to `spoke_vpc_id`. Uses `try(module.network[0].*, null)` which works but hides intent compared to an explicit `local.databricks_managed ? null : module.network[0].*` ternary. +- **Output naming misnomers.** Outputs named `*_psc_fr_id` return GCP forwarding-rule **names**, not IDs (the underlying attribute is `google_compute_forwarding_rule.x.name`). Same misnomer in the `account` module's input variables. +- **Module READMEs are placeholder-plus-terraform-docs only.** No `## Usage` HCL block. terraform-google-modules and HashiCorp registry convention is to include a minimal calling example above the auto-generated section. +- **No regional validation.** The `private-connectivity` module hardcodes regional PSC service-attachment maps for 14 regions but accepts any string for `google_region` — invalid regions fail with a confusing map-lookup error instead of a clear validation message. + +## Goals + +1. Every variable in every module has a `description`. +2. Each module's `.tf` files have one clear concern each; no file mixes three or more responsibilities. +3. Every module declares only `terraform { required_providers + required_version }` in a file named `versions.tf`. No provider configuration blocks inside modules. +4. Every example has the same file shape: `versions.tf` (required_providers) + `providers.tf` (provider blocks) + `main.tf` + `variables.tf` + `outputs.tf` + `terraform.tfvars` + `README.md` + `Makefile`. +5. Composer outputs cover everything a downstream consumer might want, with explicit conditional ternaries instead of `try()`. +6. Output and variable names are accurate (no `*_fr_id` for things that are names; no aliases like `vpc_id` shadowing `spoke_vpc_id`). +7. `google_region` is validated against the supported-region list at the boundary where the regional maps actually live. +8. Each module README has a `## Usage` block before the terraform-docs section. + +## Non-Goals + +- No behavioral changes to the composer's runtime logic (preconditions stay; module wiring stays; resource attributes stay). +- No new submodules; no rename of any module directory. +- No CIDR or IP-format validation (GCP API rejects bad CIDRs with clear errors; relying on that is fine). +- No `tags` propagation across resources. The unused `tags` variable on the composer is documented as reserved-for-future and otherwise left alone (the user explicitly opted out of removing or propagating it). +- No new tests beyond keeping the existing fixtures passing. +- No changes to AWS or Azure modules/examples. + +## Scope of changes + +### Modules under `modules/gcp/` + +**`network/`** (4 → 10 .tf files): + +| File | Contents | +|------|----------| +| `vpc.tf` | `google_compute_network.spoke_vpc`, `google_compute_network.hub_vpc` | +| `subnets.tf` | `google_compute_subnetwork.spoke_subnet`, `google_compute_subnetwork.hub_subnet` | +| `nat.tf` | `google_compute_router.router`, `google_compute_router_nat.nat` | +| `peering.tf` | both `google_compute_network_peering` resources | +| `shared-vpc.tf` | `google_compute_shared_vpc_host_project`, `google_compute_shared_vpc_service_project` | +| `data.tf` | data sources for existing-vpc lookups | +| `locals.tf` | `create_vpc`, `use_existing_vpc`, `subnet_name` (moved from `main.tf`) | +| `variables.tf` | unchanged (all 17 already documented) | +| `outputs.tf` | unchanged | +| `versions.tf` | unchanged | + +`main.tf` is removed (its contents redistribute). + +**`private-connectivity/`** (file layout unchanged; the `psc.tf` + `firewall.tf` + `locals.tf` split is already good): + +- `variables.tf`: add `description` to the 15 undocumented variables +- `outputs.tf`: rename `frontend_psc_fr_id` → `frontend_forwarding_rule_name`, `backend_psc_fr_id` → `backend_forwarding_rule_name`, `hub_frontend_psc_fr_id` → `hub_frontend_forwarding_rule_name` +- `variables.tf`: add `validation { contains(<14 regions>, var.google_region) }` on `google_region` +- `README.md`: add `## Usage` block + +**`account/`** (6 → 8 .tf files): + +| File | Contents | +|------|----------| +| `workspace.tf` | `databricks_mws_workspaces.this` (moved from `main.tf`) | +| `networks.tf` | `databricks_mws_networks.this` (moved from `main.tf`) | +| `vpc-endpoints.tf` | unchanged | +| `pas.tf` | unchanged (kept short — 9 lines, single resource, doesn't justify merging) | +| `locals.tf` | the 4 locals from `main.tf` | +| `variables.tf` | add descriptions to 17 variables; rename inputs `frontend_psc_fr_id`/`backend_psc_fr_id`/`hub_frontend_psc_fr_id` → `frontend_forwarding_rule_name` etc. | +| `outputs.tf` | add `private_access_settings_id` output | +| `versions.tf` | unchanged | + +`main.tf` is removed. + +**`dns/`** (file layout unchanged; `hub.tf` + `spoke.tf` is already good): + +- `locals.tf` (NEW): extract `workspace_dns_id = regex(...)` from `hub.tf` +- `variables.tf`: add descriptions to all 12 variables +- `README.md`: add `## Usage` block + +**`databricks-workspace/` (composer)** (4 → 7 .tf files): + +| File | Contents | +|------|----------| +| `main.tf` | only the 4 module blocks (network, private-connectivity, account, dns) | +| `locals.tf` | `databricks_managed`, `create_vpc`, `any_private_link`, `spoke_project` | +| `preconditions.tf` | `null_resource.preconditions` with 6 existing rules + 1 new region-validation rule | +| `random.tf` | `random_string.suffix` | +| `variables.tf` | add descriptions to 22 undocumented variables; `tags` documented as reserved | +| `outputs.tf` | full rewrite (see below) | +| `versions.tf` | unchanged | + +**`service-account/`**: + +- `init.tf` is split: `terraform {}` block moves to `versions.tf`; `provider "google" {}` is removed (modules must not configure providers; the consumer already configures it) +- `variables.tf`: add descriptions if any are missing +- `README.md`: add `## Usage` block + +**`unity-catalog/`**: + +- `terraform.tf` is renamed `versions.tf` +- `variables.tf`: add descriptions if any are missing (most already have them) +- `README.md`: add `## Usage` block (replaces the existing placeholder) + +### Examples under `examples/` + +Every example normalizes to: + +``` +versions.tf # terraform { required_providers + required_version } +providers.tf # provider "google" {} + provider "databricks" {} (+ workspace alias where used) +main.tf # module "workspace" {} +variables.tf +outputs.tf +terraform.tfvars +README.md +Makefile +``` + +Per-example changes: +- `gcp-basic`: split `init.tf` → `versions.tf` + `providers.tf` +- `gcp-byovpc`: split `init.tf` → `versions.tf` + `providers.tf` +- `gcp-existing-vpc`: split `init.tf` → `versions.tf` + `providers.tf` +- `gcp-with-psc-exfiltration-protection`: rename `terraform.tf` → `versions.tf` (`providers.tf` already exists) +- `gcp-sa-provisioning`: split `init.tf` → `versions.tf` + `providers.tf`. The `provider "google" {}` block that the module currently carries is moved here (it's the natural consumer location). + +## Output changes (composer) + +### Removed + +- `vpc_id` (alias of `spoke_vpc_id`; pure redundancy) + +### Renamed (in submodules; composer doesn't expose these forwarding-rule outputs) + +- `private-connectivity`: `frontend_psc_fr_id` → `frontend_forwarding_rule_name`, `backend_psc_fr_id` → `backend_forwarding_rule_name`, `hub_frontend_psc_fr_id` → `hub_frontend_forwarding_rule_name` + +### Added (composer outputs) + +- `private_access_settings_id` +- `frontend_endpoint_id`, `backend_endpoint_id`, `transit_endpoint_id` (the three `mws_vpc_endpoint` IDs) +- `spoke_vpc_self_link`, `spoke_subnet_id`, `spoke_subnet_self_link` +- `hub_vpc_self_link` +- `nat_id` +- `frontend_psc_ip_spoke`, `backend_psc_ip_spoke`, `frontend_psc_ip_hub` +- `google_region` (echo, useful for downstream wiring) + +### Restructured (explicit ternaries instead of `try()`) + +Old: +```hcl +output "spoke_vpc_id" { value = try(module.network[0].spoke_vpc_id, null) } +``` + +New: +```hcl +output "spoke_vpc_id" { + value = local.databricks_managed ? null : module.network[0].spoke_vpc_id + description = "Spoke VPC ID (null when vpc_source=databricks_managed)" +} +``` + +Same behavior; intent is now visible. + +## Validation rules added + +Two new rules, one in `private-connectivity` (variable-level) and one in the composer (cross-variable, via `preconditions.tf`): + +1. **`private-connectivity/variables.tf`**: +```hcl +variable "google_region" { + type = string + description = "GCP region for PSC and firewall resources" + validation { + condition = contains([ + "asia-northeast1", "asia-south1", "asia-southeast1", "australia-southeast1", + "europe-west1", "europe-west2", "europe-west3", "northamerica-northeast1", + "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-west1", "us-west4" + ], var.google_region) + error_message = "google_region must be one of the regions in the regional PSC service-attachment maps. See locals.tf." + } +} +``` + +2. **`databricks-workspace/preconditions.tf`** (new rule alongside the 6 existing): +```hcl +precondition { + condition = ( + !local.any_private_link && !var.restricted_egress + ) || contains([], var.google_region) + error_message = "google_region must be a region supported by Databricks PSC when any private_link_* flag or restricted_egress is true." +} +``` + +The 14-region list is duplicated in two places. That's acceptable: they validate different layers (one always, one only when PSC is requested), and the list changes only when Databricks adds a new PSC region (rare). + +## README usage examples + +Each module README receives a `## Usage` block before the `` marker. Template: + +```markdown +# modules/gcp/ + + + +## Usage + +```hcl +module "" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/" + + # minimal required inputs +} +``` + + + + +... (auto-generated by terraform-docs) + +``` + +For submodules (`network`, `private-connectivity`, `account`, `dns`), the prose explicitly notes "Typically called by `modules/gcp/databricks-workspace`; consume directly only if you have a reason to." + +For the composer (`databricks-workspace`), the prose lists the four supported scenarios and points to the four examples. + +For top-level modules (`service-account`, `unity-catalog`), the prose stands alone. + +## Migration impact + +This refactor is **not** a state-breaking change for the examples in this repo — none of the resource addresses change. The risks are: + +| Risk | Mitigation | +|------|-----------| +| `vpc_id` output removed (breaking) | Audit consumers: only the migrated examples consume `module.workspace.vpc_id`. Update them to use `spoke_vpc_id` in the same PR. | +| Submodule output renames break the composer's wiring | Update the composer's `account` module call in the same task that renames the `private-connectivity` outputs. | +| `account` input renames (`*_psc_fr_id` → `*_forwarding_rule_name`) break the composer | Same task. | +| `service-account` module previously self-configured `provider "google"`; consumers that didn't explicitly configure it would break | Only `examples/gcp-sa-provisioning` consumes it, and it already configures `provider "google"`. README is updated to make the requirement explicit. | +| `terraform-docs` regeneration may touch unrelated module READMEs if run from the repo root | All `make docs` invocations during implementation run from the specific module dir (`make -C modules/gcp/ docs`), never from `modules/`. | + +State migration is not in scope — examples are reference material and customers re-apply on clean state per PR 1's migration documentation. + +## Implementation phasing + +The work groups into logical commits matching the prior PR-1 squash style: + +1. `refactor(gcp): split module files by concern` — file-splitting only, no behavioral changes +2. `refactor(gcp): rename forwarding-rule outputs and account inputs` — coordinated rename across `private-connectivity` and `account` +3. `feat(gcp/databricks-workspace): expand and rename composer outputs` — drop `vpc_id`, add 14 outputs, switch `try()` to explicit ternaries +4. `docs(gcp): add descriptions to all module variables` — ~70 variable descriptions across 5 modules +5. `feat(gcp): add region validations on private-connectivity and composer` — the two new validation rules +6. `refactor(gcp): standardize provider/versions placement` — `versions.tf` everywhere, remove `provider {}` from service-account module, split examples' `init.tf` into `versions.tf` + `providers.tf` +7. `docs(gcp): add `## Usage` sections to module READMEs` — 7 module READMEs updated +8. `docs(gcp): regenerate terraform-docs READMEs` — final regen sweep + +Implementation plan will sequence these so each commit leaves the tree validating. + +## Open questions + +None at this time. diff --git a/examples/gcp-basic/README.md b/examples/gcp-basic/README.md index 894b51e4..b6df4554 100644 --- a/examples/gcp-basic/README.md +++ b/examples/gcp-basic/README.md @@ -1,25 +1,27 @@ -# Provisioning Databricks workspace on GCP with managed VPC -========================= +# examples/gcp-basic — Databricks-managed VPC -In this template, we show how to deploy a workspace with managed VPC. +Calls `modules/gcp/databricks-workspace` with `vpc_source = "databricks_managed"`. +The Databricks platform provisions the workspace VPC; you provide only the GCP +project, region, and prefix. +## Prerequisites -## Requirements - -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The Service Account needs to be added as Databricks Admin in the account console - -## Run as an SA +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. +## Apply +```bash +terraform init +terraform apply +``` -## Run the tempalte +## Migrating from the old example -- You need to fill in the `variables.tf` -- run `terraform init` -- run `teraform apply` +This example previously called `modules/gcp-workspace-basic`. State from the +old apply does **not** migrate cleanly to the new composer because the +`databricks_mws_workspaces` resource address differs. Re-apply on clean state. ## Requirements @@ -34,7 +36,7 @@ No providers. | Name | Source | Version | |------|--------|---------| -| [gcp-basic](#module\_gcp-basic) | github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-basic | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -45,18 +47,17 @@ No resources. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Email of the service account used for deployment | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [google\_zone](#input\_google\_zone) | Zone in GCP region | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [workspace\_name](#input\_workspace\_name) | Name of the workspace to create | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project where the workspace will be created | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | ## Outputs | Name | Description | |------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | diff --git a/examples/gcp-basic/main.tf b/examples/gcp-basic/main.tf index 372bbf9f..32655699 100644 --- a/examples/gcp-basic/main.tf +++ b/examples/gcp-basic/main.tf @@ -1,9 +1,11 @@ -module "gcp-basic" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-basic" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix databricks_account_id = var.databricks_account_id google_project = var.google_project google_region = var.google_region - prefix = var.prefix workspace_name = var.workspace_name - delegate_from = var.delegate_from + + vpc_source = "databricks_managed" } diff --git a/examples/gcp-basic/outputs.tf b/examples/gcp-basic/outputs.tf index d6b170a9..81a92ab3 100644 --- a/examples/gcp-basic/outputs.tf +++ b/examples/gcp-basic/outputs.tf @@ -1,9 +1,9 @@ - -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" } -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" } diff --git a/examples/gcp-basic/init.tf b/examples/gcp-basic/providers.tf similarity index 66% rename from examples/gcp-basic/init.tf rename to examples/gcp-basic/providers.tf index a8ea9f9f..edead550 100644 --- a/examples/gcp-basic/init.tf +++ b/examples/gcp-basic/providers.tf @@ -1,14 +1,3 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - provider "google" { project = var.google_project region = var.google_region @@ -19,4 +8,4 @@ provider "databricks" { host = "https://accounts.gcp.databricks.com" google_service_account = var.databricks_google_service_account account_id = var.databricks_account_id -} \ No newline at end of file +} diff --git a/examples/gcp-basic/terraform.tfvars b/examples/gcp-basic/terraform.tfvars new file mode 100644 index 00000000..8405cca5 --- /dev/null +++ b/examples/gcp-basic/terraform.tfvars @@ -0,0 +1,7 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" diff --git a/examples/gcp-basic/variables.tf b/examples/gcp-basic/variables.tf index 9805c04b..4b02c043 100644 --- a/examples/gcp-basic/variables.tf +++ b/examples/gcp-basic/variables.tf @@ -4,38 +4,32 @@ variable "databricks_account_id" { } variable "databricks_google_service_account" { - description = "Email of the service account used for deployment" type = string + description = "Service account email used for Databricks provider authentication" } variable "google_project" { type = string - description = "Google project for VCP/workspace deployment" + description = "GCP project where the workspace will be created" } variable "google_region" { type = string - description = "Google region for VCP/workspace deployment" + description = "GCP region for workspace deployment" } variable "google_zone" { - description = "Zone in GCP region" type = string + description = "GCP zone (used by the google provider)" } variable "prefix" { type = string - description = "Prefix to use in generated VPC name" + description = "Prefix used to name generated resources" } variable "workspace_name" { - description = "Name of the workspace to create" type = string + description = "Workspace name" } -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} - - diff --git a/modules/gcp-with-psc-exfiltration-protection/terraform.tf b/examples/gcp-basic/versions.tf similarity index 73% rename from modules/gcp-with-psc-exfiltration-protection/terraform.tf rename to examples/gcp-basic/versions.tf index 688f0fbd..5389c5dd 100644 --- a/modules/gcp-with-psc-exfiltration-protection/terraform.tf +++ b/examples/gcp-basic/versions.tf @@ -6,8 +6,5 @@ terraform { google = { source = "hashicorp/google" } - random = { - source = "hashicorp/random" - } } -} \ No newline at end of file +} diff --git a/examples/gcp-byovpc/README.md b/examples/gcp-byovpc/README.md index 8dc13eba..34587829 100644 --- a/examples/gcp-byovpc/README.md +++ b/examples/gcp-byovpc/README.md @@ -1,25 +1,39 @@ -# Provisioning Databricks workspace on GCP with a custom VPC -========================= +# examples/gcp-byovpc — Customer-managed VPC -In this template, we show how to deploy a workspace with a custom vpc. +Calls `modules/gcp/databricks-workspace` with `vpc_source = "create"`. Terraform +creates the spoke VPC + subnet + Cloud Router + NAT, then registers the network +with the Databricks account and provisions a workspace inside it. +## Prerequisites -## Requirements +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID +- CIDR ranges for the spoke VPC and subnet that don't overlap with existing networks -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The sizing of the custom vpc subnets needs to be appropriate for the usage of the workspace. [This documentation covers it](https://docs.gcp.databricks.com/administration-guide/cloud-configurations/gcp/network-sizing.html) +## Apply -## Run as an SA +```bash +terraform init +terraform apply +``` -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. +## Migrating from the old example +This example previously called `modules/gcp-workspace-byovpc`. Several variable +names changed to match the new composer API: -## Run the tempalte +| Old name | New name | +|----------|----------| +| `subnet_ip_cidr_range` | `subnet_cidr` | +| `pod_ip_cidr_range` | `pod_cidr` | +| `svc_ip_cidr_range` | `svc_cidr` | +| `subnet_name`, `router_name`, `nat_name` | (removed — composer derives from `prefix` + random suffix) | +| `delegate_from` | (removed — handled by `examples/gcp-sa-provisioning`) | +| _(new)_ | `spoke_vpc_cidr` (VPC primary CIDR, distinct from subnet CIDR) | -- You need to fill in the variables.tf -- run `terraform init` -- run `teraform apply` +State from the old apply does **not** migrate cleanly to the new composer +because resource addresses differ. Re-apply on clean state. ## Requirements @@ -30,13 +44,13 @@ No requirements. | Name | Version | |------|---------| -| [google](#provider\_google) | 4.63.1 | +| [google](#provider\_google) | 6.46.0 | ## Modules | Name | Source | Version | |------|--------|---------| -| [gcp-byovpc](#module\_gcp-byovpc) | github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-byovpc | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -50,23 +64,23 @@ No requirements. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Email of the service account used for deployment | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [google\_zone](#input\_google\_zone) | Zone in GCP region | `string` | n/a | yes | -| [nat\_name](#input\_nat\_name) | Name of the NAT service in compute router | `string` | n/a | yes | -| [pod\_ip\_cidr\_range](#input\_pod\_ip\_cidr\_range) | IP Range for Pods subnet (secondary) | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [router\_name](#input\_router\_name) | Name of the compute router to create | `string` | n/a | yes | -| [subnet\_ip\_cidr\_range](#input\_subnet\_ip\_cidr\_range) | IP Range for Nodes subnet (primary) | `string` | n/a | yes | -| [subnet\_name](#input\_subnet\_name) | Name of the subnet to create | `string` | n/a | yes | -| [svc\_ip\_cidr\_range](#input\_svc\_ip\_cidr\_range) | IP Range for Services subnet (secondary) | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project where the workspace VPC and resources will be created | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for the spoke VPC (e.g. 10.0.0.0/16) | `string` | n/a | yes | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the GKE nodes subnet primary range (e.g. 10.0.0.0/22) | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | +| [pod\_cidr](#input\_pod\_cidr) | Optional secondary range for GKE pods | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | Optional secondary range for GKE services | `string` | `null` | no | ## Outputs | Name | Description | |------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [vpc\_id](#output\_vpc\_id) | ID of the spoke VPC created by the module | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | diff --git a/examples/gcp-byovpc/data.tf b/examples/gcp-byovpc/data.tf new file mode 100644 index 00000000..a16591b2 --- /dev/null +++ b/examples/gcp-byovpc/data.tf @@ -0,0 +1,3 @@ +data "google_client_openid_userinfo" "me" {} + +data "google_client_config" "current" {} diff --git a/examples/gcp-byovpc/main.tf b/examples/gcp-byovpc/main.tf index c1e82a06..5c9d7ec0 100644 --- a/examples/gcp-byovpc/main.tf +++ b/examples/gcp-byovpc/main.tf @@ -1,15 +1,15 @@ -module "gcp-byovpc" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-byovpc" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix databricks_account_id = var.databricks_account_id google_project = var.google_project google_region = var.google_region - prefix = var.prefix - subnet_ip_cidr_range = var.subnet_ip_cidr_range - pod_ip_cidr_range = var.pod_ip_cidr_range - svc_ip_cidr_range = var.svc_ip_cidr_range - subnet_name = var.subnet_name - router_name = var.router_name - nat_name = var.nat_name workspace_name = var.workspace_name - delegate_from = var.delegate_from + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr } diff --git a/examples/gcp-byovpc/outputs.tf b/examples/gcp-byovpc/outputs.tf index f544b3ba..202343f0 100644 --- a/examples/gcp-byovpc/outputs.tf +++ b/examples/gcp-byovpc/outputs.tf @@ -1,8 +1,19 @@ -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" } -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} + +output "vpc_id" { + value = module.workspace.spoke_vpc_id + description = "ID of the spoke VPC created by the module" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" } \ No newline at end of file diff --git a/examples/gcp-byovpc/init.tf b/examples/gcp-byovpc/providers.tf similarity index 55% rename from examples/gcp-byovpc/init.tf rename to examples/gcp-byovpc/providers.tf index 55e8f98c..edead550 100644 --- a/examples/gcp-byovpc/init.tf +++ b/examples/gcp-byovpc/providers.tf @@ -1,31 +1,11 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - provider "google" { project = var.google_project region = var.google_region zone = var.google_zone - } provider "databricks" { host = "https://accounts.gcp.databricks.com" google_service_account = var.databricks_google_service_account account_id = var.databricks_account_id - } - -data "google_client_openid_userinfo" "me" { -} - - -data "google_client_config" "current" { -} \ No newline at end of file diff --git a/examples/gcp-byovpc/terraform.tfvars b/examples/gcp-byovpc/terraform.tfvars new file mode 100644 index 00000000..6029b385 --- /dev/null +++ b/examples/gcp-byovpc/terraform.tfvars @@ -0,0 +1,11 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +spoke_vpc_cidr = "" +subnet_cidr = "" +pod_cidr = null +svc_cidr = null diff --git a/examples/gcp-byovpc/variables.tf b/examples/gcp-byovpc/variables.tf index e1c91f2d..f9d9aefd 100644 --- a/examples/gcp-byovpc/variables.tf +++ b/examples/gcp-byovpc/variables.tf @@ -4,61 +4,53 @@ variable "databricks_account_id" { } variable "databricks_google_service_account" { - description = "Email of the service account used for deployment" type = string + description = "Service account email used for Databricks provider authentication" } variable "google_project" { type = string - description = "Google project for VCP/workspace deployment" + description = "GCP project where the workspace VPC and resources will be created" } variable "google_region" { type = string - description = "Google region for VCP/workspace deployment" + description = "GCP region for workspace deployment" } variable "google_zone" { - description = "Zone in GCP region" type = string + description = "GCP zone (used by the google provider)" } variable "prefix" { type = string - description = "Prefix to use in generated VPC name" + description = "Prefix used to name generated resources" } -variable "subnet_ip_cidr_range" { +variable "workspace_name" { type = string - description = "IP Range for Nodes subnet (primary)" + description = "Workspace name" } -variable "pod_ip_cidr_range" { +variable "spoke_vpc_cidr" { type = string - description = "IP Range for Pods subnet (secondary)" + description = "CIDR for the spoke VPC (e.g. 10.0.0.0/16)" } -variable "svc_ip_cidr_range" { +variable "subnet_cidr" { type = string - description = "IP Range for Services subnet (secondary)" + description = "CIDR for the GKE nodes subnet primary range (e.g. 10.0.0.0/22)" } -variable "subnet_name" { +variable "pod_cidr" { type = string - description = "Name of the subnet to create" + default = null + description = "Optional secondary range for GKE pods" } -variable "router_name" { +variable "svc_cidr" { type = string - description = "Name of the compute router to create" -} - -variable "nat_name" { - type = string - description = "Name of the NAT service in compute router" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) + default = null + description = "Optional secondary range for GKE services" } diff --git a/examples/gcp-byovpc/versions.tf b/examples/gcp-byovpc/versions.tf new file mode 100644 index 00000000..5389c5dd --- /dev/null +++ b/examples/gcp-byovpc/versions.tf @@ -0,0 +1,10 @@ +terraform { + required_providers { + databricks = { + source = "databricks/databricks" + } + google = { + source = "hashicorp/google" + } + } +} diff --git a/examples/gcp-sa-provisionning/Makefile b/examples/gcp-existing-vpc/Makefile similarity index 100% rename from examples/gcp-sa-provisionning/Makefile rename to examples/gcp-existing-vpc/Makefile diff --git a/examples/gcp-existing-vpc/README.md b/examples/gcp-existing-vpc/README.md new file mode 100644 index 00000000..48affa71 --- /dev/null +++ b/examples/gcp-existing-vpc/README.md @@ -0,0 +1,74 @@ +# examples/gcp-existing-vpc — Use a pre-existing VPC + +Calls `modules/gcp/databricks-workspace` with `vpc_source = "existing"`. Instead +of creating a VPC, the composer looks up the named VPC + subnet via Terraform +data sources and registers them with the Databricks account. + +This is the scenario for organizations that manage GCP networking out-of-band +(e.g. via a platform team) and just want Databricks to consume an existing +network. + +## Prerequisites + +- A GCP project with the Databricks platform onboarded +- A pre-existing VPC and subnet in that project. The subnet must be in `google_region`. +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID + +## Apply + +```bash +terraform init +terraform apply +``` + +## What the composer does NOT do in this mode + +- Does not create the VPC, subnet, router, or NAT — those must already exist +- Does not enforce that the subnet has Private Google Access enabled — verify in the console +- Does not configure egress firewalls or PrivateLink (those require `vpc_source = "create"`) + +To layer PrivateLink onto an existing network, the current composer requires +`vpc_source = "create"`. Future work may relax this. + + +## Requirements + +No requirements. + +## Providers + +No providers. + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | + +## Resources + +No resources. + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | Name of the pre-existing subnet inside the VPC (must be in google\_region) | `string` | n/a | yes | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | Name of the pre-existing GCP VPC to deploy the workspace into | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project hosting the existing VPC and subnet (also the workspace project) | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment (must match the existing subnet's region) | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name Databricks-side resources (mws\_networks, mws\_workspaces) | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | + +## Outputs + +| Name | Description | +|------|-------------| +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | + diff --git a/examples/gcp-existing-vpc/main.tf b/examples/gcp-existing-vpc/main.tf new file mode 100644 index 00000000..6b4173e3 --- /dev/null +++ b/examples/gcp-existing-vpc/main.tf @@ -0,0 +1,13 @@ +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} diff --git a/examples/gcp-existing-vpc/outputs.tf b/examples/gcp-existing-vpc/outputs.tf new file mode 100644 index 00000000..469a66e6 --- /dev/null +++ b/examples/gcp-existing-vpc/outputs.tf @@ -0,0 +1,14 @@ +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" +} diff --git a/examples/gcp-existing-vpc/providers.tf b/examples/gcp-existing-vpc/providers.tf new file mode 100644 index 00000000..edead550 --- /dev/null +++ b/examples/gcp-existing-vpc/providers.tf @@ -0,0 +1,11 @@ +provider "google" { + project = var.google_project + region = var.google_region + zone = var.google_zone +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + google_service_account = var.databricks_google_service_account + account_id = var.databricks_account_id +} diff --git a/examples/gcp-existing-vpc/terraform.tfvars b/examples/gcp-existing-vpc/terraform.tfvars new file mode 100644 index 00000000..a541640e --- /dev/null +++ b/examples/gcp-existing-vpc/terraform.tfvars @@ -0,0 +1,9 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +existing_vpc_name = "" +existing_subnet_name = "" diff --git a/examples/gcp-existing-vpc/variables.tf b/examples/gcp-existing-vpc/variables.tf new file mode 100644 index 00000000..f0518c88 --- /dev/null +++ b/examples/gcp-existing-vpc/variables.tf @@ -0,0 +1,44 @@ +variable "databricks_account_id" { + type = string + description = "Databricks Account ID" +} + +variable "databricks_google_service_account" { + type = string + description = "Service account email used for Databricks provider authentication" +} + +variable "google_project" { + type = string + description = "GCP project hosting the existing VPC and subnet (also the workspace project)" +} + +variable "google_region" { + type = string + description = "GCP region for workspace deployment (must match the existing subnet's region)" +} + +variable "google_zone" { + type = string + description = "GCP zone (used by the google provider)" +} + +variable "prefix" { + type = string + description = "Prefix used to name Databricks-side resources (mws_networks, mws_workspaces)" +} + +variable "workspace_name" { + type = string + description = "Workspace name" +} + +variable "existing_vpc_name" { + type = string + description = "Name of the pre-existing GCP VPC to deploy the workspace into" +} + +variable "existing_subnet_name" { + type = string + description = "Name of the pre-existing subnet inside the VPC (must be in google_region)" +} diff --git a/examples/gcp-existing-vpc/versions.tf b/examples/gcp-existing-vpc/versions.tf new file mode 100644 index 00000000..5389c5dd --- /dev/null +++ b/examples/gcp-existing-vpc/versions.tf @@ -0,0 +1,10 @@ +terraform { + required_providers { + databricks = { + source = "databricks/databricks" + } + google = { + source = "hashicorp/google" + } + } +} diff --git a/examples/gcp-sa-provisioning/data.tf b/examples/gcp-sa-provisioning/data.tf new file mode 100644 index 00000000..c2c3f9bf --- /dev/null +++ b/examples/gcp-sa-provisioning/data.tf @@ -0,0 +1 @@ +data "google_client_openid_userinfo" "me" {} diff --git a/examples/gcp-sa-provisioning/init.tf b/examples/gcp-sa-provisioning/init.tf deleted file mode 100644 index 5332dab6..00000000 --- a/examples/gcp-sa-provisioning/init.tf +++ /dev/null @@ -1,17 +0,0 @@ -terraform { - required_providers { - - google = { - source = "hashicorp/google" - } - } -} - -provider "google" { - project = var.google_project - region = var.google_region - zone = var.google_zone - -} -data "google_client_openid_userinfo" "me" { -} diff --git a/examples/gcp-sa-provisioning/main.tf b/examples/gcp-sa-provisioning/main.tf index 7b596530..109fd493 100644 --- a/examples/gcp-sa-provisioning/main.tf +++ b/examples/gcp-sa-provisioning/main.tf @@ -1,5 +1,5 @@ module "gcp-sa-provisioning" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" + source = "../../modules/gcp/service-account" google_project = var.google_project prefix = var.prefix delegate_from = var.delegate_from diff --git a/examples/gcp-sa-provisioning/providers.tf b/examples/gcp-sa-provisioning/providers.tf new file mode 100644 index 00000000..6040c599 --- /dev/null +++ b/examples/gcp-sa-provisioning/providers.tf @@ -0,0 +1,5 @@ +provider "google" { + project = var.google_project + region = var.google_region + zone = var.google_zone +} diff --git a/modules/gcp-sa-provisioning/init.tf b/examples/gcp-sa-provisioning/versions.tf similarity index 66% rename from modules/gcp-sa-provisioning/init.tf rename to examples/gcp-sa-provisioning/versions.tf index 3cd60564..b3340e11 100644 --- a/modules/gcp-sa-provisioning/init.tf +++ b/examples/gcp-sa-provisioning/versions.tf @@ -1,11 +1,7 @@ terraform { required_providers { - google = { source = "hashicorp/google" } } } - -data "google_client_openid_userinfo" "me" { -} diff --git a/examples/gcp-with-psc-exfiltration-protection/README.md b/examples/gcp-with-psc-exfiltration-protection/README.md index 64d676a9..265c560f 100644 --- a/examples/gcp-with-psc-exfiltration-protection/README.md +++ b/examples/gcp-with-psc-exfiltration-protection/README.md @@ -1,37 +1,42 @@ -# Provisioning Databricks on GCP workspace with a Hub & Spoke network architecture for data exfiltration protection +# examples/gcp-with-psc-exfiltration-protection — Workspace with PSC + private DNS + restricted egress -This example is using the [gcp-with-psc-exfiltration-protection](../../modules/gcp-with-psc-exfiltration-protection) module. +Calls `modules/gcp/databricks-workspace` with all PrivateLink and egress-control flags enabled: -This template provides an example deployment of: Hub-Spoke networking with egress firewall to control all outbound traffic from Databricks subnets. +- `vpc_source = "create"` — composer creates the spoke VPC + hub VPC + peering +- `private_link_frontend = true` — frontend PSC endpoint (workspace UI/API) +- `private_link_backend = true` — backend (SCC) PSC endpoint (data plane) +- `private_access_only = true` — `mws_private_access_settings.public_access_enabled = false` +- `restricted_egress = true` — hub VPC + deny-egress firewall + private DNS zones -With this setup, you can setup firewall rules to block / allow egress traffic from your Databricks clusters. You can also use firewall to block all access to storage accounts, and use private endpoint connection to bypass this firewall, such that you allow access only to specific storage accounts. +Optionally pairs with the `modules/gcp/unity-catalog` module to create a metastore, GCS bucket, storage credential, external location, and default catalog. +## Prerequisites -To find IP and FQDN for your deployment, go to: https://docs.gcp.databricks.com/en/resources/ip-domain-region.html +- Two (or three) GCP projects: workspace project, spoke VPC project, hub VPC project (can be the same) +- Service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID +- CIDR ranges that don't overlap: `spoke_vpc_cidr`, `subnet_cidr` (subset of spoke), `hub_vpc_cidr`, `psc_subnet_cidr` +- Regional default Hive Metastore IP from [Databricks docs](https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore) -## Overall Architecture +## Apply -![alt text](../../modules/gcp-with-psc-exfiltration-protection/images/architecture.png) +```bash +terraform init +terraform apply +``` -Resources to be created: -* Hub VPC and its subnet -* Spoke VPC and its subnets -* Peering between Hub and Spoke VPC -* Private Service Connect (PSC) endpoints -* DNS private and peering zones -* Firewall rules for Hub and Spoke VPCs -* Databricks workspace with private link to control plane, user to webapp and private link to DBFS +## Migrating from the old example -## How to use +This example previously called `modules/gcp-with-psc-exfiltration-protection` and `modules/gcp-unity-catalog`. Key changes: -1. Reference this module using one of the different [module source types](https://developer.hashicorp.com/terraform/language/modules/sources) -2. Add `terraform.tfvars` with the information about service principals to be provisioned at account level. +| Old | New | +|-----|-----| +| `module.gcp_with_data_exfiltration_protection` | `module.workspace` | +| `modules/gcp-with-psc-exfiltration-protection` | `modules/gcp/databricks-workspace` with `vpc_source=create` + 4 PSC/egress flags | +| `modules/gcp-unity-catalog` | `modules/gcp/unity-catalog` (relocated, same interface) | +| `spoke_vpc_cidr` (legacy: was used as subnet CIDR AND firewall source ranges) | Split into `subnet_cidr` (subnet CIDR) and `spoke_vpc_cidr` (broader VPC CIDR for firewall source) | -## How to fill in variable values - -Variables have no default values in order to avoid misconfiguration - -Most values are related to resources managed by Databricks. The required values can be found at: https://docs.gcp.databricks.com/en/resources/ip-domain-region.html +State from the old apply does **not** migrate cleanly to the new composer because resource addresses differ. Re-apply on clean state. ## Requirements @@ -49,8 +54,8 @@ No providers. | Name | Source | Version | |------|--------|---------| -| [gcp\_with\_data\_exfiltration\_protection](#module\_gcp\_with\_data\_exfiltration\_protection) | ../../modules/gcp-with-psc-exfiltration-protection | n/a | -| [unity\_catalog](#module\_unity\_catalog) | ../../modules/gcp-unity-catalog | n/a | +| [unity\_catalog](#module\_unity\_catalog) | ../../modules/gcp/unity-catalog | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -60,25 +65,29 @@ No resources. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| -| [catalog\_name](#input\_catalog\_name) | Name to assign to default catalog | `string` | n/a | yes | +| [catalog\_name](#input\_catalog\_name) | Name to assign to default Unity Catalog catalog | `string` | n/a | yes | | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | | [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | -| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Value of regional default Hive Metastore IP | `string` | n/a | yes | -| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for Hub VPC | `string` | n/a | yes | -| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID related to Hub VPC | `string` | n/a | yes | -| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the Spoke VPC is a Shared or a dedicated VPC | `bool` | n/a | yes | -| [metastore\_name](#input\_metastore\_name) | Name to assign to regional metastore | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | -| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID related to Spoke VPC | `string` | n/a | yes | -| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID related to Databricks workspace | `string` | n/a | yes | -| [tags](#input\_tags) | Map of tags to add to all resources | `map(string)` | `{}` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Regional default Hive Metastore IP (used by the spoke egress firewall to allow MySQL/3306) | `string` | n/a | yes | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for the hub subnet | `string` | n/a | yes | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID hosting the hub VPC | `string` | n/a | yes | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the spoke VPC project hosts a Shared VPC and the workspace project is bound as a service project | `bool` | n/a | yes | +| [metastore\_name](#input\_metastore\_name) | Name to assign to regional Unity Catalog metastore | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for the dedicated PSC subnet in the spoke VPC | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR of the spoke VPC address space (used as source\_ranges for the hub ingress firewall) | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID hosting the spoke VPC (often the same as workspace project) | `string` | n/a | yes | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the spoke subnet (must be within spoke\_vpc\_cidr) | `string` | n/a | yes | +| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID where the Databricks workspace lives | `string` | n/a | yes | +| [tags](#input\_tags) | Map of tags applied to the composer (the composer accepts this but does not currently propagate to all submodules) | `map(string)` | `{}` | no | ## Outputs | Name | Description | |------|-------------| +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | ID of the hub VPC | +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [vpc\_id](#output\_vpc\_id) | ID of the spoke VPC | | [workspace\_id](#output\_workspace\_id) | The Databricks workspace ID | | [workspace\_url](#output\_workspace\_url) | The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com' | diff --git a/examples/gcp-with-psc-exfiltration-protection/main.tf b/examples/gcp-with-psc-exfiltration-protection/main.tf index c0b7fa91..54eff6a6 100644 --- a/examples/gcp-with-psc-exfiltration-protection/main.tf +++ b/examples/gcp-with-psc-exfiltration-protection/main.tf @@ -1,16 +1,26 @@ -module "gcp_with_data_exfiltration_protection" { - source = "../../modules/gcp-with-psc-exfiltration-protection" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" - databricks_account_id = var.databricks_account_id + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.workspace_google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project hub_vpc_google_project = var.hub_vpc_google_project is_spoke_vpc_shared = var.is_spoke_vpc_shared - prefix = var.prefix - spoke_vpc_google_project = var.spoke_vpc_google_project - workspace_google_project = var.workspace_google_project - google_region = var.google_region - hive_metastore_ip = var.hive_metastore_ip hub_vpc_cidr = var.hub_vpc_cidr psc_subnet_cidr = var.psc_subnet_cidr - spoke_vpc_cidr = var.spoke_vpc_cidr - tags = var.tags + hive_metastore_ip = var.hive_metastore_ip + + tags = var.tags } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/outputs.tf b/examples/gcp-with-psc-exfiltration-protection/outputs.tf index 681fe5d0..76e7c517 100644 --- a/examples/gcp-with-psc-exfiltration-protection/outputs.tf +++ b/examples/gcp-with-psc-exfiltration-protection/outputs.tf @@ -1,10 +1,24 @@ - output "workspace_url" { - value = module.gcp_with_data_exfiltration_protection.workspace_url + value = module.workspace.workspace_url description = "The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com'" } output "workspace_id" { + value = module.workspace.workspace_id description = "The Databricks workspace ID" - value = module.gcp_with_data_exfiltration_protection.workspace_id +} + +output "vpc_id" { + value = module.workspace.spoke_vpc_id + description = "ID of the spoke VPC" +} + +output "hub_vpc_id" { + value = module.workspace.hub_vpc_id + description = "ID of the hub VPC" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/providers.tf b/examples/gcp-with-psc-exfiltration-protection/providers.tf index 489bf1e9..f2881ffd 100644 --- a/examples/gcp-with-psc-exfiltration-protection/providers.tf +++ b/examples/gcp-with-psc-exfiltration-protection/providers.tf @@ -6,7 +6,7 @@ provider "databricks" { provider "databricks" { alias = "workspace" - host = module.gcp_with_data_exfiltration_protection.workspace_url + host = module.workspace.workspace_url } provider "google" { diff --git a/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars b/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars index 8f095727..c9a603e9 100644 --- a/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars +++ b/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars @@ -13,8 +13,10 @@ prefix = "" hive_metastore_ip = "" hub_vpc_cidr = "" spoke_vpc_cidr = "" +subnet_cidr = "" psc_subnet_cidr = "" metastore_name = "" catalog_name = "" +tags = {} diff --git a/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf b/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf index c6c0628c..792862d8 100644 --- a/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf +++ b/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf @@ -1,15 +1,16 @@ module "unity_catalog" { - source = "../../modules/gcp-unity-catalog" + source = "../../modules/gcp/unity-catalog" providers = { - databricks = databricks, + databricks = databricks databricks.workspace = databricks.workspace } - databricks_workspace_id = module.gcp_with_data_exfiltration_protection.workspace_id - databricks_workspace_url = module.gcp_with_data_exfiltration_protection.workspace_url + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url google_project = var.workspace_google_project google_region = var.google_region + prefix = var.prefix metastore_name = var.metastore_name catalog_name = var.catalog_name - prefix = var.prefix } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/variables.tf b/examples/gcp-with-psc-exfiltration-protection/variables.tf index 15365ccf..48578319 100644 --- a/examples/gcp-with-psc-exfiltration-protection/variables.tf +++ b/examples/gcp-with-psc-exfiltration-protection/variables.tf @@ -10,64 +10,69 @@ variable "google_region" { variable "workspace_google_project" { type = string - description = "Google Cloud project ID related to Databricks workspace" + description = "Google Cloud project ID where the Databricks workspace lives" } variable "spoke_vpc_google_project" { type = string - description = "Google Cloud project ID related to Spoke VPC" + description = "Google Cloud project ID hosting the spoke VPC (often the same as workspace project)" } variable "hub_vpc_google_project" { type = string - description = "Google Cloud project ID related to Hub VPC" + description = "Google Cloud project ID hosting the hub VPC" } variable "is_spoke_vpc_shared" { type = bool - description = "Whether the Spoke VPC is a Shared or a dedicated VPC" + description = "Whether the spoke VPC project hosts a Shared VPC and the workspace project is bound as a service project" } variable "prefix" { type = string - description = "Prefix to use in generated resources name" + description = "Prefix used to name generated resources" } # For the value of the regional Hive Metastore IP, refer to the Databricks documentation -# Here - https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore +# https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore variable "hive_metastore_ip" { type = string - description = "Value of regional default Hive Metastore IP" + description = "Regional default Hive Metastore IP (used by the spoke egress firewall to allow MySQL/3306)" } variable "hub_vpc_cidr" { type = string - description = "CIDR for Hub VPC" + description = "CIDR for the hub subnet" } variable "spoke_vpc_cidr" { type = string - description = "CIDR for Spoke VPC" + description = "CIDR of the spoke VPC address space (used as source_ranges for the hub ingress firewall)" +} + +variable "subnet_cidr" { + type = string + description = "CIDR for the spoke subnet (must be within spoke_vpc_cidr)" } variable "psc_subnet_cidr" { type = string - description = "CIDR for Spoke VPC" + description = "CIDR for the dedicated PSC subnet in the spoke VPC" } variable "tags" { type = map(string) - description = "Map of tags to add to all resources" + description = "Map of tags applied to the composer (the composer accepts this but does not currently propagate to all submodules)" default = {} } variable "metastore_name" { type = string - description = "Name to assign to regional metastore" + description = "Name to assign to regional Unity Catalog metastore" } variable "catalog_name" { type = string - description = "Name to assign to default catalog" + description = "Name to assign to default Unity Catalog catalog" } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/terraform.tf b/examples/gcp-with-psc-exfiltration-protection/versions.tf similarity index 100% rename from examples/gcp-with-psc-exfiltration-protection/terraform.tf rename to examples/gcp-with-psc-exfiltration-protection/versions.tf diff --git a/modules/Makefile b/modules/Makefile index 98c80a85..a23fed05 100644 --- a/modules/Makefile +++ b/modules/Makefile @@ -1,8 +1,11 @@ PROJECTS := $(dir $(wildcard */README.md)) -docs: $(PROJECTS) +docs: $(PROJECTS) gcp-recursive $(PROJECTS): $(MAKE) -C $@ docs -.PHONY: $(PROJECTS) +gcp-recursive: + $(MAKE) -C gcp docs + +.PHONY: $(PROJECTS) docs gcp-recursive diff --git a/modules/gcp-sa-provisioning/Makefile b/modules/gcp-sa-provisioning/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-sa-provisioning/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-unity-catalog/Makefile b/modules/gcp-unity-catalog/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-unity-catalog/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-with-psc-exfiltration-protection/Makefile b/modules/gcp-with-psc-exfiltration-protection/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-with-psc-exfiltration-protection/README.md b/modules/gcp-with-psc-exfiltration-protection/README.md deleted file mode 100644 index 6f9650de..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/README.md +++ /dev/null @@ -1,139 +0,0 @@ -# Databricks on Google Cloud with Private Service Connect and Hub-Spoke network structure (data exfiltration protection). - -## ⚠️ Prerequisites -To **enable Private Service Connect for your Databricks workspace** on Google Cloud, you must contact your Databricks account team and provide: -- Databricks account ID -- VPC Host Project ID of the **compute plane VPC** for enabling back-end Private Service Connect -- VPC Host Project ID of the **transit VPC** for enabling front-end Private Service Connect -- Workspace region - -This configuration **cannot be completed independently** and requires coordination with your Databricks account team. - -## Overview - -The module includes: -1. Hub-Spoke networking with egress firewall to control all outbound traffic, e.g. to pypi.org. -2. Private Service Connect connection for backend traffic from data plane to control plane. -3. Private Service Connect connection from user client to webapp service. -4. Private Google Access from data plane to DBFS storage. -5. Private Service Connect connection for web-auth traffic. - -## Overall Architecture - -![alt text](images/architecture.png) - -With this deployment, traffic from user client to webapp (notebook UI), backend traffic from data plane to control plane will be through PSC endpoints. This terraform sample will create: -* Hub VPC and its subnet -* Spoke VPC and its subnets -* Peering between Hub and Spoke VPC -* Private Service Connect (PSC) endpoints -* DNS private and peering zones -* Firewall rules for Hub and Spoke VPCs -* Databricks workspace with private link to control plane, user to webapp and private link to DBFS - - -**Note that** the module does not contain the VPC SC implementation. This can be added to increase the security level in the Databricks deployment, providing detailed access level for ingress and egress traffic. -## How to use - -> **Note** -> You can customize this module by adding, deleting or updating the Google Cloud resources to adapt the module to your requirements. -> A deployment example using this module can be found in [examples/gcp-with-psc-exfiltration-protection](../../examples/gcp-with-psc-exfiltration-protection) - -1. Reference this module using one of the different [module source types](https://developer.hashicorp.com/terraform/language/modules/sources) -2. Add `terraform.tfvars` with the information about service principals to be provisioned at account level. - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_networks.databricks_network](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | -| [databricks_mws_private_access_settings.pas](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_private_access_settings) | resource | -| [databricks_mws_vpc_endpoint.backend_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_vpc_endpoint.frontend_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_vpc_endpoint.transit_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [google_compute_address.backend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_address.hub_frontend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_address.spoke_frontend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_firewall.databricks_workspace_traffic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.default_deny_egress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.hub_net_traffic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_databricks_compute_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_databricks_control_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_managed_hive](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_forwarding_rule.backend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_forwarding_rule.hub_frontend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_forwarding_rule.spoke_frontend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_network.hub_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_network.spoke_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_network_peering.hub_spoke_peering](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | -| [google_compute_network_peering.spoke_hub_peering](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | -| [google_compute_shared_vpc_host_project.host](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_host_project) | resource | -| [google_compute_shared_vpc_service_project.service](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_service_project) | resource | -| [google_compute_subnetwork.hub_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_compute_subnetwork.psc_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_compute_subnetwork.spoke_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_dns_managed_zone.gcr_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.gcr_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.google_apis_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.google_apis_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.hub_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.pkg_dev_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.pkg_dev_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.spoke_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_record_set.gcr_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.gcr_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_psc_auth](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.pkg_dev_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.pkg_dev_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.restricted_apis_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.restricted_apis_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_relay](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_workspace_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | -| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Value of regional default Hive Metastore IP | `string` | n/a | yes | -| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for Hub VPC | `string` | n/a | yes | -| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID related to Hub VPC | `string` | n/a | yes | -| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the Spoke VPC is a Shared or a dedicated VPC | `bool` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | -| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID related to Spoke VPC | `string` | n/a | yes | -| [tags](#input\_tags) | Map of tags to add to all resources | `map(string)` | n/a | yes | -| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID related to Databricks workspace | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [workspace\_id](#output\_workspace\_id) | The Databricks workspace ID | -| [workspace\_url](#output\_workspace\_url) | The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com' | - \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf b/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf deleted file mode 100644 index f0a355e0..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf +++ /dev/null @@ -1,86 +0,0 @@ -################################################### -# Databricks VPC Endpoints & Network Configuration -################################################### - -# ================================================ -# Private Service Connect Endpoint Configurations -# ================================================ - -# Registers a transit VPC endpoint for hub network connectivity -resource "databricks_mws_vpc_endpoint" "transit_endpoint" { - depends_on = [google_compute_forwarding_rule.backend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-hub-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for hub network - gcp_vpc_endpoint_info { - project_id = var.hub_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.hub_frontend_psc_ep.name - endpoint_region = var.google_region - } -} - -# Registers frontend workspace VPC endpoint for user-facing access -resource "databricks_mws_vpc_endpoint" "frontend_endpoint" { - depends_on = [google_compute_forwarding_rule.backend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-ws-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for spoke workspace - gcp_vpc_endpoint_info { - project_id = var.spoke_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.spoke_frontend_psc_ep.name - endpoint_region = var.google_region - } -} - -# Registers backend SCC (Secure Cluster Connectivity) endpoint -resource "databricks_mws_vpc_endpoint" "backend_endpoint" { - depends_on = [google_compute_forwarding_rule.spoke_frontend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-scc-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for backend connectivity - gcp_vpc_endpoint_info { - project_id = var.spoke_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.backend_psc_ep.name - endpoint_region = var.google_region - } -} - -# ================================================ -# Network Configuration for Databricks Workspace -# ================================================ - -resource "databricks_mws_networks" "databricks_network" { - network_name = "${var.prefix}-ntw-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP network infrastructure details - gcp_network_info { - network_project_id = var.spoke_vpc_google_project - vpc_id = google_compute_network.spoke_vpc.name - subnet_id = google_compute_subnetwork.spoke_subnetwork.name - subnet_region = var.google_region - } - - # PrivateLink endpoint associations - vpc_endpoints { - dataplane_relay = [databricks_mws_vpc_endpoint.backend_endpoint.vpc_endpoint_id] # SCC connectivity - rest_api = [databricks_mws_vpc_endpoint.frontend_endpoint.vpc_endpoint_id] # Workspace API access - } -} - -# ================================================ -# Private Access Configuration -# ================================================ - -resource "databricks_mws_private_access_settings" "pas" { - private_access_settings_name = "${var.prefix}-pas-${random_string.suffix.result}" - region = var.google_region - public_access_enabled = false # Block public internet access - private_access_level = "ACCOUNT" # Apply to entire Databricks account -} diff --git a/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf b/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf deleted file mode 100644 index b0eca334..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf +++ /dev/null @@ -1,214 +0,0 @@ -######################################### -# Databricks Private DNS Configuration # -######################################### - -# Create a private DNS zone for Databricks PSC management -resource "google_dns_managed_zone" "hub_private_zone" { - name = "${var.prefix}-hub-gcp-databricks-com" - project = var.hub_vpc_google_project - dns_name = "gcp.databricks.com." - description = "Private DNS zone for Databricks PSC management" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# DNS A record for the Databricks workspace URL -resource "google_dns_record_set" "hub_workspace_url" { - name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP address - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -# DNS A record for the Databricks PSC authentication endpoint -resource "google_dns_record_set" "hub_workspace_psc_auth" { - name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the same frontend Private Endpoint IP - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -# DNS A record for the Databricks dataplane endpoint -resource "google_dns_record_set" "hub_workspace_dp" { - name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the same frontend Private Endpoint IP - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -############################################# -# Google Container Registry Private DNS Zone # -############################################# - -# Create a private DNS zone for GCR (gcr.io) -resource "google_dns_managed_zone" "gcr_private_zone" { - name = "${var.prefix}-gcr-io" - project = var.hub_vpc_google_project - dns_name = "gcr.io." - description = "Private DNS zone for GCR private resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of gcr.io -resource "google_dns_record_set" "gcr_cname" { - name = "*.${google_dns_managed_zone.gcr_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.gcr_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to gcr.io - rrdatas = [ - "gcr.io." - ] -} - -# A record for gcr.io pointing to Google IPs for private access -resource "google_dns_record_set" "gcr_a" { - name = google_dns_managed_zone.gcr_private_zone.dns_name - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.gcr_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for gcr.io - rrdatas = [ - "199.36.153.8", - "199.36.153.9", - "199.36.153.10", - "199.36.153.11" - ] -} - -################################## -# Google APIs Private DNS Zone # -################################## - -# Create a private DNS zone for Google APIs (googleapis.com) -resource "google_dns_managed_zone" "google_apis_private_zone" { - name = "${var.prefix}-google-apis" - project = var.hub_vpc_google_project - dns_name = "googleapis.com." - description = "Private DNS zone for Google APIs resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of googleapis.com -resource "google_dns_record_set" "restricted_apis_cname" { - name = "*.${google_dns_managed_zone.google_apis_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.google_apis_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to restricted.googleapis.com - rrdatas = [ - "restricted.googleapis.com." - ] -} - -# A record for restricted.googleapis.com pointing to Google IPs for private access -resource "google_dns_record_set" "restricted_apis_a" { - name = "restricted.${google_dns_managed_zone.google_apis_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.google_apis_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for restricted.googleapis.com - rrdatas = [ - "199.36.153.4", - "199.36.153.5", - "199.36.153.6", - "199.36.153.7" - ] -} - -################################## -# Go Packages Private DNS Zone # -################################## - -# Create a private DNS zone for Go Packages (pkg.dev) -resource "google_dns_managed_zone" "pkg_dev_private_zone" { - name = "${var.prefix}-pkg-dev" - project = var.hub_vpc_google_project - dns_name = "pkg.dev." - description = "Private DNS zone for Go Packages resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of pkg.dev -resource "google_dns_record_set" "pkg_dev_cname" { - name = "*.${google_dns_managed_zone.pkg_dev_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.pkg_dev_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to pkg.dev - rrdatas = [ - "pkg.dev." - ] -} - -# A record for pkg.dev pointing to Google IPs for private access -resource "google_dns_record_set" "pkg_dev_a" { - name = google_dns_managed_zone.pkg_dev_private_zone.dns_name - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.pkg_dev_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for pkg.dev - rrdatas = [ - "199.36.153.8", - "199.36.153.9", - "199.36.153.10", - "199.36.153.11" - ] -} diff --git a/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf b/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf deleted file mode 100644 index 799cd81a..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf +++ /dev/null @@ -1,135 +0,0 @@ -############################################# -# Databricks Private DNS Zone (Spoke VPC) # -############################################# - -# Creates a private DNS managed zone for Databricks PSC endpoints -# This zone is only visible within the spoke VPC network -resource "google_dns_managed_zone" "spoke_private_zone" { - name = "${var.prefix}-spoke-gcp-databricks-com" - project = var.spoke_vpc_google_project - dns_name = "gcp.databricks.com." - description = "Private DNS zone for Databricks PSC management" - visibility = "private" - - # Restricts DNS zone visibility to the spoke VPC - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } -} - -# Creates an A record for the Databricks workspace endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_workspace_url" { - name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP in the spoke VPC - rrdatas = [ - google_compute_address.spoke_frontend_pe_ip_address.address - ] -} - -# Creates an A record for the Databricks dataplane endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_workspace_dp" { - name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP in the spoke VPC - rrdatas = [ - google_compute_address.spoke_frontend_pe_ip_address.address - ] -} - -# Creates an A record for the Databricks relay/tunnel endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_relay" { - name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the backend Private Endpoint IP (used for relay/tunnel) - rrdatas = [ - google_compute_address.backend_pe_ip_address.address - ] -} - -########################################################## -# Peering DNS Zones for Hub-Spoke Shared Service Access # -########################################################## - -# The following managed zones provide private DNS for Google services (GCR, Google APIs, Go Packages) -# and are peered to the hub VPC for shared DNS resolution across VPCs. - -# Google Container Registry (GCR) private peering zone -resource "google_dns_managed_zone" "gcr_peering_zone" { - name = "${var.prefix}-peering-gcr" - project = var.spoke_vpc_google_project - dns_name = "gcr.io." - description = "Peering DNS zone for GCR private resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Google APIs private peering zone -resource "google_dns_managed_zone" "google_apis_peering_zone" { - name = "${var.prefix}-peering-google-apis" - project = var.spoke_vpc_google_project - dns_name = "googleapis.com." - description = "Private DNS zone for Google APIs resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Go Packages (pkg.dev) private peering zone -resource "google_dns_managed_zone" "pkg_dev_peering_zone" { - name = "${var.prefix}-peering-pkg-dev" - project = var.spoke_vpc_google_project - dns_name = "pkg.dev." - description = "Private DNS zone for Go Packages resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} diff --git a/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf b/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf deleted file mode 100644 index a1563a2e..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf +++ /dev/null @@ -1,21 +0,0 @@ -# ========================================================== -# Google Cloud VPC Firewall Rule: Hub Network Ingress Traffic -# ========================================================== - -resource "google_compute_firewall" "hub_net_traffic" { - name = "${google_compute_network.hub_vpc.name}-ingress" - - project = var.hub_vpc_google_project - network = google_compute_network.hub_vpc.self_link - - direction = "INGRESS" - priority = 1000 - destination_ranges = [] - # The source IP range(s) allowed by this rule (CIDR format) - # Only traffic originating from the spoke VPC's CIDR block will be allowed - source_ranges = [var.spoke_vpc_cidr] - - allow { - protocol = "all" - } -} diff --git a/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf b/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf deleted file mode 100644 index a44c69a6..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf +++ /dev/null @@ -1,112 +0,0 @@ -############################################################# -# Google Cloud Firewall Rules for Databricks Spoke Network # -############################################################# - -# ========================================================== -# Default Egress Deny Rule (Catch-All Block) -# ========================================================== - -resource "google_compute_firewall" "default_deny_egress" { - name = "${google_compute_network.spoke_vpc.name}-default-deny-egress" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1100 # Higher priority than allow rules - destination_ranges = ["0.0.0.0/0"] # Block all external destinations - source_ranges = [] - - deny { protocol = "all" } # Explicit deny all outbound traffic -} - -# ========================================================== -# Essential Service Allow Rules -# ========================================================== - -# Allows outbound traffic to Google APIs and services -resource "google_compute_firewall" "to_google_apis" { - name = "${google_compute_network.spoke_vpc.name}-to-google-apis" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 # Lower priority than deny rule - destination_ranges = [ - "199.36.153.4/30", # Restricted Google APIs - "199.36.153.8/30", # GCR/GCS endpoints - "34.126.0.0/18" # Additional Google service IPs - ] - - allow { protocol = "all" } # Full protocol access to these IPs -} - -# Allows control plane communication for Databricks -resource "google_compute_firewall" "to_databricks_control_plane" { - name = "${google_compute_network.spoke_vpc.name}-to-databricks-control-plane" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = [ - "${google_compute_forwarding_rule.backend_psc_ep.ip_address}/32", # SCC endpoint - "${google_compute_forwarding_rule.spoke_frontend_psc_ep.ip_address}/32" # Frontend endpoint - ] - - allow { - protocol = "tcp" - ports = ["443"] # HTTPS only - } -} - -# ========================================================== -# Managed Hive Metastore Access (Conditional) -# ========================================================== - -resource "google_compute_firewall" "to_managed_hive" { - name = "${google_compute_network.spoke_vpc.name}-to-${var.google_region}-managed-hive" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = ["${var.hive_metastore_ip}/32"] # Metastore-specific IP - - allow { - protocol = "tcp" - ports = ["3306"] # MySQL port - } -} - -# ========================================================== -# Internal Workspace Communication -# ========================================================== - -resource "google_compute_firewall" "databricks_workspace_traffic" { - name = "${google_compute_network.spoke_vpc.name}-${databricks_mws_workspaces.databricks_workspace.workspace_id}-ingress" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "INGRESS" - priority = 1000 - source_ranges = [var.spoke_vpc_cidr] # Internal VPC traffic - target_tags = ["databricks-${databricks_mws_workspaces.databricks_workspace.workspace_id}"] # Workspace-specific instances - - allow { protocol = "all" } # Full internal access -} - -resource "google_compute_firewall" "to_databricks_compute_plane" { - name = "${google_compute_network.spoke_vpc.name}-to-databricks-compute-plane" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = [ - var.spoke_vpc_cidr - ] - - allow { - protocol = "all" - } -} \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/images/architecture.png b/modules/gcp-with-psc-exfiltration-protection/images/architecture.png deleted file mode 100644 index 9b245904..00000000 Binary files a/modules/gcp-with-psc-exfiltration-protection/images/architecture.png and /dev/null differ diff --git a/modules/gcp-with-psc-exfiltration-protection/outputs.tf b/modules/gcp-with-psc-exfiltration-protection/outputs.tf deleted file mode 100644 index 27983846..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/outputs.tf +++ /dev/null @@ -1,10 +0,0 @@ - -output "workspace_url" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url - description = "The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com'" -} - -output "workspace_id" { - description = "The Databricks workspace ID" - value = databricks_mws_workspaces.databricks_workspace.workspace_id -} \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/psc.tf b/modules/gcp-with-psc-exfiltration-protection/psc.tf deleted file mode 100644 index e64b5436..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/psc.tf +++ /dev/null @@ -1,75 +0,0 @@ -######################################################### -# Private Service Connect (PSC) Internal Endpoints Setup -######################################################### - -# ---------------------------------------------------------------- -# Secure Cluster Connectivity (SCC) PSC Endpoint (Spoke VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the backend (SCC) PSC endpoint in the spoke VPC -resource "google_compute_address" "backend_pe_ip_address" { - name = "${var.prefix}-psc-scc-ip-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.psc_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the SCC PSC service attachment -resource "google_compute_forwarding_rule" "backend_psc_ep" { - name = "${var.prefix}-psc-scc-ep-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - network = google_compute_network.spoke_vpc.id - ip_address = google_compute_address.backend_pe_ip_address.id - target = local.google_backend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} - -# ---------------------------------------------------------------- -# Workspace Frontend PSC Endpoint (Spoke VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the workspace frontend PSC endpoint in the spoke VPC -resource "google_compute_address" "spoke_frontend_pe_ip_address" { - name = "${var.prefix}-psc-ws-ip-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.psc_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the workspace frontend PSC service attachment -resource "google_compute_forwarding_rule" "spoke_frontend_psc_ep" { - name = "${var.prefix}-psc-ws-ep-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - network = google_compute_network.spoke_vpc.id - ip_address = google_compute_address.spoke_frontend_pe_ip_address.id - target = local.google_frontend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} - -# ---------------------------------------------------------------- -# Workspace Frontend PSC Endpoint (Hub VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the workspace frontend PSC endpoint in the hub VPC -resource "google_compute_address" "hub_frontend_pe_ip_address" { - name = "${var.prefix}-hub-psc-ws-ip-${random_string.suffix.result}" - project = var.hub_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.hub_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the workspace frontend PSC service attachment in the hub VPC -resource "google_compute_forwarding_rule" "hub_frontend_psc_ep" { - name = "${var.prefix}-hub-psc-ws-ep-${random_string.suffix.result}" - project = var.hub_vpc_google_project - region = var.google_region - network = google_compute_network.hub_vpc.id - ip_address = google_compute_address.hub_frontend_pe_ip_address.id - target = local.google_frontend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} diff --git a/modules/gcp-with-psc-exfiltration-protection/variables.tf b/modules/gcp-with-psc-exfiltration-protection/variables.tf deleted file mode 100644 index cd96e520..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/variables.tf +++ /dev/null @@ -1,62 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_region" { - type = string - description = "Google Cloud region where the resources will be created" -} - -variable "workspace_google_project" { - type = string - description = "Google Cloud project ID related to Databricks workspace" -} - -variable "spoke_vpc_google_project" { - type = string - description = "Google Cloud project ID related to Spoke VPC" -} - -variable "hub_vpc_google_project" { - type = string - description = "Google Cloud project ID related to Hub VPC" -} - -variable "is_spoke_vpc_shared" { - type = bool - description = "Whether the Spoke VPC is a Shared or a dedicated VPC" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated resources name" -} - -# For the value of the regional Hive Metastore IP, refer to the Databricks documentation -# Here - https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore -variable "hive_metastore_ip" { - type = string - description = "Value of regional default Hive Metastore IP" -} - -variable "hub_vpc_cidr" { - type = string - description = "CIDR for Hub VPC" -} - -variable "spoke_vpc_cidr" { - type = string - description = "CIDR for Spoke VPC" -} - -variable "psc_subnet_cidr" { - type = string - description = "CIDR for Spoke VPC" -} - -variable "tags" { - description = "Map of tags to add to all resources" - type = map(string) -} - diff --git a/modules/gcp-with-psc-exfiltration-protection/vpc.tf b/modules/gcp-with-psc-exfiltration-protection/vpc.tf deleted file mode 100644 index 113c873d..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/vpc.tf +++ /dev/null @@ -1,90 +0,0 @@ -######################################################### -# Hub & Spoke Network Infrastructure Configuration -######################################################### - -# ======================================================= -# VPC Networks -# ======================================================= - -# Spoke VPC for Databricks workspace and workloads -resource "google_compute_network" "spoke_vpc" { - name = "${var.prefix}-spoke-vpc-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - auto_create_subnetworks = false # Manual subnet configuration - routing_mode = "GLOBAL" # Global routing for hybrid connectivity - bgp_best_path_selection_mode = "STANDARD" -} - -# Hub VPC for centralized networking services -resource "google_compute_network" "hub_vpc" { - name = "${var.prefix}-hub-vpc-${random_string.suffix.result}" - project = var.hub_vpc_google_project - auto_create_subnetworks = false - routing_mode = "GLOBAL" -} - -# ======================================================= -# Subnetwork Configuration -# ======================================================= - -# Primary spoke subnet for general workloads -resource "google_compute_subnetwork" "spoke_subnetwork" { - name = "${var.prefix}-spoke-subnet-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.id - region = var.google_region - ip_cidr_range = var.spoke_vpc_cidr - private_ip_google_access = true # Enables Private Google Access -} - -# Dedicated PSC subnet for Private Service Connect endpoints -resource "google_compute_subnetwork" "psc_subnetwork" { - name = "${var.prefix}-spoke-psc-subnet-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.id - region = var.google_region - ip_cidr_range = var.psc_subnet_cidr - private_ip_google_access = true -} - -# Hub subnet for shared services -resource "google_compute_subnetwork" "hub_subnetwork" { - name = "${var.prefix}-hub-subnet-${random_string.suffix.result}" - project = var.hub_vpc_google_project - network = google_compute_network.hub_vpc.id - region = var.google_region - ip_cidr_range = var.hub_vpc_cidr - private_ip_google_access = true -} - -# ======================================================= -# Network Peering Configuration -# ======================================================= - -# Bidirectional peering between hub and spoke VPCs -resource "google_compute_network_peering" "hub_spoke_peering" { - name = "${var.prefix}-hub-spoke-peering-${random_string.suffix.result}" - network = google_compute_network.hub_vpc.self_link - peer_network = google_compute_network.spoke_vpc.self_link -} - -resource "google_compute_network_peering" "spoke_hub_peering" { - name = "${var.prefix}-spoke-hub-peering-${random_string.suffix.result}" - network = google_compute_network.spoke_vpc.self_link - peer_network = google_compute_network.hub_vpc.self_link -} - -# ======================================================= -# Shared VPC Configuration (Conditional) -# ======================================================= - -resource "google_compute_shared_vpc_host_project" "host" { - count = var.workspace_google_project != var.spoke_vpc_google_project && var.is_spoke_vpc_shared ? 1 : 0 - project = var.spoke_vpc_google_project -} - -resource "google_compute_shared_vpc_service_project" "service" { - count = var.workspace_google_project != var.spoke_vpc_google_project && var.is_spoke_vpc_shared ? 1 : 0 - host_project = google_compute_shared_vpc_host_project.host[0].project - service_project = var.workspace_google_project -} diff --git a/modules/gcp-with-psc-exfiltration-protection/workspace.tf b/modules/gcp-with-psc-exfiltration-protection/workspace.tf deleted file mode 100644 index 44dc510e..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/workspace.tf +++ /dev/null @@ -1,22 +0,0 @@ -######################################################### -# Databricks Workspace Configuration -######################################################### - -resource "databricks_mws_workspaces" "databricks_workspace" { - workspace_name = "${var.prefix}-ws-${random_string.suffix.result}" - - # Databricks account and cloud provider details - account_id = var.databricks_account_id - location = var.google_region # GCP region for workspace deployment - - # GCP project hosting workspace resources - cloud_resource_container { - gcp { - project_id = var.workspace_google_project - } - } - - # Network and security configurations - private_access_settings_id = databricks_mws_private_access_settings.pas.private_access_settings_id # Private access enforcement - network_id = databricks_mws_networks.databricks_network.network_id # Associated VPC network -} diff --git a/modules/gcp-workspace-basic/Makefile b/modules/gcp-workspace-basic/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-workspace-basic/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-workspace-basic/README.md b/modules/gcp-workspace-basic/README.md deleted file mode 100644 index cc6d6b12..00000000 --- a/modules/gcp-workspace-basic/README.md +++ /dev/null @@ -1,67 +0,0 @@ -gcp basic -========================= - -In this template, we show how to deploy a workspace with managed vpc. - - -## Requirements - -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The Service Account needs to be added as Databricks Admin in the account console - -## Run as an SA - -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. - - -## Run the tempalte - -- You need to fill in the variables.tf -- run `terraform init` -- run `teraform apply` - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | -| [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | -| [google_client_openid_userinfo.me](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_openid_userinfo) | data source | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [workspace\_name](#input\_workspace\_name) | Name of the workspace to create | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | - diff --git a/modules/gcp-workspace-basic/init.tf b/modules/gcp-workspace-basic/init.tf deleted file mode 100644 index b07d3474..00000000 --- a/modules/gcp-workspace-basic/init.tf +++ /dev/null @@ -1,24 +0,0 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - -data "google_client_openid_userinfo" "me" { -} - - -data "google_client_config" "current" { -} - - -resource "random_string" "suffix" { - special = false - upper = false - length = 6 -} diff --git a/modules/gcp-workspace-basic/outputs.tf b/modules/gcp-workspace-basic/outputs.tf deleted file mode 100644 index d6b170a9..00000000 --- a/modules/gcp-workspace-basic/outputs.tf +++ /dev/null @@ -1,9 +0,0 @@ - -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url -} - -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true -} diff --git a/modules/gcp-workspace-basic/variables.tf b/modules/gcp-workspace-basic/variables.tf deleted file mode 100644 index 5e94a563..00000000 --- a/modules/gcp-workspace-basic/variables.tf +++ /dev/null @@ -1,29 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_project" { - type = string - description = "Google project for VCP/workspace deployment" -} - -variable "google_region" { - type = string - description = "Google region for VCP/workspace deployment" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated VPC name" -} - -variable "workspace_name" { - type = string - description = "Name of the workspace to create" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} diff --git a/modules/gcp-workspace-basic/workspace.tf b/modules/gcp-workspace-basic/workspace.tf deleted file mode 100644 index 262d8a06..00000000 --- a/modules/gcp-workspace-basic/workspace.tf +++ /dev/null @@ -1,14 +0,0 @@ -resource "databricks_mws_workspaces" "databricks_workspace" { - account_id = var.databricks_account_id - workspace_name = var.workspace_name - - location = var.google_region - cloud_resource_container { - gcp { - project_id = var.google_project - } - } - token { - comment = "Terraform token" - } -} diff --git a/modules/gcp-workspace-byovpc/Makefile b/modules/gcp-workspace-byovpc/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-workspace-byovpc/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-workspace-byovpc/README.md b/modules/gcp-workspace-byovpc/README.md deleted file mode 100644 index 0fbaf403..00000000 --- a/modules/gcp-workspace-byovpc/README.md +++ /dev/null @@ -1,75 +0,0 @@ -gcp byovpc -========================= - -In this template, we show how to deploy a workspace with a custom VPC. - - -## Requirements - -- You need to have run `gcp-sa-provisionning` module and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The sizing of the custom vpc subnets needs to be appropriate for the usage of the workspace. [This documentation covers it](https://docs.gcp.databricks.com/administration-guide/cloud-configurations/gcp/network-sizing.html) - -## Run as an SA - -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. - - -## Run the tempalte - -- You need to fill in the `variables.tf` -- run `terraform init` -- run `teraform apply` - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_networks.databricks_network](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [google_compute_network.dbx_private_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_router.router](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router) | resource | -| [google_compute_router_nat.nat](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router_nat) | resource | -| [google_compute_subnetwork.network-with-private-secondary-ip-ranges](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | -| [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | -| [google_client_openid_userinfo.me](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_openid_userinfo) | data source | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [nat\_name](#input\_nat\_name) | Name of the NAT service in compute router | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [router\_name](#input\_router\_name) | Name of the compute router to create | `string` | n/a | yes | -| [subnet\_ip\_cidr\_range](#input\_subnet\_ip\_cidr\_range) | IP Range for Nodes subnet (primary) | `string` | n/a | yes | -| [subnet\_name](#input\_subnet\_name) | Name of the subnet to create | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | - \ No newline at end of file diff --git a/modules/gcp-workspace-byovpc/init.tf b/modules/gcp-workspace-byovpc/init.tf deleted file mode 100644 index 103a33ee..00000000 --- a/modules/gcp-workspace-byovpc/init.tf +++ /dev/null @@ -1,22 +0,0 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - -data "google_client_openid_userinfo" "me" { -} - -data "google_client_config" "current" { -} - -resource "random_string" "suffix" { - special = false - upper = false - length = 6 -} diff --git a/modules/gcp-workspace-byovpc/outputs.tf b/modules/gcp-workspace-byovpc/outputs.tf deleted file mode 100644 index f544b3ba..00000000 --- a/modules/gcp-workspace-byovpc/outputs.tf +++ /dev/null @@ -1,8 +0,0 @@ -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url -} - -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true -} \ No newline at end of file diff --git a/modules/gcp-workspace-byovpc/variables.tf b/modules/gcp-workspace-byovpc/variables.tf deleted file mode 100644 index 45650087..00000000 --- a/modules/gcp-workspace-byovpc/variables.tf +++ /dev/null @@ -1,45 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_project" { - type = string - description = "Google project for VCP/workspace deployment" -} - -variable "google_region" { - type = string - description = "Google region for VCP/workspace deployment" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated VPC name" -} - -# These three ranges need to be computed based on the workspace size (cf documentation) -variable "subnet_ip_cidr_range" { - type = string - description = "IP Range for Nodes subnet (primary)" -} - -variable "subnet_name" { - type = string - description = "Name of the subnet to create" -} - -variable "router_name" { - type = string - description = "Name of the compute router to create" -} - -variable "nat_name" { - type = string - description = "Name of the NAT service in compute router" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} diff --git a/modules/gcp-workspace-byovpc/vpc.tf b/modules/gcp-workspace-byovpc/vpc.tf deleted file mode 100644 index 31e8e808..00000000 --- a/modules/gcp-workspace-byovpc/vpc.tf +++ /dev/null @@ -1,40 +0,0 @@ -resource "google_compute_network" "dbx_private_vpc" { - project = var.google_project - name = "${var.prefix}-${random_string.suffix.result}" - auto_create_subnetworks = false -} - -resource "google_compute_subnetwork" "network-with-private-secondary-ip-ranges" { - name = var.subnet_name - ip_cidr_range = var.subnet_ip_cidr_range - region = var.google_region - network = google_compute_network.dbx_private_vpc.id - private_ip_google_access = true -} - -resource "google_compute_router" "router" { - name = var.router_name - region = google_compute_subnetwork.network-with-private-secondary-ip-ranges.region - network = google_compute_network.dbx_private_vpc.id -} - -resource "google_compute_router_nat" "nat" { - name = var.nat_name - router = google_compute_router.router.name - region = google_compute_router.router.region - nat_ip_allocate_option = "AUTO_ONLY" - source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" -} - -resource "databricks_mws_networks" "databricks_network" { - account_id = var.databricks_account_id - - network_name = "${var.prefix}-${random_string.suffix.result}" - - gcp_network_info { - network_project_id = var.google_project - vpc_id = google_compute_network.dbx_private_vpc.name - subnet_id = google_compute_subnetwork.network-with-private-secondary-ip-ranges.name - subnet_region = google_compute_subnetwork.network-with-private-secondary-ip-ranges.region - } -} diff --git a/modules/gcp-workspace-byovpc/workspace.tf b/modules/gcp-workspace-byovpc/workspace.tf deleted file mode 100644 index 0f7c7a0a..00000000 --- a/modules/gcp-workspace-byovpc/workspace.tf +++ /dev/null @@ -1,20 +0,0 @@ -resource "databricks_mws_workspaces" "databricks_workspace" { - account_id = var.databricks_account_id - workspace_name = "dbx-example-tf-deploy-${random_string.suffix.result}" - - location = var.google_region - cloud_resource_container { - gcp { - project_id = var.google_project - } - } - - network_id = databricks_mws_networks.databricks_network.network_id - - token { - comment = "Terraform token" - } - - # this makes sure that the NAT is created for outbound traffic before creating the workspace - depends_on = [google_compute_router_nat.nat] -} diff --git a/modules/gcp/Makefile b/modules/gcp/Makefile new file mode 100644 index 00000000..30b525d1 --- /dev/null +++ b/modules/gcp/Makefile @@ -0,0 +1,8 @@ +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) + +$(PROJECTS): + $(MAKE) -C $@ docs + +.PHONY: $(PROJECTS) docs diff --git a/modules/gcp/account/Makefile b/modules/gcp/account/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/account/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/account/README.md b/modules/gcp/account/README.md new file mode 100644 index 00000000..c947a82a --- /dev/null +++ b/modules/gcp/account/README.md @@ -0,0 +1,85 @@ +# modules/gcp/account + +All `databricks_mws_*` resources for the GCP composer: `mws_networks`, `mws_workspaces`, `mws_vpc_endpoint`, `mws_private_access_settings`. + +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual. + +```hcl +module "account" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/account" + + prefix = "acme" + suffix = "abc123" + databricks_account_id = var.databricks_account_id + google_project = "my-workspace-project" + google_region = "us-central1" + vpc_source = "databricks_managed" +} +``` + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [databricks](#requirement\_databricks) | >= 1.0 | + +## Providers + +| Name | Version | +|------|---------| +| [databricks](#provider\_databricks) | 1.114.2 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [databricks_mws_networks.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | +| [databricks_mws_private_access_settings.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_private_access_settings) | resource | +| [databricks_mws_vpc_endpoint.backend](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_vpc_endpoint.frontend](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_vpc_endpoint.transit](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_workspaces.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks account ID (GUID) where this workspace will be registered | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project ID hosting the workspace data plane | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region where the workspace will be deployed | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [suffix](#input\_suffix) | Random suffix appended to resource names for uniqueness (passed by the composer) | `string` | n/a | yes | +| [vpc\_source](#input\_vpc\_source) | One of: databricks\_managed (no mws\_networks), create (we built the VPC), existing (data-source lookup) | `string` | n/a | yes | +| [backend\_forwarding\_rule\_name](#input\_backend\_forwarding\_rule\_name) | Name of the backend (SCC) PSC forwarding rule from private-connectivity; gates backend mws\_vpc\_endpoint creation | `string` | `null` | no | +| [enable\_backend](#input\_enable\_backend) | Create the backend (SCC) mws\_vpc\_endpoint | `bool` | `false` | no | +| [enable\_frontend](#input\_enable\_frontend) | Create the frontend mws\_vpc\_endpoint (and, if hub\_frontend\_forwarding\_rule\_name is set, the transit endpoint) | `bool` | `false` | no | +| [frontend\_forwarding\_rule\_name](#input\_frontend\_forwarding\_rule\_name) | Name of the frontend PSC forwarding rule from private-connectivity; gates frontend mws\_vpc\_endpoint creation | `string` | `null` | no | +| [hub\_frontend\_forwarding\_rule\_name](#input\_hub\_frontend\_forwarding\_rule\_name) | Name of the hub-side frontend PSC forwarding rule from private-connectivity; gates transit mws\_vpc\_endpoint creation | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project hosting the hub VPC (used for the transit databricks\_mws\_vpc\_endpoint when restricted\_egress is enabled) | `string` | `null` | no | +| [nat\_dependency](#input\_nat\_dependency) | Opaque value (typically the Cloud NAT ID) used as depends\_on for the workspace to ensure NAT readiness before workspace creation | `any` | `null` | no | +| [private\_access\_only](#input\_private\_access\_only) | Create databricks\_mws\_private\_access\_settings with public\_access\_enabled=false and attach it to the workspace | `bool` | `false` | no | +| [spoke\_subnet\_name](#input\_spoke\_subnet\_name) | Name of the spoke subnet used in databricks\_mws\_networks.gcp\_network\_info.subnet\_id (null when vpc\_source=databricks\_managed) | `string` | `null` | no | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project hosting the spoke VPC (used in databricks\_mws\_networks.gcp\_network\_info.network\_project\_id) | `string` | `null` | no | +| [spoke\_vpc\_name](#input\_spoke\_vpc\_name) | Name of the spoke VPC used in databricks\_mws\_networks.gcp\_network\_info.vpc\_id (null when vpc\_source=databricks\_managed) | `string` | `null` | no | +| [workspace\_name](#input\_workspace\_name) | Optional workspace name override. Defaults to "prefix-ws-suffix" when null | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [backend\_endpoint\_id](#output\_backend\_endpoint\_id) | Backend mws\_vpc\_endpoint ID (null when no PSC) | +| [frontend\_endpoint\_id](#output\_frontend\_endpoint\_id) | Frontend mws\_vpc\_endpoint ID (null when no PSC) | +| [network\_id](#output\_network\_id) | mws\_networks ID (null when databricks\_managed) | +| [private\_access\_settings\_id](#output\_private\_access\_settings\_id) | databricks\_mws\_private\_access\_settings ID (null when private\_access\_only=false) | +| [transit\_endpoint\_id](#output\_transit\_endpoint\_id) | Hub-side mws\_vpc\_endpoint ID (null when no hub) | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | + diff --git a/modules/gcp/account/locals.tf b/modules/gcp/account/locals.tf new file mode 100644 index 00000000..9a63069d --- /dev/null +++ b/modules/gcp/account/locals.tf @@ -0,0 +1,6 @@ +locals { + workspace_name = coalesce(var.workspace_name, "${var.prefix}-ws-${var.suffix}") + emit_mws_networks = var.vpc_source != "databricks_managed" + emit_vpc_endpoints = var.frontend_forwarding_rule_name != null && var.backend_forwarding_rule_name != null + emit_pas = var.private_access_only +} diff --git a/modules/gcp/account/networks.tf b/modules/gcp/account/networks.tf new file mode 100644 index 00000000..e044bad7 --- /dev/null +++ b/modules/gcp/account/networks.tf @@ -0,0 +1,21 @@ +resource "databricks_mws_networks" "this" { + count = local.emit_mws_networks ? 1 : 0 + + account_id = var.databricks_account_id + network_name = "${var.prefix}-ntw-${var.suffix}" + + gcp_network_info { + network_project_id = var.spoke_vpc_google_project + vpc_id = var.spoke_vpc_name + subnet_id = var.spoke_subnet_name + subnet_region = var.google_region + } + + dynamic "vpc_endpoints" { + for_each = local.emit_vpc_endpoints ? [1] : [] + content { + dataplane_relay = [databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id] + rest_api = [databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id] + } + } +} diff --git a/modules/gcp/account/outputs.tf b/modules/gcp/account/outputs.tf new file mode 100644 index 00000000..da6bf525 --- /dev/null +++ b/modules/gcp/account/outputs.tf @@ -0,0 +1,34 @@ +output "workspace_id" { + value = databricks_mws_workspaces.this.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = databricks_mws_workspaces.this.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + description = "mws_networks ID (null when databricks_managed)" +} + +output "frontend_endpoint_id" { + value = var.enable_frontend && var.frontend_forwarding_rule_name != null ? databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id : null + description = "Frontend mws_vpc_endpoint ID (null when no PSC)" +} + +output "backend_endpoint_id" { + value = var.enable_backend && var.backend_forwarding_rule_name != null ? databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id : null + description = "Backend mws_vpc_endpoint ID (null when no PSC)" +} + +output "transit_endpoint_id" { + value = var.enable_frontend && var.hub_frontend_forwarding_rule_name != null ? databricks_mws_vpc_endpoint.transit[0].vpc_endpoint_id : null + description = "Hub-side mws_vpc_endpoint ID (null when no hub)" +} + +output "private_access_settings_id" { + value = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + description = "databricks_mws_private_access_settings ID (null when private_access_only=false)" +} diff --git a/modules/gcp/account/pas.tf b/modules/gcp/account/pas.tf new file mode 100644 index 00000000..7ec78445 --- /dev/null +++ b/modules/gcp/account/pas.tf @@ -0,0 +1,9 @@ +resource "databricks_mws_private_access_settings" "this" { + count = local.emit_pas ? 1 : 0 + + account_id = var.databricks_account_id + private_access_settings_name = "${var.prefix}-pas-${var.suffix}" + region = var.google_region + public_access_enabled = false + private_access_level = "ACCOUNT" +} diff --git a/modules/gcp/account/tests/byovpc/main.tf b/modules/gcp/account/tests/byovpc/main.tf new file mode 100644 index 00000000..a85e293e --- /dev/null +++ b/modules/gcp/account/tests/byovpc/main.tf @@ -0,0 +1,27 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" +} diff --git a/modules/gcp/account/tests/databricks-managed/main.tf b/modules/gcp/account/tests/databricks-managed/main.tf new file mode 100644 index 00000000..79c6f0d9 --- /dev/null +++ b/modules/gcp/account/tests/databricks-managed/main.tf @@ -0,0 +1,24 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "databricks_managed" +} diff --git a/modules/gcp/account/tests/psc-with-pas/main.tf b/modules/gcp/account/tests/psc-with-pas/main.tf new file mode 100644 index 00000000..899efe32 --- /dev/null +++ b/modules/gcp/account/tests/psc-with-pas/main.tf @@ -0,0 +1,36 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + + frontend_forwarding_rule_name = "fixture-psc-ws-ep-abc123" + backend_forwarding_rule_name = "fixture-psc-scc-ep-abc123" + hub_frontend_forwarding_rule_name = "fixture-hub-psc-ws-ep-abc123" + + enable_frontend = true + enable_backend = true + private_access_only = true +} diff --git a/modules/gcp/account/variables.tf b/modules/gcp/account/variables.tf new file mode 100644 index 00000000..19f3693f --- /dev/null +++ b/modules/gcp/account/variables.tf @@ -0,0 +1,106 @@ +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "suffix" { + type = string + description = "Random suffix appended to resource names for uniqueness (passed by the composer)" +} + +variable "workspace_name" { + type = string + default = null + description = "Optional workspace name override. Defaults to \"prefix-ws-suffix\" when null" +} + +variable "databricks_account_id" { + type = string + description = "Databricks account ID (GUID) where this workspace will be registered" +} + +variable "google_project" { + type = string + description = "GCP project ID hosting the workspace data plane" +} + +variable "google_region" { + type = string + description = "GCP region where the workspace will be deployed" +} + +variable "vpc_source" { + type = string + description = "One of: databricks_managed (no mws_networks), create (we built the VPC), existing (data-source lookup)" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +variable "spoke_vpc_name" { + type = string + default = null + description = "Name of the spoke VPC used in databricks_mws_networks.gcp_network_info.vpc_id (null when vpc_source=databricks_managed)" +} + +variable "spoke_subnet_name" { + type = string + default = null + description = "Name of the spoke subnet used in databricks_mws_networks.gcp_network_info.subnet_id (null when vpc_source=databricks_managed)" +} + +variable "spoke_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the spoke VPC (used in databricks_mws_networks.gcp_network_info.network_project_id)" +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (used for the transit databricks_mws_vpc_endpoint when restricted_egress is enabled)" +} + +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_forwarding_rule_name" { + type = string + default = null + description = "Name of the frontend PSC forwarding rule from private-connectivity; gates frontend mws_vpc_endpoint creation" +} + +variable "backend_forwarding_rule_name" { + type = string + default = null + description = "Name of the backend (SCC) PSC forwarding rule from private-connectivity; gates backend mws_vpc_endpoint creation" +} + +variable "hub_frontend_forwarding_rule_name" { + type = string + default = null + description = "Name of the hub-side frontend PSC forwarding rule from private-connectivity; gates transit mws_vpc_endpoint creation" +} + +variable "enable_frontend" { + type = bool + default = false + description = "Create the frontend mws_vpc_endpoint (and, if hub_frontend_forwarding_rule_name is set, the transit endpoint)" +} + +variable "enable_backend" { + type = bool + default = false + description = "Create the backend (SCC) mws_vpc_endpoint" +} + +variable "private_access_only" { + type = bool + default = false + description = "Create databricks_mws_private_access_settings with public_access_enabled=false and attach it to the workspace" +} + +variable "nat_dependency" { + type = any + default = null + description = "Opaque value (typically the Cloud NAT ID) used as depends_on for the workspace to ensure NAT readiness before workspace creation" +} diff --git a/modules/gcp/account/versions.tf b/modules/gcp/account/versions.tf new file mode 100644 index 00000000..2d66ceef --- /dev/null +++ b/modules/gcp/account/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + } +} diff --git a/modules/gcp/account/vpc-endpoints.tf b/modules/gcp/account/vpc-endpoints.tf new file mode 100644 index 00000000..e4c073e1 --- /dev/null +++ b/modules/gcp/account/vpc-endpoints.tf @@ -0,0 +1,38 @@ +resource "databricks_mws_vpc_endpoint" "frontend" { + count = var.enable_frontend && var.frontend_forwarding_rule_name != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-ws-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.frontend_forwarding_rule_name + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "backend" { + count = var.enable_backend && var.backend_forwarding_rule_name != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-scc-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.backend_forwarding_rule_name + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "transit" { + count = var.enable_frontend && var.hub_frontend_forwarding_rule_name != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-hub-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.hub_vpc_google_project + psc_endpoint_name = var.hub_frontend_forwarding_rule_name + endpoint_region = var.google_region + } +} diff --git a/modules/gcp/account/workspace.tf b/modules/gcp/account/workspace.tf new file mode 100644 index 00000000..a367e3fb --- /dev/null +++ b/modules/gcp/account/workspace.tf @@ -0,0 +1,20 @@ +resource "databricks_mws_workspaces" "this" { + account_id = var.databricks_account_id + workspace_name = local.workspace_name + location = var.google_region + + cloud_resource_container { + gcp { + project_id = var.google_project + } + } + + network_id = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + private_access_settings_id = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + + token { + comment = "Terraform" + } + + depends_on = [var.nat_dependency] +} diff --git a/modules/gcp/databricks-workspace/Makefile b/modules/gcp/databricks-workspace/Makefile new file mode 100644 index 00000000..38b83c2e --- /dev/null +++ b/modules/gcp/databricks-workspace/Makefile @@ -0,0 +1,3 @@ +.PHONY: docs +docs: + terraform-docs -c ../../../.terraform-docs.yml . diff --git a/modules/gcp/databricks-workspace/README.md b/modules/gcp/databricks-workspace/README.md new file mode 100644 index 00000000..20f04ac3 --- /dev/null +++ b/modules/gcp/databricks-workspace/README.md @@ -0,0 +1,114 @@ +# GCP Databricks Workspace Composer + +This module creates a complete Databricks workspace on Google Cloud Platform with full networking, connectivity, and authentication management. + +## Usage + +```hcl +module "workspace" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/databricks-workspace" + + prefix = "acme" + databricks_account_id = var.databricks_account_id + google_project = "my-workspace-project" + google_region = "us-central1" + + vpc_source = "databricks_managed" # or "create" / "existing" +} +``` + +See `examples/gcp-basic`, `examples/gcp-byovpc`, `examples/gcp-existing-vpc`, and `examples/gcp-with-psc-exfiltration-protection` for the four supported scenarios. + +## Components + +- **network**: VPC creation or integration (databricks_managed, create, or existing) +- **private_connectivity**: Private Service Connect (PSC) with optional frontend/backend +- **account**: Databricks MWS resources and workspace +- **dns**: Private DNS zones for restricted egress scenarios + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [databricks](#requirement\_databricks) | >= 1.0 | +| [google](#requirement\_google) | >= 4.0 | +| [null](#requirement\_null) | >= 3.0 | +| [random](#requirement\_random) | >= 3.0 | + +## Providers + +| Name | Version | +|------|---------| +| [null](#provider\_null) | 3.2.4 | +| [random](#provider\_random) | 3.8.1 | + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [account](#module\_account) | ../account | n/a | +| [dns](#module\_dns) | ../dns | n/a | +| [network](#module\_network) | ../network | n/a | +| [private\_connectivity](#module\_private\_connectivity) | ../private-connectivity | n/a | + +## Resources + +| Name | Type | +|------|------| +| [null_resource.preconditions](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource | +| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks account ID (GUID) where this workspace will be registered | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project ID hosting the workspace data plane | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region where the workspace will be deployed. When any private\_link\_* flag or restricted\_egress is true, the region must be supported by Databricks PSC (see preconditions.tf) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources (e.g. "acme" produces "acme-spoke-vpc-") | `string` | n/a | yes | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | Name of the pre-existing subnet to use (must be in google\_region). Required when vpc\_source=existing | `string` | `null` | no | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | Name of the pre-existing VPC to use. Required when vpc\_source=existing | `string` | `null` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Regional Hive metastore IP used by the managed-hive allow rule. When null, the regional default is looked up internally; if no default exists for the region, the rule is skipped | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR of the hub subnet (e.g. 10.1.0.0/24). Required when restricted\_egress=true | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project hosting the hub VPC. Required when restricted\_egress=true | `string` | `null` | no | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | If true, bind the spoke VPC project as a Shared-VPC host and the workspace project as a service project. Only takes effect when restricted\_egress=true and the two projects differ | `bool` | `false` | no | +| [pod\_cidr](#input\_pod\_cidr) | Optional CIDR for the GKE pods secondary range. Adds a secondary\_ip\_range to the spoke subnet when set | `string` | `null` | no | +| [private\_access\_only](#input\_private\_access\_only) | Create databricks\_mws\_private\_access\_settings with public\_access\_enabled=false. Workspace becomes reachable only through PSC endpoints | `bool` | `false` | no | +| [private\_link\_backend](#input\_private\_link\_backend) | Create the backend (SCC, data plane) PSC endpoint and a backend databricks\_mws\_vpc\_endpoint | `bool` | `false` | no | +| [private\_link\_frontend](#input\_private\_link\_frontend) | Create the frontend (workspace UI/API) PSC endpoint and a frontend databricks\_mws\_vpc\_endpoint | `bool` | `false` | no | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR of the dedicated PSC subnet in the spoke VPC (e.g. 10.0.255.0/28). Required when restricted\_egress=true or any private\_link\_* flag is true | `string` | `null` | no | +| [restricted\_egress](#input\_restricted\_egress) | Create hub VPC + bidirectional peering + deny-egress firewall + private DNS zones. Requires vpc\_source=create and at least one private\_link\_* flag | `bool` | `false` | no | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR of the spoke VPC address space (e.g. 10.0.0.0/16). Required when vpc\_source=create; ignored otherwise | `string` | `null` | no | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project hosting the spoke VPC. Defaults to google\_project when null | `string` | `null` | no | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR of the spoke subnet primary range (e.g. 10.0.0.0/22). Required when vpc\_source=create | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | Optional CIDR for the GKE services secondary range. Adds a secondary\_ip\_range to the spoke subnet when set | `string` | `null` | no | +| [tags](#input\_tags) | Map of tags. Currently not propagated to child resources; reserved for future use | `map(string)` | `{}` | no | +| [vpc\_source](#input\_vpc\_source) | Where the workspace VPC comes from. One of: databricks\_managed (no networking module called), create (Terraform creates VPC + subnet + NAT), existing (data-source lookup) | `string` | `"databricks_managed"` | no | +| [workspace\_name](#input\_workspace\_name) | Optional workspace name override. Defaults to "prefix-ws-suffix" when null | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [backend\_endpoint\_id](#output\_backend\_endpoint\_id) | Backend (SCC) mws\_vpc\_endpoint ID (null when private\_link\_backend=false) | +| [backend\_psc\_ip\_spoke](#output\_backend\_psc\_ip\_spoke) | IP address of the spoke-side backend PSC endpoint (null when no PSC) | +| [frontend\_endpoint\_id](#output\_frontend\_endpoint\_id) | Frontend mws\_vpc\_endpoint ID (null when private\_link\_frontend=false) | +| [frontend\_psc\_ip\_hub](#output\_frontend\_psc\_ip\_hub) | IP address of the hub-side frontend PSC endpoint (null when restricted\_egress=false) | +| [frontend\_psc\_ip\_spoke](#output\_frontend\_psc\_ip\_spoke) | IP address of the spoke-side frontend PSC endpoint (null when no PSC) | +| [google\_region](#output\_google\_region) | Region the workspace was deployed to (echo of input; convenient for downstream modules) | +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | Hub VPC ID (null when restricted\_egress=false) | +| [hub\_vpc\_self\_link](#output\_hub\_vpc\_self\_link) | Hub VPC self-link (null when restricted\_egress=false) | +| [nat\_id](#output\_nat\_id) | Cloud NAT ID (null when vpc\_source != create) | +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID (null when vpc\_source=databricks\_managed) | +| [private\_access\_settings\_id](#output\_private\_access\_settings\_id) | databricks\_mws\_private\_access\_settings ID (null when private\_access\_only=false) | +| [spoke\_subnet\_id](#output\_spoke\_subnet\_id) | Spoke subnet ID (null when vpc\_source=databricks\_managed) | +| [spoke\_subnet\_self\_link](#output\_spoke\_subnet\_self\_link) | Spoke subnet self-link (null when vpc\_source=databricks\_managed) | +| [spoke\_vpc\_id](#output\_spoke\_vpc\_id) | Spoke VPC ID (null when vpc\_source=databricks\_managed) | +| [spoke\_vpc\_self\_link](#output\_spoke\_vpc\_self\_link) | Spoke VPC self-link (null when vpc\_source=databricks\_managed) | +| [suffix](#output\_suffix) | Random suffix used in resource names (useful when wiring downstream modules) | +| [transit\_endpoint\_id](#output\_transit\_endpoint\_id) | Hub-side mws\_vpc\_endpoint ID (null when no hub or no frontend PSC) | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL (https://..gcp.databricks.com) | + \ No newline at end of file diff --git a/modules/gcp/databricks-workspace/locals.tf b/modules/gcp/databricks-workspace/locals.tf new file mode 100644 index 00000000..2bddf79c --- /dev/null +++ b/modules/gcp/databricks-workspace/locals.tf @@ -0,0 +1,8 @@ +locals { + databricks_managed = var.vpc_source == "databricks_managed" + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + any_private_link = var.private_link_frontend || var.private_link_backend + spoke_project = coalesce(var.spoke_vpc_google_project, var.google_project) +} diff --git a/modules/gcp/databricks-workspace/main.tf b/modules/gcp/databricks-workspace/main.tf new file mode 100644 index 00000000..dd70e016 --- /dev/null +++ b/modules/gcp/databricks-workspace/main.tf @@ -0,0 +1,100 @@ +module "network" { + source = "../network" + count = local.databricks_managed ? 0 : 1 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + vpc_source = var.vpc_source + spoke_vpc_google_project = local.spoke_project + + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr + + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name + + create_hub = var.restricted_egress + hub_vpc_google_project = var.hub_vpc_google_project + hub_vpc_cidr = var.hub_vpc_cidr + is_spoke_vpc_shared = var.is_spoke_vpc_shared + workspace_google_project = var.google_project +} + +module "private_connectivity" { + source = "../private-connectivity" + count = local.any_private_link ? 1 : 0 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + spoke_vpc_cidr = var.spoke_vpc_cidr + + hub_vpc_id = var.restricted_egress ? module.network[0].hub_vpc_id : null + hub_vpc_self_link = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + hub_vpc_google_project = var.hub_vpc_google_project + hub_subnet_name = var.restricted_egress ? module.network[0].hub_subnet_name : null + hub_vpc_cidr = var.hub_vpc_cidr + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + restrict_egress = var.restricted_egress + psc_subnet_cidr = var.psc_subnet_cidr + + hive_metastore_ip = var.hive_metastore_ip +} + +module "account" { + source = "../account" + + prefix = var.prefix + suffix = random_string.suffix.result + workspace_name = var.workspace_name + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + vpc_source = var.vpc_source + + spoke_vpc_name = local.databricks_managed ? null : module.network[0].spoke_vpc_name + spoke_subnet_name = local.databricks_managed ? null : module.network[0].spoke_subnet_name + spoke_vpc_google_project = local.spoke_project + hub_vpc_google_project = var.hub_vpc_google_project + + frontend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].frontend_forwarding_rule_name : null + backend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].backend_forwarding_rule_name : null + hub_frontend_forwarding_rule_name = local.any_private_link ? module.private_connectivity[0].hub_frontend_forwarding_rule_name : null + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + private_access_only = var.private_access_only + + nat_dependency = local.databricks_managed ? null : module.network[0].nat_id +} + +module "dns" { + source = "../dns" + count = var.restricted_egress ? 1 : 0 + + prefix = var.prefix + google_region = var.google_region + + hub_vpc_id = module.network[0].hub_vpc_id + hub_vpc_self_link = module.network[0].hub_vpc_self_link + hub_vpc_google_project = var.hub_vpc_google_project + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity[0].frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity[0].frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity[0].backend_psc_ip_spoke +} diff --git a/modules/gcp/databricks-workspace/outputs.tf b/modules/gcp/databricks-workspace/outputs.tf new file mode 100644 index 00000000..f3955042 --- /dev/null +++ b/modules/gcp/databricks-workspace/outputs.tf @@ -0,0 +1,99 @@ +# === Workspace =========================================================== +output "workspace_id" { + value = module.account.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.account.workspace_url + description = "Databricks workspace URL (https://..gcp.databricks.com)" +} + +output "network_id" { + value = module.account.network_id + description = "databricks_mws_networks ID (null when vpc_source=databricks_managed)" +} + +output "private_access_settings_id" { + value = module.account.private_access_settings_id + description = "databricks_mws_private_access_settings ID (null when private_access_only=false)" +} + +# === mws_vpc_endpoint IDs (Databricks-side PSC registration) ============ +output "frontend_endpoint_id" { + value = module.account.frontend_endpoint_id + description = "Frontend mws_vpc_endpoint ID (null when private_link_frontend=false)" +} + +output "backend_endpoint_id" { + value = module.account.backend_endpoint_id + description = "Backend (SCC) mws_vpc_endpoint ID (null when private_link_backend=false)" +} + +output "transit_endpoint_id" { + value = module.account.transit_endpoint_id + description = "Hub-side mws_vpc_endpoint ID (null when no hub or no frontend PSC)" +} + +# === Network ============================================================= +output "spoke_vpc_id" { + value = local.databricks_managed ? null : module.network[0].spoke_vpc_id + description = "Spoke VPC ID (null when vpc_source=databricks_managed)" +} + +output "spoke_vpc_self_link" { + value = local.databricks_managed ? null : module.network[0].spoke_vpc_self_link + description = "Spoke VPC self-link (null when vpc_source=databricks_managed)" +} + +output "spoke_subnet_id" { + value = local.databricks_managed ? null : module.network[0].spoke_subnet_id + description = "Spoke subnet ID (null when vpc_source=databricks_managed)" +} + +output "spoke_subnet_self_link" { + value = local.databricks_managed ? null : module.network[0].spoke_subnet_self_link + description = "Spoke subnet self-link (null when vpc_source=databricks_managed)" +} + +output "hub_vpc_id" { + value = var.restricted_egress ? module.network[0].hub_vpc_id : null + description = "Hub VPC ID (null when restricted_egress=false)" +} + +output "hub_vpc_self_link" { + value = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + description = "Hub VPC self-link (null when restricted_egress=false)" +} + +output "nat_id" { + value = local.create_vpc ? module.network[0].nat_id : null + description = "Cloud NAT ID (null when vpc_source != create)" +} + +# === Private connectivity =============================================== +output "frontend_psc_ip_spoke" { + value = local.any_private_link ? module.private_connectivity[0].frontend_psc_ip_spoke : null + description = "IP address of the spoke-side frontend PSC endpoint (null when no PSC)" +} + +output "backend_psc_ip_spoke" { + value = local.any_private_link ? module.private_connectivity[0].backend_psc_ip_spoke : null + description = "IP address of the spoke-side backend PSC endpoint (null when no PSC)" +} + +output "frontend_psc_ip_hub" { + value = var.restricted_egress ? module.private_connectivity[0].frontend_psc_ip_hub : null + description = "IP address of the hub-side frontend PSC endpoint (null when restricted_egress=false)" +} + +# === Identifiers ======================================================== +output "suffix" { + value = random_string.suffix.result + description = "Random suffix used in resource names (useful when wiring downstream modules)" +} + +output "google_region" { + value = var.google_region + description = "Region the workspace was deployed to (echo of input; convenient for downstream modules)" +} diff --git a/modules/gcp/databricks-workspace/preconditions.tf b/modules/gcp/databricks-workspace/preconditions.tf new file mode 100644 index 00000000..770cfe8e --- /dev/null +++ b/modules/gcp/databricks-workspace/preconditions.tf @@ -0,0 +1,39 @@ +# Cross-variable preconditions. +resource "null_resource" "preconditions" { + lifecycle { + precondition { + condition = !var.restricted_egress || local.create_vpc + error_message = "restricted_egress=true requires vpc_source=\"create\" (hub-spoke topology needs us to own both VPCs)." + } + precondition { + condition = !var.restricted_egress || local.any_private_link + error_message = "restricted_egress=true requires at least one of private_link_frontend or private_link_backend." + } + precondition { + condition = !var.restricted_egress || (var.hub_vpc_google_project != null && var.hub_vpc_cidr != null && var.psc_subnet_cidr != null) + error_message = "restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, and psc_subnet_cidr." + } + precondition { + condition = !local.create_vpc || (var.spoke_vpc_cidr != null && var.subnet_cidr != null) + error_message = "vpc_source=\"create\" requires spoke_vpc_cidr and subnet_cidr." + } + precondition { + condition = !local.use_existing_vpc || (var.existing_vpc_name != null && var.existing_subnet_name != null) + error_message = "vpc_source=\"existing\" requires existing_vpc_name and existing_subnet_name." + } + precondition { + condition = !local.databricks_managed || (!var.private_link_frontend && !var.private_link_backend && !var.restricted_egress) + error_message = "vpc_source=\"databricks_managed\" forbids private_link_frontend, private_link_backend, and restricted_egress." + } + precondition { + condition = ( + !local.any_private_link && !var.restricted_egress + ) || contains([ + "asia-northeast1", "asia-south1", "asia-southeast1", "australia-southeast1", + "europe-west1", "europe-west2", "europe-west3", "northamerica-northeast1", + "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-west1", "us-west4" + ], var.google_region) + error_message = "google_region must be a region supported by Databricks PSC when any private_link_* flag or restricted_egress is true." + } + } +} diff --git a/modules/gcp/databricks-workspace/random.tf b/modules/gcp/databricks-workspace/random.tf new file mode 100644 index 00000000..7c4efc8d --- /dev/null +++ b/modules/gcp/databricks-workspace/random.tf @@ -0,0 +1,9 @@ +resource "random_string" "suffix" { + length = 6 + special = false + upper = false + + lifecycle { + ignore_changes = [special, upper] + } +} diff --git a/modules/gcp/databricks-workspace/tests/basic/main.tf b/modules/gcp/databricks-workspace/tests/basic/main.tf new file mode 100644 index 00000000..2a96df8b --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/basic/main.tf @@ -0,0 +1,28 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" +} diff --git a/modules/gcp/databricks-workspace/tests/byovpc/main.tf b/modules/gcp/databricks-workspace/tests/byovpc/main.tf new file mode 100644 index 00000000..18ac0dc2 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/byovpc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} diff --git a/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf b/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf new file mode 100644 index 00000000..a6b992c5 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} diff --git a/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf b/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf new file mode 100644 index 00000000..31ee19e4 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf @@ -0,0 +1,29 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: vpc_source="existing" requires existing_vpc_name + existing_subnet_name +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" +} diff --git a/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf b/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf new file mode 100644 index 00000000..696e7641 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: vpc_source="databricks_managed" forbids private_link_frontend +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" + private_link_frontend = true +} diff --git a/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf new file mode 100644 index 00000000..df6ca744 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: restricted_egress=true requires vpc_source="create" +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" + restricted_egress = true +} diff --git a/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf new file mode 100644 index 00000000..f93d3cb4 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf @@ -0,0 +1,34 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, psc_subnet_cidr +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + private_link_frontend = true + private_link_backend = true + restricted_egress = true +} diff --git a/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf b/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf new file mode 100644 index 00000000..cc60d2ae --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf @@ -0,0 +1,41 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + is_spoke_vpc_shared = true + hub_vpc_cidr = "10.1.0.0/24" + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/databricks-workspace/variables.tf b/modules/gcp/databricks-workspace/variables.tf new file mode 100644 index 00000000..ea6e33f7 --- /dev/null +++ b/modules/gcp/databricks-workspace/variables.tf @@ -0,0 +1,143 @@ +# === Identity ============================================================ +variable "prefix" { + type = string + description = "Prefix used to name generated resources (e.g. \"acme\" produces \"acme-spoke-vpc-\")" +} + +variable "databricks_account_id" { + type = string + description = "Databricks account ID (GUID) where this workspace will be registered" +} + +variable "google_project" { + type = string + description = "GCP project ID hosting the workspace data plane" +} + +variable "google_region" { + type = string + description = "GCP region where the workspace will be deployed. When any private_link_* flag or restricted_egress is true, the region must be supported by Databricks PSC (see preconditions.tf)" +} + +variable "workspace_name" { + type = string + default = null + description = "Optional workspace name override. Defaults to \"prefix-ws-suffix\" when null" +} + +variable "tags" { + type = map(string) + default = {} + description = "Map of tags. Currently not propagated to child resources; reserved for future use" +} + +# === VPC source ========================================================== +variable "vpc_source" { + type = string + default = "databricks_managed" + description = "Where the workspace VPC comes from. One of: databricks_managed (no networking module called), create (Terraform creates VPC + subnet + NAT), existing (data-source lookup)" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +# When vpc_source = "create" +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR of the spoke VPC address space (e.g. 10.0.0.0/16). Required when vpc_source=create; ignored otherwise" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR of the spoke subnet primary range (e.g. 10.0.0.0/22). Required when vpc_source=create" +} + +variable "pod_cidr" { + type = string + default = null + description = "Optional CIDR for the GKE pods secondary range. Adds a secondary_ip_range to the spoke subnet when set" +} + +variable "svc_cidr" { + type = string + default = null + description = "Optional CIDR for the GKE services secondary range. Adds a secondary_ip_range to the spoke subnet when set" +} + +# When vpc_source = "existing" +variable "existing_vpc_name" { + type = string + default = null + description = "Name of the pre-existing VPC to use. Required when vpc_source=existing" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of the pre-existing subnet to use (must be in google_region). Required when vpc_source=existing" +} + +# === Connectivity feature flags ========================================== +variable "private_link_frontend" { + type = bool + default = false + description = "Create the frontend (workspace UI/API) PSC endpoint and a frontend databricks_mws_vpc_endpoint" +} + +variable "private_link_backend" { + type = bool + default = false + description = "Create the backend (SCC, data plane) PSC endpoint and a backend databricks_mws_vpc_endpoint" +} + +variable "private_access_only" { + type = bool + default = false + description = "Create databricks_mws_private_access_settings with public_access_enabled=false. Workspace becomes reachable only through PSC endpoints" +} + +variable "restricted_egress" { + type = bool + default = false + description = "Create hub VPC + bidirectional peering + deny-egress firewall + private DNS zones. Requires vpc_source=create and at least one private_link_* flag" +} + +# === Required when restricted_egress = true ============================== +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC. Required when restricted_egress=true" +} + +variable "spoke_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the spoke VPC. Defaults to google_project when null" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC project as a Shared-VPC host and the workspace project as a service project. Only takes effect when restricted_egress=true and the two projects differ" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR of the hub subnet (e.g. 10.1.0.0/24). Required when restricted_egress=true" +} + +variable "psc_subnet_cidr" { + type = string + default = null + description = "CIDR of the dedicated PSC subnet in the spoke VPC (e.g. 10.0.255.0/28). Required when restricted_egress=true or any private_link_* flag is true" +} + +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP used by the managed-hive allow rule. When null, the regional default is looked up internally; if no default exists for the region, the rule is skipped" +} diff --git a/modules/gcp/databricks-workspace/versions.tf b/modules/gcp/databricks-workspace/versions.tf new file mode 100644 index 00000000..ead1a86d --- /dev/null +++ b/modules/gcp/databricks-workspace/versions.tf @@ -0,0 +1,21 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + random = { + source = "hashicorp/random" + version = ">= 3.0" + } + null = { + source = "hashicorp/null" + version = ">= 3.0" + } + } +} diff --git a/modules/gcp/dns/Makefile b/modules/gcp/dns/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/dns/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/dns/README.md b/modules/gcp/dns/README.md new file mode 100644 index 00000000..150bda7a --- /dev/null +++ b/modules/gcp/dns/README.md @@ -0,0 +1,92 @@ +# modules/gcp/dns + +Private DNS zones (hub + spoke) used with restricted-egress workspaces. + +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer) when `restricted_egress=true`. Direct consumption is unusual; this module is terminal (no outputs). + +```hcl +module "dns" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/dns" + + prefix = "acme" + google_region = "us-central1" + + hub_vpc_id = module.network.hub_vpc_id + hub_vpc_self_link = module.network.hub_vpc_self_link + hub_vpc_google_project = "my-hub-project" + + spoke_vpc_id = module.network.spoke_vpc_id + spoke_vpc_self_link = module.network.spoke_vpc_self_link + spoke_vpc_google_project = "my-spoke-project" + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity.frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity.frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity.backend_psc_ip_spoke +} +``` + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 7.31.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_dns_managed_zone.gcr](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.hub_dbx](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.pkg_dev](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.spoke_dbx](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_record_set.gcr_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.gcr_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.google_apis_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.google_apis_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_psc_auth](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.pkg_dev_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.pkg_dev_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_tunnel](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [backend\_psc\_ip\_spoke](#input\_backend\_psc\_ip\_spoke) | Spoke-side backend (SCC) PSC endpoint IP (used in the spoke tunnel..gcp.databricks.com A record) | `string` | n/a | yes | +| [frontend\_psc\_ip\_spoke](#input\_frontend\_psc\_ip\_spoke) | Spoke-side frontend PSC endpoint IP (used in the spoke gcp.databricks.com A records) | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region (used in the spoke tunnel DNS record name) | `string` | n/a | yes | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project hosting the hub VPC (used for the hub DNS zones) | `string` | n/a | yes | +| [hub\_vpc\_id](#input\_hub\_vpc\_id) | ID of the hub VPC (DNS zones with this VPC's visibility) | `string` | n/a | yes | +| [hub\_vpc\_self\_link](#input\_hub\_vpc\_self\_link) | Self-link of the hub VPC | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated DNS managed zones | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project hosting the spoke VPC (used for the spoke DNS zone) | `string` | n/a | yes | +| [spoke\_vpc\_id](#input\_spoke\_vpc\_id) | ID of the spoke VPC (DNS zone with this VPC's visibility) | `string` | n/a | yes | +| [spoke\_vpc\_self\_link](#input\_spoke\_vpc\_self\_link) | Self-link of the spoke VPC | `string` | n/a | yes | +| [workspace\_url](#input\_workspace\_url) | Workspace URL from databricks\_mws\_workspaces; used to extract the workspace DNS ID via regex | `string` | n/a | yes | +| [frontend\_psc\_ip\_hub](#input\_frontend\_psc\_ip\_hub) | Hub-side frontend PSC endpoint IP (used in the hub gcp.databricks.com A records) | `string` | `null` | no | + +## Outputs + +No outputs. + diff --git a/modules/gcp/dns/hub.tf b/modules/gcp/dns/hub.tf new file mode 100644 index 00000000..77f6b503 --- /dev/null +++ b/modules/gcp/dns/hub.tf @@ -0,0 +1,140 @@ +# === gcp.databricks.com (hub) ============================================ +resource "google_dns_managed_zone" "hub_dbx" { + name = "${var.prefix}-hub-gcp-databricks-com" + project = var.hub_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "hub_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_psc_auth" { + name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +# === gcr.io ============================================================== +resource "google_dns_managed_zone" "gcr" { + name = "${var.prefix}-gcr-io" + project = var.hub_vpc_google_project + dns_name = "gcr.io." + description = "Private DNS zone for GCR private resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "gcr_cname" { + name = "*.${google_dns_managed_zone.gcr.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "CNAME" + ttl = 300 + rrdatas = ["gcr.io."] +} + +resource "google_dns_record_set" "gcr_a" { + name = google_dns_managed_zone.gcr.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} + +# === googleapis.com ====================================================== +resource "google_dns_managed_zone" "google_apis" { + name = "${var.prefix}-google-apis" + project = var.hub_vpc_google_project + dns_name = "googleapis.com." + description = "Private DNS zone for Google APIs resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "google_apis_cname" { + name = "*.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "CNAME" + ttl = 300 + rrdatas = ["restricted.googleapis.com."] +} + +resource "google_dns_record_set" "google_apis_a" { + name = "restricted.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.4", "199.36.153.5", "199.36.153.6", "199.36.153.7"] +} + +# === pkg.dev ============================================================= +resource "google_dns_managed_zone" "pkg_dev" { + name = "${var.prefix}-pkg-dev" + project = var.hub_vpc_google_project + dns_name = "pkg.dev." + description = "Private DNS zone for Go Packages resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "pkg_dev_cname" { + name = "*.${google_dns_managed_zone.pkg_dev.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "CNAME" + ttl = 300 + rrdatas = ["pkg.dev."] +} + +resource "google_dns_record_set" "pkg_dev_a" { + name = google_dns_managed_zone.pkg_dev.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} diff --git a/modules/gcp/dns/locals.tf b/modules/gcp/dns/locals.tf new file mode 100644 index 00000000..9e2c4fd2 --- /dev/null +++ b/modules/gcp/dns/locals.tf @@ -0,0 +1,4 @@ +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} diff --git a/modules/gcp/dns/outputs.tf b/modules/gcp/dns/outputs.tf new file mode 100644 index 00000000..19cbc3d5 --- /dev/null +++ b/modules/gcp/dns/outputs.tf @@ -0,0 +1 @@ +# This module has no outputs; DNS records are terminal. diff --git a/modules/gcp/dns/spoke.tf b/modules/gcp/dns/spoke.tf new file mode 100644 index 00000000..25c3bc2e --- /dev/null +++ b/modules/gcp/dns/spoke.tf @@ -0,0 +1,41 @@ +# === gcp.databricks.com (spoke) ========================================== +resource "google_dns_managed_zone" "spoke_dbx" { + name = "${var.prefix}-spoke-gcp-databricks-com" + project = var.spoke_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.spoke_vpc_id + } + } +} + +resource "google_dns_record_set" "spoke_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_tunnel" { + name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.backend_psc_ip_spoke] +} diff --git a/modules/gcp/dns/tests/hub-and-spoke/main.tf b/modules/gcp/dns/tests/hub-and-spoke/main.tf new file mode 100644 index 00000000..baebc0dd --- /dev/null +++ b/modules/gcp/dns/tests/hub-and-spoke/main.tf @@ -0,0 +1,29 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "dns" { + source = "../.." + + prefix = "fixture" + google_region = "us-central1" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + + workspace_url = "https://1234567890123456.7.gcp.databricks.com" + + frontend_psc_ip_spoke = "10.0.255.4" + frontend_psc_ip_hub = "10.1.0.10" + backend_psc_ip_spoke = "10.0.255.5" +} diff --git a/modules/gcp/dns/variables.tf b/modules/gcp/dns/variables.tf new file mode 100644 index 00000000..dc877b17 --- /dev/null +++ b/modules/gcp/dns/variables.tf @@ -0,0 +1,64 @@ +variable "prefix" { + type = string + description = "Prefix used to name generated DNS managed zones" +} + +variable "google_region" { + type = string + description = "GCP region (used in the spoke tunnel DNS record name)" +} + +# Hub +variable "hub_vpc_id" { + type = string + description = "ID of the hub VPC (DNS zones with this VPC's visibility)" +} + +variable "hub_vpc_self_link" { + type = string + description = "Self-link of the hub VPC" +} + +variable "hub_vpc_google_project" { + type = string + description = "GCP project hosting the hub VPC (used for the hub DNS zones)" +} + +# Spoke +variable "spoke_vpc_id" { + type = string + description = "ID of the spoke VPC (DNS zone with this VPC's visibility)" +} + +variable "spoke_vpc_self_link" { + type = string + description = "Self-link of the spoke VPC" +} + +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC (used for the spoke DNS zone)" +} + +# Workspace +variable "workspace_url" { + type = string + description = "Workspace URL from databricks_mws_workspaces; used to extract the workspace DNS ID via regex" +} + +# PSC IPs +variable "frontend_psc_ip_spoke" { + type = string + description = "Spoke-side frontend PSC endpoint IP (used in the spoke gcp.databricks.com A records)" +} + +variable "frontend_psc_ip_hub" { + type = string + default = null + description = "Hub-side frontend PSC endpoint IP (used in the hub gcp.databricks.com A records)" +} + +variable "backend_psc_ip_spoke" { + type = string + description = "Spoke-side backend (SCC) PSC endpoint IP (used in the spoke tunnel..gcp.databricks.com A record)" +} diff --git a/modules/gcp/dns/versions.tf b/modules/gcp/dns/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/dns/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/network/Makefile b/modules/gcp/network/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/network/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/network/README.md b/modules/gcp/network/README.md new file mode 100644 index 00000000..2293f435 --- /dev/null +++ b/modules/gcp/network/README.md @@ -0,0 +1,95 @@ +# modules/gcp/network + +VPC, subnet, router, NAT, peering, and Shared-VPC binding for the Databricks GCP composer. + +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual; you'll need to wire the outputs yourself. + +```hcl +module "network" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/network" + + prefix = "acme" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "my-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 6.46.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_compute_network.hub_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | +| [google_compute_network.spoke_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | +| [google_compute_network_peering.hub_to_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | +| [google_compute_network_peering.spoke_to_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | +| [google_compute_router.router](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router) | resource | +| [google_compute_router_nat.nat](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router_nat) | resource | +| [google_compute_shared_vpc_host_project.host](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_host_project) | resource | +| [google_compute_shared_vpc_service_project.service](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_service_project) | resource | +| [google_compute_subnetwork.hub_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | +| [google_compute_subnetwork.spoke_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | +| [google_compute_network.existing_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_network) | data source | +| [google_compute_subnetwork.existing_spoke_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_subnetwork) | data source | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [google\_region](#input\_google\_region) | GCP region for all network resources | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix for generated resource names | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project hosting the spoke VPC | `string` | n/a | yes | +| [suffix](#input\_suffix) | Random suffix passed by the composer for uniqueness | `string` | n/a | yes | +| [vpc\_source](#input\_vpc\_source) | Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup) | `string` | n/a | yes | +| [create\_hub](#input\_create\_hub) | Create a hub VPC + subnet + peering with the spoke. Composer passes restricted\_egress here. | `bool` | `false` | no | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | Name of pre-existing subnet (required when vpc\_source=existing) | `string` | `null` | no | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | Name of pre-existing VPC (required when vpc\_source=existing) | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for the hub subnet (required when create\_hub=true) | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project hosting the hub VPC (required when create\_hub=true) | `string` | `null` | no | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project | `bool` | `false` | no | +| [pod\_cidr](#input\_pod\_cidr) | GKE secondary range for pods (optional) | `string` | `null` | no | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for the spoke subnet primary range (required when vpc\_source=create) | `string` | `null` | no | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the spoke subnet (required when vpc\_source=create) | `string` | `null` | no | +| [subnet\_name](#input\_subnet\_name) | Override for spoke subnet name (default: "{prefix}-subnet-{suffix}") | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | GKE secondary range for services (optional) | `string` | `null` | no | +| [workspace\_google\_project](#input\_workspace\_google\_project) | Workspace project (used for Shared-VPC service binding) | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [hub\_subnet\_name](#output\_hub\_subnet\_name) | Name of the hub subnet (null when create\_hub=false) | +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | ID of the hub VPC (null when create\_hub=false) | +| [hub\_vpc\_name](#output\_hub\_vpc\_name) | Name of the hub VPC (null when create\_hub=false) | +| [hub\_vpc\_self\_link](#output\_hub\_vpc\_self\_link) | Self-link of the hub VPC (null when create\_hub=false) | +| [nat\_id](#output\_nat\_id) | ID of the Cloud NAT (null when vpc\_source=existing) | +| [spoke\_subnet\_id](#output\_spoke\_subnet\_id) | ID of the spoke subnet | +| [spoke\_subnet\_name](#output\_spoke\_subnet\_name) | Name of the spoke subnet | +| [spoke\_subnet\_self\_link](#output\_spoke\_subnet\_self\_link) | Self-link of the spoke subnet | +| [spoke\_vpc\_id](#output\_spoke\_vpc\_id) | ID of the spoke VPC | +| [spoke\_vpc\_name](#output\_spoke\_vpc\_name) | Name of the spoke VPC | +| [spoke\_vpc\_self\_link](#output\_spoke\_vpc\_self\_link) | Self-link of the spoke VPC | + diff --git a/modules/gcp/network/data.tf b/modules/gcp/network/data.tf new file mode 100644 index 00000000..93a4dad8 --- /dev/null +++ b/modules/gcp/network/data.tf @@ -0,0 +1,14 @@ +data "google_compute_network" "existing_spoke" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_vpc_name + project = var.spoke_vpc_google_project +} + +data "google_compute_subnetwork" "existing_spoke_subnet" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_subnet_name + project = var.spoke_vpc_google_project + region = var.google_region +} diff --git a/modules/gcp/network/locals.tf b/modules/gcp/network/locals.tf new file mode 100644 index 00000000..fd060acb --- /dev/null +++ b/modules/gcp/network/locals.tf @@ -0,0 +1,6 @@ +locals { + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + subnet_name = coalesce(var.subnet_name, "${var.prefix}-subnet-${var.suffix}") +} diff --git a/modules/gcp/network/nat.tf b/modules/gcp/network/nat.tf new file mode 100644 index 00000000..1f448b17 --- /dev/null +++ b/modules/gcp/network/nat.tf @@ -0,0 +1,19 @@ +resource "google_compute_router" "router" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-router-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = google_compute_network.spoke_vpc[0].id +} + +resource "google_compute_router_nat" "nat" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-nat-${var.suffix}" + project = var.spoke_vpc_google_project + router = google_compute_router.router[0].name + region = var.google_region + nat_ip_allocate_option = "AUTO_ONLY" + source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" +} diff --git a/modules/gcp/network/outputs.tf b/modules/gcp/network/outputs.tf new file mode 100644 index 00000000..958fe271 --- /dev/null +++ b/modules/gcp/network/outputs.tf @@ -0,0 +1,66 @@ +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].id : null + ) + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].name : null + ) + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].self_link : null + ) + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].id : null + ) + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].name : null + ) + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].self_link : null + ) + description = "Self-link of the spoke subnet" +} + +output "nat_id" { + value = local.create_vpc ? google_compute_router_nat.nat[0].id : null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} + +output "hub_vpc_id" { + value = var.create_hub ? google_compute_network.hub_vpc[0].id : null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = var.create_hub ? google_compute_network.hub_vpc[0].name : null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = var.create_hub ? google_compute_network.hub_vpc[0].self_link : null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = var.create_hub ? google_compute_subnetwork.hub_subnet[0].name : null + description = "Name of the hub subnet (null when create_hub=false)" +} diff --git a/modules/gcp/network/peering.tf b/modules/gcp/network/peering.tf new file mode 100644 index 00000000..60337507 --- /dev/null +++ b/modules/gcp/network/peering.tf @@ -0,0 +1,15 @@ +resource "google_compute_network_peering" "hub_to_spoke" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-spoke-${var.suffix}" + network = google_compute_network.hub_vpc[0].self_link + peer_network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link +} + +resource "google_compute_network_peering" "spoke_to_hub" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-spoke-hub-${var.suffix}" + network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link + peer_network = google_compute_network.hub_vpc[0].self_link +} diff --git a/modules/gcp/network/shared-vpc.tf b/modules/gcp/network/shared-vpc.tf new file mode 100644 index 00000000..af064902 --- /dev/null +++ b/modules/gcp/network/shared-vpc.tf @@ -0,0 +1,12 @@ +resource "google_compute_shared_vpc_host_project" "host" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + project = var.spoke_vpc_google_project +} + +resource "google_compute_shared_vpc_service_project" "service" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + host_project = google_compute_shared_vpc_host_project.host[0].project + service_project = var.workspace_google_project +} diff --git a/modules/gcp/network/subnets.tf b/modules/gcp/network/subnets.tf new file mode 100644 index 00000000..d08aa551 --- /dev/null +++ b/modules/gcp/network/subnets.tf @@ -0,0 +1,39 @@ +# === Spoke subnet ======================================================= +resource "google_compute_subnetwork" "spoke_subnet" { + count = local.create_vpc ? 1 : 0 + + name = local.subnet_name + project = var.spoke_vpc_google_project + network = google_compute_network.spoke_vpc[0].id + region = var.google_region + ip_cidr_range = var.subnet_cidr + private_ip_google_access = true + + dynamic "secondary_ip_range" { + for_each = var.pod_cidr != null ? [1] : [] + content { + range_name = "pods" + ip_cidr_range = var.pod_cidr + } + } + + dynamic "secondary_ip_range" { + for_each = var.svc_cidr != null ? [1] : [] + content { + range_name = "services" + ip_cidr_range = var.svc_cidr + } + } +} + +# === Hub subnet ========================================================= +resource "google_compute_subnetwork" "hub_subnet" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-subnet-${var.suffix}" + project = var.hub_vpc_google_project + network = google_compute_network.hub_vpc[0].id + region = var.google_region + ip_cidr_range = var.hub_vpc_cidr + private_ip_google_access = true +} diff --git a/modules/gcp/network/tests/create-with-hub/main.tf b/modules/gcp/network/tests/create-with-hub/main.tf new file mode 100644 index 00000000..2d27e0bd --- /dev/null +++ b/modules/gcp/network/tests/create-with-hub/main.tf @@ -0,0 +1,26 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + create_hub = true + hub_vpc_google_project = "fixture-hub-project" + hub_vpc_cidr = "10.1.0.0/24" + is_spoke_vpc_shared = true + workspace_google_project = "fixture-workspace-project" +} diff --git a/modules/gcp/network/tests/create/main.tf b/modules/gcp/network/tests/create/main.tf new file mode 100644 index 00000000..29d729bb --- /dev/null +++ b/modules/gcp/network/tests/create/main.tf @@ -0,0 +1,20 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} diff --git a/modules/gcp/network/tests/existing/main.tf b/modules/gcp/network/tests/existing/main.tf new file mode 100644 index 00000000..8935be2c --- /dev/null +++ b/modules/gcp/network/tests/existing/main.tf @@ -0,0 +1,20 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "existing" + spoke_vpc_google_project = "fixture-project" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} diff --git a/modules/gcp/network/variables.tf b/modules/gcp/network/variables.tf new file mode 100644 index 00000000..17e96d61 --- /dev/null +++ b/modules/gcp/network/variables.tf @@ -0,0 +1,104 @@ +variable "prefix" { + type = string + description = "Prefix for generated resource names" +} + +variable "suffix" { + type = string + description = "Random suffix passed by the composer for uniqueness" +} + +variable "google_region" { + type = string + description = "GCP region for all network resources" +} + +variable "vpc_source" { + type = string + description = "Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup)" + validation { + condition = contains(["create", "existing"], var.vpc_source) + error_message = "vpc_source must be 'create' or 'existing'." + } +} + +# Spoke project always required +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC" +} + +# === Used when vpc_source = "create" ==================================== +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet primary range (required when vpc_source=create)" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet (required when vpc_source=create)" +} + +variable "subnet_name" { + type = string + default = null + description = "Override for spoke subnet name (default: \"{prefix}-subnet-{suffix}\")" +} + +variable "pod_cidr" { + type = string + default = null + description = "GKE secondary range for pods (optional)" +} + +variable "svc_cidr" { + type = string + default = null + description = "GKE secondary range for services (optional)" +} + +# === Used when vpc_source = "existing" ================================== +variable "existing_vpc_name" { + type = string + default = null + description = "Name of pre-existing VPC (required when vpc_source=existing)" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of pre-existing subnet (required when vpc_source=existing)" +} + +# === Hub configuration (only when create_hub = true) ==================== +variable "create_hub" { + type = bool + default = false + description = "Create a hub VPC + subnet + peering with the spoke. Composer passes restricted_egress here." +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (required when create_hub=true)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR for the hub subnet (required when create_hub=true)" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project" +} + +variable "workspace_google_project" { + type = string + default = null + description = "Workspace project (used for Shared-VPC service binding)" +} diff --git a/modules/gcp/network/versions.tf b/modules/gcp/network/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/network/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/network/vpc.tf b/modules/gcp/network/vpc.tf new file mode 100644 index 00000000..52286a1e --- /dev/null +++ b/modules/gcp/network/vpc.tf @@ -0,0 +1,19 @@ +# === Spoke VPC (created) ================================================ +resource "google_compute_network" "spoke_vpc" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-spoke-vpc-${var.suffix}" + project = var.spoke_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +# === Hub VPC ============================================================ +resource "google_compute_network" "hub_vpc" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-vpc-${var.suffix}" + project = var.hub_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} diff --git a/modules/gcp/private-connectivity/Makefile b/modules/gcp/private-connectivity/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/private-connectivity/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/private-connectivity/README.md b/modules/gcp/private-connectivity/README.md new file mode 100644 index 00000000..1ebc4472 --- /dev/null +++ b/modules/gcp/private-connectivity/README.md @@ -0,0 +1,96 @@ +# modules/gcp/private-connectivity + +GCP-side PSC endpoints + restricted-egress firewall for the Databricks GCP composer. + +## Usage + +Typically called by `modules/gcp/databricks-workspace` (the composer). Direct consumption is supported but unusual. + +```hcl +module "private_connectivity" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/private-connectivity" + + prefix = "acme" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = module.network.spoke_vpc_id + spoke_vpc_self_link = module.network.spoke_vpc_self_link + spoke_vpc_google_project = "my-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = true + psc_subnet_cidr = "10.0.255.0/28" +} +``` + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 7.31.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_compute_address.backend_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_address.frontend_address_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_address.frontend_address_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_firewall.hub_ingress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_ctl_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_hive](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_default_deny_egress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_forwarding_rule.backend_fr](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_forwarding_rule.frontend_fr_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_forwarding_rule.frontend_fr_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_subnetwork.psc_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [google\_region](#input\_google\_region) | GCP region for PSC and firewall resources (must be one of the regions in the regional PSC service-attachment maps) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for the dedicated PSC subnet in the spoke VPC | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR of the spoke VPC address space (used as source\_ranges for the hub ingress firewall) | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project that hosts the spoke VPC | `string` | n/a | yes | +| [spoke\_vpc\_id](#input\_spoke\_vpc\_id) | ID of the spoke VPC (output from the network module) | `string` | n/a | yes | +| [spoke\_vpc\_self\_link](#input\_spoke\_vpc\_self\_link) | Self-link of the spoke VPC (used as the network reference for firewall rules) | `string` | n/a | yes | +| [suffix](#input\_suffix) | Random suffix appended to resource names for uniqueness (passed by the composer) | `string` | n/a | yes | +| [enable\_backend](#input\_enable\_backend) | Create the backend (SCC, data plane) PSC endpoint on the spoke | `bool` | `false` | no | +| [enable\_frontend](#input\_enable\_frontend) | Create the frontend (workspace UI/API) PSC endpoint on the spoke and, if hub exists, the hub side | `bool` | `false` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Regional Hive metastore IP used by the managed-hive allow rule. Looked up via internal map when null; firewall rule is skipped if the lookup also yields empty | `string` | `null` | no | +| [hub\_subnet\_name](#input\_hub\_subnet\_name) | Name of the hub subnet (used as the subnetwork reference for the hub-side PSC address) | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR of the hub VPC address space (reserved for future use) | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project that hosts the hub VPC (null when no hub is created) | `string` | `null` | no | +| [hub\_vpc\_id](#input\_hub\_vpc\_id) | ID of the hub VPC (null when no hub is created) | `string` | `null` | no | +| [hub\_vpc\_self\_link](#input\_hub\_vpc\_self\_link) | Self-link of the hub VPC (null when no hub is created) | `string` | `null` | no | +| [restrict\_egress](#input\_restrict\_egress) | Create the egress firewall stack: deny-egress, allow Google APIs, allow control plane, allow managed Hive (conditional), hub ingress | `bool` | `false` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [backend\_forwarding\_rule\_name](#output\_backend\_forwarding\_rule\_name) | Name of the backend (SCC) PSC forwarding rule (null when enable\_backend=false) | +| [backend\_psc\_ip\_spoke](#output\_backend\_psc\_ip\_spoke) | IP address of the spoke-side backend PSC endpoint | +| [frontend\_forwarding\_rule\_name](#output\_frontend\_forwarding\_rule\_name) | Name of the spoke-side frontend PSC forwarding rule (null when enable\_frontend=false) | +| [frontend\_psc\_ip\_hub](#output\_frontend\_psc\_ip\_hub) | IP address of the hub-side frontend PSC endpoint (null when no hub) | +| [frontend\_psc\_ip\_spoke](#output\_frontend\_psc\_ip\_spoke) | IP address of the spoke-side frontend PSC endpoint | +| [hub\_frontend\_forwarding\_rule\_name](#output\_hub\_frontend\_forwarding\_rule\_name) | Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend) | +| [psc\_subnet\_self\_link](#output\_psc\_subnet\_self\_link) | Self-link of the PSC subnet | + diff --git a/modules/gcp/private-connectivity/firewall.tf b/modules/gcp/private-connectivity/firewall.tf new file mode 100644 index 00000000..97342c67 --- /dev/null +++ b/modules/gcp/private-connectivity/firewall.tf @@ -0,0 +1,97 @@ +# Egress firewall stack — only emitted when restrict_egress = true. + +# === Spoke deny-egress ================================================== +resource "google_compute_firewall" "spoke_default_deny_egress" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-default-deny-egress" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1100 + destination_ranges = ["0.0.0.0/0"] + source_ranges = [] + + deny { + protocol = "all" + } +} + +# === Spoke allow Google APIs ============================================ +resource "google_compute_firewall" "spoke_allow_google_apis" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-google-apis" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "199.36.153.4/30", + "199.36.153.8/30", + "34.126.0.0/18" + ] + + allow { + protocol = "all" + } +} + +# === Spoke allow Databricks control plane (to PSC IPs) ================== +resource "google_compute_firewall" "spoke_allow_ctl_plane" { + count = var.restrict_egress && var.enable_frontend && var.enable_backend ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-databricks-control-plane" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "${google_compute_forwarding_rule.backend_fr[0].ip_address}/32", + "${google_compute_forwarding_rule.frontend_fr_spoke[0].ip_address}/32" + ] + + allow { + protocol = "tcp" + ports = ["443"] + } +} + +# === Spoke allow managed Hive (conditional on hive_metastore_ip) ======== +resource "google_compute_firewall" "spoke_allow_hive" { + count = var.restrict_egress && local.hive_metastore_ip != "" ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-${var.google_region}-managed-hive" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = ["${local.hive_metastore_ip}/32"] + + allow { + protocol = "tcp" + ports = ["3306"] + } +} + +# === Hub ingress from spoke ============================================= +resource "google_compute_firewall" "hub_ingress" { + count = var.restrict_egress && local.hub_present ? 1 : 0 + + name = "${var.prefix}-hub-${var.suffix}-ingress" + project = var.hub_vpc_google_project + network = var.hub_vpc_self_link + + direction = "INGRESS" + priority = 1000 + destination_ranges = [] + source_ranges = [var.spoke_vpc_cidr] + + allow { + protocol = "all" + } +} diff --git a/modules/gcp-with-psc-exfiltration-protection/main.tf b/modules/gcp/private-connectivity/locals.tf similarity index 79% rename from modules/gcp-with-psc-exfiltration-protection/main.tf rename to modules/gcp/private-connectivity/locals.tf index c587c4b3..e2c8cead 100644 --- a/modules/gcp-with-psc-exfiltration-protection/main.tf +++ b/modules/gcp/private-connectivity/locals.tf @@ -1,16 +1,4 @@ -##################################################### -# Local Values and Random String Resource -##################################################### - -# --------------------------------------------------- -# Local Value: Extract Workspace DNS ID -# --------------------------------------------------- locals { - # Extracts a numeric identifier from the Databricks workspace URL. - # The regex pattern "[0-9]+\.[0-9]+" matches the first occurrence of two groups of digits separated by a dot (e.g., "1234567890123456.1234"). - # This value is typically used to generate unique DNS names for the workspace. - workspace_dns_id = regex("[0-9]+\\.[0-9]+", databricks_mws_workspaces.databricks_workspace.workspace_url) - google_frontend_psc_targets = { "asia-northeast1" = "projects/general-prod-asianortheast1-01/regions/asia-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" "asia-south1" = "projects/gen-prod-asias1-01/regions/asia-south1/serviceAttachments/plproxy-psc-endpoint-all-ports" @@ -44,20 +32,14 @@ locals { "us-west1" = "projects/prod-gcp-us-west1/regions/us-west1/serviceAttachments/ngrok-psc-endpoint" "us-west4" = "projects/prod-gcp-us-west4/regions/us-west4/serviceAttachments/ngrok-psc-endpoint" } -} -# --------------------------------------------------- -# Random String Resource: Suffix Generator -# --------------------------------------------------- -resource "random_string" "suffix" { - lifecycle { - ignore_changes = [ - special, - upper - ] + # Regional default Hive Metastore IPs per Databricks docs: + # https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore + # NOTE: kept empty initially; override via var.hive_metastore_ip. + default_hive_metastore_ips = { } - special = false - upper = false - length = 6 + hive_metastore_ip = var.hive_metastore_ip != null ? var.hive_metastore_ip : try(local.default_hive_metastore_ips[var.google_region], "") + + hub_present = var.hub_vpc_id != null } diff --git a/modules/gcp/private-connectivity/outputs.tf b/modules/gcp/private-connectivity/outputs.tf new file mode 100644 index 00000000..2df4b312 --- /dev/null +++ b/modules/gcp/private-connectivity/outputs.tf @@ -0,0 +1,34 @@ +output "psc_subnet_self_link" { + value = google_compute_subnetwork.psc_subnet.self_link + description = "Self-link of the PSC subnet" +} + +output "frontend_forwarding_rule_name" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the spoke-side frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_forwarding_rule_name" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_forwarding_rule_name" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} + +output "frontend_psc_ip_spoke" { + value = var.enable_frontend ? google_compute_address.frontend_address_spoke[0].address : null + description = "IP address of the spoke-side frontend PSC endpoint" +} + +output "backend_psc_ip_spoke" { + value = var.enable_backend ? google_compute_address.backend_address[0].address : null + description = "IP address of the spoke-side backend PSC endpoint" +} + +output "frontend_psc_ip_hub" { + value = local.hub_present && var.enable_frontend ? google_compute_address.frontend_address_hub[0].address : null + description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" +} diff --git a/modules/gcp/private-connectivity/psc.tf b/modules/gcp/private-connectivity/psc.tf new file mode 100644 index 00000000..485abb2e --- /dev/null +++ b/modules/gcp/private-connectivity/psc.tf @@ -0,0 +1,78 @@ +# === PSC Subnet (spoke) ================================================= +resource "google_compute_subnetwork" "psc_subnet" { + name = "${var.prefix}-psc-subnet-${var.suffix}" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_id + region = var.google_region + ip_cidr_range = var.psc_subnet_cidr + private_ip_google_access = true +} + +# === Backend (SCC) PSC endpoint — spoke ================================= +resource "google_compute_address" "backend_address" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "backend_fr" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.backend_address[0].id + target = local.google_backend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — spoke ====================================== +resource "google_compute_address" "frontend_address_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.frontend_address_spoke[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — hub (transit) ============================== +resource "google_compute_address" "frontend_address_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ip-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + subnetwork = var.hub_subnet_name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ep-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + network = var.hub_vpc_id + ip_address = google_compute_address.frontend_address_hub[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} diff --git a/modules/gcp/private-connectivity/tests/full-isolated/main.tf b/modules/gcp/private-connectivity/tests/full-isolated/main.tf new file mode 100644 index 00000000..f7a11a3c --- /dev/null +++ b/modules/gcp/private-connectivity/tests/full-isolated/main.tf @@ -0,0 +1,32 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + hub_subnet_name = "fixture-hub-subnet-abc123" + hub_vpc_cidr = "10.1.0.0/24" + + enable_frontend = true + enable_backend = true + restrict_egress = true + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/private-connectivity/tests/no-egress/main.tf b/modules/gcp/private-connectivity/tests/no-egress/main.tf new file mode 100644 index 00000000..6c318d4c --- /dev/null +++ b/modules/gcp/private-connectivity/tests/no-egress/main.tf @@ -0,0 +1,26 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = false + restrict_egress = false + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/private-connectivity/variables.tf b/modules/gcp/private-connectivity/variables.tf new file mode 100644 index 00000000..a19f8f7c --- /dev/null +++ b/modules/gcp/private-connectivity/variables.tf @@ -0,0 +1,105 @@ +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "suffix" { + type = string + description = "Random suffix appended to resource names for uniqueness (passed by the composer)" +} + +variable "google_region" { + type = string + description = "GCP region for PSC and firewall resources (must be one of the regions in the regional PSC service-attachment maps)" + validation { + condition = contains([ + "asia-northeast1", "asia-south1", "asia-southeast1", "australia-southeast1", + "europe-west1", "europe-west2", "europe-west3", "northamerica-northeast1", + "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-west1", "us-west4" + ], var.google_region) + error_message = "google_region must be one of the regions in the regional PSC service-attachment maps. See locals.tf in modules/gcp/private-connectivity." + } +} + +# Spoke network refs +variable "spoke_vpc_id" { + type = string + description = "ID of the spoke VPC (output from the network module)" +} + +variable "spoke_vpc_self_link" { + type = string + description = "Self-link of the spoke VPC (used as the network reference for firewall rules)" +} + +variable "spoke_vpc_google_project" { + type = string + description = "GCP project that hosts the spoke VPC" +} + +variable "spoke_vpc_cidr" { + type = string + description = "CIDR of the spoke VPC address space (used as source_ranges for the hub ingress firewall)" +} + +# Hub network refs (nullable when no hub) +variable "hub_vpc_id" { + type = string + default = null + description = "ID of the hub VPC (null when no hub is created)" +} + +variable "hub_vpc_self_link" { + type = string + default = null + description = "Self-link of the hub VPC (null when no hub is created)" +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project that hosts the hub VPC (null when no hub is created)" +} + +variable "hub_subnet_name" { + type = string + default = null + description = "Name of the hub subnet (used as the subnetwork reference for the hub-side PSC address)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR of the hub VPC address space (reserved for future use)" +} + +# Feature flags +variable "enable_frontend" { + type = bool + default = false + description = "Create the frontend (workspace UI/API) PSC endpoint on the spoke and, if hub exists, the hub side" +} + +variable "enable_backend" { + type = bool + default = false + description = "Create the backend (SCC, data plane) PSC endpoint on the spoke" +} + +variable "restrict_egress" { + type = bool + default = false + description = "Create the egress firewall stack: deny-egress, allow Google APIs, allow control plane, allow managed Hive (conditional), hub ingress" +} + +# PSC subnet CIDR +variable "psc_subnet_cidr" { + type = string + description = "CIDR for the dedicated PSC subnet in the spoke VPC" +} + +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP used by the managed-hive allow rule. Looked up via internal map when null; firewall rule is skipped if the lookup also yields empty" +} diff --git a/modules/gcp/private-connectivity/versions.tf b/modules/gcp/private-connectivity/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/private-connectivity/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/service-account/Makefile b/modules/gcp/service-account/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/service-account/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-sa-provisioning/README.md b/modules/gcp/service-account/README.md similarity index 81% rename from modules/gcp-sa-provisioning/README.md rename to modules/gcp/service-account/README.md index 75b57fcf..24b1e08f 100644 --- a/modules/gcp-sa-provisioning/README.md +++ b/modules/gcp/service-account/README.md @@ -25,16 +25,35 @@ You can do the same thing by provisioning a service account that will have the s - run `terraform init` - run `teraform apply` +## Usage + +Run once per GCP project to provision the service account Databricks uses to deploy workspaces. + +```hcl +module "service_account" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/service-account" + + google_project = "my-project" + prefix = "acme" + delegate_from = ["user:alice@example.com"] +} +``` + +The consumer must also configure `provider "google" {}` (project + region/zone) — this module does not carry its own provider configuration. + ## Requirements -No requirements. +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | ## Providers | Name | Version | |------|---------| -| [google](#provider\_google) | n/a | +| [google](#provider\_google) | 7.32.0 | ## Modules diff --git a/modules/gcp/service-account/data.tf b/modules/gcp/service-account/data.tf new file mode 100644 index 00000000..c2c3f9bf --- /dev/null +++ b/modules/gcp/service-account/data.tf @@ -0,0 +1 @@ +data "google_client_openid_userinfo" "me" {} diff --git a/modules/gcp-sa-provisioning/main.tf b/modules/gcp/service-account/main.tf similarity index 100% rename from modules/gcp-sa-provisioning/main.tf rename to modules/gcp/service-account/main.tf diff --git a/modules/gcp-sa-provisioning/outputs.tf b/modules/gcp/service-account/outputs.tf similarity index 100% rename from modules/gcp-sa-provisioning/outputs.tf rename to modules/gcp/service-account/outputs.tf diff --git a/modules/gcp-sa-provisioning/variables.tf b/modules/gcp/service-account/variables.tf similarity index 100% rename from modules/gcp-sa-provisioning/variables.tf rename to modules/gcp/service-account/variables.tf diff --git a/modules/gcp/service-account/versions.tf b/modules/gcp/service-account/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/service-account/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/unity-catalog/Makefile b/modules/gcp/unity-catalog/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/unity-catalog/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/unity-catalog/README.md b/modules/gcp/unity-catalog/README.md new file mode 100644 index 00000000..6646d6b7 --- /dev/null +++ b/modules/gcp/unity-catalog/README.md @@ -0,0 +1,75 @@ +# modules/gcp/unity-catalog + +Unity Catalog metastore, GCS bucket, storage credential, external location, and catalog for GCP Databricks workspaces. Called by examples after the workspace exists (uses workspace-scoped Databricks provider alias). + +## Usage + +Called after `modules/gcp/databricks-workspace` to create a metastore, GCS bucket, storage credential, external location, and default catalog. + +```hcl +module "unity_catalog" { + source = "github.com/databricks/terraform-databricks-examples//modules/gcp/unity-catalog" + + providers = { + databricks = databricks + databricks.workspace = databricks.workspace + } + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url + google_project = "my-workspace-project" + google_region = "us-central1" + prefix = "acme" + metastore_name = "main-metastore" + catalog_name = "main" +} +``` + +The consumer must declare a `databricks.workspace` provider alias pointing at the workspace URL. + + +## Requirements + +No requirements. + +## Providers + +| Name | Version | +|------|---------| +| [databricks](#provider\_databricks) | 1.115.0 | +| [databricks.workspace](#provider\_databricks.workspace) | 1.115.0 | +| [google](#provider\_google) | 7.32.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [databricks_catalog.main](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/catalog) | resource | +| [databricks_external_location.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/external_location) | resource | +| [databricks_metastore.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/metastore) | resource | +| [databricks_metastore_assignment.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/metastore_assignment) | resource | +| [databricks_storage_credential.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/storage_credential) | resource | +| [google_storage_bucket.ext_bucket](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket) | resource | +| [google_storage_bucket_iam_member.unity_cred_admin](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_iam_member) | resource | +| [google_storage_bucket_iam_member.unity_cred_reader](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_iam_member) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [catalog\_name](#input\_catalog\_name) | Name to assign to default catalog | `string` | n/a | yes | +| [databricks\_workspace\_id](#input\_databricks\_workspace\_id) | The unique identifier of the Databricks workspace in which resources will be managed. | `any` | n/a | yes | +| [databricks\_workspace\_url](#input\_databricks\_workspace\_url) | The URL of the Databricks workspace to which resources will be deployed (e.g., https://.gcp.databricks.com). | `any` | n/a | yes | +| [google\_project](#input\_google\_project) | The Google Cloud project ID where the Databricks workspace and associated resources will be created. | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | +| [metastore\_name](#input\_metastore\_name) | Name to assign to regional metastore | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | + +## Outputs + +No outputs. + diff --git a/modules/gcp-unity-catalog/databricks-cloud-resources.tf b/modules/gcp/unity-catalog/databricks-cloud-resources.tf similarity index 100% rename from modules/gcp-unity-catalog/databricks-cloud-resources.tf rename to modules/gcp/unity-catalog/databricks-cloud-resources.tf diff --git a/modules/gcp-unity-catalog/gcs.tf b/modules/gcp/unity-catalog/gcs.tf similarity index 100% rename from modules/gcp-unity-catalog/gcs.tf rename to modules/gcp/unity-catalog/gcs.tf diff --git a/modules/gcp-unity-catalog/variables.tf b/modules/gcp/unity-catalog/variables.tf similarity index 100% rename from modules/gcp-unity-catalog/variables.tf rename to modules/gcp/unity-catalog/variables.tf diff --git a/modules/gcp-unity-catalog/terraform.tf b/modules/gcp/unity-catalog/versions.tf similarity index 100% rename from modules/gcp-unity-catalog/terraform.tf rename to modules/gcp/unity-catalog/versions.tf