diff --git a/skills/lfx-v2-secrets/SKILL.md b/skills/lfx-v2-secrets/SKILL.md new file mode 100644 index 0000000..fd59b72 --- /dev/null +++ b/skills/lfx-v2-secrets/SKILL.md @@ -0,0 +1,635 @@ +--- +name: lfx-v2-secrets +description: > + Guide an agent through wiring up secrets for LFX V2 microservices using External Secrets + Operator (ESO) + IRSA. Handles both new services (full infrastructure setup) and existing + services (add a secret to an already-configured service) by checking whether the ESO + objects exist before deciding which steps to run. Use this skill whenever someone says + "set up secrets", "wire up ESO", "add a secret to this service", "IRSA configuration", + "External Secrets for V2", or any mention of AWS Secrets Manager integration with + Kubernetes for LFX V2 services. +allowed-tools: Bash, Read, Glob, Grep, AskUserQuestion, WebFetch +--- + + + + + +# LFX V2 Secrets Setup Guide + +Secrets for LFX V2 microservices are managed through **External Secrets Operator (ESO)** +combined with **IAM Roles for Service Accounts (IRSA)** on AWS. This provides a secure, +GitOps-driven way to sync secrets from AWS Secrets Manager into Kubernetes. + +--- + +## Understanding the Architecture + +### How It Works + +1. A **Kubernetes ServiceAccount** is annotated with an **IRSA role ARN** +2. ESO's **SecretStore** uses that ServiceAccount's JWT token to authenticate to AWS +3. ESO watches **ExternalSecret** manifests and syncs matching secrets from AWS SM into K8s Secrets +4. Application deployments reference the K8s Secret via environment variable or volume mount +5. Local development skips ESO entirely and injects secret values directly via `environment` in values + +### Key Constants + +These values are fixed and apply across all V2 services: + +| Item | Value | +|------|-------| +| AWS Region | `us-west-2` | +| K8s Secret name | `-secrets` (e.g., `lfx-v2-committee-service-secrets`) | +| SecretStore name | `` — matches `metadata.name` in `SecretStore.yaml` and `spec.secretStoreRef.name` in `ExternalSecret.yaml` | +| IAM account — dev | `788942260905` | +| IAM account — staging | `844790888233` | +| IAM account — prod | `372256339901` | +| IRSA role ARN pattern | `arn:aws:iam:::role/` | +| AWS SM path pattern | `<3rd-party-service>//[]` | +| ServiceAccount annotation key | `eks.amazonaws.com/role-arn` | +| ESO JWT auth field | `spec.provider.aws.auth.jwt.serviceAccountRef` | + +--- + +## Branching + +Before making any changes, create a branch in each repo being modified. Use the format +`/`. Never commit directly to `main`. The username is the git +username (typically the part before `@` in the email). Always sign off commits. + +--- + +## Step 1: Gather Information from the User + +Identify `` — the fully qualified service name including the `lfx-v2-` prefix (e.g., +`lfx-v2-committee-service`). If the user did not include it in their request, ask for it now +before proceeding. This is used directly in all resource names: K8s Secret is `-secrets`, +role ARN ends in ``, etc. + +If not already provided in the initial request, ask the user for: + +1. **List of secrets** — LFX V2 secrets come from either 1Password or Auth0: + + **1Password sources** (e.g., API keys, SMTP credentials, etc): + - **Secret name** (e.g., "LinkedIn Credentials", "SMTP Credentials") + - **Third-party service** that provides the secret (e.g., `litellm`, `github`) + - **1Password item name** — exact name as it appears in the vault + - **Field names in 1Password** — the exact field names as they appear in the source vault + + **Auth0 sources** (e.g., M2M client credentials, BFF client secrets): + - **Auth0 client name** — exact display name in Auth0 (e.g., `LFX V2 Invite Service`) + - **Credential type** — `auth0` (produces `client_id` + `client_secret`) or `auth0_jwt` (produces `client_id` + `client_public_key` + `client_private_key`; standard for LFX V2 microservices) + - **Auto-rotate** — yes/no; default `true` for `auth0_jwt` V2 services + - **AWS SM path** — follows `auth0/` convention (e.g., `auth0/LFX_V2_Invite_Service`) + + > Field renames are applied automatically — always prefix with `auth0_` (e.g., `client_id` → `auth0_client_id`, `client_secret` → `auth0_client_secret`, `client_private_key` → `auth0_client_private_key`). Do not ask the user for these. + +2. **Which environments need this secret** — `development`, `staging`, `production` + +**1Password example:** + +```text +Service: lfx-v2-invite-service +Namespace: invite-service +Secrets: + - Atlassian API Key (3rd party service: atlassian, 1Password field: atlassian_api_key) - all envs + - Supabase API Key (3rd party service: supabase, 1Password fields: url, api_key) - all envs +``` + +**Auth0 examples:** + +`auth0` (client_secret, simpler M2M or BFF clients): + +```text +Service: lfx-v2-committee-service +Namespace: committee-service +Auth0 secrets: + - Committee Service BFF + type: auth0 + client: "LFX V2 Committee BFF" + path: auth0/LFX_V2_Committee_BFF +``` + +`auth0_jwt` (JWT private key, standard for LFX V2 microservices): + +```text +Service: lfx-v2-committee-service +Namespace: committee-service +Auth0 secrets: + - Committee Service M2M + type: auth0_jwt + client: "LFX V2 Committee Service" + auto_rotate: true + path: auth0/LFX_V2_Committee_Service +``` + +--- + +## Step 2: Check Whether ESO Is Already Configured + +Before making any changes, look up the service's infrastructure details and determine whether +`SecretStore` and `ExternalSecret` objects already exist. + +### 2a. Fetch IAM service account definitions + +Fetch `iam-service-account-definitions.yaml` directly from GitHub: +``` +https://raw.githubusercontent.com/linuxfoundation/lfx-v2-opentofu/main/iam-service-account-definitions.yaml +``` +If the fetch fails (e.g., auth error), ask the user to provide the service's namespace and eso_service_tag. + +Look up the entry for ``: +- **`namespace`** — note the value; defaults to `` if not set. Confirm with the user only if the entry is missing entirely. +- **`eso_service_tag`** — note the value; defaults to `` if not set. If the file is inaccessible or the entry is missing, ask the user to confirm the tag (suggest `` as the default). + +### 2b. Check for existing ESO objects + +Determine whether `SecretStore` and `ExternalSecret` objects already exist for this service +by fetching from GitHub. The service repo name matches the service name (e.g., `lfx-v2-committee-service` +lives at `github.com/linuxfoundation/lfx-v2-committee-service`). + +Fetch these three URLs (a 404 means the file doesn't exist yet): + +``` +# ESO custom resources in lfx-v2-argocd +https://raw.githubusercontent.com/linuxfoundation/lfx-v2-argocd/main/custom-resources//SecretStore.yaml +https://raw.githubusercontent.com/linuxfoundation/lfx-v2-argocd/main/custom-resources//ExternalSecret.yaml + +# ServiceAccount in the service Helm chart +https://raw.githubusercontent.com/linuxfoundation//main/charts//templates/serviceaccount.yaml +``` + +If any fetch returns a non-404 auth error, fall back to checking the local filesystem if the +repo is checked out, or ask the user whether ESO is already configured for this service. + +**If all three files are present** → skip to [Step 4](#step-4-add-entry-to-lfx-secrets-management). + +**If any are missing** → continue with Step 3 to set up the infrastructure first. + +--- + +## Step 3: Set Up ESO Infrastructure (New Services Only) + +Run this step only when Step 2 found that ESO is not yet configured. This touches three repos +and must be done before writing any secrets. + +### Step 3a: Add IAM Service Account Entry in `lfx-v2-opentofu` + +In the [lfx-v2-opentofu](https://github.com/linuxfoundation/lfx-v2-opentofu) repo, +edit `iam-service-account-definitions.yaml` and add: + +```yaml +service_account_roles: + : + namespace: "" +``` + +Example for committee service: + +```yaml +service_account_roles: + lfx-v2-committee-service: + namespace: "committee-service" +``` + +> **Defaults**: All fields default to the role key (i.e., ``): `namespace`, +> `service_account`, and `eso_service_tag`. Only specify a field when its value differs +> from the role key. + +### Step 3b: Create ServiceAccount in the Service Helm Chart + +In the service repo's Helm chart (e.g., `lfx-v2-invite-service`), create +`charts//templates/serviceaccount.yaml`: + +```yaml +# Copyright The Linux Foundation and each contributor to LFX. +# SPDX-License-Identifier: MIT +{{- if .Values.serviceAccount.create }} +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ .Values.serviceAccount.name | default .Chart.Name }} + namespace: {{ .Release.Namespace }} + labels: + app: {{ .Chart.Name }} + {{- with .Values.serviceAccount.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +automountServiceAccountToken: {{ .Values.serviceAccount.automountServiceAccountToken | default true }} +{{- end }} +``` + +Add to `charts//values.yaml`: + +```yaml +serviceAccount: + create: true + name: "" + annotations: {} + automountServiceAccountToken: true +``` + +### Step 3c: Create Custom Resources in `lfx-v2-argocd` + +The `SecretStore` and `ExternalSecret` are **static YAML files** (not Helm templates) placed in +`lfx-v2-argocd/custom-resources//`. + +Create `custom-resources//SecretStore.yaml`: + +```yaml +# Copyright The Linux Foundation and each contributor to LFX. +# SPDX-License-Identifier: MIT +--- +apiVersion: external-secrets.io/v1 +kind: SecretStore +metadata: + name: + namespace: +spec: + provider: + aws: + auth: + jwt: + serviceAccountRef: + name: + region: us-west-2 + service: SecretsManager +``` + +Create `custom-resources//ExternalSecret.yaml`: + +```yaml +# Copyright The Linux Foundation and each contributor to LFX. +# SPDX-License-Identifier: MIT +--- +apiVersion: external-secrets.io/v1 +kind: ExternalSecret +metadata: + name: + namespace: +spec: + secretStoreRef: + kind: SecretStore + name: + target: + creationPolicy: Owner + name: -secrets + refreshInterval: 10m + dataFrom: + - find: + conversionStrategy: Default + decodingStrategy: None + tags: + service-: enabled + rewrite: + - merge: + conflictPolicy: Error + into: '' + strategy: Extract +``` + +> **Tag-based discovery**: ESO finds and merges all AWS SM secrets tagged +> `service-: enabled` into a single Kubernetes Secret named +> `-secrets`. No manual `data` list is needed — new secrets are picked up +> automatically after the next sync. + +### Step 3d: Add IRSA Annotation in `lfx-v2-argocd` Per-Environment Values + +In `values/dev/.yaml` (repeat for staging and prod with the matching account ID): + +```yaml +# Copyright The Linux Foundation and each contributor to LFX. +# SPDX-License-Identifier: MIT +--- +serviceAccount: + annotations: + eks.amazonaws.com/role-arn: arn:aws:iam::788942260905:role/ + automountServiceAccountToken: true +``` + +| Environment | Account ID | File | +|-------------|-----------|------| +| Development | `788942260905` | `values/dev/.yaml` | +| Staging | `844790888233` | `values/staging/.yaml` | +| Production | `372256339901` | `values/prod/.yaml` | + +--- + +## Step 4: Add Entry to `lfx-secrets-management` + +In the [lfx-secrets-management](https://github.com/linuxfoundation/lfx-secrets-management) repo, +add an entry for each secret to `secrets/lfx/.yml` — one file per LFX V2 service. +If the file doesn't exist yet, create it. If it already exists, append the new entry. + +> **Important**: All secrets must be stored as JSON in AWS SM, even single-field ones. +> Two equivalent ways to express this: +> +> ```yaml +> # implicit JSON (list form) +> fields: +> - +> +> # explicit JSON (scalar + flag) +> fields: +> store_as_json: true +> ``` + +**1Password template:** + +```yaml +: + tags: [lfx_v2, , <3rd_party_service_tag>] + envs: [development, staging, production] + source: + onepassword: + vaults: + development: LFX V2 - Development + staging: LFX V2 - Staging + production: LFX V2 - Production + item: <1Password Item Name> + fields: + - + destinations: + - aws_secretsmanager: + tags: + service-: enabled + path: <3rd-party-service>/ +``` + +Example for Supabase API key: + +```yaml +Supabase API Key: + tags: [lfx_v2, lfx-self-serve, supabase] + envs: [development, staging, production] + source: + onepassword: + vaults: + development: LFX V2 - Development + production: LFX V2 - Production + staging: LFX V2 - Staging + item: LFX v2 supabase + fields: + - url + - api_key + destinations: + - aws_secretsmanager: + tags: + service-pcc: enabled + path: supabase/lfx-self-serve +``` + +> **Tips**: +> +> - Each secret becomes a separate AWS SM path entry +> - The `path` must include the service name: `<3rd-party-service>/` (e.g., `atlassian/lfx-v2-committee-service`) +> - The `tags` list must include the service tag (`eso_service_tag` from Step 2a) so the secret is identifiable by service +> - Use the `envs` list to sync to all three environments in parallel +> - The `source.onepassword.item` should match exactly the name in 1Password vaults +> - Field names should be descriptive enough to avoid duplicates (`litellm_api_key`, not just `api_key`) + +**Auth0 template** (`auth0` — client_secret): + +> For LFX V2 services, always rename fields to a descriptive name prefixed with `auth0_` +> (e.g. `auth0_client_id`, `auth0_client_secret`) so keys are unambiguous in the merged K8s Secret. + +```yaml +: + tags: [auth0, ] + envs: [development, staging, production] + source: + auth0: + client_name: + rename_fields: + client_id: auth0_client_id + client_secret: auth0_client_secret + destinations: + - onepassword: + item: auth0 + field_types: + auth0_client_id: text + auth0_client_secret: password + - aws_secretsmanager: + path: auth0/ + tags: + service-: enabled +``` + +**Auth0 template** (`auth0_jwt` — JWT private key): + +```yaml +: + tags: [auth0_jwt, ] + envs: [development, staging, production] + auto_rotate: true + source: + auth0_jwt: + client_name: + rename_fields: + client_id: + client_public_key: + client_private_key: + destinations: + - onepassword: + item: auth0 + field_types: + : text + : text + : password + - aws_secretsmanager: + path: auth0/ + tags: + service-: enabled +``` + +> **Naming review**: Before finalising the entry, review the secret name, tags, and +> destination path against this test: could a reasonable person look at each value and +> have a reasonable idea of what the secret is and where it came from — without needing +> to read the source or ask anyone? If not, rename before proceeding. For example: +> - Secret name: `Atlassian API Key` ✓ — `Key` ✗ +> - Tags: `[lfx_v2, atlassian, lfx-v2-committee-service]` ✓ — `[lfx_v2, key]` ✗ +> - Path: `atlassian/lfx-v2-committee-service` ✓ — `api_key` ✗ + +> **Important**: After the `lfx-secrets-management` PR is merged, the secret must be deployed +> to AWS SM before the `lfx-v2-argocd` PR can merge — ArgoCD will fail to sync if the secret +> doesn't exist yet. +> +> **For 1Password secrets**: trigger the +> [Deploy workflow](https://github.com/linuxfoundation/lfx-secrets-management/actions/workflows/deploy.yml) +> manually: +> 1. Go to the Deploy workflow page linked above +> 2. Click **Run workflow** +> 3. In the tag field, enter the most specific tag from the `tags:` field in your YAML entry +> (e.g. `litellm`, `atlassian`) — not the AWS resource tag — to avoid re-deploying or +> rotating unrelated secrets +> 4. Confirm the workflow completes successfully before merging the `lfx-v2-argocd` PR +> +> If you're not comfortable triggering the workflow yourself, ask the Platform Engineering team +> to run it for you. +> +> **For Auth0 JWT secrets** (`auth0_jwt` source type): do not trigger the workflow yourself — +> these secrets are rotated on every deploy. Ask the Platform Engineering team to deploy it +> and coordinate the timing so only the intended service's credentials are rotated. + +--- + +## Step 5: Wire Secrets into Service Environment in `lfx-v2-argocd` + +Add an `environment` block (or extend the existing one) that maps each secret field to an +environment variable. Reference the Kubernetes Secret created by the ExternalSecret +(`-secrets`) and use the field name as the key. + +- If the secret is deployed to **all environments**, add it to `values/global/.yaml` +- If the secret is deployed to **specific environments only**, add it to each relevant + per-environment file (`values/dev/.yaml`, `values/staging/.yaml`, + `values/prod/.yaml`) instead + +> Before writing `secretKeyRef.name`, verify the K8s Secret name from `ExternalSecret.yaml` +> in `lfx-v2-argocd/custom-resources//` — check `spec.target.name`. It is typically +> `-secrets` but must match exactly. + +> Tag-based discovery means new secrets are picked up automatically — no changes to +> `ExternalSecret.yaml` are needed as long as the AWS SM tag matches the service. + +> **lfx-self-serve**: If the service is `lfx-self-serve` and `values/dev/lfx-self-serve.yaml` +> or `values/global/lfx-self-serve.yaml` are modified, apply the same change to +> `values/dev/lfx-self-serve-branch.yaml` as well. The K8s Secret for `lfx-self-serve` +> is named `pcc-secrets` — use that as `secretKeyRef.name` instead of `lfx-self-serve-secrets`. +> Note: the `eso_service_tag` for `lfx-self-serve` is `pcc` (so the AWS SM resource tag is +> `service-pcc: enabled` and the K8s Secret is `pcc-secrets`), but the service tag in the +> `tags:` list should be `lfx-self-serve` — the human-readable service name. + +```yaml +# Copyright The Linux Foundation and each contributor to LFX. +# SPDX-License-Identifier: MIT +--- +environment: + SECRET_NAME: + valueFrom: + secretKeyRef: + name: -secrets + key: + ANOTHER_SECRET_NAME: + valueFrom: + secretKeyRef: + name: -secrets + key: +``` + +> **Note**: The `environment:` block is always named `environment`, but its nesting varies — +> some services have it at the top level, others under `app:`. Always check the existing +> values file before adding entries and match the structure already in use. + +--- + +## Verification Checklist + +After completing all applicable steps, verify the setup: + +**File checklist — all repos involved:** + +- [ ] `lfx-v2-opentofu`: `iam-service-account-definitions.yaml` has service entry *(new services only)* +- [ ] `lfx-secrets-management`: appropriate file under `secrets/lfx/` has sync entry for each secret; Deploy workflow run after merge +- [ ] Service Helm chart *(new services only)*: + - [ ] `templates/serviceaccount.yaml` created + - [ ] `values.yaml` has `serviceAccount` block +- [ ] `lfx-v2-argocd`: + - [ ] `custom-resources//SecretStore.yaml` created *(new services only)* + - [ ] `custom-resources//ExternalSecret.yaml` created *(new services only)* + - [ ] `values/dev/.yaml` has IRSA role ARN + `automountServiceAccountToken: true` *(new services only)* + - [ ] `values/staging/.yaml` has IRSA role ARN + `automountServiceAccountToken: true` *(new services only)* + - [ ] `values/prod/.yaml` has IRSA role ARN + `automountServiceAccountToken: true` *(new services only)* + - [ ] `values/global/.yaml` has `environment` block with `secretKeyRef` entries for all secrets + - [ ] **lfx-self-serve only**: `lfx-self-serve-branch` updated alongside any `lfx-self-serve` values file change + +**Configuration checks:** + +- [ ] IRSA role ARN format is correct: `arn:aws:iam:::role/` +- [ ] All account IDs are correct (dev=`788942260905`, staging=`844790888233`, prod=`372256339901`) +- [ ] AWS SM secret config tags (`tags:` field) are specific enough to scope the deploy workflow run +- [ ] AWS SM resource tag on each secret matches `eso_service_tag`: `service-: enabled` +- [ ] All secrets are stored as JSON (list form fields or `store_as_json: true`) +- [ ] `ExternalSecret` target name is `-secrets` and `secretKeyRef` names match + +**1Password setup** *(1Password sources only)*: + +- [ ] Items exist in all required vaults (LFX V2 - Development/Staging/Production) +- [ ] Item names match exactly what's in the `lfx-secrets-management` `item:` field +- [ ] Field names match exactly what's in the `fields:` list + +--- + +## Reference Implementations + +Real examples in the codebase: + +| Service | What It Added | References | +|---------|---------------|-----------| +| Email Service | SMTP credentials | `lfx-v2-email-service` chart, lfx-v2-argocd values | +| Invite Service | JWT secret | `lfx-v2-invite-service` chart (LFXV2-1783), lfx-v2-argocd values | +| Committee Service | Auth0 JWT client secret (`auth0_jwt`, auto-rotate, renamed fields) | `secrets/lfx/auth0_clients.yml` in lfx-secrets-management | + +Check these repos for the exact file structure and conventions used in production. + +--- + +## Common Workflows + +### Adding a JWT secret to a new service + +1. User asks: "Set up JWT secret for invite-service" +2. Collect: service name, 1Password item name, field names, environments (Step 1) +3. Check for existing ESO objects — none found, proceed with Step 3 +4. Follow Steps 3–5 in order +5. Verify using the checklist above +6. Open PRs for `lfx-v2-opentofu`, `lfx-secrets-management`, service Helm chart, and `lfx-v2-argocd` +7. Merge `lfx-secrets-management` PR, trigger Deploy workflow, then merge remaining PRs + +### Adding SMTP credentials to an existing service + +1. User asks: "Add SMTP secret to email-service" +2. Collect: service name, 1Password item name, field names, environments (Step 1) +3. Check for existing ESO objects — found, skip Step 3 +4. Follow Steps 4–5 +5. Verify using the checklist above +6. Open PRs for `lfx-secrets-management` and `lfx-v2-argocd` +7. Merge `lfx-secrets-management` PR, trigger Deploy workflow, then merge `lfx-v2-argocd` PR + +### Adding an Auth0 client key pair to an existing service + +1. User asks: "Add the LFX V2 Persona Service auth0 client to lfx-v2-persona-service" +2. Check for existing ESO objects — found, skip Step 3 +3. Add sync entry in lfx-secrets-management (Step 4) +4. Coordinate with the Platform Engineering team to deploy the auto-rotated secret +5. Wire into deployment in values charts (Step 5) +6. Verify using the checklist above +7. Submit argocd PR + +### Debugging: "Pods can't read the secret" + +Check in order: + +1. **Pod events** — `kubectl describe pod ` to see if the SecretStore mounted +2. **ESO logs** — `kubectl logs -n external-secrets-system deployment/external-secrets` +3. **AWS SM permissions** — verify IRSA role has `SecretsManager:GetSecretValue` on the path +4. **Secret exists in AWS SM** — lfx-secrets-management automation has synced the secret +5. **ExternalSecret status** — `kubectl describe externalsecret ` shows sync status +6. **Topology/firewalling** — pod can reach AWS API endpoint (check SecurityGroup, NACL, DNS) + +--- + +## Communication Style + +This skill serves both platform engineers and application developers: + +- **For experienced infrastructure engineers**: Use technical terms freely (IRSA, JWT auth, ESO operator). +- **For application developers touching secrets for the first time**: Explain what ESO is + (*"it automatically copies secrets from AWS into Kubernetes"*) and IRSA (*"it proves + your pod is who it claims to be when talking to AWS"*). +- **For non-technical users**: Avoid "Kubernetes", "IRSA", "operator", "manifest". Instead say + "cloud setup", "permissions", "secure secret storage", "automated sync". + +Always finish with the verification checklist so the user can confirm everything is wired correctly.