Feature/agent platform by allamand · Pull Request #529 · aws-samples/appmod-blueprints

allamand · 2026-03-13T08:10:13Z

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

- Add repository overview with architecture and tech stack - Add GitOps patterns and ArgoCD best practices - Add Backstage development guidelines with Kro plugin details - Add Kro (Kubernetes Resource Orchestrator) development guide - Add Terraform infrastructure guidelines - Add application development standards for all languages - Add coding standards and best practices - Add ML/AI workloads guide (Ray, Kubeflow, MLflow, etc.) - Add progressive delivery guide (Argo Rollouts, canary deployments) - Add comprehensive troubleshooting guide These steering files provide AI agents with deep context about: - Repository structure and conventions - Development workflows and patterns - Testing strategies - Security best practices - Common troubleshooting scenarios

- Add DESIGN.md with complete architecture and implementation plan - Add README.md user guide for deployment and usage - Add COMPONENTS.md with detailed component specifications - Add TROUBLESHOOTING.md with diagnostic procedures Documentation covers: - GitOps bridge pattern for agent platform integration - Feature flag mechanism for backward compatibility - Component details (Kagent, LiteLLM, Langfuse, Jaeger, Tofu Controller, Agent Core) - Security, monitoring, backup/DR considerations - Migration guide and troubleshooting procedures

Changes: - Replace ApplicationSet pattern with individual ArgoCD Applications - Each component (Kagent, LiteLLM, Agent Gateway, Langfuse, Jaeger, Tofu Controller, Agent Core) is now a separate Application - Each Application directly references its Helm chart in sample-agent-platform-on-eks repository - Update architecture diagrams to reflect new pattern - Update verification commands and troubleshooting steps - Simplify deployment flow documentation This approach provides: - More explicit control over each component - Easier debugging and management - Direct chart references without ApplicationSet generator complexity

…ular architecture - Expanded Epic 8 from 4 generic tasks to 10 detailed Asana-ready tasks covering bridge chart, IAM roles, secrets, hub-config, and e2e validation - Updated DESIGN.md to remove workshop_type references from appmod-blueprints (workshop concerns move to platform-engineering-on-eks repo) - Made Agent Gateway auth provider-agnostic (Keycloak/Cognito/external) - Parameterized resource prefix throughout (no hardcoded peeks) - Updated deployment scenarios, FAQ, migration guide for modular architecture - Added cross-references between UPGRADE-APPROACH.md and DESIGN.md - Updated task total to 60, timeline to 14-18 weeks - Added 3 new success criteria for agent platform on modular architecture

- UPGRADE-APPROACH.md: Fixed duplicate Epic 8 issue, appended clean Epic 8 detailed breakdown (Tasks 8.1-8.10) with Kro+ACK compositions - DESIGN.md: Replaced all terraform apply/init/variables.tf references with hub-config addons approach and GitOps bootstrap patterns. Updated deployment flows, testing, migration guide, FAQ, and DR sections. Feature flag now uses hub-config ConfigMap (Level 3) and GitOps commit (Level 4) instead of Terraform variables and deploy scripts. - README.md: Updated Quick Start to use kubectl bootstrap instead of terraform apply. Updated Disable section. Parameterized resource prefix.

… patterns/workshop, git credentials, schema versioning, observability config

allamand · 2026-03-13T08:16:55Z

+
+## CI/CD Integration
+
+### GitLab CI Pipeline


You want to use gitlab for cicd ?

allamand · 2026-03-13T08:18:52Z

For backstage we was wondering if we scope this out if the project and just use the one in CNOE and using existing OSS plugins for kro, gitlab integration…

allamand · 2026-03-13T08:24:20Z

+
+## ApplicationSet Patterns
+
+### List Generator Pattern


We should include cluster generator here as it is the one we are using most in the project

allamand · 2026-03-13T08:31:43Z

+- Integration tests with AWS
+- Deployment validation
+
+## Backstage Integration


I disagree on this. Really backstage should not create things in Kubernetes êtes directly but only create things through gitops : PR, push

This can be fixed.

allamand · 2026-03-13T08:40:10Z

+
+### Cluster Design Principles
+1. **Multi-AZ Deployment**: Spread across 3 availability zones
+2. **Managed Node Groups**: Use EKS managed node groups


We don’t want to use managed nodes groupes but Eks auto mode

This can be fixed.###

allamand · 2026-03-13T08:43:01Z

+}
+```
+
+### IRSA Configuration


Remove this

allamand · 2026-03-13T08:46:38Z

+
+#### ModelConfig CRD
+
+```yaml


Can we add another example pointing to a ray endpoint in the cluster instead of bedrock. ?

This can be fixed.

allamand · 2026-03-13T08:48:34Z

+    model: anthropic.claude-3-5-sonnet-20241022-v2:0
+    region: us-east-1
+
+  # Service account with IRSA


Can we use pod identity instead of IRSA ?

This can be fixed.

allamand · 2026-03-13T08:53:47Z

+
+### Overview
+
+Agent Core Components provision AWS Bedrock Agent Core capabilities (Memory, Browser, Code Interpreter) using Tofu Controller.


Can we deploy this using kro/ACK instead of open tofu/terrzform ? To have consistency with the platform

There is no support for ACK support for Agent core components. So this is the only approach we did POC.

allamand · 2026-03-13T09:02:18Z

+
+> **Note**: There is no Terraform in the `appmod-blueprints` solution repo. Initial EKS cluster creation (via Terraform, CDK, eksctl, etc.) lives in the customer's own infra repo or the workshop repo (`platform-engineering-on-eks`). Once the hub cluster exists, it self-manages via Kro+ACK/CrossPlane compositions and ArgoCD.
+
+### Changes in `sample-agent-platform-on-eks` Repository


What are the need to create another git repo for that ? We should only use appmod-blueprints Also for agents

It is absolutely required for extending the platform to agent platform work we are doing separate. Thats the core reason for this refactor. This is our core tenet

allamand · 2026-03-13T09:48:36Z

+3. **Config-external**: `hub-config.yaml` lives outside the repo; customers pass their own config. The config drives Kro/CrossPlane compositions and ArgoCD bootstrap.
+4. **Provider-agnostic**: Git provider (GitHub vs GitLab vs CodeCommit), OIDC provider, and CI/CD provider are swappable via configuration
+5. **GitOpsy spokes**: Spoke clusters provisioned and managed via CrossPlane/Kro through the hub cluster — same mechanism as hub self-management
+6. **Workshop as a pattern, not a fork**: Workshop-specific code lives in `patterns/workshop/` within the main repo (alongside other consumption patterns like `patterns/hub-only/`, `patterns/full-platform/`). The workshop pattern includes CloudFront, GitLab integration, Identity Center setup, and workshop-specific configurations. Heavy workshop orchestration (Terraform for cluster creation, deploy scripts) lives in the internal `platform-engineering-on-eks` GitLab repo.


platform specific will be in the pattern repo. there is no terraform code to be in internal platform-engineering-on-eks, we just reuse the generic code to create hub. specifics scripts will also leave in tha workshop pattern scrips dir as they only apply on how we deploy the platform and ca be reference by users wanting to use other patterns as well

allamand · 2026-03-13T09:52:47Z

+│       └── README.md                 #   Workshop deployment guide (references platform-engineering-on-eks)
+├── applications/                     # UNCHANGED: Sample apps
+├── backstage/                        # UNCHANGED: Backstage IDP
+├── gitops/                           # REFACTORED: GitOps configurations


What is refactor here ? looks UNCHANGED for me

allamand · 2026-03-13T09:54:48Z

+```
+
+> **Key changes**:
+> - The `platform/infra/terraform/` directory is removed from `appmod-blueprints`. All Terraform code for cluster creation, GitLab PATs, and workshop-specific infra moves to the `platform-engineering-on-eks` internal GitLab repo. The solution repo is purely GitOps-native for ongoing management.


I think we need to keep this terraform cluster to create hub cluster in this repo. it is optional. workshop pattern and full will use it, while other patterns may use existing clusters. there is no point/advantages moving this to gitlab

Terraform module to create hub cluster stays in this repo.

allamand · 2026-03-13T09:55:47Z

+> **Key changes**:
+> - The `platform/infra/terraform/` directory is removed from `appmod-blueprints`. All Terraform code for cluster creation, GitLab PATs, and workshop-specific infra moves to the `platform-engineering-on-eks` internal GitLab repo. The solution repo is purely GitOps-native for ongoing management.
+> - `modules/hub-provisioning/` provides a turnkey Terraform module that customers can `source` from GitHub to provision the hub cluster and bootstrap the platform. After bootstrap, the platform is self-managing.
+> - `examples/` is renamed to `patterns/` to better reflect that these are consumption patterns, not just examples. The `workshop/` pattern is a first-class citizen alongside other patterns. Workshop-specific configuration (CloudFront, GitLab, Identity Center) lives in `patterns/workshop/`; heavy workshop orchestration (Terraform, deploy scripts) lives in `platform-engineering-on-eks`.


There is no examples/ folder in current setup, what are you refering to ?

There is no examples/ folder in current setup, what are you refering to ?

@allamand the proposal is to have a folder called patterns or blueprints - see discussion on the slack channel. Under patterns we can have a workshop folder that will contain workshop specific content. This is to address the fact that we cannot move all workshop specific content to the gitlab for workshop content.

allamand · 2026-03-13T09:56:33Z

+> **Key changes**:
+> - The `platform/infra/terraform/` directory is removed from `appmod-blueprints`. All Terraform code for cluster creation, GitLab PATs, and workshop-specific infra moves to the `platform-engineering-on-eks` internal GitLab repo. The solution repo is purely GitOps-native for ongoing management.
+> - `modules/hub-provisioning/` provides a turnkey Terraform module that customers can `source` from GitHub to provision the hub cluster and bootstrap the platform. After bootstrap, the platform is self-managing.
+> - `examples/` is renamed to `patterns/` to better reflect that these are consumption patterns, not just examples. The `workshop/` pattern is a first-class citizen alongside other patterns. Workshop-specific configuration (CloudFront, GitLab, Identity Center) lives in `patterns/workshop/`; heavy workshop orchestration (Terraform, deploy scripts) lives in `platform-engineering-on-eks`.


all workshop specific patterns should be there, so user can see full picture on how we did it, don't hide things in internal repo

allamand · 2026-03-13T10:31:05Z

+eksctl create cluster --name hub --region us-west-2
+
+# 2. Install ArgoCD
+helm install argocd argo/argo-cd -n argocd --create-namespace


activte EKS capabilities

allamand · 2026-03-13T10:33:41Z

+
+### 3.4 Phase 4: Workshop Isolation
+
+**Goal**: Move all workshop-specific code (including ALL Terraform) to the internal `platform-engineering-on-eks` GitLab repo. The `appmod-blueprints` repo becomes a clean, customer-facing, GitOps-native solution with zero Terraform and zero workshop concerns.


again, just in pattern/workshop/ folder, not internal gitlab

allamand · 2026-03-13T10:35:52Z

+
+#### 3.5.2 Target State
+
+- Hub cluster creation is done once by any tool (eksctl, CDK, TF, CLI) — this is outside `appmod-blueprints`


but we provide few options there who people that don't have existing clusters

allamand · 2026-03-13T10:38:00Z

+  - CloudFront via ACK CloudFront controller (optional)
+  - Observability via ACK Grafana/Prometheus (optional)
+  - Pod Identity via native K8s resources
+- Spoke clusters are provisioned exclusively via Kro RGDs or CrossPlane compositions from the hub


I would say, this does not matter, users will have choice to provision them using any tools, we privide a way to do it with kro/ACK, that could also be crossplane, pullumi, terraform, eksctl, console... that does not matters, we just need to show them how to create/register the cluster secret, and which IAM Rolke to add in the EKS Access entries, then the platform will register it automatically with Argo and bootstrap it as a fleet member

allamand · 2026-03-13T10:39:11Z

+  - Pod Identity via native K8s resources
+- Spoke clusters are provisioned exclusively via Kro RGDs or CrossPlane compositions from the hub
+- ArgoCD ApplicationSets auto-discover and bootstrap new spokes
+- Backstage templates allow self-service spoke creation


I would like to add also here an Agentic way to add clusters using our solution that uses agentic.

so users can either :

use basckstage

use agent

use native gitops integration

to create new spoke clusters, or any apps

elamaran11 · 2026-03-13T15:00:07Z

@allamand Thanks for the feedback. We will be implementing some of these but rest is not part of the tenets we decided upon for this approach. Im happy to have you as part of this refactor effort. Following feedback is incorporated:

EKS Auto Mode — replaced "node groups" with "EKS Auto Mode (no managed node groups)" throughout; added to design principles, executive summary, hub-config example, cluster stack description, hub-provisioning module, bootstrap guide, and test steps.

Pod Identity, not IRSA — added as design principle #10, added pod_identity: true to hub-config example with explicit "not IRSA" comments. IRSA was not previously referenced in this doc, so the additions make the Pod Identity preference explicit.

Backstage GitOps-only — updated all Backstage template references to emphasize PR/push through GitOps, not direct kubectl apply. Updated Task 3.6, Task 5.3, the Asana tables, and the spoke creation flow.

Cluster generator for ApplicationSets — added "cluster generator pattern" to fleet descriptions in both repo structures, the ApplicationSets change table, and the spoke auto-discovery section.

No examples/ folder — fixed the incorrect "renamed from examples/" references to clarify patterns/ is a new directory.

Activate EKS capabilities — updated the customer bootstrap flow to include aws eks update-cluster-config for enabling Auto Mode capabilities (compute, networking, storage, load balancing) after cluster creation.

…tOps-only, cluster generator, activate EKS capabilities, fix examples/ reference

…workshop/ Relocate TF/scripts destination from platform-engineering-on-eks internal GitLab repo to patterns/workshop/terraform/ and patterns/workshop/scripts/ within appmod-blueprints. The internal repo contains only workshop content and instructions, not infrastructure code. Key changes: - Task 1.5: 'Move TF to external repo' → 'Relocate to patterns/workshop/' - Task 2.3: simplified dependencies (no longer depends on Epic 4) - Epic 4: 'Move to external repo' → 'Reorganize within appmod-blueprints' - Tasks 4.1-4.7: rewritten for internal reorganization - Target repo structure: added patterns/workshop/terraform/ and scripts/ - All impact tables, risk mitigations, execution order updated - DESIGN.md: 4 references updated to patterns/workshop/ - References to platform-engineering-on-eks: 87→10 (all content/instructions role)

elamaran11 added 6 commits February 18, 2026 09:13

Integrate feedback into UPGRADE-APPROACH.md: hub-provisioning module,…

092158f

… patterns/workshop, git credentials, schema versioning, observability config

allamand commented Mar 13, 2026

View reviewed changes

elamaran11 marked this pull request as draft March 13, 2026 15:00

elamaran11 and others added 2 commits March 13, 2026 11:10

Integrate PR #529 feedback: EKS Auto Mode, Pod Identity, Backstage Gi…

0b496b6

…tOps-only, cluster generator, activate EKS capabilities, fix examples/ reference


		### Overview

		Agent Core Components provision AWS Bedrock Agent Core capabilities (Memory, Browser, Code Interpreter) using Tofu Controller.


		> Note: There is no Terraform in the `appmod-blueprints` solution repo. Initial EKS cluster creation (via Terraform, CDK, eksctl, etc.) lives in the customer's own infra repo or the workshop repo (`platform-engineering-on-eks`). Once the hub cluster exists, it self-manages via Kro+ACK/CrossPlane compositions and ArgoCD.

		### Changes in `sample-agent-platform-on-eks` Repository


		### 3.4 Phase 4: Workshop Isolation

		Goal: Move all workshop-specific code (including ALL Terraform) to the internal `platform-engineering-on-eks` GitLab repo. The `appmod-blueprints` repo becomes a clean, customer-facing, GitOps-native solution with zero Terraform and zero workshop concerns.


		#### 3.5.2 Target State

		- Hub cluster creation is done once by any tool (eksctl, CDK, TF, CLI) — this is outside `appmod-blueprints`

Conversation

allamand commented Mar 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elamaran11 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

elamaran11 commented Mar 13, 2026 •

edited

Loading