A sophisticated multi-agent system powered by Amazon Nova and LangGraph that automatically designs production-ready AWS cloud architectures from natural language descriptions.
The AWS Architecture Builder decomposes complex architecture design into discrete, specialized tasks using the Single Responsibility Principle (SRP). Each agent is responsible for one concrete task, improving accuracy and reducing hallucination. The system uses a LangGraph-based orchestration framework to maintain state across all agents and enables iterative research loops until sufficient technical depth is achieved.
- Voice Input: Capture project requirements via natural voice using Amazon Nova's real-time audio capabilities
- Autonomous Research: Conduct iterative web research to identify relevant AWS services
- Architecture Design: Generate comprehensive, production-ready cloud architectures with high-availability and scaling strategies
- Visual Representation: Create both ASCII and Mermaid flowchart diagrams of the architecture
- Cost Estimation: Estimate monthly AWS costs based on realistic production configurations
- Human-in-the-Loop: Support for human review and approval at critical decision points
- Python >= 3.13
uvpackage manager (install here)
Create a .env file in the project root with the following variables:
# Amazon Nova API Configuration
NOVA_BASE_URL=https://api.nova.amazon.com/v1
NOVA_API_KEY=your_api_key_here
# Optional: For additional services
# Add other AWS credentials or service endpoints as neededRequired Environment Variables:
NOVA_BASE_URL: The endpoint URL for Amazon Nova models (Bedrock or https://api.nova.amazon.com/v1)NOVA_API_KEY: Your API key for accessing Amazon Nova through the OpenAI-compatible interface
uv run uvicorn server:app --reload --port 8080Output:
INFO: Uvicorn running on http://127.0.0.1:8080
INFO: Application startup complete
The API will be available at http://localhost:8080
- Frontend:
http://localhost:8080/ - API docs:
http://localhost:8080/docs
For interactive debugging and visualization of the agent graph:
uv run langgraph devThis starts LangGraph Studio, allowing you to:
- Visualize the agent workflow in real-time
- Inspect state at each node
- Test different inputs
- Debug conditional edges and routing logic
The system follows a multi-stage pipeline with intelligent routing:
START
βββ Router (determines entry point)
β βββ New Project: cloud_architect
β βββ Update Request: Smart routing based on change scope
β
βββ RESEARCH LOOP (can iterate up to 3 times)
β βββ cloud_architect (generates research questions)
β βββ browser (web search using DuckDuckGo)
β βββ research_analyst (extracts AWS services & findings)
β βββ guardian (decides: research more or proceed to design?)
β
βββ ARCHITECTURE PIPELINE
βββ solution_architect (designs architecture)
βββ blueprint_generator (ASCII diagram)
βββ designer (Mermaid visualization)
βββ summarizer (markdown documentation)
βββ cost_estimator (monthly cost projection)
Responsible For: Initial research question generation
- Input: Project description or change request
- Task:
- For new projects: Imagine a plausible general solution architecture and generate 4 browser-searchable technical questions
- For updates: Generate focused questions specific to the change scope
- Output: Array of search queries
- LLM Used:
nova-2-lite-v1(fast, lightweight reasoning) - Pattern: Single Responsibility - question generation only
Example Output:
How does AWS EKS handle auto-scaling with mixed instance types?
What are the networking requirements for VPC endpoints with S3 buckets?
How to implement cross-region failover for RDS databases?
Responsible For: Web search execution
- Input: Array of search queries
- Task: Execute DuckDuckGo searches for each unprocessed query
- Output: Array of search results (title + body pairs)
- Tool Used:
DDGS(DuckDuckGo Search library) - Pattern: Tool integration - pure search execution
- Note: Skips already-processed queries to avoid redundant searches
Implementation Details:
# Searches up to 2 results per query
# Formats as "Title: Body" for downstream processing
# Appends to existing results for iterative loopsResponsible For: Knowledge extraction from search results
- Input: Raw search results
- Task:
- Extract all mentioned Amazon Web Services (EC2, EKS, S3, RDS, Lambda, etc.)
- Extract critical technical findings (patterns, configurations, constraints, limits, pricing)
- Ensure completeness and precision - architect depends solely on this output
- Output: Structured JSON with:
services: List of AWS servicestechnical_findings: Atomic, actionable insights
- LLM Used:
nova-pro-v1(deep work model for comprehensive analysis) - Pattern: Critic Pattern - includes validation/review to ensure output structure correctness
- Output Parsing: Pydantic
AnalysisResultschema with built-in validation
Key Prompting Strategy:
- Emphasizes exhaustiveness: "Do not omit any service or constraint mentioned"
- Demands atomic findings: "Extract specific, actionable technical insights"
- Enforces deduplicated output: "Ensure no duplicate services"
Responsible For: Research loop termination decision
- Input: Current state (analysis iteration, services count, findings count)
- Task: Decide whether to loop back for more research or proceed to architecture design
- Decision Criteria:
- Minimum iterations check:
analysis_iteration < analysis_max_iterations(default: 3) - Minimum searches analyzed: >= 5
- Minimum services identified: >= 3
- Minimum technical findings: >= 5
- Minimum iterations check:
- Output: "cloud_architect" (loop back) or "solution_architect" (proceed)
- Pattern: Deterministic validation - threshold-based gating
Rationale: Ensures sufficient research depth before committing to architecture design, reducing redesign loops.
Responsible For: Production-ready architecture design
- Input:
- Project description
- Extracted AWS services and technical findings
- (If update: existing architecture + change request)
- Task:
- Design coherent, buildable AWS architecture
- Map services to roles and selection justifications
- Define component interactions
- Design scaling, HA, and deployment strategies
- Document assumptions and tradeoffs
- Output: Structured JSON with:
architecture_overview: Description of the overall architectureservice_mapping: Dict of service β {role, reason_for_selection}component_interactions: Array of interaction descriptionsscaling_strategy,high_availability_strategy,deployment_strategyassumptions,tradeoffs
- LLM Used:
nova-pro-v1(deep work for complex design) - Pattern: Critic Pattern - Pydantic
SolutionArchitectureschema validates structure - Update Logic: For change requests, modifies only affected components; preserves unaffected services
Architecture Principles Embedded:
- Production-grade configuration (not hobby-tier)
- HA duplication if required by strategy
- Total cost consideration (e.g., multiple instances counted)
- Service integration coherence
Responsible For: ASCII diagram generation
- Input:
- List of AWS services
- Component interactions
- Task: Generate clear, text-based ASCII architecture diagram
- Represent services as rectangular boxes
- Show data flows with arrows (β, β)
- Keep simple and readable
- Output: ASCII art string
- LLM Used:
nova-pro-v1(visual reasoning for diagram layout) - Pattern: Creative output generation
Example ASCII Output:
+------------------+ +------------------------+
| EC2 Web |----->| Load Balancer |
| Server | | (Traffic Dist) |
+------------------+ +------------------------+
| |
v v
+------------------+ +------------------+
| S3 Storage | | RDS Database |
+------------------+ +------------------+
Responsible For: Mermaid flowchart generation
- Input: ASCII blueprint from blueprint_generator
- Task:
- Convert ASCII diagram into valid Mermaid v10+ flowchart syntax
- Validate Mermaid syntax safety rules:
- Node labels with special chars wrapped in quotes
- No HTML tags or emojis
- Unique alphanumeric node IDs
- Only
-->for directional flow
- Group related components in subgraphs (e.g., VPC, private networks)
- Output: Raw Mermaid code (no markdown fences)
- LLM Used:
nova-pro-v1 - Pattern: Critic Pattern -
clean_mermaid()function removes markdown artifacts - Safety: Strict output validation ensures rendering correctness
Mermaid Syntax Example:
graph TD
A["API Gateway"]
B["EC2 (Frontend & Backend)"]
C["RDS Database"]
A --> B
B --> C
Responsible For: Human-readable documentation
- Input: Complete architecture state
- Task: Synthesize architecture data into structured Markdown
- Output: Markdown document with sections:
- Architecture Overview
- Cloud Services (with roles and selection justifications)
- Component Interactions
- Scaling Strategy
- High Availability Strategy
- Deployment Strategy
- Assumptions
- Tradeoffs
- LLM Used: None (pure data formatting)
- Pattern: Templating and data assembly
Responsible For: Monthly AWS cost projection
- Input: Complete architecture with services and strategies
- Task:
- Identify appropriate AWS plan/tier for each service
- Estimate realistic production configurations (CPU, RAM, storage, bandwidth, replicas)
- Calculate monthly costs in USD
- Justify sizing decisions based on architecture assumptions
- Handle usage-based pricing (bandwidth, storage) with reasonable estimates
- Include HA duplication if required
- Output: Structured JSON with:
services: Dict of service β array of pricing planstotal_monthly_cost: Sum in USDsummary: Executive explanation of cost drivers
- LLM Used:
nova-pro-v1(complex financial reasoning) - Pattern: Human-in-the-Loop -
interrupt()pauses execution for user approval before cost calculation - Output Parsing: Pydantic
PricingModelvalidates structure
Design Notes:
- TODO: Integration with live AWS pricing API
- Currently uses LLM knowledge for cost estimates
- Includes HA cost multipliers
- Assumes production-grade (not hobby-tier)
The single source of truth passed between all agents. Organized by processing stage:
class SolutionState(TypedDict):
# Input
description: str # Project requirements
update_request: str # Optional change request
# Research Phase
search_queries: List[str] # Questions to research
processed_queries: List[str] # Already searched
search_results: List[str] # Raw search results
analyzed_searches: List[str] # Fingerprints of analyzed results
analysis: str # Consolidated analysis
technical_findings: List[str] # Atomic insights
services: List[str] # Extracted AWS services
analysis_iteration: int # Current loop count
analysis_max_iterations: int # Max iterations (default: 3)
# Architecture Design Phase
architecture_overview: str
service_mapping: Dict[str, ServiceDetails]
component_interactions: List[str]
scaling_strategy: str
high_availability_strategy: str
deployment_strategy: str
assumptions: List[str]
tradeoffs: List[str]
# Visualization Phase
architecture_diagram: str # ASCII diagram
architecture_design_code: str # Mermaid code
# Output Phase
summary: str # Markdown documentation
cost_estimate_result: int # Monthly cost
cost_estimate_breakdown: Dict # Service-level costs
cost_estimate_summary: str # Cost explanation1. Router Edge (START β conditional)
- Decision Logic: Routes based on new project vs. update request
- Nodes:
"cloud_architect"β New project flow"solution_architect"β No research needed"blueprint_generator"β Only regenerate diagram"summarizer"β Only regenerate summary
- Logic: Uses LLM to classify change scope and route accordingly
2. Guardian Edge (research_analyst β conditional)
- Decision Logic: Threshold-based research completion check
- Nodes:
"cloud_architect"β Loop for more research"solution_architect"β Proceed to design
Research Loop:
cloud_architectβbrowserβresearch_analystβ (Guardian decides)
Architecture Pipeline:
solution_architectβblueprint_generatorβdesignerβsummarizerβcost_estimatorβ END
Each agent owns one concrete responsibility:
- Cloud Architect = Question generation
- Browser = Search execution
- Research Analyst = Knowledge extraction
- Solution Architect = Architecture design
- Blueprint Generator = ASCII diagram
- Designer = Mermaid conversion
- Summarizer = Markdown compilation
- Cost Estimator = Cost calculation
Benefit: Reduced hallucination, improved accuracy, easier debugging and updating.
Used in high-difficulty tasks to fix structural issues at the node level:
Applied In:
- Research Analyst: Uses Pydantic
AnalysisResultoutput parser to validate extracted services and findings - Solution Architect: Uses Pydantic
SolutionArchitectureparser to ensure all required fields are present and correctly typed - Designer: Uses
clean_mermaid()reviewer method to strip markdown artifacts and ensure valid Mermaid syntax - Cost Estimator: Uses Pydantic
PricingModelparser with strict validation
Pattern Implementation:
from langchain_core.output_parsers import PydanticOutputParser
_parser = PydanticOutputParser(pydantic_object=AnalysisResult)
format_instructions = _parser.get_format_instructions() # Auto-generate format guide
parsed = _parser.parse(response) # Auto-validate and retry if neededBenefits:
- Structured Output: Guarantees valid JSON/data structures
- Automatic Validation: Pydantic catches schema mismatches
- Prompt Engineering: Format instructions embedded in prompts guide LLM output
- Error Recovery: Failed parses can trigger retries within the node
Problem: One web search round may not find all relevant AWS services and constraints.
Solution: Guardian-gated loop allowing up to 3 iterations:
- Iteration 1: Initial broad research
- Iteration 2: Deeper investigation based on findings
- Iteration 3: Final refinement (optional)
Termination Conditions (any trigger exit):
analysis_max_iterationsreached- Sufficient services found (>= 3)
- Sufficient findings documented (>= 5)
- Sufficient searches analyzed (>= 5)
State Management: Loop preserves previous searches and findings, appends new results.
Three-tier model selection based on task requirements:
| Model | Use Case | Capability | Cost |
|---|---|---|---|
| nova-micro-v1 | Classification, routing, labeling | Pure text generation, lightweight reasoning | Lowest |
| nova-2-lite-v1 | Fast tasks, question generation | Question generation, lightweight analysis | Low |
| nova-pro-v1 | Deep work, design, analysis | Architecture design, complex reasoning, cost estimation | Higher |
Pattern: Use minimal model required for task, reducing latency and cost.
The cost_estimator node uses interrupt() to pause execution before cost calculation:
from langgraph.types import interrupt
def cost_estimator(state: SolutionState):
logger.info("π° Cost Estimator Node")
interrupt({"question": "Get a cost estimate."})
# ... cost calculation ...Benefits:
- User can review architecture before committing to cost analysis
- Allows for approval/rejection of design
- Creates checkpoint for resuming later
- Graph must be compiled with
MemorySavercheckpointer for this to work
Invocation:
_graph = build_graph(use_checkpointer=True) # Enable interrupts
# ... graph runs, pauses at interrupt ...
resume_cost_estimation(_graph, config) # User resumesFor architecture changes, the system intelligently routes to the appropriate node:
_ROUTABLE_NODES = {
"cloud_architect": "Research needed - unfamiliar services, new integrations",
"solution_architect": "No research - change uses known services",
"blueprint_generator": "Only diagram needs regeneration",
"summarizer": "Only summary needs regeneration",
}Decision Process:
- User provides
update_request - Router LLM classifies change scope
- Routes to appropriate node
- Only affected components are recomputed
- Unaffected state (existing architecture, services) is preserved
- LangGraph (
>= 1.1.1): Multi-agent orchestration and state management - FastAPI (
>= 0.135.1): REST API backend - Uvicorn (
>= 0.41.0): ASGI server
- Amazon Nova via OpenAI-compatible API:
nova-micro-v1: Classification and routingnova-2-lite-v1: Lightweight analysisnova-pro-v1: Deep work and design
- LangChain (
langchain-openai >= 1.1.11): LLM abstractions and output parsing - Pydantic: Structured output validation
- DDGS (
>= 9.11.3): DuckDuckGo web search - PyAudio (
>= 0.2.14): Audio input for voice intake - WebSockets (
>= 16.0): Real-time communication - Streamlit (
>= 1.55.0): Frontend framework (optional UI) - streamlit-mermaid (
>= 0.3.0): Mermaid diagram rendering
- Pydantic: Type-safe data models and validation
- python-dotenv (
>= 1.2.2): Environment variable management - Python (
>= 3.13): Latest stable version
AmazonNovaHack/
βββ src/
β βββ agents/ # Multi-agent implementations
β β βββ __init__.py # Export all agents
β β βββ cloud_architect.py # Question generation
β β βββ browser.py # Web search execution
β β βββ research_analyst.py # Knowledge extraction
β β βββ solution_architect.py # Architecture design
β β βββ blueprint_generator.py # ASCII diagrams
β β βββ designer.py # Mermaid conversion
β β βββ summarizer.py # Markdown compilation
β β βββ cost_estimator.py # Cost projection
β β
β βββ models/ # Data models & schemas
β β βββ __init__.py
β β βββ state.py # SolutionState (TypedDict)
β β βββ schemas.py # Pydantic schemas for output validation
β β
β βββ tools/ # Utility tools
β β βββ __init__.py
β β βββ search.py # DuckDuckGo web search wrapper
β β
β βββ nova/ # Voice & Amazon Nova integration
β β βββ __init__.py
β β βββ voice_intake.py # Real-time voice session management
β β
β βββ frontend/ # Frontend UI
β β βββ index.html # Web interface
β β
β βββ config.py # LLM configuration (micro, lite, pro)
β βββ graph.py # LangGraph orchestration logic
β βββ server.py # FastAPI backend
β βββ langgraph.json # LangGraph metadata
β
βββ pyproject.toml # Python project config
βββ .env.example # Environment template
βββ README.md # Project documentation (this file)
βββ uv.lock # Dependency lock file
-
config.py: Initializes three LLM instances with different capabilities:micro_llm: Pure text generation for routing/classificationfast_llm: Lightweight analysis and question generationllm: Deep work for architecture design and analysis- Handles OpenAI-compatible API endpoints and custom HTTP client (disables compression)
-
graph.py:- Defines
SolutionStatetype - Implements
router()conditional edge for entry point selection - Implements
guardian()conditional edge for research loop termination - Constructs full LangGraph workflow
- Provides invocation helpers:
build_solution(),request_change(),resume_cost_estimation() - Exports compiled
studio_graph
- Defines
-
server.py:- FastAPI backend with CORS middleware
- Endpoints:
POST /api/build- Initial architecture buildPOST /api/change- Request architecture changePOST /api/cost- Resume cost estimationPOST /api/intake/start- Start voice sessionGET /api/intake/status- Poll voice completionPOST /api/intake/cancel- Cancel voice sessionGET /- Serve frontend
- Session management with debug replay capabilities
- Lifespan management for graph initialization
-
models/state.py: TypedDict definition ofSolutionStatewith all fields documented -
models/schemas.py: Pydantic schemas for output validation:AnalysisResult: Services and findings from researchServiceDetails: Role and justification for each AWS serviceSolutionArchitecture: Complete architecture structurePricePlan: Individual pricing tierPricingModel: Complete cost breakdown
- Each agent file exports a single function with signature:
def agent_name(state: SolutionState) -> dict
- Returns partial state update (only modified fields)
- LangGraph merges updates back into global state
-
tools/search.py: DuckDuckGo search wrapperweb_search(query, max_results=2)β List[str]- Returns "Title: Body" formatted results
- Error handling with logging
-
nova/voice_intake.py: Real-time voice session managementVoiceIntakeclass with methods:start()- Begin listening with Nova Sonic audio processingget_description()- Block until session ends and return textcancel()- Discard current session
- Audio constants: 24kHz sample rate, mono, 1024-byte chunks
- System instruction: Prompt Nova to collect project description concisely
- Threading-based playback and recording
frontend/index.html: Web UI (minimal template provided).env.example: Template for environment variableslanggraph.json: LangGraph metadata for studio/deploymentpyproject.toml: Project metadata and dependencies
# Start the server
uvicorn server:app --reload --port 8000
# In client code or via API:
curl -X POST http://localhost:8000/api/build \
-H "Content-Type: application/json" \
-d '{"description": "E-commerce platform with millions of daily users"}'curl -X POST http://localhost:8000/api/change \
-H "Content-Type: application/json" \
-d '{"update_request": "Add real-time notifications using WebSockets"}'# Start voice session
curl -X POST http://localhost:8000/api/intake/start
# Poll for completion
curl http://localhost:8000/api/intake/status
# Once complete, system extracts description and calls build- State Management: Maintains complex state across 8+ agents without spaghetti code
- Declarative Graph: Clear visualization of data flow and agent relationships
- Conditional Routing: Intelligent edge selection based on state (Guardian pattern)
- Checkpointing: Enables human-in-the-loop interrupts and resumable execution
- Production Ready: Built for scalable multi-agent systems
- Cost Optimization: Use micro for routing (cheapest), pro for deep work (necessary)
- Latency Optimization: Lightweight tasks don't need pro's compute
- Quality vs. Speed: Match model capability to task difficulty
- Completeness: First search rarely finds all relevant AWS services and constraints
- Guided Iteration: Guardian ensures we stop when sufficient information is gathered
- Cost Effective: Early termination if findings are rich
- Reliability: Pydantic validation catches LLM hallucinations and formatting errors
- Debuggability: Clear error messages when LLM output doesn't match schema
- Recovery: Allows retry logic within nodes
- Accuracy: Each agent specializes in one task, reducing cross-talk hallucination
- Composability: Agents are reusable and testable in isolation
- Maintenance: Changes to research don't affect design logic
Built with β€οΈ using Amazon Nova and LangGraph