This document describes the implementation of a scalable multi-agent system where each Recursor hackathon participant is represented by an Agent Stack containing 4 specialized sub-agents that work together autonomously.
Each participant = 1 agent stack with 4 sub-agents:
-
Planner Agent
- Strategic planning and roadmap creation
- Todo management and prioritization
- Decision-making about project direction
- Receives and acts on Reviewer's advice
-
Builder Agent
- Executes todos from the Planner
- Builds working prototypes
- Generates single-file HTML/JS applications
- Manages artifact versioning
-
Communicator Agent
- Handles inter-agent messaging
- Processes broadcasts and direct messages
- Collects feedback from other agents and visitors
- Summarizes insights for the Reviewer
-
Reviewer Agent
- Analyzes project progress and quality
- Reviews feedback collected by Communicator
- Generates recommendations for the Planner
- Identifies risks and opportunities
- Language: TypeScript
- Package Manager: pnpm (Turborepo workspace)
- Location:
packages/agent-engine/
- Convex (Primary Backend - Sponsor)
- Real-time reactive database
- Automatic client synchronization (<1s latency)
- TypeScript-first with auto-generated types
- Built-in serverless functions
- Scheduled functions for agent tick loops
- No infrastructure management
- Groq: Primary provider (fast inference, cost-effective)
- OpenAI: Fallback for complex reasoning
- Gemini: Alternative provider for diversity
- Custom memory with Convex
- Short-term:
current_contextfield (active task, recent messages, focus) - Long-term:
memoryfield (learned facts, accumulated learnings) - Real-time sync across all agents
- No external memory service needed (mem0 optional for future)
- Short-term:
agent_stacks
- One per participant
- Fields: participant_name, phase, created_at
agent_states
- 4 per stack (one for each agent type)
- Fields: stack_id, agent_type, memory (long-term), current_context (short-term), updated_at
- Index: by_stack
project_ideas
- Mapped 1:1 to each participant
- Fields: stack_id, title, description, status, created_by, created_at
- Index: by_stack
todos
- Scoped to each participant via stack_id
- Fields: stack_id, content, status, assigned_by, priority, created_at, completed_at
- Indexes: by_stack, by_status
messages
- Global broadcasts (to_stack_id: null) OR direct participant-to-participant
- Fields: from_stack_id, to_stack_id, from_agent_type, content, message_type, read_by[], created_at
- Indexes: by_recipient, by_sender, broadcasts
artifacts
- Build outputs linked to participant
- Fields: stack_id, type, version, content, url, metadata, created_at
- Index: by_stack
- Versioning: auto-incremented per stack
agent_traces
- Observability logging
- Fields: stack_id, agent_type, thought, action, result, timestamp
- Indexes: by_stack, by_time
-
Global Message Buffer (Broadcasts)
- Any agent can post to global channel
- Stored with
to_stack_id: nullandmessage_type: 'broadcast' - All agents can query broadcasts using Convex index
- Use case: General announcements, hackathon-wide updates
-
Participant-to-Participant Direct Messages
- Messages with specific
to_stack_id - Indexed by recipient for fast querying
read_byarray tracks which agents have seen it- Use case: Collaboration requests, specific feedback
- Messages with specific
// Communicator reads unread messages
const broadcasts = await messaging.getBroadcasts(stackId);
const directs = await messaging.getDirectMessages(stackId);
// Process and respond
await messaging.sendBroadcast(stackId, agentType, response);
// OR
await messaging.sendDirect(fromStackId, toStackId, agentType, message);
// Mark as read
await messaging.markAsRead(messageId, stackId);- Approach: Builder generates complete single-file applications
- Format: HTML with inline CSS and JavaScript
- Storage: Directly in Convex
artifacts.contentfield - Hosting: Can be served from Convex or via data URLs for iframe embedding
- Benefits:
- Simple to generate with LLMs
- Easy to version and diff
- No external dependencies
- Fast iteration
- External platforms: v0.dev, Lovable, Replit
- Multi-file projects stored as zip or separate entries
- Rich metadata tracking (tech stack, build time, dependencies)
- Each build creates new artifact entry with incremented
version - Query latest:
artifacts.withIndex("by_stack").filter(stack_id).order("desc") - Enables rollback and comparison
Each tick executes agents in sequence:
async tick() {
// 1. Planner evaluates state, creates/updates plan
const plannerThought = await planner.think();
// 2. Builder executes next todo item
const builderResult = await builder.think();
// 3. Communicator checks for messages, responds
const communications = await communicator.think();
// 4. Reviewer analyzes communications, advises planner
const review = await reviewer.think();
const recommendations = await reviewer.getRecommendationsForPlanner();
await planner.receiveAdvice(recommendations.join('\n'));
// 5. Log everything to observability
// (each agent logs its own traces)
}- Scheduled via Convex scheduled functions OR
- Manual loop with configurable interval (default: 5000ms)
- Graceful shutdown handling
- Error recovery and logging
Project Ideas ←──→ Planner ──→ Todos
↓ ↑
Advice (Builder reads)
↑ ↓
Reviewer Builder ──→ Artifacts
↑
Feedback
↑
Communicator ←──→ Messages
- Each agent can update its own memory (facts, learnings)
- Each agent can update its own context (active task, focus, recent messages)
- Reviewer's recommendations stored in Planner's context
- Communicator's feedback summaries stored for Reviewer
- No separate API needed - Convex functions ARE the API
- Real-time by default - Convex subscriptions
- Key functions:
traces.log(): Log agent thoughts and actionstraces.list(): Query traces for a stacktraces.getRecent(): Recent traces across all stacksagents.getStack(): Current state snapshotmessages.getTimeline(): Message history
- Next.js + React + Convex React hooks
- Views:
- Live Feed: Auto-updating stream of all agent activity
- Agent Detail: Per-participant drill-down with 4 sub-agents
- Message Flow: Visual graph of message exchanges
- State Inspector: Current ideas, todos, memory, context
- Real-time updates via Convex subscriptions (no manual WebSockets)
convex/
schema.ts # Complete database schema
agents.ts # Agent stack CRUD operations
messages.ts # Messaging functions
artifacts.ts # Artifact management
todos.ts # Todo operations
project_ideas.ts # Project idea management
traces.ts # Observability trace logging
tsconfig.json # TypeScript config
packages/agent-engine/
src/
agents/
base-agent.ts # Base class with shared functionality
planner.ts # Planner agent implementation
builder.ts # Builder agent implementation
communicator.ts # Communicator agent implementation
reviewer.ts # Reviewer agent implementation
memory/
convex-memory.ts # Memory provider using Convex
messaging/
convex-messages.ts # Messaging provider using Convex
artifacts/
html-builder.ts # HTML/JS generator using LLMs
orchestrator.ts # Coordinates all 4 agents
config.ts # LLM provider configuration
cli.ts # CLI for testing
index.ts # Package exports
package.json # Dependencies & scripts
tsconfig.json # TypeScript config
README.md # Documentation
cd packages/agent-engine
pnpm cli create "ParticipantName"pnpm cli listpnpm cli run <stack_id> [max_ticks] [interval_ms]
# Example: pnpm cli run abc123 20 3000pnpm cli status <stack_id>Required environment variables:
CONVEX_URL=https://your-deployment.convex.cloud
NEXT_PUBLIC_CONVEX_URL=https://your-deployment.convex.cloud
GROQ_API_KEY=your-groq-key
OPENAI_API_KEY=your-openai-key
GEMINI_API_KEY=your-gemini-key
- 1 participant
- Manual tick execution via CLI
- Validate full ideation → build → demo cycle
- Test messaging between agents
- Verify artifact generation
- Deploy 10 agent stacks
- Parallel execution
- Load test Convex and LLM providers
- Monitor costs (Groq, Convex)
- Tune tick rates based on performance
- Convex scheduled functions for automation
- Graceful degradation if rate limits hit
- Cost controls and budget monitoring
- Observability dashboard for real-time monitoring
- Real-time is primary requirement: Convex built for this
- <1s latency: Optimistic updates built-in
- TypeScript-first: Auto-generated types, no manual API
- Simpler for MVP: No WebSocket setup, no manual sync
- Developer velocity: 1-2 days vs 4-5 days for core features
- Fastest inference (critical for tight agent loops)
- Cost-effective ($0.10/1M tokens)
- Good for high-frequency operations (Planner, Communicator)
- Automatic fallback to OpenAI for complex tasks (Builder)
- Start with custom Convex-based memory
- Simpler, TypeScript-native
- Leverages Convex's real-time capabilities
- Sufficient for MVP
- Add mem0 later if needed
- Only for sophisticated extraction/consolidation
- Currently Python-focused (integration complexity)
- Smithery can host custom MCP servers
- Primary messaging via Convex (simpler, faster)
- Can add MCP later for advanced tool orchestration
- Groq: ~$5-10 (10K tokens/agent/hour @ $0.10/1M tokens)
- Convex: Free tier covers MVP; paid tier ~$5-15
- Smithery (if used): Likely free for hackathon
- Total: <$30 for 10-agent test run
- Groq: ~$50-100
- Convex: ~$50-150 depending on usage
- Total: ~$100-250 for 8-hour hackathon simulation
✓ Single agent stack completes ideation → build → demo cycle autonomously ✓ Communicator handles both broadcasts AND direct messages correctly ✓ Builder generates valid single HTML/JS file artifacts ✓ Reviewer influences Planner's decisions based on feedback ✓ Observability traces show real-time agent thoughts/actions ✓ Project ideas and todos correctly mapped to each participant ✓ 10 agent stacks run concurrently without crashes ✓ <1s latency for state updates ✓ Total cost < $30 for 10-agent, 8-hour simulation
-
Complete Convex Setup
- Run
npx convex devto create deployment - Push schema to Convex
- Set up environment variables
- Run
-
Install Dependencies
pnpm install
-
Test Single Agent
cd packages/agent-engine pnpm cli create "TestAgent" pnpm cli run <stack_id> 5 5000
-
Build Observability Dashboard
- Create
apps/observability-dashboardNext.js app - Integrate Convex React hooks
- Build live feed and agent detail views
- Create
-
Scale to 10 Agents
- Create 10 agent stacks
- Run in parallel
- Monitor performance and costs
-
Iterate and Optimize
- Tune agent prompts based on output quality
- Adjust tick rates for optimal performance
- Add more sophisticated planning logic
- Enhance HTML builder with better templates
- Verify
CONVEX_URLis set correctly - Check Convex deployment status:
npx convex dev - Regenerate types:
npx convex codegen
- Check API keys are valid
- Monitor rate limits (Groq has limits)
- Fallback to OpenAI if Groq fails
- Add exponential backoff for retries
- Check traces in Convex dashboard
- Verify todos are being created
- Check agent memory/context for issues
- Adjust agent prompts if reasoning is off
- PRD: /docs/plans/prd.md
- Backend Recommendation: /docs/guides/backend-recommendation.md
- Convex Documentation
- Groq Documentation
- Mastra Framework (optional, not used in initial implementation)