A hexagonal architecture-based Retrieval-Augmented Generation (RAG) conversational AI system designed for maximum portability, testability, and scalability. Built with Go, the system can run entirely locally with minimal dependencies or deploy to full serverless cloud environments without changing the core application logic.
- 🏗️ Hexagonal Architecture: Clean separation between business logic and infrastructure
- 🔄 Multi-Environment Deployment: Same codebase runs locally (Docker + SQLite + NATS) or in cloud (AWS Lambda + SQS + OpenSearch)
- 🤖 OpenAI-Compatible: Works with Ollama, LM Studio, OpenAI, and other compatible APIs
- 🔧 Tool Support: MCP (Model Context Protocol) compatible tool system
- 💬 Real-time Communication: WebSocket support for live conversation updates
- 📊 Context Management: Intelligent conversation context construction with token management
- 🗃️ Flexible Storage: SQLite for local development, designed for easy swap to production databases
- ⚡ Event-Driven: NATS-based messaging for scalable service communication
- 🎯 Type-Safe: Full Go type safety with comprehensive interfaces
- Go 1.21+
- Make (optional, for convenience commands)
- Clone and setup:
git clone <repository-url>
cd hexarag
make deps- Run database migrations:
make migrate- Start the server:
make runThe server will start on http://localhost:8080 with:
- REST API at
/api/v1/* - WebSocket endpoint at
/ws - Web interface at
/(when implemented)
make docker-runThis starts all services including NATS server and Ollama (when Docker Compose is implemented).
HexaRAG follows strict hexagonal architecture principles:
- Entities:
Message,Conversation,SystemPrompt,ToolCall - Services:
ContextConstructor,InferenceEngine - Ports: Interfaces defining contracts with external systems
- Storage: SQLite adapter (swappable with PostgreSQL, etc.)
- Messaging: NATS adapter (swappable with SQS, Redis, etc.)
- LLM: OpenAI-compatible adapter (works with Ollama, LM Studio, OpenAI)
- Tools: MCP time server (extensible to any MCP-compatible tools)
- API: HTTP/WebSocket adapters
hexarag/
├── cmd/
│ ├── server/ # Main application entry point
│ └── migrate/ # Database migration utility
├── internal/
│ ├── domain/ # Core business logic (the hexagon)
│ │ ├── entities/ # Domain entities
│ │ ├── services/ # Business services
│ │ └── ports/ # Interface contracts
│ └── adapters/ # Infrastructure implementations
│ ├── storage/ # Database adapters
│ ├── messaging/ # Event bus adapters
│ ├── llm/ # Language model adapters
│ ├── tools/ # Tool execution adapters
│ └── api/ # HTTP/WebSocket adapters
├── pkg/ # Shared utilities
├── deployments/ # Deployment configurations
├── web/ # Frontend assets
└── docs/ # Documentation
Configuration is managed via YAML files and environment variables:
# deployments/config/config.yaml
server:
port: 8080
host: "0.0.0.0"
llm:
provider: "openai-compatible"
base_url: "http://localhost:11434/v1" # Ollama
model: "llama2"
max_tokens: 4096
temperature: 0.7
nats:
url: "nats://localhost:4222"
jetstream:
enabled: true
retention_days: 7
database:
path: "./data/hexarag.db"
tools:
mcp_time_server:
enabled: true
timezones: ["UTC", "America/New_York", "Europe/London"]Override with environment variables:
export HEXARAG_LLM_BASE_URL="http://localhost:1234/v1" # LM Studio
export HEXARAG_LLM_MODEL="llama-3.2-3b"Conversations:
GET /api/v1/conversations- List conversationsPOST /api/v1/conversations- Create conversationGET /api/v1/conversations/{id}- Get conversationPUT /api/v1/conversations/{id}- Update conversationDELETE /api/v1/conversations/{id}- Delete conversation
Messages:
GET /api/v1/conversations/{id}/messages- Get messagesPOST /api/v1/conversations/{id}/messages- Send message
System Prompts:
GET /api/v1/system-prompts- List system promptsPOST /api/v1/system-prompts- Create system promptGET /api/v1/system-prompts/{id}- Get system promptPUT /api/v1/system-prompts/{id}- Update system promptDELETE /api/v1/system-prompts/{id}- Delete system prompt
Connect to /ws?conversation_id={id} for real-time updates:
const ws = new WebSocket('ws://localhost:8080/ws?conversation_id=conv123');
ws.onmessage = (event) => {
const message = JSON.parse(event.data);
console.log('Received:', message.type, message.data);
};make build # Build binaries
make run # Run development server
make test # Run tests
make test-coverage # Run tests with coverage
make migrate # Run database migrations
make fmt # Format code
make lint # Lint code (requires golangci-lint)
make docker-build # Build Docker image
make docker-run # Run with Docker Compose
make dev-setup # Setup development environment
make help # Show all commands- Define the port interface in
internal/domain/ports/ - Implement the adapter in
internal/adapters/ - Wire it up in
cmd/server/main.go
Example: Adding PostgreSQL support:
// internal/adapters/storage/postgres/postgres_adapter.go
type Adapter struct { /* ... */ }
func (a *Adapter) SaveMessage(ctx context.Context, msg *entities.Message) error {
// PostgreSQL implementation
}The hexagonal architecture makes testing straightforward:
// Test business logic with mocks
mockStorage := &MockStoragePort{}
contextConstructor := services.NewContextConstructor(mockStorage, ...)
// Test adapters with real dependencies
sqliteAdapter := sqlite.NewAdapter(":memory:", "")- Storage: SQLite file database
- Messaging: Local NATS server
- LLM: Ollama or LM Studio
- Tools: Built-in MCP time server
Self-Hosted (Kubernetes):
- Storage: PostgreSQL
- Messaging: NATS cluster
- LLM: Self-hosted models or API services
- Deployment: Helm charts (planned)
Cloud-Native (AWS Serverless):
- Storage: Amazon RDS or DynamoDB
- Messaging: Amazon SQS + EventBridge
- LLM: Amazon Bedrock or OpenAI
- Deployment: AWS SAM (planned)
- Hexagonal architecture core
- Local development stack
- Basic conversation management
- OpenAI-compatible LLM integration
- MCP tool system
- Real-time WebSocket communication
- Vector storage integration (ChromaDB/OpenSearch)
- Semantic search capabilities
- Document ingestion pipeline
- Advanced context construction
- AWS Lambda adapters
- Kubernetes deployment
- Production monitoring
- Multi-tenancy support
- Authentication and authorization
- Rate limiting and quotas
- Advanced tool ecosystem
- Analytics and insights
# Run all tests
make test
# Run with coverage
make test-coverage
# Test specific package
go test ./internal/domain/services -v- Architecture Guide - Detailed hexagonal architecture explanation
- Deployment Guide - Production deployment strategies
- API Documentation - Complete API reference
- Contributing Guide - How to contribute
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Inspired by hexagonal architecture principles
- Built for the Model Context Protocol (MCP) ecosystem
- Designed for the modern AI application landscape