Skip to content

korkridake/GenAIOps-OSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GenAIOps-OSS

A production-ready, open-source LLMOps stack template that combines:

  • LiteLLM β€” unified LLM API gateway, virtual keys, cost allocation, model access management
  • Langfuse β€” LLM observability, evaluation, prompt management, and dataset creation

Deploy once, connect any LLM provider, control costs and access, and get full observability β€” all from a single docker compose up.


Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Your Application                            β”‚
β”‚          (uses standard OpenAI SDK pointed at LiteLLM)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚ OpenAI-compatible API  (port 4000)
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      LiteLLM Proxy                                   β”‚
β”‚  β€’ Unified API for OpenAI / Azure / Anthropic / Ollama / …          β”‚
β”‚  β€’ Virtual keys & team budgets                                       β”‚
β”‚  β€’ Model access control & rate limiting                              β”‚
β”‚  β€’ Cost tracking & spend logs                                        β”‚
β”‚  β€’ Redis caching                                                     β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
      β”‚ forwards requests        β”‚ spend / metrics               β”‚ traces
      β–Ό                          β–Ό                               β–Ό
  LLM Providers            PostgreSQL (litellm db)        Langfuse Server
  (OpenAI, Azure,                                              (port 3000)
   Anthropic, Ollama)                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                                          β”‚  Langfuse UI   β”‚
                                                          β”‚  β€’ Traces      β”‚
                                                          β”‚  β€’ Evaluations β”‚
                                                          β”‚  β€’ Prompts     β”‚
                                                          β”‚  β€’ Datasets    β”‚
                                                          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                                  β”‚
                                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                      β–Ό                           β–Ό               β–Ό
                               PostgreSQL                    ClickHouse         MinIO
                              (langfuse db)                (analytics store) (blob store)

Services at a glance

Service Image Default Port Purpose
litellm ghcr.io/berriai/litellm:main-latest 4000 LLM API gateway
langfuse-server langfuse/langfuse:3 3000 Observability UI & API
langfuse-worker langfuse/langfuse-worker:3 β€” Background job processor
postgres postgres:16-alpine 5432 (internal) Relational store
clickhouse clickhouse/clickhouse-server:24.12 8123 (internal) Analytics / event store
redis redis:7-alpine 6379 (internal) Cache & queue
minio minio/minio:latest 9000 / 9001 S3-compatible blob storage

Quick Start

Prerequisites

  • Docker β‰₯ 24 and Docker Compose v2
  • An API key for at least one LLM provider (OpenAI, Azure, Anthropic, or a local Ollama install)

1 β€” Clone and configure

git clone https://github.com/your-org/GenAIOps-OSS.git
cd GenAIOps-OSS
cp .env.example .env

Edit .env and fill in your real values:

# Required β€” change ALL placeholder values
LITELLM_MASTER_KEY=sk-litellm-your-secret-key
LITELLM_SALT_KEY=a-random-32-character-string-here
LANGFUSE_NEXTAUTH_SECRET=another-32-char-random-string
LANGFUSE_SALT=yet-another-random-salt

# At least one LLM provider key
OPENAI_API_KEY=sk-...

Security: never commit .env to version control. The .gitignore already excludes it.

2 β€” Start the stack

docker compose up -d

First run downloads all images (~3 GB) and runs database migrations β€” allow ~2 minutes.

Check that everything is healthy:

docker compose ps

3 β€” Access the services

Service URL Default credentials
LiteLLM API http://localhost:4000 Bearer LITELLM_MASTER_KEY
LiteLLM UI http://localhost:4000/ui admin / LITELLM_MASTER_KEY
Langfuse UI http://localhost:3000 Create account on first visit
MinIO Console http://localhost:9001 MINIO_ROOT_USER / password

4 β€” Run the demo app

cd app
pip install -r requirements.txt
python main.py

Creating Virtual Keys for Cost Allocation

Virtual keys let you assign budgets and track spend per team, project, or user.

# Create a virtual key for a team with a monthly $50 budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "team_id": "team-engineering",
    "max_budget": 50,
    "budget_duration": "monthly",
    "models": ["gpt-4o-mini", "gpt-3.5-turbo"],
    "metadata": {"project": "customer-chat"}
  }'

See docs/cost-allocation.md for full details.


Model Access Management

Control which models each virtual key or team can access:

# Create a read-only research key restricted to cheap models
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": ["gpt-4o-mini", "claude-3-haiku"],
    "tpm_limit": 100000,
    "rpm_limit": 100
  }'

See docs/model-access-management.md for full details.


Observability

Every request through LiteLLM is automatically traced in Langfuse. Open http://localhost:3000 to see:

  • Traces β€” full request/response detail per call
  • Metrics β€” latency, token usage, cost over time
  • Evaluations β€” attach human or LLM-generated quality scores
  • Sessions β€” group traces into user sessions

See docs/observability.md for full details.


Project Structure

GenAIOps-OSS/
β”œβ”€β”€ docker-compose.yml          # Orchestrates all services
β”œβ”€β”€ .env.example                # Environment variables template
β”œβ”€β”€ litellm/
β”‚   β”œβ”€β”€ Dockerfile              # Extends LiteLLM image with config
β”‚   └── config.yaml             # LiteLLM proxy configuration
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py                 # Demo script (run with python main.py)
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   β”œβ”€β”€ llm_client.py       # OpenAI client factory for LiteLLM proxy
β”‚   β”‚   β”œβ”€β”€ tracing.py          # Langfuse tracing helpers
β”‚   β”‚   └── cost_tracker.py     # LiteLLM spend API client
β”‚   └── examples/
β”‚       β”œβ”€β”€ chat_completion.py  # Multi-model chat example
β”‚       β”œβ”€β”€ evaluation.py       # LLM-as-a-judge evaluation
β”‚       └── prompt_management.py# Langfuse prompt CRUD
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ conftest.py             # pytest fixtures
β”‚   β”œβ”€β”€ test_utils.py           # Unit tests (no external services)
β”‚   └── requirements.txt
β”œβ”€β”€ scripts/
β”‚   └── init-postgres.sh        # Creates litellm + langfuse databases
└── docs/
    β”œβ”€β”€ architecture.md
    β”œβ”€β”€ configuration.md
    β”œβ”€β”€ cost-allocation.md
    β”œβ”€β”€ model-access-management.md
    β”œβ”€β”€ observability.md
    β”œβ”€β”€ evaluation.md
    └── prompt-management.md

Configuration Reference

See docs/configuration.md for a full reference of litellm/config.yaml.


Running Tests

pip install -r tests/requirements.txt -r app/requirements.txt
pytest tests/ -v

Tests are fully offline β€” all external services are mocked.


Contributing

  1. Fork the repository and create a feature branch.
  2. Make your changes and add tests where appropriate.
  3. Run pytest tests/ -v to verify all tests pass.
  4. Open a pull request with a clear description of the change.

Please keep commits focused: one logical change per commit.


License

MIT

About

A unified handbook for building, deploying and understanding LLM agents and the wider ecosystem πŸ”₯

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors