An open-source AI application that makes German parliamentary documents accessible, searchable, and understandable for everyone.
Note: This project is for educational and research purposes. Please respect the Bundestag DIP API terms of service. The code was developed with GenAI support β not intended for production use without review.
β³ Cold start: The demo runs on Azure Container Apps with scale-to-zero enabled to minimize costs. The first request after inactivity may take 30β60 seconds while the container starts up.
Citizens, journalists, and researchers can ask questions in plain language about German parliamentary activity and receive:
- AI-powered summaries of 50+ page legal documents β explained like a journalist would
- Citizen impact analysis β what does this law mean for everyday people?
- External media coverage β AI-curated news links from major German outlets (optional, via OpenAI web search)
- Structured search results across VorgΓ€nge (procedures), Drucksachen (documents), and Plenarprotokolle (transcripts)
- Bilingual support β German and English interface with LLM-based title translation
- Full transparency β see which tools the AI calls, what data it fetches, and how long each step takes
| Audience | Value |
|---|---|
| π§βπ€βπ§ Citizens | Understand complex legislation without legal expertise. Direct access to government decisions. |
| π° Journalists | Quickly find and summarize parliamentary activity across legislative periods. |
| π Researchers | Search structured parliamentary data with filters for party, document type, and date ranges. |
| ποΈ Government | Demonstrate transparent, digital-first citizen engagement. |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Browser (SPA) β
β Vanilla JS Β· SSE Streaming Β· Reasoning Panel Β· i18n β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β POST /api/chat/stream
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ
β FastAPI Backend (Python) β
β β
β βββββββββββββββ ββββββββββββββββ ββββββββββββββββββ β
β β Phase 1: β β Tool β β Phase 2: β β
β β LLM decides βββΆβ Dispatcher βββΆβ LLM generates β β
β β which tools β β (10 tools) β β answer β β
β βββββββββββββββ ββββββββ¬ββββββββ βββββββββ¬βββββββββ β
β β β β
β βββββββββββββββββββββββββΌβββββββββββββββββββΌβββββββββ β
β β Security: Rate Limit Β· Input Validation Β· CSP β β
β β Security Headers Β· CORS Β· HSTS (opt) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
ββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββββββ βββββββββββββ βββββββββββββββββββββββββ
β OpenAI GPT-5 β β OpenAI β β Bundestag DIP API β
β mini (400K ctx) β β Web Searchβ β search.dip.bundestag β
β Function calling β β (optional)β β .de/api/v1 β
ββββββββββββββββββββ βββββββββββββ βββββββββββββββββββββββββ
| Decision | Rationale |
|---|---|
| Single-file HTML UI | Zero build tools, instant reload, no npm/webpack complexity |
| SSE streaming | Real-time token-by-token display with reasoning transparency |
| Server-side table formatting | Search results bypass the LLM entirely β faster, cheaper |
| 2-phase LLM approach | Phase 1 (compact) picks tools; Phase 2 (task-specific) generates answer |
| MCP + REST dual interface | Same tools exposed as both OpenAI functions and MCP protocol |
| OpenAI Web Search | Optional news coverage via Responses API β runs parallel to Phase 2, zero added latency |
User types: "Welche Klimaschutzgesetze wurden 2026 verabschiedet?"
β
βΌ
1. POST /api/chat/stream { message, history, language }
β
βΌ
2. Phase 1 LLM (GPT-5 mini, compact prompt)
β Decides: call search_vorgaenge(query="Klimaschutz", vorgangstyp="Gesetzgebung", date_from="2026-01-01")
β
βΌ
3. Tool Dispatcher executes async DIP API calls
β GET /vorgang?f.suche=Klimaschutz&f.vorgangstyp=Gesetzgebung&f.datum.start=2026-01-01
β Results cached (SHA-256 key, 1h TTL, 256 entries)
β
βΌ
4. Results formatted as Markdown table (search-only fast path)
β No Phase 2 LLM call needed for pure searches
β Each row has: [π DIP] [π€ AI Summary] [π€ Citizen Impact] links
β
βΌ
5. SSE events streamed to browser:
β "model_thinking" β "tool_call" β "tool_result" β "content" β "done"
β Reasoning panel shows each step with live timers
When user clicks "π€ AI Summary":
6. New chat message: "Fasse den Vorgang 332067 zusammen (ID:332067)"
β
βΌ
7. Phase 1 β calls get_vorgang_details(332067) + fetches Drucksache/Plenarprotokoll text
β
βΌ
8. Phase 2 (Summary prompt) β Journalistic explanation of the law's substance:
problem it solves, real-world significance, key actors, financial impact
β
β (parallel) Web Search β OpenAI Responses API with domain-filtered
β news search β appends "π° External Media Coverage" section
βΌ
9. Complete response with AI summary + optional news links
| Layer | Implementation |
|---|---|
| Rate Limiting | 30 requests/minute per IP (in-memory, sliding window) |
| Input Validation | Message: max 10,000 chars Β· History: max 50 messages Β· Language: de|en only |
| Security Headers | X-Frame-Options: DENY Β· X-Content-Type-Options: nosniff Β· Referrer-Policy Β· Permissions-Policy |
| CSP | default-src 'self' Β· frame-ancestors 'none' Β· img-src 'self' data: https: |
| HSTS | Opt-in via ENABLE_HSTS=true (recommended for production) |
| XSS Prevention | No inline onclick handlers β delegated event listeners with data-* attributes Β· URL scheme validation blocks javascript:/data: |
| CORS | Configurable origins (no wildcard) Β· Defaults to localhost |
| Error Sanitization | Internal errors never exposed to client β generic messages only |
| XSRF Protection | Enabled in Streamlit config |
| Docker | Non-root appuser Β· Minimal base image (python:3.11-slim) |
| Azure Secrets | Managed identity for ACR Β· @secure() Bicep parameters Β· No admin credentials |
- Python 3.11+
- OpenAI API key (GPT-5 mini access)
- Bundestag DIP API key (free registration)
# 1. Clone and setup
git clone https://github.com/ROBROICH/bundestag-rag-public.git
cd bundestag-rag-public
# 2. Create virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements-mcp.txt
# 3. Configure environment
cp env.template .env
# Edit .env with your API keys
# 4. Start the Chat App
python -m uvicorn src.chat.app:app --host 127.0.0.1 --port 8000 --reloadAccess at: http://localhost:8000
docker build -f deployment/docker/Dockerfile.mcp -t bundestag-chat .
docker run -p 8000:8000 --env-file .env bundestag-chat.\deployment\azure\deploy-container-apps.ps1 `
-ResourceGroup "rg-bundestag" `
-Location "westeurope"See Azure Deployment Docs for full guide.
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
β | OpenAI API key with GPT-5 mini access |
BUNDESTAG_API_KEY |
β | DIP API key (free) |
ALLOWED_ORIGINS |
β | CORS origins (default: localhost:8000) |
ENABLE_HSTS |
β | Enable Strict-Transport-Security (default: false) |
ENABLE_WEB_SEARCH |
β | Enable news coverage via OpenAI web search in summaries (default: false) |
MCP_HOST |
β | Bind address (default: 127.0.0.1) |
LOG_LEVEL |
β | Logging level (default: INFO) |
bundestag-rag-api/
βββ src/
β βββ chat/
β β βββ app.py # FastAPI backend β endpoints, tools, prompts, security
β β βββ static/
β β βββ index.html # Single-file chat UI (vanilla JS, SSE streaming)
β βββ mcp/
β β βββ server.py # MCP server (10 tools via FastMCP SSE transport)
β βββ web/
β βββ openai_config.py # Model configuration (GPT-5 mini, token limits)
βββ config/
β βββ settings.py # API URLs, timeouts, cache settings
βββ deployment/
β βββ docker/ # Dockerfile, Dockerfile.mcp, Dockerfile.optimized
β βββ azure/ # Bicep template + deploy script for Container Apps
β βββ local/ # docker-compose.yml for local dev
βββ main.py # Entry point (chat/mcp/streamlit modes)
βββ requirements.txt # Full dependencies (pinned versions)
βββ requirements-mcp.txt # Minimal production dependencies
βββ env.template # Environment variable template
The LLM has access to 10 tools that query the official Bundestag DIP API:
| Tool | Description |
|---|---|
search_vorgaenge |
Search parliamentary procedures (filter by party, type, date, legislative period) |
search_drucksachen |
Search official printed documents |
search_plenarprotokolle |
Search plenary session transcripts |
get_vorgang |
Get procedure metadata by ID |
get_vorgang_details |
Get full procedure with linked Drucksache text |
get_drucksache |
Get document metadata |
get_drucksache_text |
Get full document text (up to 250K chars) |
get_plenarprotokoll_text |
Get full plenary transcript text |
search_personen |
Search Bundestag members by name |
search_aktivitaeten |
Search parliamentary activities (speeches, votes, motions) |
These same tools are exposed via MCP protocol at /mcp for integration with other LLM clients.
| Feature | Description |
|---|---|
| π Language Toggle | German / English with LLM-based table translation |
| π§ Reasoning Panel | Real-time display of tool calls, timing, and intermediate results |
| π Search Tables | Server-formatted Markdown tables with DIP links, PDF links, AI Summary & Citizen Impact actions |
| πΊοΈ Guided Search | Topic cards β Party filter β Document type β Auto-generated query |
| π Wahlperiode Slider | Filter by legislative period (WP 1β21, 1949β2026+) |
| π¬ Streaming Responses | Token-by-token display with live progress indicators |
| π Action Tiles | Follow-up buttons: π DIP / π€ AI Summary / π€ Citizen Impact |
| π° Media Coverage | Optional news links from curated German outlets appended to AI summaries |
This project is licensed under the MIT License.
Note: This project is for educational and research purposes. Please respect the Bundestag DIP API terms of service.
For deployment issues, see the Azure Deployment Documentation troubleshooting section.