A production-ready NLP-to-SQL agent that converts natural language queries to SQL, validates and executes them, and generates insights. Three orchestration modes with full tracing, reasoning logs, and safety enforcement.
app.py— Streamlit UI with DB selector, semantic builder, orchestration mode radio, results displayagentic_pipeline.py— Main orchestrator: 8-step pipeline with full tracing (recommended production mode)agent_chain.py— LangChain integration for multi-tool orchestration
db.py— Database introspection, schema extraction, query executionsemantic.py— Build semantic models enriched with table samplesnlp_sql.py— OpenAI SQL generation with automatic retries (up to 5x)
agent.py— SQL policy verifier: reject CRUD/DDL, enforce read-only (SELECT/DESCRIBE/EXPLAIN/SHOW)
insights.py— Summarize results (describe + stats) and generate narrative via OpenAI
test_agent.py— Unit tests (pytest) for SQL verification, validation, response cleaningsetup_sample_db.py— Create sample SQLite databasedry_run.py— Dry-run example without Streamlitintegration_examples.py— Usage patterns for custom integration
README.md— Quick start, environment setup, usage modesARCHITECTURE.md— System diagrams, data flow, component responsibilitiesPIPELINE_GUIDE.md— Detailed orchestration guide, tracing, debugging, security checklist
requirements.txt— All dependencies (streamlit, sqlalchemy, openai, sqlglot, langchain, pytest)start.sh— Interactive setup & test runner script
# 1. Setup
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# 2. Create sample DB
python setup_sample_db.py
# 3. Set API key
export OPENAI_API_KEY="sk-..."
# 4. Choose one:
python dry_run.py # Dry-run example
pytest test_agent.py -v # Run tests
python integration_examples.py # Integration patterns
streamlit run app.py # Run UI (→ http://localhost:8501)Schema → Semantic → Generate SQL → Validate → Execute → Summarize → Insights
LangChain REACT agent orchestrates: generate_sql, validate_sql, execute_sql, narrative_insights
Complete 8-step pipeline with:
- Full step tracing (input, output, duration, error)
- Reasoning logs per step
- Collapsible UI for each step
- Performance metrics
✅ SQL Syntax Validation — sqlglot parsing (detects malformed SQL)
✅ SQL Policy Enforcement — Reject CRUD/DDL, allow SELECT/DESCRIBE/EXPLAIN/SHOW only
✅ Comment Injection Prevention — Strip line/block comments
✅ Multiple-Statement Rejection — Block chained statements
✅ Row Limits — Enforce 1000-row limit per query
✅ Automatic Retry — Up to 5 retries if SQL generation fails
User Query (NL)
↓
1. Introspect DB schema
2. Build semantic model (with samples)
3. Generate SQL via OpenAI (retry up to 5x)
4. Validate SQL syntax
5. Verify SQL policy (read-only)
6. Execute query (with row limit)
7. Summarize results (describe + stats)
8. Generate narrative insights (OpenAI)
↓
Result: SQL + Data + Summary + Narrative + Full Traces
✓ introspect_schema (45ms)
Reasoning: Fetch DB schema and table info
✓ build_semantic_model (120ms)
Reasoning: Enrich schema with sample data
✓ generate_sql (850ms)
Reasoning: Call OpenAI, retry up to 5x
Attempts: 2 (first failed, second succeeded)
✓ validate_sql_syntax (12ms)
Reasoning: Parse with sqlglot
✓ verify_sql_policy (8ms)
Reasoning: Ensure read-only, no CRUD/DDL
✓ execute_sql (34ms)
Reasoning: Run with 1000-row limit
Rows: 42
✓ summarize_results (28ms)
Reasoning: Compute describe() and percentiles
✓ narrative_insights (680ms)
Reasoning: Use OpenAI to produce findings
Total: 1.8 seconds
from sqlalchemy import create_engine
from agentic_pipeline import run_agentic_pipeline
engine = create_engine("sqlite:////path/to/db.db")
result = run_agentic_pipeline("Show top customers", engine, dialect="sqlite")
print(result["final_result"]["sql"]) # Generated SQL
print(result["final_result"]["narrative"]) # Insightsfrom agent import verify_sql_query
ok, msg = verify_sql_query("SELECT * FROM users")
# ok=True, msg=""
ok, msg = verify_sql_query("DELETE FROM users")
# ok=False, msg="Disallowed keyword: delete"pipeline = AgenticPipeline(engine, dialect="sqlite", max_sql_retries=5)
for query in ["Query 1", "Query 2", "Query 3"]:
result = pipeline.run(query)
print(f"Duration: {sum(t['duration_ms'] for t in result['traces'])}ms")export OPENAI_API_KEY="sk-..." # Required
export OPENAI_MODEL="gpt-4" # Default: gpt-4
export DB_URLS='{"default":"sqlite:////tmp/data.db"}' # JSON map- Database selector
- Schema viewer (expandable)
- Semantic model builder button
- SQL dialect selector
- Max retries slider
- Orchestration mode radio
For production:
- Use read-only DB user
- Add query timeout (30s max)
- Implement IP whitelist
- Add audit logging
- Monitor OpenAI costs
- Use API key rotation
- Encrypt sensitive data (TLS)
- Test with malicious inputs
- Rate-limit API calls
- Set up error alerts
Typical query execution time (gpt-4):
- Introspect: ~45ms
- Semantic build: ~120ms
- SQL generation: ~800ms (includes API latency)
- Validation: ~20ms
- Execution: ~50ms
- Summarization: ~30ms
- Insights: ~700ms (includes API latency)
Total: ~1.8 seconds
- ARCHITECTURE.md — System design, data flow, component responsibilities
- PIPELINE_GUIDE.md — Detailed orchestration, tracing, debugging
- README.md — Quick start, environment setup
- test_agent.py — Unit tests demonstrating behavior
- integration_examples.py — Real-world usage patterns
- dry_run.py — End-to-end example
- streamlit — UI framework
- sqlalchemy — ORM & database abstraction
- pandas — Data manipulation
- openai — ChatGPT API client
- sqlglot — SQL parsing & validation
- langchain — Agent orchestration framework
- pytest — Testing framework
| Issue | Solution |
|---|---|
OPENAI_API_KEY not set |
export OPENAI_API_KEY="sk-..." |
| SQL generation fails after 5 retries | Simplify query, check schema samples |
| "Disallowed keyword" error | Query includes INSERT/UPDATE/DELETE — use SELECT only |
| Slow execution | Check OpenAI latency, add indexes to DB |
ModuleNotFoundError |
Run pip install -r requirements.txt |
| Streamlit not found | Activate venv: source .venv/bin/activate |
- Add conversation memory for multi-turn refinement
- Integrate more LLM providers (Claude, Llama)
- Add query caching & result memoization
- Implement row-level security (RLS)
- Add chart generation (Plotly, Matplotlib)
- Deploy with authentication (OAuth2)
- Set up monitoring & cost tracking
Prototype / Educational Use
Questions? Check the documentation files (README.md, ARCHITECTURE.md, PIPELINE_GUIDE.md) or review the code examples.