A Production-Grade Natural Language to SQL System with Enterprise Trust Layers
The Enterprise GenAI Data Copilot is a hackathon-ready, production-grade demonstration of how AI can democratize data access in enterprises. Non-technical users ask business questions in natural language and receive instant, trustworthy answers backed by validated SQL queries.
- β NOT a chatbot - This is an enterprise decision system
- π Trust-first architecture with multi-layered validation
- π’ Enterprise-grade security with read-only enforcement
- π Business intelligence with actionable insights
- π¨ Professional UI designed for corporate environments
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Next.js Dashboard (Enterprise UI) β β
β β β’ Query Input β’ SQL Display β’ Trust Badges β β
β β β’ Result Panel β’ Insights β’ Query History β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β REST API
βββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β BACKEND (FastAPI) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Query Orchestrator β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 1. Schema Grounding Layer (SchemaMapper) β β
β β β Maps business terms to database schema β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 2. SQL Generator (LLM-powered) β β
β β β Generates SQL with confidence scores β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 3. SQL Validator (Trust Layer) β οΈ CRITICAL β β
β β β Read-only enforcement β β
β β β Schema validation β β
β β β SQL injection detection β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 4. Query Executor (Secure) β β
β β β Timeout enforcement β β
β β β Row limits β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 5. Consistency Checker β β
β β β Statistical validation β β
β β β Business rule checks β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β 6. Insight Generator (LLM-powered) β β
β β β Business explanations β β
β β β Recommendations β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β DATA LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SQLite Banking Database β β
β β β’ Customers (50 records) β β
β β β’ Accounts (65+ records) β β
β β β’ Transactions (80+ records) β β
β β β’ Loans (30 records) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Python 3.9+
- Node.js 18+
- npm or yarn
cd enterprise-genai-copilotcd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Initialize database
cd database
python init_db.py
cd ..
# Start backend server
python -m uvicorn app.main:app --reloadBackend will be available at http://localhost:8000
API docs at http://localhost:8000/api/docs
# In a new terminal
cd frontend
# Install dependencies
npm install
# Start development server
npm run devFrontend will be available at http://localhost:3000
Create .env file in the backend directory:
OPENAI_API_KEY=your-api-key-here
LLM_MODEL=gpt-4Note: The system works in mock mode without an API key for demonstration purposes.
Try these questions in the dashboard:
Customer Analytics:
- "How many active customers do we have?"
- "How many high-risk customers are in our database?"
- "What is the distribution of customers by state?"
Account Analysis:
- "What is the total balance across all active accounts?"
- "How many accounts by account type?"
- "What is the average account balance?"
Transaction Intelligence:
- "What is the average transaction amount in the last 30 days?"
- "Show me transaction volume by type"
- "What is the total transaction amount this month?"
Loan Portfolio:
- "What is the total outstanding loan amount?"
- "How many active loans do we have?"
- "What is the average loan interest rate?"
Advanced Analytics:
- "Who are the top 10 customers by total account balance?"
- "Which customers have both loans and high account balances?"
-
Schema Grounding
- Prevents hallucination of non-existent tables/columns
- Maps business terminology to technical schema
- Provides context-aware SQL generation
-
SQL Validation (Multi-layered)
- β Read-only enforcement (SELECT only)
- β Schema validation (tables/columns exist)
- β SQL injection detection
- β Dangerous pattern filtering
- β Column-level access control
-
Query Execution Safety
- Timeout enforcement (30s default)
- Row count limits (10,000 default)
- Execution metrics tracking
- Connection pooling (production)
-
Result Consistency Checking
- Statistical anomaly detection
- Business rule validation
- Null value analysis
- Range validation
-
Trust Signals UI
- Visual trust indicators
- Confidence scores
- Validation status
- Execution metrics
- Automated Insights: LLM generates executive-friendly explanations
- Key Findings: Highlights critical observations
- Caveats: Mentions limitations and considerations
- Recommendations: Suggests next steps and actions
enterprise-genai-copilot/
βββ backend/
β βββ app/
β β βββ main.py # FastAPI application
β β βββ config.py # Configuration management
β β βββ models/
β β β βββ schemas.py # Pydantic models
β β β βββ database.py # Database manager
β β βββ services/
β β β βββ orchestrator.py # Main coordinator
β β β βββ schema_mapper.py # Schema grounding
β β β βββ sql_generator.py # LLM SQL generation
β β β βββ sql_validator.py # Security validation
β β β βββ query_executor.py # Safe execution
β β β βββ consistency_checker.py # Result validation
β β β βββ insight_generator.py # Business insights
β β βββ prompts/
β β β βββ sql_generation.py # SQL gen prompts
β β β βββ insight_generation.py # Insight prompts
β β βββ utils/
β β βββ logger.py # Structured logging
β β βββ auth.py # Mock authentication
β βββ database/
β β βββ schema.sql # Database schema
β β βββ seed_data.sql # Realistic data
β β βββ init_db.py # DB initialization
β βββ tests/
β β βββ test_business_questions.py
β β βββ test_sql_validator.py
β β βββ test_hallucination_scenarios.py
β βββ requirements.txt
β
βββ frontend/
β βββ src/
β β βββ app/
β β β βββ page.tsx # Main dashboard
β β β βββ layout.tsx # Root layout
β β β βββ globals.css # Global styles
β β βββ components/
β β β βββ Dashboard.tsx # Main layout
β β β βββ QueryInput.tsx # Question input
β β β βββ SQLPanel.tsx # SQL display
β β β βββ ResultPanel.tsx # Results table
β β β βββ ExplanationPanel.tsx # Insights
β β β βββ TrustBadges.tsx # Trust indicators
β β β βββ QueryHistory.tsx # History sidebar
β β βββ services/
β β β βββ api.ts # API client
β β βββ types/
β β β βββ index.ts # TypeScript types
β β βββ lib/
β β βββ utils.ts # Utility functions
β βββ package.json
β βββ tsconfig.json
β
βββ docs/
β βββ ARCHITECTURE.md
β βββ API_DOCUMENTATION.md
β βββ DEMO_GUIDE.md
β
βββ README.md
cd backend
pytest tests/ -vTest coverage includes:
- Business question accuracy
- SQL validation security
- Hallucination prevention
- Edge case handling
customer_id,full_name,email,phoneaddress,city,state,zip_codecustomer_since,status,risk_rating
account_id,customer_id,account_numberaccount_type(checking, savings, investment)balance,status,opened_date,interest_rate
transaction_id,account_idtransaction_type(deposit, withdrawal, transfer, payment)amount,transaction_date,description,category
loan_id,customer_id,loan_numberloan_type(mortgage, personal, auto, business)loan_amount,outstanding_balance,interest_ratemonthly_payment,start_date,maturity_date
Enterprise-style layout with query input, SQL panel, results, insights, and trust indicators.
Visual indicators showing:
- β Schema Validated
- π Read-Only Verified
- β Result Consistent
- ποΈ Columns Whitelisted
- β‘ Execution Safe
- Authentication: Replace mock auth with OAuth/JWT
- API Keys: Secure storage in environment/secrets manager
- Database: Use PostgreSQL/MySQL with proper credentials
- Rate Limiting: Implement API rate limits
- Logging: Enhanced audit logging for compliance
- HTTPS: Enable TLS/SSL for all communications
- Caching: Redis for query results and schema metadata
- Connection Pooling: SQLAlchemy for database connections
- Async Processing: Background tasks for long queries
- CDN: Frontend asset distribution
- Load Balancing: Multiple backend instances
This is a hackathon demo project. For production use:
- Add comprehensive error handling
- Implement user authentication
- Add query result caching
- Enhanced LLM prompt optimization
- Multi-database support
- Advanced analytics dashboard
MIT License - See LICENSE file for details
- β Production-grade architecture
- β Multi-layered trust system
- β Comprehensive testing
- β Clean, documented code
- β Enterprise UI/UX
- β Schema grounding prevents hallucinations
- β Consistency checking validates results
- β Trust-first design for enterprises
- β Business insight generation
- β Democratizes data access
- β Reduces analyst workload
- β Accelerates decision-making
- β Maintains data governance
- β Full-stack implementation
- β Realistic demo data
- β Test coverage
- β Comprehensive documentation
For questions or issues, please refer to the documentation in /docs or check the API documentation at /api/docs when the backend is running.
Built with β€οΈ for Enterprise AI Innovation