A full-stack intelligent data analytics platform that combines Snowflake Cortex AI with a modern React dashboard to provide automated data warehouse analysis, visualization, and conversational insights with dynamic credential management and selective table analysis.
Blend360 Enterprise Data Intelligence Platform is an enterprise-grade solution that leverages AI agents to automatically analyze your Snowflake data warehouse, generating:
- π Dynamic Credential Management - Configure Snowflake connections via UI with secure .pem file upload
- π Selective Table Analysis - Choose specific tables to analyze with "Select All" option
- π Automated KPIs - AI-generated business metrics
- π Dynamic Charts - Intelligent visualizations with fallback mechanisms
- π Data Quality Assessment - AI SQL-based validation, fixes and scoring
- π Relationship Mapping - Automatic table relationship inference
- π¬ Conversational AI - Natural language querying of insights
- π Executive Summaries - Narrative insights for stakeholders
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β React Frontend β
β β’ TypeScript + Vite + shadcn/ui β
β β’ Dashboard, Charts, Chat Interface β
β β’ Real-time data visualization β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β HTTP/REST API
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Flask Backend β
β β’ Multi-Agent AI System β
β β’ Snowflake Cortex Integration β
β β’ Dynamic SQL Generation & Repair β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β Snowpark + JDBC
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Snowflake Data Cloud β
β β’ Cortex AI (Mistral-Large2) β
β β’ Data Warehouse Tables β
β β’ INFORMATION_SCHEMA β
β β’ CLEAN_INSIGHTS_STORE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Language: Python 3.8+
- Framework: Flask 2.x with Flask-CORS
- Database: Snowflake Connector Python + Snowpark
- AI: Snowflake Cortex (
mistral-large2) - Auth: Private Key (RSA PKCS#8)
- Utilities: Cryptography, JSON parsing, Regex
- Language: TypeScript 5.8
- Framework: React 18.3
- Build Tool: Vite 5.4
- UI Library: shadcn/ui (Radix UI primitives)
- Styling: Tailwind CSS 3.4
- Charts: Recharts 2.15
- State: React Context + TanStack Query
- Routing: React Router DOM 6.30
data-insights-hub-main/
βββ backend/
β βββ app.py # Main Flask application
β βββ .env # Environment configuration
β βββ CLARITY_SERVICE_ACCOUNT.pem # Private key for auth
β βββ README.md # Backend documentation
βββ frontend/
β βββ src/
β β βββ components/ # UI components
β β β βββ ui/ # shadcn/ui components (40+)
β β β βββ ChatbotSlider.tsx
β β β βββ DomainInsights.tsx
β β β βββ ...
β β βββ contexts/ # React contexts
β β βββ pages/ # Route components
β β βββ types/ # TypeScript definitions
β β βββ App.tsx # Root component
β βββ package.json
β βββ vite.config.ts
β βββ tailwind.config.ts
β βββ README.md # Frontend documentation
βββ README.md # This file
NEW: Configure Snowflake connections directly from the UI without editing .env files:
- Credential Input Form: Enter account, user, role, warehouse, database, schema
- Secure File Upload: Upload private key (.pem) files with validation
- Table Selection: Browse and select specific tables for analysis
- Select All Option: Quickly analyze all available tables
Workflow:
- Click "Load Analysis" button
- Enter Snowflake credentials in slider form
- Upload .pem private key file
- Click "Connect & Fetch Tables"
- Select desired tables with checkboxes
- Click "Run Analysis" to start pipeline
The backend implements specialized agents for different analysis tasks:
- MetadataAgent: Extracts schema information (tables, columns, types) with table filtering
- DataProfilerAgent: Profiles data (row counts, distributions)
- RelationshipAgent: Infers FK relationships using AI
- KPIGeneratorAgent: Creates business metrics using Cortex AI
- KPIExecutionAgent: Executes and validates KPI SQL
- ChartGeneratorAgent: Designs visualizations with AI
- ChartDataAgent: Renders charts with dynamic repair
- DataQualityScopeAgent: AI determines quality check targets
- DataQualityProfiler: Executes SQL validation checks
- DataQualityAgent: Analyzes quality signals with AI
- NarrativeInsightAgent: Generates executive summaries
- ChatAgent: Conversational interface for querying insights
Intelligent fallback mechanism when AI-generated SQL fails:
def repair_chart_sql(chart, metadata):
# Selects safe dimension columns (DATE, DEVICE_TYPE, etc.)
# Picks numeric columns for aggregation
# Constructs valid GROUP BY with LIMIT
return repaired_sqlNEW: Combined data quality and transformation view:
- Three-tier quality assessment with scores
- SQL validation for nulls, duplicates, invalid dates
- AI-generated fix suggestions with actionable steps
- Merged display showing "Issue:" and "Action:" for each problem
- Single unified section to avoid duplication
Modern React UI featuring:
- Real-time data visualization with Recharts
- Dark/light theme toggle
- Responsive grid layouts
- Loading states and error handling
- AI chat sidebar with context-aware responses
- Connection configuration slider with step-by-step workflow
Natural language interface:
- Context-aware responses based on latest analysis
- Message history
- Fallback responses when data not available
Backend:
- Python 3.8+
- Snowflake account with Cortex enabled
- Private key for authentication (.pem format) - Can now be provided via UI
Frontend:
- Node.js 18+ or Bun runtime
- npm/yarn/bun package manager
- Start Backend (minimal .env setup for legacy endpoints only)
cd backend
pip install flask flask-cors snowflake-connector-python snowflake-snowpark-python cryptography
# Minimal .env for /clean-report endpoint (optional)
cat > .env << EOF
SNOWFLAKE_USER=service_account
SNOWFLAKE_ACCOUNT=account.region
SNOWFLAKE_WAREHOUSE=warehouse
SNOWFLAKE_DATABASE=database
SNOWFLAKE_SCHEMA=schema
SNOWFLAKE_ROLE=role
PRIVATE_KEY_PATH=./key.pem
PRIVATE_KEY_PASSPHRASE=
EOF
python app.py- Start Frontend
cd frontend
npm install # or: bun install
npm run dev # or: bun dev- Configure via UI
- Navigate to
http://localhost:8080 - Click "Load Analysis" button
- Enter Snowflake credentials
- Upload your .pem private key file
- Select tables to analyze
- Run analysis
- Navigate to
cd backend
# Install dependencies
pip install flask flask-cors snowflake-connector-python snowflake-snowpark-python cryptography
# Configure environment
cat > .env << EOF
SNOWFLAKE_USER=your_username
SNOWFLAKE_ACCOUNT=your_account.region
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_DATABASE=your_database
SNOWFLAKE_SCHEMA=your_schema
SNOWFLAKE_ROLE=your_role
PRIVATE_KEY_PATH=./CLARITY_SERVICE_ACCOUNT.pem
PRIVATE_KEY_PASSPHRASE=your_passphrase
EOF
# Create required Snowflake table
# Run in Snowflake:
# CREATE TABLE CLEAN_INSIGHTS_STORE (
# LOAD_ID VARCHAR(255),
# LOAD_DATETIME TIMESTAMP_NTZ,
# CLEAN_JSON VARIANT
# );
# Start server
python app.pyBackend runs on http://127.0.0.1:8082
cd frontend
npm install
npm run dev
Frontend runs on `http://localhost:8080`
#### 3. Access the Application
1. Open browser to `http://localhost:8080`
2. Login with any credentials (demo mode)
3. Dashboard automatically fetches latest analysis
4. Click "Run New Analysis" to trigger backend pipeline
## π Data Flow
### Complete Analysis Pipeline
User Request β Frontend Dashboard β GET /run-analysis β Backend Flask β
- MetadataAgent β Extract schema
- DataProfilerAgent β Count rows
- RelationshipAgent β Infer FK relationships (Cortex AI)
- KPIGeneratorAgent β Design KPIs (Cortex AI)
- KPIExecutionAgent β Execute KPI SQL
- ChartGeneratorAgent β Design charts (Cortex AI)
- ChartDataAgent β Execute chart SQL (with repair)
- DataQualityScopeAgent β Determine check scope (Cortex AI)
- DataQualityProfiler β Run SQL validations
- DataQualityAgent β Analyze quality (Cortex AI)
- NarrativeInsightAgent β Generate summary (Cortex AI)
- Store in CLEAN_INSIGHTS_STORE β JSON Response β Frontend β DomainInsights Component β Render UI
### Chat Flow
User Message β ChatbotSlider β POST /chat β Backend ChatAgent β
- Fetch latest report from CLEAN_INSIGHTS_STORE
- Send message + context to Cortex AI
- Generate contextual answer β JSON Response β Frontend β Display message + Text-to-Speech option
## π API Reference
### Backend Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/` | Health check |
| GET | `/run-analysis` | Trigger full analysis pipeline |
| GET | `/clean-report` | Get latest report |
| GET | `/clean-report/<load_id>` | Get specific report |
| GET | `/clean-report/runs` | List all report runs |
| POST | `/chat` | Conversational AI query |
### Response Schema
```json
{
"status": "success",
"data": {
"meta": {
"load_id": "uuid",
"generated_at": "2024-01-01T00:00:00",
"schema_analyzed": "PROD_SCHEMA"
},
"summary": {
"tables_count": 25,
"kpis_count": 4,
"charts_count": 4,
"quality_score": 85
},
"understanding": {
"total_tables": 25,
"tables": [{ "table": "USERS", "columns": 12, "rows": 50000 }],
"relationships": [{ "table1": "ORDERS", "table2": "USERS", "relationship": "USER_ID" }]
},
"kpis": [
{ "name": "Total Revenue", "value": 1500000, "sql": "SELECT SUM(amount)..." }
],
"charts": [
{
"name": "Daily Sales",
"chart_type": "line",
"x_axis": "DATE",
"y_axis": "VALUE",
"sample_data": [{ "DATE": "2024-01-01", "VALUE": 1000 }]
}
],
"data_quality": {
"overall_score": 85,
"issues": [{ "table": "ORDERS", "column": "EMAIL", "issue": "Missing values", "suggested_fix": "..." }]
},
"transformations": [],
"insights": {
"summary": "Your data warehouse contains 25 tables...",
"key_points": ["Main domains: Sales, Marketing", "Total tables: 25"]
}
}
}
# Snowflake Connection
SNOWFLAKE_USER=your_username
SNOWFLAKE_ACCOUNT=abc12345.us-west-2
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_DATABASE=PROD_DB
SNOWFLAKE_SCHEMA=PUBLIC
SNOWFLAKE_ROLE=ANALYST
# Authentication
PRIVATE_KEY_PATH=./service_account.pem
PRIVATE_KEY_PASSPHRASE=your_passphraseUpdate backend URL in fetch calls (for production):
// In Dashboard.tsx, ChatbotSlider.tsx
const BACKEND_URL = process.env.VITE_BACKEND_URL || 'http://127.0.0.1:8080';
const response = await fetch(`${BACKEND_URL}/clean-report`);# Test health endpoint
curl http://localhost:8080/
# Test analysis
curl http://localhost:8080/run-analysis
# Test latest report
curl http://localhost:8080/clean-report
# Test chat
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is the data quality score?"}'# Build and check for errors
npm run build
# Run linter
npm run lint- Analysis pipeline: 30-60 seconds (depends on schema size)
- Cortex AI calls: ~2-5 seconds each
- Table profiling: Limited to 15 tables
- Chart data: Limited to 20 rows per chart
- Initial load: < 2 seconds
- Chart rendering: < 500ms
- Theme switching: Instant
- Build size: ~500KB (gzipped)
- Private key authentication (RSA 2048-bit)
- No password storage
- CORS enabled for localhost (configure for production)
- Environment-based secrets
- Mock authentication (replace in production)
- localStorage for demo purposes
- No sensitive data in client
- HTTPS required for production
Backend not connecting to Snowflake:
- Verify private key format (PKCS#8 DER)
- Check public key is added to Snowflake user
- Confirm account identifier format
Frontend can't reach backend:
- Ensure backend is running on port 8080
- Check CORS configuration
- Verify fetch URLs
Charts not rendering:
- Check sample_data format
- Ensure x_axis/y_axis keys exist
- Look for null values
Cortex AI not responding:
- Verify Cortex is enabled in your region
- Check role privileges
- Confirm model name is correct
class NewAgent(BaseAgent):
def run(self, context):
return self.cortex(f"""
Your prompt here...
Context: {json.dumps(context)}
Format: {{ "result": "" }}
""")// src/components/NewComponent.tsx
export function NewComponent() {
return <div>Your component</div>;
}
// Add to Dashboard.tsx or other pages
import { NewComponent } from '@/components/NewComponent';npx shadcn-ui@latest add [component-name]- Use production-grade WSGI server:
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:8080 app:app- Configure CORS for your domain:
CORS(app, origins=['https://yourdomain.com'])- Use environment variables for secrets
- Build for production:
npm run build-
Deploy
dist/folder to:- Netlify
- Vercel
- AWS S3 + CloudFront
- Azure Static Web Apps
-
Update backend URLs in code
- Backend README - Detailed backend documentation
- Frontend README - Detailed frontend documentation
- Follow existing code structure
- Use TypeScript for frontend
- Add docstrings for Python functions
- Test all changes locally
- Update READMEs for new features
Proprietary - Data Insights Hub
Data Insights Hub Development Team
For issues, questions, or feature requests, contact the development team.
Built with βοΈ Snowflake Cortex AI | βοΈ React | π Python