Skip to content

BLEND360/Data-Intelligence-Platform

Β 
Β 

Repository files navigation

Blend360 Enterprise Data Intelligence Platform

A full-stack intelligent data analytics platform that combines Snowflake Cortex AI with a modern React dashboard to provide automated data warehouse analysis, visualization, and conversational insights with dynamic credential management and selective table analysis.

πŸš€ Overview8

Blend360 Enterprise Data Intelligence Platform is an enterprise-grade solution that leverages AI agents to automatically analyze your Snowflake data warehouse, generating:

  • πŸ” Dynamic Credential Management - Configure Snowflake connections via UI with secure .pem file upload
  • πŸ“‹ Selective Table Analysis - Choose specific tables to analyze with "Select All" option
  • πŸ“Š Automated KPIs - AI-generated business metrics
  • πŸ“ˆ Dynamic Charts - Intelligent visualizations with fallback mechanisms
  • πŸ” Data Quality Assessment - AI SQL-based validation, fixes and scoring
  • πŸ”— Relationship Mapping - Automatic table relationship inference
  • πŸ’¬ Conversational AI - Natural language querying of insights
  • πŸ“ Executive Summaries - Narrative insights for stakeholders

πŸ—οΈ Architecture

System Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     React Frontend                          β”‚
β”‚  β€’ TypeScript + Vite + shadcn/ui                           β”‚
β”‚  β€’ Dashboard, Charts, Chat Interface                        β”‚
β”‚  β€’ Real-time data visualization                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚ HTTP/REST API
                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Flask Backend                             β”‚
β”‚  β€’ Multi-Agent AI System                                    β”‚
β”‚  β€’ Snowflake Cortex Integration                             β”‚
β”‚  β€’ Dynamic SQL Generation & Repair                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚ Snowpark + JDBC
                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Snowflake Data Cloud                           β”‚
β”‚  β€’ Cortex AI (Mistral-Large2)                              β”‚
β”‚  β€’ Data Warehouse Tables                                    β”‚
β”‚  β€’ INFORMATION_SCHEMA                                       β”‚
β”‚  β€’ CLEAN_INSIGHTS_STORE                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Technology Stack

Backend

  • Language: Python 3.8+
  • Framework: Flask 2.x with Flask-CORS
  • Database: Snowflake Connector Python + Snowpark
  • AI: Snowflake Cortex (mistral-large2)
  • Auth: Private Key (RSA PKCS#8)
  • Utilities: Cryptography, JSON parsing, Regex

Frontend

  • Language: TypeScript 5.8
  • Framework: React 18.3
  • Build Tool: Vite 5.4
  • UI Library: shadcn/ui (Radix UI primitives)
  • Styling: Tailwind CSS 3.4
  • Charts: Recharts 2.15
  • State: React Context + TanStack Query
  • Routing: React Router DOM 6.30

πŸ“‚ Project Structure

data-insights-hub-main/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app.py                    # Main Flask application
β”‚   β”œβ”€β”€ .env                      # Environment configuration
β”‚   β”œβ”€β”€ CLARITY_SERVICE_ACCOUNT.pem  # Private key for auth
β”‚   └── README.md                 # Backend documentation
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/           # UI components
β”‚   β”‚   β”‚   β”œβ”€β”€ ui/              # shadcn/ui components (40+)
β”‚   β”‚   β”‚   β”œβ”€β”€ ChatbotSlider.tsx
β”‚   β”‚   β”‚   β”œβ”€β”€ DomainInsights.tsx
β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   β”œβ”€β”€ contexts/            # React contexts
β”‚   β”‚   β”œβ”€β”€ pages/               # Route components
β”‚   β”‚   β”œβ”€β”€ types/               # TypeScript definitions
β”‚   β”‚   └── App.tsx              # Root component
β”‚   β”œβ”€β”€ package.json
β”‚   β”œβ”€β”€ vite.config.ts
β”‚   β”œβ”€β”€ tailwind.config.ts
β”‚   └── README.md                # Frontend documentation
└── README.md                    # This file

🎯 Key Features

1. Dynamic Connection Configuration

NEW: Configure Snowflake connections directly from the UI without editing .env files:

  • Credential Input Form: Enter account, user, role, warehouse, database, schema
  • Secure File Upload: Upload private key (.pem) files with validation
  • Table Selection: Browse and select specific tables for analysis
  • Select All Option: Quickly analyze all available tables

Workflow:

  1. Click "Load Analysis" button
  2. Enter Snowflake credentials in slider form
  3. Upload .pem private key file
  4. Click "Connect & Fetch Tables"
  5. Select desired tables with checkboxes
  6. Click "Run Analysis" to start pipeline

2. AI-Powered Multi-Agent System

The backend implements specialized agents for different analysis tasks:

  • MetadataAgent: Extracts schema information (tables, columns, types) with table filtering
  • DataProfilerAgent: Profiles data (row counts, distributions)
  • RelationshipAgent: Infers FK relationships using AI
  • KPIGeneratorAgent: Creates business metrics using Cortex AI
  • KPIExecutionAgent: Executes and validates KPI SQL
  • ChartGeneratorAgent: Designs visualizations with AI
  • ChartDataAgent: Renders charts with dynamic repair
  • DataQualityScopeAgent: AI determines quality check targets
  • DataQualityProfiler: Executes SQL validation checks
  • DataQualityAgent: Analyzes quality signals with AI
  • NarrativeInsightAgent: Generates executive summaries
  • ChatAgent: Conversational interface for querying insights

3. Dynamic SQL Repair

Intelligent fallback mechanism when AI-generated SQL fails:

def repair_chart_sql(chart, metadata):
    # Selects safe dimension columns (DATE, DEVICE_TYPE, etc.)
    # Picks numeric columns for aggregation
    # Constructs valid GROUP BY with LIMIT
    return repaired_sql

4. Comprehensive Data Quality & Transformation

NEW: Combined data quality and transformation view:

  • Three-tier quality assessment with scores
  • SQL validation for nulls, duplicates, invalid dates
  • AI-generated fix suggestions with actionable steps
  • Merged display showing "Issue:" and "Action:" for each problem
  • Single unified section to avoid duplication

5. Interactive Dashboard

Modern React UI featuring:

  • Real-time data visualization with Recharts
  • Dark/light theme toggle
  • Responsive grid layouts
  • Loading states and error handling
  • AI chat sidebar with context-aware responses
  • Connection configuration slider with step-by-step workflow

6. Conversational AI

Natural language interface:

  • Context-aware responses based on latest analysis
  • Message history
  • Fallback responses when data not available

🚦 Getting Started

Prerequisites

Backend:

  • Python 3.8+
  • Snowflake account with Cortex enabled
  • Private key for authentication (.pem format) - Can now be provided via UI

Frontend:

  • Node.js 18+ or Bun runtime
  • npm/yarn/bun package manager

Quick Start

Option 1: UI-Based Configuration (Recommended)

  1. Start Backend (minimal .env setup for legacy endpoints only)
cd backend
pip install flask flask-cors snowflake-connector-python snowflake-snowpark-python cryptography

# Minimal .env for /clean-report endpoint (optional)
cat > .env << EOF
SNOWFLAKE_USER=service_account
SNOWFLAKE_ACCOUNT=account.region
SNOWFLAKE_WAREHOUSE=warehouse
SNOWFLAKE_DATABASE=database
SNOWFLAKE_SCHEMA=schema
SNOWFLAKE_ROLE=role
PRIVATE_KEY_PATH=./key.pem
PRIVATE_KEY_PASSPHRASE=
EOF

python app.py
  1. Start Frontend
cd frontend
npm install  # or: bun install
npm run dev  # or: bun dev
  1. Configure via UI
    • Navigate to http://localhost:8080
    • Click "Load Analysis" button
    • Enter Snowflake credentials
    • Upload your .pem private key file
    • Select tables to analyze
    • Run analysis

Option 2: Traditional .env Configuration

cd backend

# Install dependencies
pip install flask flask-cors snowflake-connector-python snowflake-snowpark-python cryptography

# Configure environment
cat > .env << EOF
SNOWFLAKE_USER=your_username
SNOWFLAKE_ACCOUNT=your_account.region
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_DATABASE=your_database
SNOWFLAKE_SCHEMA=your_schema
SNOWFLAKE_ROLE=your_role
PRIVATE_KEY_PATH=./CLARITY_SERVICE_ACCOUNT.pem
PRIVATE_KEY_PASSPHRASE=your_passphrase
EOF

# Create required Snowflake table
# Run in Snowflake:
# CREATE TABLE CLEAN_INSIGHTS_STORE (
#   LOAD_ID VARCHAR(255),
#   LOAD_DATETIME TIMESTAMP_NTZ,
#   CLEAN_JSON VARIANT
# );

# Start server
python app.py

Backend runs on http://127.0.0.1:8082 cd frontend

Install dependencies

npm install

or: bun install

Start development server

npm run dev

or: bun run dev


Frontend runs on `http://localhost:8080`

#### 3. Access the Application

1. Open browser to `http://localhost:8080`
2. Login with any credentials (demo mode)
3. Dashboard automatically fetches latest analysis
4. Click "Run New Analysis" to trigger backend pipeline

## πŸ”„ Data Flow

### Complete Analysis Pipeline

User Request β†’ Frontend Dashboard ↓ GET /run-analysis β†’ Backend Flask ↓

  1. MetadataAgent β†’ Extract schema
  2. DataProfilerAgent β†’ Count rows
  3. RelationshipAgent β†’ Infer FK relationships (Cortex AI)
  4. KPIGeneratorAgent β†’ Design KPIs (Cortex AI)
  5. KPIExecutionAgent β†’ Execute KPI SQL
  6. ChartGeneratorAgent β†’ Design charts (Cortex AI)
  7. ChartDataAgent β†’ Execute chart SQL (with repair)
  8. DataQualityScopeAgent β†’ Determine check scope (Cortex AI)
  9. DataQualityProfiler β†’ Run SQL validations
  10. DataQualityAgent β†’ Analyze quality (Cortex AI)
  11. NarrativeInsightAgent β†’ Generate summary (Cortex AI)
  12. Store in CLEAN_INSIGHTS_STORE ↓ JSON Response β†’ Frontend ↓ DomainInsights Component β†’ Render UI

### Chat Flow

User Message β†’ ChatbotSlider ↓ POST /chat β†’ Backend ChatAgent ↓

  1. Fetch latest report from CLEAN_INSIGHTS_STORE
  2. Send message + context to Cortex AI
  3. Generate contextual answer ↓ JSON Response β†’ Frontend ↓ Display message + Text-to-Speech option

## πŸ“Š API Reference

### Backend Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/` | Health check |
| GET | `/run-analysis` | Trigger full analysis pipeline |
| GET | `/clean-report` | Get latest report |
| GET | `/clean-report/<load_id>` | Get specific report |
| GET | `/clean-report/runs` | List all report runs |
| POST | `/chat` | Conversational AI query |

### Response Schema

```json
{
  "status": "success",
  "data": {
    "meta": {
      "load_id": "uuid",
      "generated_at": "2024-01-01T00:00:00",
      "schema_analyzed": "PROD_SCHEMA"
    },
    "summary": {
      "tables_count": 25,
      "kpis_count": 4,
      "charts_count": 4,
      "quality_score": 85
    },
    "understanding": {
      "total_tables": 25,
      "tables": [{ "table": "USERS", "columns": 12, "rows": 50000 }],
      "relationships": [{ "table1": "ORDERS", "table2": "USERS", "relationship": "USER_ID" }]
    },
    "kpis": [
      { "name": "Total Revenue", "value": 1500000, "sql": "SELECT SUM(amount)..." }
    ],
    "charts": [
      {
        "name": "Daily Sales",
        "chart_type": "line",
        "x_axis": "DATE",
        "y_axis": "VALUE",
        "sample_data": [{ "DATE": "2024-01-01", "VALUE": 1000 }]
      }
    ],
    "data_quality": {
      "overall_score": 85,
      "issues": [{ "table": "ORDERS", "column": "EMAIL", "issue": "Missing values", "suggested_fix": "..." }]
    },
    "transformations": [],
    "insights": {
      "summary": "Your data warehouse contains 25 tables...",
      "key_points": ["Main domains: Sales, Marketing", "Total tables: 25"]
    }
  }
}

πŸ”§ Configuration

Backend Environment Variables

# Snowflake Connection
SNOWFLAKE_USER=your_username
SNOWFLAKE_ACCOUNT=abc12345.us-west-2
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_DATABASE=PROD_DB
SNOWFLAKE_SCHEMA=PUBLIC
SNOWFLAKE_ROLE=ANALYST

# Authentication
PRIVATE_KEY_PATH=./service_account.pem
PRIVATE_KEY_PASSPHRASE=your_passphrase

Frontend Configuration

Update backend URL in fetch calls (for production):

// In Dashboard.tsx, ChatbotSlider.tsx
const BACKEND_URL = process.env.VITE_BACKEND_URL || 'http://127.0.0.1:8080';
const response = await fetch(`${BACKEND_URL}/clean-report`);

πŸ§ͺ Testing

Backend Testing

# Test health endpoint
curl http://localhost:8080/

# Test analysis
curl http://localhost:8080/run-analysis

# Test latest report
curl http://localhost:8080/clean-report

# Test chat
curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the data quality score?"}'

Frontend Testing

# Build and check for errors
npm run build

# Run linter
npm run lint

πŸ“ˆ Performance

Backend

  • Analysis pipeline: 30-60 seconds (depends on schema size)
  • Cortex AI calls: ~2-5 seconds each
  • Table profiling: Limited to 15 tables
  • Chart data: Limited to 20 rows per chart

Frontend

  • Initial load: < 2 seconds
  • Chart rendering: < 500ms
  • Theme switching: Instant
  • Build size: ~500KB (gzipped)

πŸ›‘οΈ Security

Backend

  • Private key authentication (RSA 2048-bit)
  • No password storage
  • CORS enabled for localhost (configure for production)
  • Environment-based secrets

Frontend

  • Mock authentication (replace in production)
  • localStorage for demo purposes
  • No sensitive data in client
  • HTTPS required for production

πŸ› Troubleshooting

Common Issues

Backend not connecting to Snowflake:

  • Verify private key format (PKCS#8 DER)
  • Check public key is added to Snowflake user
  • Confirm account identifier format

Frontend can't reach backend:

  • Ensure backend is running on port 8080
  • Check CORS configuration
  • Verify fetch URLs

Charts not rendering:

  • Check sample_data format
  • Ensure x_axis/y_axis keys exist
  • Look for null values

Cortex AI not responding:

  • Verify Cortex is enabled in your region
  • Check role privileges
  • Confirm model name is correct

πŸ“ Development

Adding New Agents

class NewAgent(BaseAgent):
    def run(self, context):
        return self.cortex(f"""
        Your prompt here...
        Context: {json.dumps(context)}
        Format: {{ "result": "" }}
        """)

Adding New Frontend Components

// src/components/NewComponent.tsx
export function NewComponent() {
  return <div>Your component</div>;
}

// Add to Dashboard.tsx or other pages
import { NewComponent } from '@/components/NewComponent';

Adding shadcn/ui Components

npx shadcn-ui@latest add [component-name]

πŸš€ Deployment

Backend (Production)

  1. Use production-grade WSGI server:
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:8080 app:app
  1. Configure CORS for your domain:
CORS(app, origins=['https://yourdomain.com'])
  1. Use environment variables for secrets

Frontend (Production)

  1. Build for production:
npm run build
  1. Deploy dist/ folder to:

    • Netlify
    • Vercel
    • AWS S3 + CloudFront
    • Azure Static Web Apps
  2. Update backend URLs in code

πŸ“š Documentation

🀝 Contributing

  1. Follow existing code structure
  2. Use TypeScript for frontend
  3. Add docstrings for Python functions
  4. Test all changes locally
  5. Update READMEs for new features

πŸ“„ License

Proprietary - Data Insights Hub

πŸ‘₯ Team

Data Insights Hub Development Team

πŸ†˜ Support

For issues, questions, or feature requests, contact the development team.


Built with ❄️ Snowflake Cortex AI | βš›οΈ React | 🐍 Python

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 83.3%
  • Python 10.2%
  • Shell 3.2%
  • CSS 1.7%
  • Dockerfile 1.0%
  • HTML 0.3%
  • JavaScript 0.3%