📊 Glean Observability Dashboard

Production-ready observability platform for Glean deployments with USE methodology monitoring and comprehensive cost analytics.

🎯 Quick Links

🚀 Quick Deployment Guide - Deploy in under 10 minutes
📖 Complete Setup Guide - Detailed setup for local and production
🧪 Testing Guide - Local testing instructions
🔧 Migration Status - Track feature completeness

🚀 Tech Stack

Frontend: Next.js 15 (React) + TypeScript + Tailwind CSS + Recharts
Backend: FastAPI (Python 3.13) + Google Cloud APIs
Monitoring: Prometheus (GKE Managed) + Cloud Monitoring API
Deployment: Vercel (Frontend) + Cloud Run (Backend)
Cost: ~$5-20/month for typical usage

✨ Features

USE Methodology Monitoring

Utilization: CPU %, Memory %, Pod Availability
Saturation: CPU Throttling, Queue Backlogs
Errors: Pod Restarts, Crash Loops

Cost Analytics

Real-time GKE cluster cost estimates
Cloud SQL instance costs
Total infrastructure spend
Daily/Monthly projections

Components Monitored

🔍 Crawler (Content Ingestion)
📡 Datasource Events Handler
🧠 Query Parser (NLP/ML)
⚡ Query Engine
🧬 Semantic Index (Qdrant Vector DB)
🔎 Keyword Index (Cloud SQL)
👥 User Data Layer
🌐 Load Balancer

🏗️ Architecture

┌─────────────────────────────────────┐
│        Next.js Frontend             │
│     (TypeScript + Tailwind)         │
│                                     │
│  - Dashboard UI                     │
│  - USE Metric Visualization         │
│  - Cost Charts                      │
│  - Component Health Cards           │
└──────────────┬──────────────────────┘
               │ REST API
               ↓
┌─────────────────────────────────────┐
│       FastAPI Backend               │
│          (Python)                   │
│                                     │
│  - Prometheus Client                │
│  - GCP Metrics Client               │
│  - Cost Estimator                   │
│  - Config Management                │
└──────────────┬──────────────────────┘
               │
               ↓
┌─────────────────────────────────────┐
│         GCP Services                │
│                                     │
│  - GKE Managed Prometheus           │
│  - Cloud Monitoring API             │
│  - Cloud SQL                        │
│  - Cloud Billing API                │
└─────────────────────────────────────┘

📁 Project Structure

glean-observability-dashboard/
├── frontend/              # Next.js application
│   ├── src/
│   │   ├── app/          # App router pages
│   │   ├── components/   # React components
│   │   ├── lib/          # Utilities
│   │   └── types/        # TypeScript types
│   └── package.json
│
├── backend/               # FastAPI application
│   ├── app/
│   │   ├── api/          # API endpoints
│   │   ├── clients/      # GCP/Prometheus clients
│   │   ├── models/       # Data models
│   │   └── main.py       # FastAPI app
│   └── requirements.txt
│
└── README.md

🚀 Quick Start

For teammates: See SETUP_GUIDE.md for complete setup instructions with GCP authentication.

For deployment: See DEPLOYMENT.md for production deployment in 10 minutes.

Fastest Local Setup (3 commands)

# 1. Authenticate with GCP
gcloud auth application-default login

# 2. Start backend (Terminal 1)
cd backend && python3 -m venv venv && source venv/bin/activate && pip install -r requirements.txt && uvicorn app.main:app --reload --port 8000

# 3. Start frontend (Terminal 2)
cd frontend && npm install && npm run dev

Open http://localhost:3000 and you're ready! 🎉

🌐 Production Deployment

Option 1: Automated (Recommended)

Click the deploy buttons at the top of this README for one-click deployment to Vercel and Cloud Run.

Option 2: Manual (Full Control)

See DEPLOYMENT.md for step-by-step instructions.

Summary:

Deploy backend to Cloud Run: gcloud run deploy --source .
Deploy frontend to Vercel: vercel --prod
Connect them with environment variables
Share with your team! 🎊

Deployed URLs:

Frontend: https://glean-observability-dashboard.vercel.app
Backend: https://glean-observability-api-xxx.run.app

🎨 UI Features

✅ Real-time Metrics: Auto-refresh with configurable intervals (30-300s)
✅ USE Methodology: Utilization, Saturation, Errors for all components
✅ Cost Analytics: GKE + Cloud SQL cost estimates and breakdowns
✅ Responsive Design: Mobile, tablet, and desktop optimized
✅ Time Range Selector: 1h, 6h, 24h, 7d, 30d historical data
✅ Component Cards: Expandable details with color-coded health status
✅ Interactive UI: Collapsible sections, hover states, loading indicators
✅ Professional Theme: Modern design with Tailwind CSS and Lucide icons

📊 API Endpoints

Core Endpoints

GET /api/health - Health check
GET /api/cluster/overview - Cluster metrics (nodes, CPU, memory, pods)
GET /api/cluster/cost - Cost estimates (GKE + Cloud SQL)
GET /api/components - List all monitored components
GET /api/components/{name} - Detailed USE metrics for a component

Query Parameters

project_id (required) - GCP project ID
deployment_name (optional) - Deployment identifier
lookback_hours (optional) - Historical data window (default: 24)

Example Request

curl "http://localhost:8000/api/cluster/overview?project_id=glean-support-sandbox&deployment_name=support-sandbox"

Full API docs: http://localhost:8000/docs (FastAPI auto-generated)

🔒 Security & Authentication

For Local Development

Use gcloud auth application-default login (your Google account)
No additional setup needed

For Production

Backend: Service account with IAM roles:
- roles/monitoring.viewer
- roles/container.viewer
- roles/cloudsql.viewer
Frontend: Optional OAuth 2.0 or Vercel password protection
CORS: Configured for Vercel domains
Rate Limiting: Built into Cloud Run (80 concurrent requests)

See SETUP_GUIDE.md for detailed security configuration.

📈 Monitoring Best Practices

This dashboard implements Brendan Gregg's USE Method:

For every resource, check:
- Utilization: How busy is it?
- Saturation: Is there queuing?
- Errors: Are there failures?
Color-coded thresholds:
- 🟢 Green: Healthy
- 🟡 Yellow: Warning
- 🔴 Red: Critical

🤝 Contributing

This is an internal Glean tool. For improvements or bug fixes:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Test thoroughly (see TESTING_GUIDE.md)
Commit: git commit -m 'Add amazing feature'
Push: git push origin feature/amazing-feature
Open a Pull Request

Development Workflow

# Make changes to backend
cd backend && source venv/bin/activate
# Edit code, then restart: uvicorn app.main:app --reload --port 8000

# Make changes to frontend
cd frontend
# Edit code - Next.js hot-reloads automatically

🐛 Troubleshooting

Common Issues

"Failed to fetch data"

Check backend is running: curl http://localhost:8000/api/health
Verify GCP auth: gcloud auth application-default print-access-token
Check browser console for CORS errors

"Permission denied"

Run: gcloud auth application-default login
Ensure you have GCP project access

"No metrics available"

Verify GKE cluster has Managed Prometheus enabled
Check project ID is correct

See SETUP_GUIDE.md for detailed troubleshooting.

📝 License

Internal Glean tool - Not for external distribution.

🙏 Acknowledgments

USE Method: Based on Brendan Gregg's USE methodology
Built for: Glean infrastructure monitoring
Powered by: Google Cloud Platform, Prometheus, FastAPI, Next.js
Migration: Successfully migrated from Streamlit to Next.js for better UX

📞 Support

Internal Slack: #glean-observability
Issues: Open a GitHub issue
Documentation: See linked guides above
On-call: Check PagerDuty rotation

🎉 Success Metrics

After deploying, you should see:

✅ Cluster overview with health status
✅ Cost estimates for GKE and Cloud SQL
✅ 8 component cards with USE metrics
✅ Auto-refresh working
✅ Time range selection functional
✅ All metrics color-coded (green/yellow/red)

Happy monitoring! 📊🚀

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
CORS_FIX.md		CORS_FIX.md
DEPLOYMENT.md		DEPLOYMENT.md
DOCUMENTATION_INDEX.md		DOCUMENTATION_INDEX.md
MIGRATION_STATUS.md		MIGRATION_STATUS.md
PRE_DEPLOYMENT_CHECKLIST.md		PRE_DEPLOYMENT_CHECKLIST.md
PRODUCTION_READY_SUMMARY.md		PRODUCTION_READY_SUMMARY.md
QUICK_REFERENCE.md		QUICK_REFERENCE.md
README.md		README.md
READY_TO_SHARE.md		READY_TO_SHARE.md
SETUP_GUIDE.md		SETUP_GUIDE.md
TESTING_GUIDE.md		TESTING_GUIDE.md
deploy.sh		deploy.sh

Folders and files

Latest commit

History

Repository files navigation

📊 Glean Observability Dashboard

🎯 Quick Links

🚀 Tech Stack

✨ Features

USE Methodology Monitoring

Cost Analytics

Components Monitored

🏗️ Architecture

📁 Project Structure

🚀 Quick Start

Fastest Local Setup (3 commands)

🌐 Production Deployment

Option 1: Automated (Recommended)

Option 2: Manual (Full Control)

🎨 UI Features

📊 API Endpoints

Core Endpoints

Query Parameters

Example Request

🔒 Security & Authentication

For Local Development

For Production

📈 Monitoring Best Practices

🤝 Contributing

Development Workflow

🐛 Troubleshooting

Common Issues

📝 License

🙏 Acknowledgments

📞 Support

🎉 Success Metrics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages