Senior data analyst and SQL engineer specializing in enterprise data warehouses. Translates business questions into fully documented, optimized SQL queries through a systematic 7-stage workflow.
- ✅ Multi-database support: Oracle, MySQL, PostgreSQL, SQL Server
- 🔍 Data Discovery: Search for tables/columns via metadata and comments
- 📊 Query Optimization: Analyze EXPLAIN plans and optimize performance
- ✅ Safety First: All queries are executed with limits and timeouts
- 📝 Full Documentation: Every query includes comments and documentation
- 🤝 Human-in-the-Loop: Checkpoints for user confirmation before proceeding
# Core dependencies
pip install python-dotenv
# Database drivers (choose according to the database you use)
pip install oracledb # For Oracle
pip install mysql-connector-python # For MySQL
pip install psycopg2-binary # For PostgreSQL
pip install pyodbc # For SQL Server
# Optional (for Excel document search)
pip install openpyxl pandasSQL Server requires an ODBC driver. Install according to your operating system:
Windows:
- Download from Microsoft
macOS:
# Install unixODBC
brew install unixodbc
# Install Microsoft ODBC Driver
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
brew install msodbcsql17Linux (Ubuntu/Debian):
# Install unixODBC
sudo apt-get install unixodbc-dev
# Install Microsoft ODBC Driver 17
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17Copy .env.example to .env and fill in your connection details:
cp .env.example .envDWH_TYPE=oracle
DWH_USERNAME=your_username
DWH_PASSWORD=your_password
DWH_DSN=hostname:port/service_nameMYSQL_DEV_TYPE=mysql
MYSQL_DEV_USERNAME=your_username
MYSQL_DEV_PASSWORD=your_password
MYSQL_DEV_HOST=localhost
MYSQL_DEV_PORT=3306
MYSQL_DEV_DATABASE=your_databasePG_DEV_TYPE=postgresql
PG_DEV_USERNAME=your_username
PG_DEV_PASSWORD=your_password
PG_DEV_HOST=localhost
PG_DEV_PORT=5432
PG_DEV_DATABASE=your_databaseMSSQL_DEV_TYPE=sqlserver
MSSQL_DEV_USERNAME=your_username
MSSQL_DEV_PASSWORD=your_password
MSSQL_DEV_HOST=localhost
MSSQL_DEV_PORT=1433
MSSQL_DEV_DATABASE=your_database
# Optional: MSSQL_DEV_DRIVER={ODBC Driver 18 for SQL Server}Search for tables and columns by name or comment:
# Search in comments and names
python scripts/search_schema.py --keyword "customer" --db DWH
# Search only in comments
python scripts/search_schema.py --keyword "customer" --search-in comments --db DWH
# Search with regex
python scripts/search_schema.py --keyword "CUST_|CUSTOMER_" --regex --db DWH
# Filter by schema
python scripts/search_schema.py --keyword "revenue" --schema SALES --db DWHCheck table structure, indexes, partitions, and statistics:
# Oracle
python scripts/check_table.py OWNER TABLE_NAME --db DWH
# SQL Server
python scripts/check_table.py dbo Customers --db MSSQL_DEV
# Output as JSON
python scripts/check_table.py SCHEMA TABLE --db DWH --format json
# Output as Markdown
python scripts/check_table.py SCHEMA TABLE --db DWH --format markdownRun SELECT queries with row limits and timeouts:
# Run query from string
python scripts/run_query_safe.py --sql "SELECT * FROM SCHEMA.TABLE" --db DWH
# Run query from file
python scripts/run_query_safe.py --file query.sql --db DWH --limit 50
# Count rows only
python scripts/run_query_safe.py --file query.sql --db DWH --count-only
# Output as JSON
python scripts/run_query_safe.py --file query.sql --db DWH --format jsonAnalyze execution plans for performance optimization:
# Run EXPLAIN on a query
python scripts/explain_query.py --file query.sql --db DWH
# Oracle
python scripts/explain_query.py --sql "SELECT * FROM TABLE" --db DWH
# SQL Server (using SHOWPLAN)
python scripts/explain_query.py --file query.sql --db MSSQL_DEV
# Output as JSON
python scripts/explain_query.py --file query.sql --db DWH --format jsonFind foreign keys and join paths:
# Find relationships for one table
python scripts/find_relationships.py --schema SCHEMA --table TABLE_NAME --db DWH
# Find join paths between multiple tables
python scripts/find_relationships.py --schema SCHEMA --tables TABLE1,TABLE2,TABLE3 --db DWHGet sample data and profiling:
# Get 10 sample rows
python scripts/sample_data.py --schema SCHEMA --table TABLE_NAME --db DWH
# Get 50 sample rows
python scripts/sample_data.py --schema SCHEMA --table TABLE_NAME --db DWH --rows 50
# Data profiling (analyze data distribution)
python scripts/sample_data.py --schema SCHEMA --table TABLE_NAME --db DWH --profileSearch in Excel documentation (if available):
# Search in folder documents/
python scripts/search_documents.py --keyword "customer" --folder documents/
# Search with regex
python scripts/search_documents.py --keyword "CUST|CUSTOMER" --folder documents/ --regexWhen using this skill with Claude, the workflow proceeds through 7 stages:
- Requirement Analysis: Analyze business requirements
- Data Discovery: Find relevant tables/columns
- Data Mapping: Map data and define join conditions
- Query Design: Design query with CTEs and comments
- Query Testing: Test with EXPLAIN and safe execution
- Optimization: Optimize based on EXPLAIN plan
- Documentation: Save query and documentation
The workflow has 4 checkpoints for user confirmation:
- CP1: After Requirement Analysis
- CP2: After Data Discovery (confirm tables/columns)
- CP3: After Data Mapping (confirm joins/filters)
- CP4: Before Query Testing (confirm query logic)
You can skip checkpoints by saying "skip checkpoints" or "auto mode".
- Use
ROWNUMfor pagination - Supports
CONNECT BYfor hierarchical queries - Partition pruning with
WHERE partition_key >= ...
- Case-insensitive string comparison (default)
- Use
LIMITfor pagination GROUP_CONCATfor string aggregation
- Case-sensitive string comparison (default)
- Use
LIMITfor pagination - Native JSON/JSONB support
- Use
TOPorOFFSET...FETCH NEXTfor pagination STRING_AGGfor string aggregation (SQL Server 2017+)- Extended properties for table/column comments
In the references/ folder, you can find reference documents:
dialect-differences.md: Differences between Oracle, MySQL, PostgreSQL, SQL Serverquery-patterns.md: Common SQL query patternswindow-functions.md: Guide to window functionsoptimization.md: Query optimization techniquesdatabase-design.md: Database designdwh-patterns.md: Data warehouse patterns
Error: "Can't open lib 'ODBC Driver 17 for SQL Server'"
- Install the ODBC driver (see System Requirements)
- Or specify a different driver:
MSSQL_DEV_DRIVER={ODBC Driver 18 for SQL Server}
Error: "Login failed for user"
- Check username/password
- Check SQL Server Authentication mode (Windows Auth vs SQL Auth)
- Ensure user has privileges to access the database
Error: "SSL Security error"
- Add
TrustServerCertificate=yesto the connection string - Or properly configure SSL certificate
Error: "TNS:could not resolve the connect identifier"
- Check DSN format:
hostname:port/service_name - Check tnsnames.ora if using alias
Error: "Access denied for user"
- Check username/password
- Check host access permissions:
GRANT ALL ON db.* TO 'user'@'host'
MIT
- 2.2.0: Added SQL Server support
- 2.1.0: Improved checkpoints and workflow
- 2.0.0: Added PostgreSQL support
- 1.0.0: Initial release with Oracle and MySQL