# Install dependencies
npm install
# Build
npm run build
# Analyze your database
node dist/cli.js analyze postgresql://user:pass@localhost:5432/mydbBefore running InstantIndex, your PostgreSQL database needs:
Add to postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
Restart PostgreSQL, then:
CREATE EXTENSION pg_stat_statements;CREATE EXTENSION hypopg;hypopg is required for InstantIndex to create hypothetical indexes without disk I/O.
$ node dist/cli.js analyze postgresql://localhost/ecommerce
InstantIndex - Automatic Database Index Optimization
Connecting to database...
✓ Connected
Checking required extensions...
✓ hypopg available (fast mode)
Analyzing database schema...
✓ Found 12 tables, 87 columns, 15 existing indexes
Analyzing query workload...
✓ Found 247 unique queries (1,245,789 total executions)
Generating index candidates...
✓ Generated 342 candidate indexes
Finding optimal index configuration...
Baseline cost: 487,392 units
Testing 342 index candidates...
Added: users(email,status) - improvement: 24.3%
Added: orders(user_id,created_at DESC) - improvement: 18.7%
Added: events(city,date) - improvement: 5.2%
Pruned redundant: users(email)
Progress |████████████████████████████████| 100% || 342/342 candidates tested
================================================================================
OPTIMIZATION RESULTS
================================================================================
CURRENT STATE:
Indexes: 15
Total cost: 487,392 units/day
RECOMMENDED CONFIGURATION:
Indexes: 17 (3 new)
Total cost: 124,891 units/day
Improvement: 74.4% faster
Storage overhead: 167.45 MB
Write overhead: +6.0%
NEW INDEXES TO CREATE:
1. CREATE INDEX idx_users_email_status ON users (email, status);
Affects: 3 query patterns (55,234 queries/day)
- SELECT * FROM users WHERE email = $1 (45000/day: 1200ms → 28ms, 42.9x faster)
- SELECT * FROM users WHERE email = $1 AND status = $2 (8500/day: 890ms → 12ms, 74.2x faster)
- SELECT * FROM users WHERE email = $1 ORDER BY status (1734/day: 2100ms → 45ms, 46.7x faster)
2. CREATE INDEX idx_orders_user_created ON orders (user_id, created_at DESC);
Affects: 2 query patterns (89,234 queries/day)
- SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC (85000/day: 450ms → 8ms, 56.3x faster)
- SELECT * FROM orders WHERE user_id = $1 AND created_at > $2 (4234/day: 1800ms → 15ms, 120.0x faster)
3. CREATE INDEX idx_events_city_date ON events (city, date) WHERE date > NOW() - INTERVAL '30 days';
Affects: 1 query pattern (12,456 queries/day)
- Dashboard recent events query (12456/day: 2300ms → 18ms, 127.8x faster)
REDUNDANT INDEXES (can be removed):
1. idx_users_email on users(email)
Made redundant by new composite indexes
TOP IMPACTED QUERIES:
1. SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC LIMIT 10
Frequency: 85,000/day
Current: 450ms → Optimized: 8ms
Speedup: 56.3x (saves 10.5h/day)
2. SELECT * FROM users WHERE email = $1
Frequency: 45,000/day
Current: 1200ms → Optimized: 28ms
Speedup: 42.9x (saves 14.7h/day)
[... more queries ...]
================================================================================
Apply these changes? (y/n): y
Creating indexes...
CREATE INDEX CONCURRENTLY idx_users_email_status ON users (email, status);
✓ Created idx_users_email_status
CREATE INDEX CONCURRENTLY idx_orders_user_created ON orders (user_id, created_at DESC);
✓ Created idx_orders_user_created
CREATE INDEX CONCURRENTLY idx_events_city_date ON events (city, date) WHERE date > NOW() - INTERVAL '30 days';
✓ Created idx_events_city_date
✓ Database optimized!node dist/cli.js analyze postgresql://localhost/mydb --max-storage 200MBnode dist/cli.js analyze postgresql://localhost/mydb --max-write-overhead 5node dist/cli.js analyze postgresql://localhost/mydb --applynode dist/cli.js analyze postgresql://localhost/mydb \
--max-storage 500MB \
--max-write-overhead 10 \
--max-indexes 5- Reads schema: Introspects tables, columns, existing indexes
- Reads workload: Gets query statistics from
pg_stat_statements - Generates candidates: Analyzes WHERE/JOIN/ORDER BY clauses to suggest indexes
- Calculates costs: Uses PostgreSQL's
EXPLAINto get exact query costs - Optimizes globally: Finds the set of indexes that minimizes total cost across ALL queries
- Shows impact: Displays which queries benefit from each index
The key insight: PostgreSQL's query planner has deterministic cost functions. We don't need to run queries or use ML - we can mathematically calculate the optimal index configuration.
To test InstantIndex, you need a database with:
- Some tables with data
pg_stat_statementsenabled- Some query history
Here's a quick test setup:
-- Enable extensions
CREATE EXTENSION pg_stat_statements;
CREATE EXTENSION hypopg;
-- Create sample tables
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255),
status VARCHAR(50),
created_at TIMESTAMP
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER,
total DECIMAL,
created_at TIMESTAMP
);
-- Insert sample data
INSERT INTO users (email, status, created_at)
SELECT
'user' || i || '@example.com',
CASE WHEN random() < 0.3 THEN 'active' ELSE 'inactive' END,
NOW() - (random() * INTERVAL '1 year')
FROM generate_series(1, 100000) i;
INSERT INTO orders (user_id, total, created_at)
SELECT
(random() * 100000)::int + 1,
(random() * 1000)::numeric(10,2),
NOW() - (random() * INTERVAL '1 year')
FROM generate_series(1, 500000) i;
-- Run some queries to populate pg_stat_statements
SELECT * FROM users WHERE email = 'user12345@example.com';
SELECT * FROM users WHERE email LIKE '%@example.com' AND status = 'active';
SELECT * FROM orders WHERE user_id = 42 ORDER BY created_at DESC LIMIT 10;
SELECT u.*, COUNT(o.id)
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id
HAVING COUNT(o.id) > 5;
-- Now run InstantIndex
-- node dist/cli.js analyze postgresql://localhost/testdbAfter running InstantIndex:
- Review the recommendations - Make sure they make sense for your workload
- Test in staging first - Don't apply directly to production
- Monitor performance - Verify the indexes help as predicted
- Re-run periodically - As your workload changes, optimal indexes may change
- Add to
postgresql.conf:shared_preload_libraries = 'pg_stat_statements' - Restart PostgreSQL
- Run
CREATE EXTENSION pg_stat_statements;
- Make sure your application has been running and executing queries
- Check
SELECT * FROM pg_stat_statements;to verify data exists
- Ensure your database user has permission to:
- Create indexes (
CREATE) - Query statistics tables
- Use extensions
- Create indexes (
- Check disk space
- Verify table exists and is accessible
- Try without
CONCURRENTLYif in development