Short SQL case studies (SQLite/Postgres style): cleaning, aggregation, window functions, and analytics.
| Goal | Where to go |
|---|---|
| Shareable project overview | Project site |
| Find a challenge by topic | Challenge finder |
| Read a challenge in the browser | Challenge viewer |
| Try one challenge quickly | 001 Passenger survival |
| Follow a structured path | Learning paths |
| Suggest a new challenge | Challenge roadmap |
| Add or improve a challenge | Contributing guide |
| Report unclear output or a broken command | Open an issue |
- Each challenge has a small dataset schema + a question
- Solutions are written in plain SQL
- Focus is on readable queries you could use in real work
Tip: Each challenge folder includes its own README with the question, expected output, and run command.
| Skill area | Good starting points |
|---|---|
| Aggregation and ranking | 001, 002, 009, 010 |
| Retention and cohorts | 003, 004, 019, 020, 027, 029 |
| Revenue analytics | 005, 007, 011, 013, 014, 030 |
| Date spines and rolling windows | 008, 016, 028 |
| Product and customer behavior | 012, 015, 017, 018, 021, 022, 023, 024, 025, 026 |
| Data engineering SQL patterns | 035, 036, 037, 038, 039, 040 |
For ordered study plans, see Learning paths.
- Passenger survival by class (Titanic-style aggregation) — challenges/001_passenger_survival
- Top customers by total spend (window function) — challenges/002_top_customers
- Customer retention (repeat customers by month) — challenges/003_customer_retention
- Cohort retention (active users by months after signup) — challenges/004_cohort_retention
- Daily revenue (7-day rolling) — challenges/005_daily_revenue
- Funnel conversion rates (view → cart → purchase) — challenges/006_funnel_conversion
- Category revenue share (% of total) — challenges/007_category_revenue_share
- Daily revenue with missing dates (date spine + rolling 7 day) — challenges/008_daily_revenue_date_spine
- Top products per category (top 3 per category) — challenges/009_top_products_per_category
- Customer LTV (total spend + order count + rank) — challenges/010_customer_ltv
- Pareto customers (top ~80% revenue) — challenges/011_pareto_customers
- Product return rate (returns / orders) — challenges/012_product_return_rate
- Monthly revenue growth (MoM %) — challenges/013_monthly_revenue_growth
- Category monthly growth (MoM by category) — challenges/014_category_monthly_growth
- Sessionization (30-min inactivity rule) — challenges/015_sessionization
- Monthly median order value (window median in SQLite) — challenges/016_monthly_median_order_value
- Longest purchase streak (consecutive months) — challenges/017_longest_purchase_streak
- Time to repeat purchase (1st → 2nd order + days) — challenges/018_time_to_repeat
- Repeat purchase within 30 days (cohort month + rate) — challenges/019_repeat_within_30d_cohort
- Retention matrix (cohort month x month number) — challenges/020_retention_matrix
- Churned customers (no orders in last 30 days) — challenges/021_churned_customers
- Customer reactivation (returned after inactivity) — challenges/022_customer_reactivation
- Weekly active users (WAU) — challenges/023_weekly_active_users
- RFM segmentation (recency + frequency + monetary quartiles) — challenges/024_rfm_segmentation
- Signup → first purchase lag (avg + median) — challenges/025_signup_to_first_purchase
- Repeat purchase within 30 days (cohort repeat rate) — challenges/026_repeat_purchase_30d
- Weekly retention (active customers + WoW retention rate) — challenges/027_weekly_retention
- Rolling 7-day active users (date spine + region + rolling window) — challenges/028_rolling_7d_active_users
- Cohort weekly retention (rolling 4-week avg) — challenges/029_cohort_weekly_retention_rolling
- Top customers per month (ties + MoM change + share of month) — challenges/030_top_customers_monthly_ties
- Longest consecutive activity streak (gap & islands) — challenges/031_longest_activity_streak
- Sessionization (30-minute inactivity gap) — challenges/032_sessionization_30min
- Session funnel conversion (view → cart → purchase) — challenges/033_session_funnel_conversion
- Reactivation cohorts (gap > 14 days) + next-month retention — challenges/034_reactivation_cohorts
- Subscription renewals + missed renewals + winback (30 days) — challenges/035_subscription_renewal_winback
- SCD Type 2 customer dimension (history table) — challenges/036_scd2_customer_dimension
- Incremental fact upsert (staging → fact, insert + update) — challenges/037_incremental_fact_upsert
- Inventory stockout risk (velocity + lead time scoring) — challenges/038_inventory_stockout_risk
- Trial to paid conversion (cohorts + conversion windows) — challenges/039_trial_to_paid_conversion
- Revenue leakage audit (invoice + payment reconciliation) — challenges/040_revenue_leakage_audit
You can copy/paste the SQL into SQLite, Postgres, or any SQL runner with minor tweaks.
Each challenge includes:
schema.sql(creates tables + sample data)solution.sql(the query)expected.json(the expected columns and rows)
Quick run with SQLite:
cat challenges/001_passenger_survival/schema.sql challenges/001_passenger_survival/solution.sql | sqlite3 -header -column :memory:Run another challenge by changing the folder name:
cat challenges/010_customer_ltv/schema.sql challenges/010_customer_ltv/solution.sql | sqlite3 -header -column :memory:type challenges\003_customer_retention\schema.sql challenges\003_customer_retention\solution.sql | sqlite3 -header -column :memory:The GitHub Actions workflow runs every schema.sql + solution.sql pair against SQLite and compares the actual query output with each challenge's expected.json snapshot. This catches both broken SQL and accidental answer changes.
Run the full validation locally:
python scripts/validate_challenges.pyRegenerate snapshots after intentionally changing a challenge answer:
python scripts/validate_challenges.py --write-expectedNew challenge ideas and fixes are welcome. Start with the challenge roadmap for candidate ideas, then see CONTRIBUTING.md for the challenge format, SQL style, and validation checklist.
Last updated: 2026-06-23
