Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 43 additions & 8 deletions .hyf/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# confirms required documentation artefacts are filled in.
#
# Total points: 100. Passing score: 60.
# L1(10) + L2(20) + L3(30) + L4(15) + L5(15) + L6(0) + L7(5) + L8(5) = 100
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
Expand Down Expand Up @@ -140,6 +141,17 @@ if file_is_filled "$ss"; then
else
fail "schema_setup.sql: vw_fact_trips must cast pickup_datetime as TIMESTAMP (pickup_datetime::TIMESTAMP) — required for time-pattern queries in Task 4 (Task 2)"
fi

# 3e: schema placement — views must NOT be in public or nyc_taxi
# The assignment requires views in the student's own assigned schema (e.g. dev_lasse).
# A view named public.vw_fact_trips or nyc_taxi.vw_fact_trips overrides shared tables
# and would affect every other student on the shared database.
if grep -qiE "(public|nyc_taxi)\.(vw_dim_zones|vw_fact_trips)" "$ss"; then
fail "schema_setup.sql: views must live in your own schema, not in 'public' or 'nyc_taxi' — remove the schema prefix and ensure your search_path is set to your personal schema (Task 2)"
l3=0 # zero the whole level: wrong schema is a structural failure, not a style issue
else
pass "schema_setup.sql: no forbidden schema prefix (public/nyc_taxi) on view names"
fi
else
fail "schema_setup.sql: file is empty or still contains unfilled TODO stubs"
fi
Expand Down Expand Up @@ -211,20 +223,29 @@ fi
((score += l5))
pass "Level 5: Task 4 verification queries ($l5/15 pts)"

# ── Level 6 (5 pts): borough screenshot present ─────────────────────────────
# ── Level 6 (0 pts): borough screenshot present (required, not separately scored) ──
# Points are captured in L8 (README findings). The screenshot must exist for a
# complete submission but is not awarded separate points to avoid gaming.
l6=0
shot_png="$REPO_ROOT/assets/borough_count.png"
check_screenshot_is_png "$shot_png" && l6=5 || {
# check_screenshot_is_png already emitted pass/fail/warn — just capture partial credit
if check_screenshot_is_png "$shot_png"; then
pass "Level 6: borough screenshot present (required)"
else
found_fallback=false
for ext in jpg jpeg; do
if [[ -s "$REPO_ROOT/assets/borough_count.$ext" ]]; then
l6=3
found_fallback=true
break
fi
done
}
if [[ "$found_fallback" = true ]]; then
pass "Level 6: borough screenshot present as jpg (PNG preferred)"
else
fail "Level 6: assets/borough_count.png missing — take a screenshot of your borough query result and save it there (Task 4)"
fi
fi
((score += l6))
pass "Level 6: borough screenshot ($l6/5 pts)"
pass "Level 6: borough screenshot ($l6/0 pts)"

# ── Level 7 (5 pts): AI_ASSIST.md filled in ─────────────────────────────────
l7=0
Expand All @@ -239,14 +260,14 @@ if file_is_filled "$ai"; then

chars=$(wc -c < "$ai" | tr -d ' ')

if [[ "$sections" -eq 4 && "$chars" -ge 1200 ]]; then
if [[ "$sections" -eq 4 && "$chars" -ge 1800 ]]; then
l7=5
pass "AI_ASSIST.md: all 4 sections present and filled in (${chars} chars)"
else
if [[ "$sections" -lt 4 ]]; then
fail "AI_ASSIST.md: only ${sections}/4 required sections present (need: '## The problem', '## The prompt', '## The response', '## Reflection') (Task 5)"
else
fail "AI_ASSIST.md: sections present but too brief (${chars} chars, target 1200+) — fill in the content (Task 5)"
fail "AI_ASSIST.md: sections present but too brief (${chars} chars, target 1800+) — the Reflection section in particular should explain what you kept, changed, or rejected and why (Task 5)"
fi
fi
else
Expand All @@ -255,6 +276,20 @@ fi
((score += l7))
pass "Level 7: Task 5 AI log ($l7/5 pts)"

# ── Level 8 (5 pts): README findings ────────────────────────────────────────
# The README must contain a ## Findings section summarising the Task 1 audit
# results (so the PR tells the story, not just files full of SQL).
l8=0
readme="$REPO_ROOT/README.md"
if grep -qiE "^##[[:space:]]+Findings" "$readme"; then
l8=5
pass "README.md: '## Findings' section present"
else
fail "README.md: missing '## Findings' section — add a short summary of your Task 1 audit results (how many duplicates, nulls, negative fares, orphaned keys you found)"
fi
((score += l8))
pass "Level 8: README findings ($l8/5 pts)"

# ── Final result ─────────────────────────────────────────────────────────────
print_results "Week 9 Autograder"
write_score "$score" "$PASSING" "$SCRIPT_DIR/score.json"
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,11 @@ Fill in these files (starters are provided). Keep them at the repo root and do n
| `verification_results.sql` | Task 4 | Verification queries (volume, revenue, geospatial, time patterns) |
| `assets/borough_count.png` | Task 4 | Screenshot of the per-borough row-count result |
| `AI_ASSIST.md` | Task 5 | One documented LLM session |
| `README.md` (this file) | Task 1 | Add a `## Findings` section summarising your audit results |

## Tasks (summary)

1. **Data Quality Audit** (`validation_queries.sql`): find duplicate trips, count NULL pickup/dropoff location IDs, check the `fare_amount` range for negatives, and find `pickup_location_id` values not present in `nyc_taxi.raw_zones`.
1. **Data Quality Audit** (`validation_queries.sql`): find duplicate trips, count NULL pickup/dropoff location IDs, check the `fare_amount` range for negatives, and find `pickup_location_id` values not present in `nyc_taxi.raw_zones`. Then add a `## Findings` section to this README with your actual results (counts, not just "I checked").
2. **Star Schema Views** (`schema_setup.sql`): `vw_dim_zones` (one row per `location_id`, the primary key) and `vw_fact_trips` (one row per trip; exclude `fare_amount < 0`; cast `pickup_datetime` to `TIMESTAMP`; keep the location IDs so it joins to `vw_dim_zones`).
3. **Data Dictionary** (`data_dictionary.md`): state each view's grain in one sentence, identify keys, list measures.
4. **Verification Queries** (`verification_results.sql`): query the views for volume, revenue, geospatial, and time-pattern questions, joining through `vw_dim_zones` for any borough/zone name. Save a screenshot of the per-borough counts to `assets/borough_count.png`.
Expand Down