Skip to content

Add CarMax mirror (port 40015)#24

Open
Violet24K wants to merge 6 commits into
aiming-lab:mainfrom
Violet24K:main
Open

Add CarMax mirror (port 40015)#24
Violet24K wants to merge 6 commits into
aiming-lab:mainfrom
Violet24K:main

Conversation

@Violet24K
Copy link
Copy Markdown

Adds a Flask mirror of carmax.com as the 16th
WebHarbor site, with full inventory search, vehicle research, comparison,
sell-my-car appraisal, financing pre-qualification, reserve, test drive,
and checkout flows.

Companion HuggingFace PR: https://huggingface.co/datasets/ChilleD/WebHarbor/discussions/15


What's in this PR

Site code (sites/carmax/)

File Lines Purpose
app.py 1,997 Flask app: 13 SQLAlchemy models, 10 WTForms, 59 routes
seed_data.py 904 Idempotent seed (12 stores, 141 vehicles, 5 users, 20 reviews, 10 articles)
templates/*.html 1,519 (44 files) base + macros + 42 page templates
static/css/main.css 221 CarMax navy (#1660a8) + yellow (#FFD900) brand styling
scrape_carmax.py 129 Reproducible httpx fetch of evox stock photos
scrape_articles.py 107 Reproducible fetch of article hero images
tasks.jsonl 20 WebVoyager benchmark tasks

Registration (3 files modified)

  • websyn_start.sh — added carmax to SITES, switched the three
    hardcoded 15s to ${#SITES[@]} so future additions don't need
    triple edits.
  • control_server.py — added 'carmax' to SITES list.
  • DockerfileEXPOSE 8101 40000-40015 (was 40000-40014).

Quality-of-life additions

  • .gitattributes — forces LF line endings on *.sh and Dockerfile
    so a Windows checkout doesn't break the container entrypoint (hit
    this exact issue during initial Docker testing — exec /opt/websyn_start.sh: no such file or directory).
  • scripts/verify_carmax.sh — single-command end-to-end verifier (build
    → run → reset → md5sum) for the new site.

Mirror functional coverage

59 routes across these areas:

  • Inventory/cars, /cars/<make>, /cars/<make>/<model>, /cars/<make>/<model>/<year>, /cars/<make>/<model>/<trim>, /cars/<make>/<model>/<trim>/<year>, with filter params for body style, drive type, fuel type, mileage cap, price range, color, store, etc.
  • Vehicle detail — full specs, features, customer reviews, similar vehicles, financing estimate
  • Research — model overview + year-by-year pages with RepairPal ratings, trims, FAQs
  • Comparison — anonymous/authed compare tool (up to 4 vehicles)
  • Saved cars — heart / unheart per-user
  • Sell my car — appraisal form → instant offer page with 7-day validity
  • Pre-qualification — soft-credit form → personalized monthly payment range
  • Financing — landing page + CarMax Auto Finance / external lender / cash options at checkout
  • Stores — 12 real CarMax locations across CA/TX/FL/GA/NY/IL/MD/MA/WA/AZ/CO/NC
  • Reserve / Test drive — auth-gated booking flows
  • Checkout — full order flow with MaxCare warranty and trade-in appraisal application
  • Account — orders, reservations, test drives, appraisals, saved cars, edit profile, change password
  • Articles + FAQ — 10 articles, 4 FAQ categories

Search uses scored token-overlap with field-weighted scoring
(make/model = 5, trim/body/color = 3, features/specs = 1), explicitly
NOT strict-AND, so queries like "honda civic sport" return results even
when one token misses on a given vehicle.


Benchmark tasks

sites/carmax/tasks.jsonl ships 20 tasks following the WebVoyager
schema (web_name, id, ques, web, upstream_url):

  • 6 Easy (2-3 steps): inventory search by year/make/model, trim-specific search, sorted filters, vehicle detail spec reading, store locator, FAQ
  • 9 Medium (4-6 steps): research-page navigation, sell-my-car form, register + pre-qual, reserve, test drive, cheapest-vehicle + store cross-check, article read, value-page lookup, MaxCare tier comparison
  • 5 Hard (7+ steps, multi-step reasoning): 3-way vehicle comparison, register + pre-qualify + report APR, saved-cars disambiguation, trade-in appraisal applied at checkout with custom finance terms, dan's order history audit

Hand-traced each task against the seed DB; the answer is verifiable on
every task and not visible at the search-result level for any task that
asks for spec-level info.


Verification

md5sum sites/carmax/instance/carmax.db sites/carmax/instance_seed/carmax.db
c6e3b281258bd8a460f7030a54b74c21 instance/carmax.db
c6e3b281258bd8a460f7030a54b74c21 instance_seed/carmax.db

Idempotency

Both seed_database() (line 675) and seed_benchmark_users() (line 722)
gate the whole function on populated-DB checks, not per row. Every
seeded created_at / saved_at / added_at uses a frozen
SEED_NOW = datetime(2026, 1, 15, 12, 0, 0) (18 references). Zero
calls to datetime.utcnow() anywhere in seed_data.py.


Asset side (HuggingFace dataset)

carmax.tar.gz (~280 MB) was uploaded to ChilleD/WebHarbor in
https://huggingface.co/datasets/ChilleD/WebHarbor/discussions/15. .assets-revision is bumped to that PR's merge SHA
in this PR.

Contents of the tarball (extracts in place into sites/carmax/):

  • instance_seed/carmax.db — the frozen seed DB
  • static/images/vehicles/ — 738 real CarMax stock photos covering
    115/138 unique (year, make, model) tuples (~86% coverage)
  • static/images/articles/ — 10 article hero images

The 18 missing (year, make, model) tuples (Ford F-150 all years, BMW 3
Series all years, Mercedes-Benz C-Class all years, 2023 Toyota Corolla
/ Kia Sorento / Subaru Outback, 2021-22 Hyundai Elantra) have no evox
stock photos on the carmax CDN — those vehicles fall back to a
CarMax-branded SVG placeholder. This matches the live site's behavior
for those exact combinations.


Test users (benchmark)

Five users with password CarMax!2026, each pre-populated for
auth-gated tasks:

Email First name Pre-qual? Saved Reservation Test drive Appraisal Order
alice.j@test.com Alice 2 (Civic + CR-V) 1 1 (at-home) 1 active
bob.k@test.com Bob 2 1 (in-store) 1 active
carol.l@test.com Carol 1 1 active
dan.m@test.com Dan 1 1 (CMX-2026-000001, ready_for_pickup, with MaxCare gold)
emma.n@test.com Emma

(Skill suggests bob.c/carol.d/david.k with TestPass123!, but
since tasks.jsonl references these specific emails throughout, I kept
the slightly different set. Functionally equivalent.)


Pre-PR checks

  • python3 -m py_compile sites/carmax/app.py — clean
  • python3 -m py_compile sites/carmax/seed_data.py — clean
  • bash scripts/build.sh webharbor:dev — succeeds (image ~6.2 GB)
  • Container boots, all 16 sites alive
  • All 16 sites return HTTP 200
  • /reset/carmax byte-identical (md5 above)
  • Each task in tasks.jsonl has a verifiable answer in the seed
  • Phase-3 walkthrough (info-leak / superficial-completion / distractor checks): 3 issues found, 3 fixed (Task 13 disambiguation, dan's order total, Turbo feature cross-field consistency)
  • Phase-4 hardening (13 leak archetypes + 4 dimensions): no real leaks; one minor task rephrasing applied

Anything that might want reviewer attention

  1. Benchmark user emails deviate from the skill's recommended
    bob.c@test.com / carol.d@test.com set — kept for tasks.jsonl
    internal consistency.
  2. 18 vehicles show a placeholder image (not 100% image coverage)
    because the carmax CDN has no evox photos for those (make, model,
    year) combinations. Could be remediated by sourcing from a different
    CDN if the maintainer requires 100% coverage.
  3. SEED_NOW = datetime(2026, 1, 15, 12, 0, 0) — matches the
    project's existing 2026 date pinning convention; please flag if a
    different reference date is preferred.

Happy to address any review feedback.

Violet24K added 6 commits May 14, 2026 22:28
…com. - 13 SQLAlchemy models (User / Store / Vehicle / SavedVehicle / Comparison + ComparisonItem / Reservation / TestDrive / Appraisal / FinancePreQual / Order / Review / Article) - 59 routes covering search / browse / detail / research / compare / saved / sell-my-car / pre-qual / reserve / test-drive / checkout / account / articles / FAQ / MaxCare / stores / auth - Token-overlap scored search with multi-field weighting - 141 deterministically-seeded vehicles across 31 templates - 12 real CarMax store locations - 5 benchmark users with pre-populated saved/reservation/test-drive/ appraisal/order data - 20 WebVoyager tasks in tasks.jsonl (6 Easy / 9 Medium / 5 Hard, including 2 disambiguation tasks) - Idempotent seed at function level; byte-identical reset verified
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant