Add Compass real-estate mirror (port 40015)#25
Open
sarendis56 wants to merge 1 commit into
Open
Conversation
Adds a Flask mirror of compass.com as the 16th WebHarbor site, with browse / search / filter, listing detail, agent directory, account flows (save, tour, inquiry, saved search, collection), and 18 WebVoyager-format benchmark tasks. sites/compass/: - app.py (1011 lines): 10 SQLAlchemy models, 35+ routes, token-overlap scored search with city/state/neighborhood boosts. User.check_password accepts both pbkdf2 and bcrypt prefixes so seed-time PBKDF2 hashes (deterministic) coexist with runtime Flask-Bcrypt writes. - seed_data.py (659 lines): idempotent function-level gates; PBKDF2 with fixed per-email salt to preserve byte-identical reset; Co-op pool backfilled to keep filter-based tasks at >=5 candidates. - 33 Jinja templates + 327-line hand-rolled CSS (white/black/serif to match the real Compass palette). - tasks.jsonl: 18 WebVoyager tasks (3 hard multi-step). - listings_clean.json: 524 normalized listings consumed by seed_data at build time (committed alongside the mirror, per the convention used by booking/, arxiv/, etc.). Registration (3 files, must stay in sync per AGENTS.md): - websyn_start.sh: compass appended to SITES, two ready-count 15s -> 16. - control_server.py: 'compass' appended to SITES. - Dockerfile: EXPOSE 8101 40000-40015. Heavy assets (instance_seed/compass.db, static/images/, ~129 MB packed) ship via the companion HuggingFace PR ChilleD/WebHarbor#3. .assets-revision already pins main, so once that merges this Just Works. Byte-identical reset verified: md5sum instance/compass.db instance_seed/compass.db -> 2a7458e3b6c3e3d0b39c32cca5d0f519 (both files). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
Adds a Flask mirror of compass.com as the 16th
WebHarbor site, with browse / search / filter, listing detail, agent
directory, account flows (save, tour, inquiry, saved search, collection),
and 18 WebVoyager-format benchmark tasks.
Companion HuggingFace PR: https://huggingface.co/datasets/ChilleD/WebHarbor/discussions/3
What's in this PR
Site code (
sites/compass/)app.pyseed_data.pytemplates/*.htmlstatic/css/compass.csslistings_clean.jsonseed_data.pyat build timetasks.jsonl_health.pyrequirements.txtRegistration (3 files modified, must stay in sync per
AGENTS.md)websyn_start.sh—compassappended toSITES=( … ), the two15sin ready-count log lines bumped to
16.control_server.py—'compass'appended toSITES.Dockerfile—EXPOSE 8101 40000-40015.Verification
All checks in
AGENTS.md§ Pre-PR checks pass.python3 -m py_compile sites/compass/{app.py,seed_data.py}— clean../scripts/build.sh webharbor:dev— image builds.docker runon alt ports8201/41000-41015:/healthreports all 16 sites alive with PIDs.200.tasks.jsonlwalk end-to-end against the running mirror.Design notes
(`sha1("salt-" + email)[:8]`), not bcrypt, because bcrypt's random salt
breaks byte-identical reset. `User.check_password` accepts both prefixes
so future writes from the running app (which uses Flask-Bcrypt) still
authenticate.
rather than strict `LIKE %q% AND %q%` — matches the booking-site pattern
in `sites/booking/app.py`.
`Listing.id` rather than `price.desc()` so the answers to Tasks 11 / 17
don't surface for free in the hero grid. Co-op pool was backfilled to
`compass.com/m/0//600x400.webp` images, resolved via Playwright
and downloaded with httpx. No placeholders, no AI stock photos.
Assets
Heavy assets (`instance_seed/compass.db`, `static/images/`, ~129 MB
packed) ship via the companion HuggingFace PR linked above.
`.assets-revision` already pins `main`, so once the HF PR merges this
code PR Just Works.