Problem
Some user-visible URL surfaces in WebHarbor mirror sites expose local mirror addresses or placeholder domains instead of realistic upstream URLs.
Observed examples:
- Google Map place detail share box can show
http://localhost:40008/place/<slug> instead of a real Google Maps URL.
- Google Map seeded place websites can use
https://example.com/<slug> placeholders.
- Booking and Allrecipes save/return forms can serialize absolute local
request.url values into hidden next inputs.
- BBC News article share copies
window.location.href, which is the local mirror URL.
- GitHub external-host recovery redirects to fixed
http://localhost:40006..., which is brittle outside the default local port layout.
Expected
Benchmark/runtime entry URLs should remain local in sites/*/tasks.jsonl, README, and Docker docs, but user-visible share/copy/return/external-link surfaces should not leak local mirror hosts or placeholder domains.
Impact
This reduces realism for web-agent tasks that inspect or copy sharing links, and it makes some post-action redirect paths depend on host-derived absolute URLs rather than stable relative paths.
Proposed Fix
- Use realistic upstream URLs for share/copy surfaces.
- Use relative paths for hidden
next inputs and validate next redirects.
- Keep benchmark localhost URLs only in runtime/task configuration.
- Add a regression check documenting allowed vs forbidden URL patterns.
A PR with the fix is available in #7.
Problem
Some user-visible URL surfaces in WebHarbor mirror sites expose local mirror addresses or placeholder domains instead of realistic upstream URLs.
Observed examples:
http://localhost:40008/place/<slug>instead of a real Google Maps URL.https://example.com/<slug>placeholders.request.urlvalues into hiddennextinputs.window.location.href, which is the local mirror URL.http://localhost:40006..., which is brittle outside the default local port layout.Expected
Benchmark/runtime entry URLs should remain local in
sites/*/tasks.jsonl, README, and Docker docs, but user-visible share/copy/return/external-link surfaces should not leak local mirror hosts or placeholder domains.Impact
This reduces realism for web-agent tasks that inspect or copy sharing links, and it makes some post-action redirect paths depend on host-derived absolute URLs rather than stable relative paths.
Proposed Fix
nextinputs and validatenextredirects.A PR with the fix is available in #7.