Rewrite explorer in vanilla JS + Bootstrap 5.3.8#24
Rewrite explorer in vanilla JS + Bootstrap 5.3.8#24victoriobentivogli merged 8 commits intodevelopfrom
Conversation
Replace the entire frontend stack with vanilla JavaScript and Bootstrap, preserving 100% of user-facing features. Bundle CodeMirror via a custom esbuild script (scripts/build-codemirror.cjs) and move SPARQL execution to a Web Worker. ## Architecture - Single ExplorerController as the source of truth; UI panels subscribe via CustomEvents (facet-changed, results-changed, loading-changed, breadcrumb-changed, facets-list-changed). - Panels: SearchPanel, SparqlPanel, NoticeView, DataView, BacklinksView. - Services: sparqlService (worker-backed CONSTRUCT/DESCRIBE + direct SELECT with AbortController timeouts), tedAPI (JSON client for the TED Search API with hostname-based production/acceptance routing), labelService (batched label resolution with snapshot-and-clear reentrancy), randomNotice. - Renderers: TreeRenderer (lazy nested cards, per-branch cycle guard), TermRenderer (RDF term + badge rendering with single label request per badge). - CodeMirror bundled once via esbuild into src/vendor/codemirror-bundle.js, committed to the repo so a fresh clone is runnable without npm install. ## Behavioural principles baked in - Request-token pattern guards three stale-response races (ExplorerController._executeCurrentQuery, NoticeView._fetchData, BacklinksView._loadBatch). All three are regression-tested. - SessionStorage-persisted notice-search history only; queries and named-node facets are deliberately not persisted. - Explicit user-gesture gate on tab-switching: only the search button, timeline click, history dropdown pick, and SPARQL Execute auto-switch to the Explore tab. URL-param loads, breadcrumb navigation, and label resolution leave the active tab alone. - AbortController-based timeouts on every fetch and worker call (30s TED API, 60s SPARQL, 10s labels). - Defensive boundary normalisation at the tedAPI layer: buyer-country arrays joined to strings, noticeType/formType coerced to string-or-null, no fabricated "Unknown" values leaking downstream. - SPARQL injection defence in validateFacet (rejects URIs carrying angle brackets, quotes, backslashes, whitespace, or control characters). - Identity preservation across history enrichment: breadcrumb and facetsList share references so post-search metadata updates stay visible via currentFacet, and the History dropdown's active highlight keeps working. - Turtle editor re-renders from current results on every view-mode switch (no stale RDF after navigation). - SPARQL worker crash safely rejects every pending promise and respawns on next call. - No "spooky" behaviour on reload: cold load lands on Search with a blank input, no auto-replay of history, no random-notice fallback. - Shareable ?facet= URLs validate on load; invalid URLs surface a dismissible banner instead of a silently blank tab. ## Test suite Vanilla node:test suite restoring the regression coverage from the deleted Zod-based Vitest suite, plus new tests for the request-token race (3 sites), URL round-trip, identity preservation, label-service batching reentrancy, sparqlService orphan-pending-request protection, and shape normalisation. - 92 tests across 7 files - Zero runtime dependencies (node:test is built into Node 20+) - Runs in ~1.6 seconds - Wired into CI via .github/workflows/test.yml ## CI / CD - .github/workflows/pages.yml: static-site deploy on push to main. Stages index.html + src/ into _site/ and uploads as a GitHub Pages artifact. No Node setup, no bundler, no build step — the repo is the deployable site. - .github/workflows/test.yml: runs npm test on every push to main or develop and on every PR. - Replaces the obsolete vite.yml, which referenced npm run build and ./dist (neither of which exist in the new architecture). ## Removes - src/app/ (Vue components, composables, controllers) - src/facets/, src/services/, src/composables/, src/workers/ - editor.html (standalone Vue editor, no longer used) - vite.config.js, vitest.config.js, yarn.lock - test/ (legacy Vitest suite — replaced with node:test) - readme.md (lowercase, Vue-era, every line stale) ## Adds - README.md (matches sibling ted-open-data repo's style) - LICENSE (EUPL v1.2, verbatim from sibling repo) - EUPL copyright headers on every first-party source file (31 files)
Code-level fixes raised by the fresh-reviewer pass on PR #24. None of these are bugs that manifest in normal use; they address gaps between what the PR body claims and what the code actually does, plus one CI workflow improvement. ## A1. Belt-and-braces SPARQL injection guard in _describeTermQuery Facets built at click-time (TermRenderer's named-node click handler, BacklinksView's subject badge click handler) bypass validateFacet because they come from server-trusted SPARQL responses. The URI is still interpolated into `DESCRIBE <${term.value}>`, though, so we now run the same FORBIDDEN_URI_CHARS check at the point of interpolation and throw on rejection. This closes the narrative gap in the PR body's "SPARQL injection defence" framing — validateFacet guards the URL/sessionStorage boundary, _describeTermQuery guards the click-time path, together they cover every way a URI can reach the DESCRIBE string. Hoisted FORBIDDEN_URI_CHARS + _isSafeUri to the top of facets.js so both sites can use them, and added a regression test that asserts getQuery throws for 5 injection-character vectors while still accepting a normal ePO URI. ## A2. Move cors + express from dependencies to devDependencies The PR body framed both as dev-only (they're used exclusively by scripts/cors-proxy.cjs, which is the local dev server). Having them under "dependencies" in package.json made npm audit / dependabot / GitHub's dependency graph classify them as production deps and potentially surface spurious CVEs. Moved to devDependencies and regenerated package-lock.json. ## A3. Add concurrency block to test.yml pages.yml already had concurrency protection; test.yml didn't. Rapid pushes to a PR branch stacked redundant test runs. Added: concurrency: group: test-${{ github.ref }} cancel-in-progress: true so only the latest commit's test run executes per ref. ## A7. Document labelService reentrancy test timing assumption The batching reentrancy tests use setTimeout(150) to outwait BATCH_DELAY_MS=100, leaving ~50ms slack. On a heavily-loaded CI runner that could theoretically flake. Added a block comment at the top of the reentrancy test section explaining the coupling, the slack, and the two options if flakes do materialise (bump the wait, or add a test-only timer hook). No behavioural change — just future-proofing the diagnostic. Test suite now 93 passing (was 92). New test: getQuery throws on unsafe URIs.
Three comment-only clarifications from the fresh-reviewer pass. No behavioural changes. ## B1. Document `src/` as the deploy boundary in pages.yml The Stage step runs `cp -r src _site/`, which means anything committed under `src/` is published to GitHub Pages on the next main-branch push. Today the directory only contains shipping code, but there's no structural guard against future drops of secrets, scratch queries, fixtures, or local mocks. Added a block comment next to the Stage step making the boundary explicit and pointing contributors at the right places to put non-shippable files. ## B2. Explain _scrollToCurrent's per-container behaviour `_scrollToCurrent` is called once per container per render (Search-tab card + Explore-tab mini-card), and the mini-card is collapsed by default — so on the first render its `clientWidth` is 0 and the scrollLeft assignment is a no-op. The scroll gets recomputed naturally when `_updateHighlight` fires after the card is expanded. The existing behaviour is correct but non-obvious; added a block comment explaining it so the next reader doesn't stop and question whether the scroll is targeting the right container. ## B3. Document _helpers.js shim minimalism The StubElement / StubClassList shim in the test helpers is intentionally small (no querySelector traversal, no event bubbling, no layout metrics, synchronous rAF). Previously this was implicit in the code; now it's explicit in the header comment, with guidance for future test authors: extend the shim here rather than pulling in JSDOM. Also lists the specific gaps a future test might trip on.
Adds unit tests for the two source files the test-analyzer agent flagged as having non-DOM logic and zero direct test coverage: TreeRenderer and randomNotice. Neither is a bug fix — both modules work correctly today — but both are exactly the kind of code that silently regresses during a refactor if nothing is looking at them. ## TreeRenderer unit tests (12 tests) Exercises the three pure helpers on TreeRenderer without touching the DOM: `_buildIndex`, `_findRootSubjects`, `_partitionObjects`. - `_buildIndex`: grouping by subject and predicate, multiple objects under the same predicate, empty input - `_findRootSubjects`: root identification, pure-cycle fallback to all subjects, literals not counting as references - `_partitionObjects`: all nestability branches — named node in index, blank node in index, ancestor-path exclusion, dangling reference, literal-vs-node mix, per-bucket order preservation The TreeRenderer constructor is called with a dummy container object; the pure helpers never read from it, so no DOM shim is needed beyond what test/_helpers.js already provides for the import chain. ## randomNotice unit tests (8 tests) + DI hook Adds `__setDoSPARQLForTesting` and `__resetForTesting` hooks on randomNotice (same pattern as labelService) so tests can drive the retry loop without hitting the real endpoint. Tests cover: - happy path (first attempt succeeds) - quad-predicate filtering (only `hasNoticePublicationNumber` triples are picked, unrelated quads are ignored) - retry path (first N attempts empty, later attempt succeeds) - exhaustion path (all 10 attempts empty → throws, no hardcoded fallback) - endpoint error recovery (doSPARQL throws → treated as empty-window, retries continue until success or exhaustion) - regression guard that the pre-rewrite "hardcoded fallback" behaviour has not silently crept back in Test suite now 113 passing (was 93). No behavioural changes — the DI hooks are strictly additive and documented as test-only.
Mirror the cancel-query feature from the sibling ted-open-data repo: a red ✗ button in the footer, next to the progress bar, visible only while a query is in flight. Clicking it terminates the SPARQL worker, clears the current results without raising an error banner, and lets the next search spawn a fresh worker. ## Why Most explorer queries are fast (~50-200ms against the acceptance endpoint), so users rarely need to cancel today. The migration angle changes the calculus: when the explorer gets folded into ted-open-data, ted-open-data users rely on the stop button for long-running exploratory SPARQL queries, and losing it during the merge would be a visible regression. Adding it now means the merge inherits a consistent cancel API from one place. ## Design — Option B (terminate worker) The explorer architecture runs SPARQL in a dedicated Web Worker. Three cancel strategies were considered: A. postMessage-based cancel (worker tracks id→AbortController map) B. Terminate + respawn the worker on stop (nuclear, simple) C. Per-request AbortController exposed via the service API Option B wins for this use case: the user clicking Stop means "stop whatever SPARQL work is happening," not "stop this one thing and keep the label fetches going." A nuclear terminate is the honest model and leverages the existing `worker.terminate()` code in the worker-crash handler — same shutdown path, same respawn path, no new state. ## Changes - `sparqlService.js`: new public `cancelAllSparqlRequests()` which rejects every pending promise with a `CancelledError` (named error), terminates the worker, and nulls it out. Next call to `getWorker()` spawns a fresh one. - `ExplorerController.js`: new public `cancelCurrentQuery()`. No-op when `!isLoading`. Otherwise delegates to the service's cancel hook (injected via constructor, same DI pattern as `doSPARQL`). `_executeCurrentQuery` now recognises `CancelledError` in its catch branch and clears `results`/`error` to null instead of writing the cancellation into `this.error` and logging. Loading state clears naturally via the finally block. - `app.js`: new `wireStopButton(controller)` helper, called alongside `wireProgressBar`. Toggles visibility on `loading-changed`, routes clicks to `controller.cancelCurrentQuery()`. - `index.html`: new `#stop-query-btn` in the footer, hidden by default, Bootstrap Icons `bi-x-circle-fill` glyph, red (`#dc3545`), mirrored from the sibling repo's placement. ## Tests Three new controller tests verifying the cancel path: - `cancelCurrentQuery invokes the cancel hook and clears loading state` - `cancelCurrentQuery is a no-op when no query is in flight` - `cancelCurrentQuery during a navigation does not affect the fresh search that replaces it` All stubbed with a mock `cancelAllSparqlRequests` that rejects the in-flight deferred with a `CancelledError`, exercising the catch branch in `_executeCurrentQuery`. Suite now 116 passing (was 113). ## Verified live via Playwright - Cold load: stop button hidden (display: none) - Search kicks off → stop button visible (display: flex), loading indicator visible - Click stop mid-flight → loading indicator clears, stop button hides, no error banner appears, data card retains whatever was there previously (cancelled search never populated it) - Fresh search after cancel → new worker spawns, query runs, results render normally - Zero console errors or warnings across the entire stop-button session
Before this change, searching a notice-number that doesn't exist
(typo, nonsense value) produced a confusingly successful-looking UI:
- Title: "Notice 99999999-2026 — 0 triples"
- Body: "No triples to display" (muted italic)
- No error banner, no explanation
Users couldn't distinguish "this notice doesn't exist" from
"this notice exists but has no data", and the phantom publication
number got saved to session history where it polluted the dropdown
indefinitely.
## Detection
A notice-number search with `results.size === 0` is a reliable
"not found" signal: any real notice produces at least rdf:type
triples, so an empty CONSTRUCT on a well-formed publication number
cannot be anything other than "no such notice". DataView checks
this in `_onResultsChanged` after a successful (non-error) query.
## The new state
A dedicated panel replaces the Tree/Turtle/Backlinks trio when
not-found fires:
- Large muted search-glass icon
- "Notice not found" heading
- Body: "No notice with publication number <code>...</code> was
found in the dataset."
- Secondary line: "Double-check the publication number, or click
here to pick a random notice."
Title is just `Notice 99999999-2026` — the "— N triples" suffix is
deliberately dropped because leaking "0 triples" into the framing
is the confusing thing we're replacing. The view-mode radio group
(Tree/Turtle/Backlinks) is hidden entirely because there's nothing
to toggle between.
## Phantom history cleanup (Layer 2a)
New `controller.removeFacetByValue(publicationNumber)` method
removes a notice-number entry from the persistent history list by
its pub-number value. DataView calls it when the not-found state
fires, so a typo'd search no longer lingers in the dropdown. The
method is careful to only match notice-number facets (not other
types that might carry the same string in a `value` field), and
persists the removal to sessionStorage.
## Consistent messaging across tabs
NoticeView's "No procedures found for this notice" message (shown
in the Search-tab results card when the TED Search API returns
zero procedures) was a second, less-clear version of the same
signal on a different tab. Now NoticeView hides the Search-tab
results card entirely in that branch, so the canonical "not found"
state on the Explore tab is the only place the user sees the
message.
## Lucky link wiring
DataView's "pick a random notice" link inside the not-found panel
routes to a new public `SearchPanel.pickRandom()` method (thin
wrapper around `_lucky()`), delegated via a `pickRandom` callback
passed to the DataView constructor from app.js. Same lucky flow as
the Search-tab link — nothing forked.
## Verified live (Playwright)
- Search 99999999-2026 → not-found panel shown, title clean, view
mode toggle hidden, history not polluted (3 real entries, no
phantom)
- Search a valid notice → not-found hidden, view mode toggle back,
tree renders, title has triples count
- Flip back and forth between valid and not-found several times →
clean state on every transition
- Click "pick a random notice" from the not-found panel → routes
through searchPanel.pickRandom(), lands on a real notice
- Zero console errors or warnings throughout
## Tests
Four new controller tests for `removeFacetByValue`:
- removes notice-number entries by value
- no-op when no match
- only matches notice-number type (ignores other facet types
carrying the same string in a `value` field)
- persists to sessionStorage
Suite now 120 passing (was 116).
…utton to data card - getShareableUrl now serialises only identity-defining fields (~103 chars vs 234) - SearchPanel.init switches to Explore tab on fresh navigation from shared URLs; reloads still preserve the active tab - share button moved from Search tab to next to the data card title, where the user is actually looking at what they want to share; supports deep named-node states the old button could not represent - extract clipboard copy fallback into shared clipboardCopy.js util
|
Comments after testing:
|
…ne clicks - hide Explore tab entirely when there's no facet to explore (plus a safety-net placeholder inside the tab in case it's ever reached) - hydrate history dropdown from sessionStorage on SearchPanel.init so reloads no longer look like 'history cleared until next search' - timeline sibling clicks no longer add to History; the Procedure Timeline is already the contextual UI for siblings, History is for notices the user explicitly started from
|
Thanks for the careful testing @victoriobentivogli — all three observations were on the money. Pushed ce7acaa which addresses them: 1. Cold load / empty Explore tab — Root cause: the placeholder ("Execute a query to see RDF data") was nested inside Fix: hide the Explore tab entirely when there's nothing to explore. The tab button only appears once a facet is loaded. A safety-net placeholder with clearer copy now lives outside 2. History dropdown empty after reload — Root cause: Fix: one call to 3. Timeline clicks adding to History — design question — You flagged this as a question and I think it's a good one. My take: they shouldn't. The mental model should be:
Adding timeline siblings to History creates clutter (one procedure with 6 stages = 6 near-duplicate entries) and duplicates affordance — the Timeline widget itself is the contextual navigator for siblings. The existing Fix: Tests: 2 new cases cover the Let me know if any of the reasoning on #3 doesn't line up with your expectations — happy to revisit. |
|
Thanks Ioannis. I agree with the reasoning behind #3. Will retest now. |
|
Testing updates:
|
I moved the share button to where it belongs (in the explore tab next to the title). |
|
Clarified the issue. Now approving and merging. |
victoriobentivogli
left a comment
There was a problem hiding this comment.
OK. Approving.
Summary
Complete rewrite of the explorer frontend from Vue 3 + Naive UI to vanilla JavaScript + Bootstrap 5.3.8. All user-facing features preserved; the new stack keeps only Express + cors for the dev server and the CodeMirror packages for the editor bundle.
Scope: 88 files changed, +8,176 / −19,025 lines.
Why this rewrite
ted-open-datarepo is already vanilla JS + Bootstrap. Rewriting this repo the same way makes future consolidation possible.node:testsuite has zero runtime dependencies and runs in ~1.6 seconds.What to focus on during review
src/js/app.jswires a singleExplorerControllerto five UI panels via CustomEvents. The controller is the only source of truth for navigation, history, and results. Header comments in each file describe its role.ExplorerController._executeCurrentQuery,NoticeView._fetchData,BacklinksView._loadBatch. Each site is regression-tested.tedAPI.js—buyer-countryarrays,notice-type/form-typeobject-or-string shapes are normalized tostring | nullat the boundary so the rest of the codebase sees stable types.facets.js—validateFacetrejects URIs carrying angle brackets, quotes, backslashes, whitespace, or control characters.index.html) — updated to match current UI. Good fit for a visual review.test/. Particularly worth reading:test/controller.test.js(token race) andtest/labelService.test.js(batching reentrancy).What does NOT need deep scrutiny
src/vendor/codemirror-bundle.js— 465KB minified bundle of CodeMirror v6, generated byscripts/build-codemirror.cjs. Only the entry pointsrc/vendor/codemirror-entry.jsis worth reading.LICENSE— byte-for-byte copy of the EUPL 1.2 file from the sibling repo..github/workflows/pages.ymlandtest.yml— minimal replacements for the obsoletevite.ymlwhich referencednpm run buildand./dist(neither exists in the new architecture).Review process before this PR
A thorough pre-PR review pass was run against the full diff, covering code review, silent-failure hunting, comment accuracy, type/shape design, and test coverage, plus a behavioral walkthrough of the running app. Every finding (7 High, 9 Medium, ~14 Low/Nit) was addressed and regression-guarded before this PR was opened.
Test plan
git clonea fresh copy and confirm no build step is needednpm installsucceeds (only the CORS proxy + CodeMirror dev-dependencies)npm startbrings up the dev server athttp://localhost:8080and the SPARQL proxy at/sparqlnpm testpasses 93/93 in under 2 seconds?facet=...): facet loads into Explore but tab stays on Search?facet={"value":"foo"}): dismissible banner shown, no data loaded, app stays on SearchCONSTRUCT { ?s ?p ?o } WHERE { ?s a epo:Notice } LIMIT 10, tab switches to Explore, results renderconsole.warnmessages from schema-drift detection intedAPI.js(non-arraynotices, missingprocedure-identifier, unexpectedpublication-numbershape) and from corruptsessionStorageJSON inExplorerController.jsare acceptable — everything else should be silent.Follow-up
Once this PR is merged to
develop, a separate PR fromdeveloptomainwill release version2.0.0. Tag1.0.0already points atmain's current (Vue-era) commit as a pre-rewrite baseline.