Rewrite explorer in vanilla JS + Bootstrap 5.3.8 by rousso · Pull Request #24 · OP-TED/ted-open-data-explorer

rousso · 2026-04-07T09:16:11Z

Summary

Complete rewrite of the explorer frontend from Vue 3 + Naive UI to vanilla JavaScript + Bootstrap 5.3.8. All user-facing features preserved; the new stack keeps only Express + cors for the dev server and the CodeMirror packages for the editor bundle.

Scope: 88 files changed, +8,176 / −19,025 lines.

Why this rewrite

The Vue stack had ~55 security vulnerabilities on the default branch. Most of that disappears with this merge.
The sibling ted-open-data repo is already vanilla JS + Bootstrap. Rewriting this repo the same way makes future consolidation possible.
The old Vitest suite carried dozens of transitive dependencies. The new node:test suite has zero runtime dependencies and runs in ~1.6 seconds.

What to focus on during review

Architecture — src/js/app.js wires a single ExplorerController to five UI panels via CustomEvents. The controller is the only source of truth for navigation, history, and results. Header comments in each file describe its role.
Request-token pattern — stale-response race guards in three places: ExplorerController._executeCurrentQuery, NoticeView._fetchData, BacklinksView._loadBatch. Each site is regression-tested.
Boundary normalization in tedAPI.js — buyer-country arrays, notice-type/form-type object-or-string shapes are normalized to string | null at the boundary so the rest of the codebase sees stable types.
SPARQL injection defence in facets.js — validateFacet rejects URIs carrying angle brackets, quotes, backslashes, whitespace, or control characters.
Help tab (index.html) — updated to match current UI. Good fit for a visual review.
Tests — 93 tests across 8 files in test/. Particularly worth reading: test/controller.test.js (token race) and test/labelService.test.js (batching reentrancy).

What does NOT need deep scrutiny

src/vendor/codemirror-bundle.js — 465KB minified bundle of CodeMirror v6, generated by scripts/build-codemirror.cjs. Only the entry point src/vendor/codemirror-entry.js is worth reading.
LICENSE — byte-for-byte copy of the EUPL 1.2 file from the sibling repo.
.github/workflows/pages.yml and test.yml — minimal replacements for the obsolete vite.yml which referenced npm run build and ./dist (neither exists in the new architecture).

Review process before this PR

A thorough pre-PR review pass was run against the full diff, covering code review, silent-failure hunting, comment accuracy, type/shape design, and test coverage, plus a behavioral walkthrough of the running app. Every finding (7 High, 9 Medium, ~14 Low/Nit) was addressed and regression-guarded before this PR was opened.

Test plan

Note: items after npm test exercise the live publications.europa.eu SPARQL endpoint via the bundled CORS proxy, so a reviewer on a network that blocks outbound access to that endpoint will see false negatives.

Follow-up

Once this PR is merged to develop, a separate PR from develop to main will release version 2.0.0. Tag 1.0.0 already points at main's current (Vue-era) commit as a pre-rewrite baseline.

Replace the entire frontend stack with vanilla JavaScript and Bootstrap, preserving 100% of user-facing features. Bundle CodeMirror via a custom esbuild script (scripts/build-codemirror.cjs) and move SPARQL execution to a Web Worker. ## Architecture - Single ExplorerController as the source of truth; UI panels subscribe via CustomEvents (facet-changed, results-changed, loading-changed, breadcrumb-changed, facets-list-changed). - Panels: SearchPanel, SparqlPanel, NoticeView, DataView, BacklinksView. - Services: sparqlService (worker-backed CONSTRUCT/DESCRIBE + direct SELECT with AbortController timeouts), tedAPI (JSON client for the TED Search API with hostname-based production/acceptance routing), labelService (batched label resolution with snapshot-and-clear reentrancy), randomNotice. - Renderers: TreeRenderer (lazy nested cards, per-branch cycle guard), TermRenderer (RDF term + badge rendering with single label request per badge). - CodeMirror bundled once via esbuild into src/vendor/codemirror-bundle.js, committed to the repo so a fresh clone is runnable without npm install. ## Behavioural principles baked in - Request-token pattern guards three stale-response races (ExplorerController._executeCurrentQuery, NoticeView._fetchData, BacklinksView._loadBatch). All three are regression-tested. - SessionStorage-persisted notice-search history only; queries and named-node facets are deliberately not persisted. - Explicit user-gesture gate on tab-switching: only the search button, timeline click, history dropdown pick, and SPARQL Execute auto-switch to the Explore tab. URL-param loads, breadcrumb navigation, and label resolution leave the active tab alone. - AbortController-based timeouts on every fetch and worker call (30s TED API, 60s SPARQL, 10s labels). - Defensive boundary normalisation at the tedAPI layer: buyer-country arrays joined to strings, noticeType/formType coerced to string-or-null, no fabricated "Unknown" values leaking downstream. - SPARQL injection defence in validateFacet (rejects URIs carrying angle brackets, quotes, backslashes, whitespace, or control characters). - Identity preservation across history enrichment: breadcrumb and facetsList share references so post-search metadata updates stay visible via currentFacet, and the History dropdown's active highlight keeps working. - Turtle editor re-renders from current results on every view-mode switch (no stale RDF after navigation). - SPARQL worker crash safely rejects every pending promise and respawns on next call. - No "spooky" behaviour on reload: cold load lands on Search with a blank input, no auto-replay of history, no random-notice fallback. - Shareable ?facet= URLs validate on load; invalid URLs surface a dismissible banner instead of a silently blank tab. ## Test suite Vanilla node:test suite restoring the regression coverage from the deleted Zod-based Vitest suite, plus new tests for the request-token race (3 sites), URL round-trip, identity preservation, label-service batching reentrancy, sparqlService orphan-pending-request protection, and shape normalisation. - 92 tests across 7 files - Zero runtime dependencies (node:test is built into Node 20+) - Runs in ~1.6 seconds - Wired into CI via .github/workflows/test.yml ## CI / CD - .github/workflows/pages.yml: static-site deploy on push to main. Stages index.html + src/ into _site/ and uploads as a GitHub Pages artifact. No Node setup, no bundler, no build step — the repo is the deployable site. - .github/workflows/test.yml: runs npm test on every push to main or develop and on every PR. - Replaces the obsolete vite.yml, which referenced npm run build and ./dist (neither of which exist in the new architecture). ## Removes - src/app/ (Vue components, composables, controllers) - src/facets/, src/services/, src/composables/, src/workers/ - editor.html (standalone Vue editor, no longer used) - vite.config.js, vitest.config.js, yarn.lock - test/ (legacy Vitest suite — replaced with node:test) - readme.md (lowercase, Vue-era, every line stale) ## Adds - README.md (matches sibling ted-open-data repo's style) - LICENSE (EUPL v1.2, verbatim from sibling repo) - EUPL copyright headers on every first-party source file (31 files)

Code-level fixes raised by the fresh-reviewer pass on PR #24. None of these are bugs that manifest in normal use; they address gaps between what the PR body claims and what the code actually does, plus one CI workflow improvement. ## A1. Belt-and-braces SPARQL injection guard in _describeTermQuery Facets built at click-time (TermRenderer's named-node click handler, BacklinksView's subject badge click handler) bypass validateFacet because they come from server-trusted SPARQL responses. The URI is still interpolated into `DESCRIBE <${term.value}>`, though, so we now run the same FORBIDDEN_URI_CHARS check at the point of interpolation and throw on rejection. This closes the narrative gap in the PR body's "SPARQL injection defence" framing — validateFacet guards the URL/sessionStorage boundary, _describeTermQuery guards the click-time path, together they cover every way a URI can reach the DESCRIBE string. Hoisted FORBIDDEN_URI_CHARS + _isSafeUri to the top of facets.js so both sites can use them, and added a regression test that asserts getQuery throws for 5 injection-character vectors while still accepting a normal ePO URI. ## A2. Move cors + express from dependencies to devDependencies The PR body framed both as dev-only (they're used exclusively by scripts/cors-proxy.cjs, which is the local dev server). Having them under "dependencies" in package.json made npm audit / dependabot / GitHub's dependency graph classify them as production deps and potentially surface spurious CVEs. Moved to devDependencies and regenerated package-lock.json. ## A3. Add concurrency block to test.yml pages.yml already had concurrency protection; test.yml didn't. Rapid pushes to a PR branch stacked redundant test runs. Added: concurrency: group: test-${{ github.ref }} cancel-in-progress: true so only the latest commit's test run executes per ref. ## A7. Document labelService reentrancy test timing assumption The batching reentrancy tests use setTimeout(150) to outwait BATCH_DELAY_MS=100, leaving ~50ms slack. On a heavily-loaded CI runner that could theoretically flake. Added a block comment at the top of the reentrancy test section explaining the coupling, the slack, and the two options if flakes do materialise (bump the wait, or add a test-only timer hook). No behavioural change — just future-proofing the diagnostic. Test suite now 93 passing (was 92). New test: getQuery throws on unsafe URIs.

Three comment-only clarifications from the fresh-reviewer pass. No behavioural changes. ## B1. Document `src/` as the deploy boundary in pages.yml The Stage step runs `cp -r src _site/`, which means anything committed under `src/` is published to GitHub Pages on the next main-branch push. Today the directory only contains shipping code, but there's no structural guard against future drops of secrets, scratch queries, fixtures, or local mocks. Added a block comment next to the Stage step making the boundary explicit and pointing contributors at the right places to put non-shippable files. ## B2. Explain _scrollToCurrent's per-container behaviour `_scrollToCurrent` is called once per container per render (Search-tab card + Explore-tab mini-card), and the mini-card is collapsed by default — so on the first render its `clientWidth` is 0 and the scrollLeft assignment is a no-op. The scroll gets recomputed naturally when `_updateHighlight` fires after the card is expanded. The existing behaviour is correct but non-obvious; added a block comment explaining it so the next reader doesn't stop and question whether the scroll is targeting the right container. ## B3. Document _helpers.js shim minimalism The StubElement / StubClassList shim in the test helpers is intentionally small (no querySelector traversal, no event bubbling, no layout metrics, synchronous rAF). Previously this was implicit in the code; now it's explicit in the header comment, with guidance for future test authors: extend the shim here rather than pulling in JSDOM. Also lists the specific gaps a future test might trip on.

Adds unit tests for the two source files the test-analyzer agent flagged as having non-DOM logic and zero direct test coverage: TreeRenderer and randomNotice. Neither is a bug fix — both modules work correctly today — but both are exactly the kind of code that silently regresses during a refactor if nothing is looking at them. ## TreeRenderer unit tests (12 tests) Exercises the three pure helpers on TreeRenderer without touching the DOM: `_buildIndex`, `_findRootSubjects`, `_partitionObjects`. - `_buildIndex`: grouping by subject and predicate, multiple objects under the same predicate, empty input - `_findRootSubjects`: root identification, pure-cycle fallback to all subjects, literals not counting as references - `_partitionObjects`: all nestability branches — named node in index, blank node in index, ancestor-path exclusion, dangling reference, literal-vs-node mix, per-bucket order preservation The TreeRenderer constructor is called with a dummy container object; the pure helpers never read from it, so no DOM shim is needed beyond what test/_helpers.js already provides for the import chain. ## randomNotice unit tests (8 tests) + DI hook Adds `__setDoSPARQLForTesting` and `__resetForTesting` hooks on randomNotice (same pattern as labelService) so tests can drive the retry loop without hitting the real endpoint. Tests cover: - happy path (first attempt succeeds) - quad-predicate filtering (only `hasNoticePublicationNumber` triples are picked, unrelated quads are ignored) - retry path (first N attempts empty, later attempt succeeds) - exhaustion path (all 10 attempts empty → throws, no hardcoded fallback) - endpoint error recovery (doSPARQL throws → treated as empty-window, retries continue until success or exhaustion) - regression guard that the pre-rewrite "hardcoded fallback" behaviour has not silently crept back in Test suite now 113 passing (was 93). No behavioural changes — the DI hooks are strictly additive and documented as test-only.

Mirror the cancel-query feature from the sibling ted-open-data repo: a red ✗ button in the footer, next to the progress bar, visible only while a query is in flight. Clicking it terminates the SPARQL worker, clears the current results without raising an error banner, and lets the next search spawn a fresh worker. ## Why Most explorer queries are fast (~50-200ms against the acceptance endpoint), so users rarely need to cancel today. The migration angle changes the calculus: when the explorer gets folded into ted-open-data, ted-open-data users rely on the stop button for long-running exploratory SPARQL queries, and losing it during the merge would be a visible regression. Adding it now means the merge inherits a consistent cancel API from one place. ## Design — Option B (terminate worker) The explorer architecture runs SPARQL in a dedicated Web Worker. Three cancel strategies were considered: A. postMessage-based cancel (worker tracks id→AbortController map) B. Terminate + respawn the worker on stop (nuclear, simple) C. Per-request AbortController exposed via the service API Option B wins for this use case: the user clicking Stop means "stop whatever SPARQL work is happening," not "stop this one thing and keep the label fetches going." A nuclear terminate is the honest model and leverages the existing `worker.terminate()` code in the worker-crash handler — same shutdown path, same respawn path, no new state. ## Changes - `sparqlService.js`: new public `cancelAllSparqlRequests()` which rejects every pending promise with a `CancelledError` (named error), terminates the worker, and nulls it out. Next call to `getWorker()` spawns a fresh one. - `ExplorerController.js`: new public `cancelCurrentQuery()`. No-op when `!isLoading`. Otherwise delegates to the service's cancel hook (injected via constructor, same DI pattern as `doSPARQL`). `_executeCurrentQuery` now recognises `CancelledError` in its catch branch and clears `results`/`error` to null instead of writing the cancellation into `this.error` and logging. Loading state clears naturally via the finally block. - `app.js`: new `wireStopButton(controller)` helper, called alongside `wireProgressBar`. Toggles visibility on `loading-changed`, routes clicks to `controller.cancelCurrentQuery()`. - `index.html`: new `#stop-query-btn` in the footer, hidden by default, Bootstrap Icons `bi-x-circle-fill` glyph, red (`#dc3545`), mirrored from the sibling repo's placement. ## Tests Three new controller tests verifying the cancel path: - `cancelCurrentQuery invokes the cancel hook and clears loading state` - `cancelCurrentQuery is a no-op when no query is in flight` - `cancelCurrentQuery during a navigation does not affect the fresh search that replaces it` All stubbed with a mock `cancelAllSparqlRequests` that rejects the in-flight deferred with a `CancelledError`, exercising the catch branch in `_executeCurrentQuery`. Suite now 116 passing (was 113). ## Verified live via Playwright - Cold load: stop button hidden (display: none) - Search kicks off → stop button visible (display: flex), loading indicator visible - Click stop mid-flight → loading indicator clears, stop button hides, no error banner appears, data card retains whatever was there previously (cancelled search never populated it) - Fresh search after cancel → new worker spawns, query runs, results render normally - Zero console errors or warnings across the entire stop-button session

Before this change, searching a notice-number that doesn't exist (typo, nonsense value) produced a confusingly successful-looking UI: - Title: "Notice 99999999-2026 — 0 triples" - Body: "No triples to display" (muted italic) - No error banner, no explanation Users couldn't distinguish "this notice doesn't exist" from "this notice exists but has no data", and the phantom publication number got saved to session history where it polluted the dropdown indefinitely. ## Detection A notice-number search with `results.size === 0` is a reliable "not found" signal: any real notice produces at least rdf:type triples, so an empty CONSTRUCT on a well-formed publication number cannot be anything other than "no such notice". DataView checks this in `_onResultsChanged` after a successful (non-error) query. ## The new state A dedicated panel replaces the Tree/Turtle/Backlinks trio when not-found fires: - Large muted search-glass icon - "Notice not found" heading - Body: "No notice with publication number <code>...</code> was found in the dataset." - Secondary line: "Double-check the publication number, or click here to pick a random notice." Title is just `Notice 99999999-2026` — the "— N triples" suffix is deliberately dropped because leaking "0 triples" into the framing is the confusing thing we're replacing. The view-mode radio group (Tree/Turtle/Backlinks) is hidden entirely because there's nothing to toggle between. ## Phantom history cleanup (Layer 2a) New `controller.removeFacetByValue(publicationNumber)` method removes a notice-number entry from the persistent history list by its pub-number value. DataView calls it when the not-found state fires, so a typo'd search no longer lingers in the dropdown. The method is careful to only match notice-number facets (not other types that might carry the same string in a `value` field), and persists the removal to sessionStorage. ## Consistent messaging across tabs NoticeView's "No procedures found for this notice" message (shown in the Search-tab results card when the TED Search API returns zero procedures) was a second, less-clear version of the same signal on a different tab. Now NoticeView hides the Search-tab results card entirely in that branch, so the canonical "not found" state on the Explore tab is the only place the user sees the message. ## Lucky link wiring DataView's "pick a random notice" link inside the not-found panel routes to a new public `SearchPanel.pickRandom()` method (thin wrapper around `_lucky()`), delegated via a `pickRandom` callback passed to the DataView constructor from app.js. Same lucky flow as the Search-tab link — nothing forked. ## Verified live (Playwright) - Search 99999999-2026 → not-found panel shown, title clean, view mode toggle hidden, history not polluted (3 real entries, no phantom) - Search a valid notice → not-found hidden, view mode toggle back, tree renders, title has triples count - Flip back and forth between valid and not-found several times → clean state on every transition - Click "pick a random notice" from the not-found panel → routes through searchPanel.pickRandom(), lands on a real notice - Zero console errors or warnings throughout ## Tests Four new controller tests for `removeFacetByValue`: - removes notice-number entries by value - no-op when no match - only matches notice-number type (ignores other facet types carrying the same string in a `value` field) - persists to sessionStorage Suite now 120 passing (was 116).

…utton to data card - getShareableUrl now serialises only identity-defining fields (~103 chars vs 234) - SearchPanel.init switches to Explore tab on fresh navigation from shared URLs; reloads still preserve the active tab - share button moved from Search tab to next to the data card title, where the user is actually looking at what they want to share; supports deep named-node states the old button could not represent - extract clipboard copy fallback into shared clipboardCopy.js util

victoriobentivogli · 2026-04-07T11:55:28Z

Comments after testing:

Cold load (clean URL, empty session): The "Explore" tab is present after a "cold" load but no content shown.
Clear history: The dropdown is cleared as expected. Reloading the page also clears the history, but after a new search, it is repopulated with previous searches.
Session storage: I see that the "History" dropdown remembers not only notices entered in the search box but also ones "clicked" in the timeline. Is this the expected behaviour? Also, when I refresh the page, the "History" dropdown is empty but the "explorer-facets-v3" key holds the proper history and performing another search brings all values back to the history dropdown.

…ne clicks - hide Explore tab entirely when there's no facet to explore (plus a safety-net placeholder inside the tab in case it's ever reached) - hydrate history dropdown from sessionStorage on SearchPanel.init so reloads no longer look like 'history cleared until next search' - timeline sibling clicks no longer add to History; the Procedure Timeline is already the contextual UI for siblings, History is for notices the user explicitly started from

rousso · 2026-04-07T12:44:27Z

Thanks for the careful testing @victoriobentivogli — all three observations were on the money. Pushed ce7acaa which addresses them:

1. Cold load / empty Explore tab — Root cause: the placeholder ("Execute a query to see RDF data") was nested inside #data-card, which is hidden when there's no current facet. So on a cold load the placeholder was hidden along with the card and the Explore tab was visually blank.

Fix: hide the Explore tab entirely when there's nothing to explore. The tab button only appears once a facet is loaded. A safety-net placeholder with clearer copy now lives outside #data-card in case the tab is ever reached in an empty state.

2. History dropdown empty after reload — Root cause: ExplorerController._loadFromSession() correctly hydrated facetsList from sessionStorage in the constructor, but no facets-list-changed event fired (it can't — there are no listeners yet at construction time) and SearchPanel.init() didn't call _updateUI() either. So the dropdown only populated after the first facets-list-changed event, which happens on the next search — exactly the "reload clears it, new search brings it back" symptom you saw.

Fix: one call to _updateUI() at the start of SearchPanel.init() so the dropdown reflects whatever the controller already loaded.

3. Timeline clicks adding to History — design question — You flagged this as a question and I think it's a good one. My take: they shouldn't. The mental model should be:

History dropdown = notices the user explicitly started from (typed, lucky, shared URL)
Procedure Timeline = lateral moves within an already-visible procedure

Adding timeline siblings to History creates clutter (one procedure with 6 stages = 6 near-duplicate entries) and duplicates affordance — the Timeline widget itself is the contextual navigator for siblings. The existing explore() and exploreFromBacklink() methods already follow this pattern (tree drill-downs and backlinks don't persist to History either); timeline clicks were the odd one out.

Fix: controller.search() now accepts { addToHistory: false }, and NoticeView's timeline click handler passes it. The breadcrumb still resets (we're switching notices), the search input still mirrors the clicked pub number, and a new _resolveExisting() helper makes sure the breadcrumb points at the enriched canonical reference if we've seen the notice before — just without inserting a new History entry.

Tests: 2 new cases cover the addToHistory: false contract (no insert + canonical resolution of already-seen notices). 123/123 pass.

Let me know if any of the reasoning on #3 doesn't line up with your expectations — happy to revisit.

victoriobentivogli · 2026-04-07T13:05:06Z

Thanks Ioannis.

I agree with the reasoning behind #3. Will retest now.

victoriobentivogli · 2026-04-07T13:26:14Z

Testing updates:

History works as expected now.
Shareable URL: The icon disappeared from the main search page. When I click on the Share icon on the "Explore" tab and paste it on a new browser (incognito) window, "facet loads into Explore but tab stays on Search" behaviour is not followed as the "Explore" tab is activated. If it is a design change, I do not dislike the new behaviour.

rousso · 2026-04-07T13:29:54Z

Testing updates:

History works as expected now.

Shareable URL: The icon disappeared from the main search page. When I click on the Share icon on the "Explore" tab and paste it on a new browser (incognito) window, "facet loads into Explore but tab stays on Search" behaviour is not followed as the "Explore" tab is activated. If it is a design change, I do not dislike the new behaviour.

I moved the share button to where it belongs (in the explore tab next to the title).
What exactly is your comment? I do not see a specific request for fix or improvement.

victoriobentivogli · 2026-04-07T13:59:48Z

Clarified the issue. Now approving and merging.

victoriobentivogli

OK. Approving.

rousso requested a review from victoriobentivogli April 7, 2026 09:16

rousso added 6 commits April 7, 2026 11:51

victoriobentivogli approved these changes Apr 7, 2026

View reviewed changes

victoriobentivogli merged commit 1a5320d into develop Apr 7, 2026
1 check passed

victoriobentivogli deleted the feature/bootstrap-rewrite branch April 7, 2026 14:01

rousso mentioned this pull request Apr 7, 2026

Release/2.0.0 #25

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite explorer in vanilla JS + Bootstrap 5.3.8#24

Rewrite explorer in vanilla JS + Bootstrap 5.3.8#24
victoriobentivogli merged 8 commits intodevelopfrom
feature/bootstrap-rewrite

rousso commented Apr 7, 2026 •

edited by victoriobentivogli

Loading

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

rousso commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

rousso commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

victoriobentivogli left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rousso commented Apr 7, 2026 • edited by victoriobentivogli Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this rewrite

What to focus on during review

What does NOT need deep scrutiny

Review process before this PR

Test plan

Follow-up

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

rousso commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

rousso commented Apr 7, 2026

Uh oh!

victoriobentivogli commented Apr 7, 2026

Uh oh!

victoriobentivogli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rousso commented Apr 7, 2026 •

edited by victoriobentivogli

Loading