Skip to content

feat(unrest): surface protest source links#30

Open
lspassos1 wants to merge 2 commits into
mainfrom
feat/protest-source-links
Open

feat(unrest): surface protest source links#30
lspassos1 wants to merge 2 commits into
mainfrom
feat/protest-source-links

Conversation

@lspassos1

Copy link
Copy Markdown
Owner

Summary
Adds source URL fidelity to unrest/protest events so map popups can link back to upstream articles when GDELT or ACLED provides a usable URL. Closes #25. Refs koala73#131.

Root cause
The unrest contract only preserved source identifiers such as GDELT or ACLED. Seeded protest events could not carry article-level URLs through the API/client boundary, so the map popup had no source link to render.

Changes

  • Added source_urls to UnrestEvent and regenerated TS/OpenAPI artifacts.
  • Preserved normalized http/https source URLs in the ACLED/GDELT seed path and during dedupe.
  • Adapted unrest client mapping to SocialUnrestEvent.sourceUrls.
  • Rendered compact source links in protest popups and cluster items when URLs exist.
  • Extended the GDELT seed test to assert source URLs survive aggregation.

Validation

  • make generate
  • npx tsx --test tests/seed-unrest-gdelt-fetch.test.mjs
  • npm run typecheck
  • npm run typecheck:api
  • npx biome lint scripts/seed-unrest-events.mjs server/worldmonitor/unrest/v1/_shared.ts src/components/MapPopup.ts src/services/unrest/index.ts tests/seed-unrest-gdelt-fetch.test.mjs completed with existing MapPopup warnings outside this change
  • git diff --check

Risk
Low. The new field is additive; existing cached seed payloads without sourceUrls continue to render without links.

@vercel

vercel Bot commented Apr 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
worldmonitor Ready Ready Preview, Comment Apr 28, 2026 4:56pm

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown
📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Unrest events now display associated source article URLs in map popups and cluster listings, allowing users to access upstream sources.
  • Tests

    • Added test coverage for source URL handling, deduplication, and safe rendering across event aggregation and display.

Walkthrough

This pull request adds end-to-end support for associating upstream source article URLs with unrest events. The changes include API schema definitions (OpenAPI and protobuf), backend logic for extracting and deduplicating source URLs from ACLED and GDELT data sources, frontend components for rendering source links in event popups, and corresponding test coverage.

Changes

Cohort / File(s) Summary
API Specifications
docs/api/UnrestService.openapi.json, docs/api/UnrestService.openapi.yaml, docs/api/worldmonitor.openapi.yaml
Added optional sourceUrls field (array of strings) to the UnrestEvent schema across JSON and YAML OpenAPI definitions.
Protocol Buffers & Type Definitions
proto/worldmonitor/unrest/v1/unrest_event.proto, src/types/index.ts
Extended UnrestEvent message with repeated source_urls field (tag 17) and added optional sourceUrls?: string[] property to TypeScript SocialUnrestEvent type.
Backend Core Logic
server/worldmonitor/unrest/v1/_shared.ts, scripts/seed-unrest-events.mjs, src/services/unrest/index.ts
Added utility functions for merging and deduplicating source URLs with a 5-URL cap (MAX_SOURCE_URLS). Updated seed script to extract source URLs from ACLED (url, iso, event_id_cnty) and GDELT (url, source_url) fields, aggregate and deduplicate them per location mention. Updated deduplication logic to consistently merge sourceUrls across all merge branches.
Frontend Components & Styling
src/components/map-popup-source-links.ts, src/components/MapPopup.ts, src/styles/main.css
Created new renderPopupSourceLinks helper that sanitizes URLs, derives display labels (hostname or numbered fallback), and generates escaped HTML anchors. Updated protest popup to render formatted source links and cluster list items to include external source links. Added CSS styling for .popup-source-links flex container and .popup-link elements.
Tests
tests/unrest-source-links.test.mts, tests/seed-unrest-gdelt-fetch.test.mjs
New test module validates deduplication preserves up to 5 URLs in order and confirms unsafe URLs are filtered from popup output. Enhanced GDELT aggregation test to mock source fields and assert sourceUrls are included in aggregated events.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 URLs hop into our events with glee,
From ACLED and GDELT, a source symphony!
Five links per unrest, de-duped with care,
Popups now sparkle with sources to share! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description covers the key aspects (summary, root cause, changes, validation, risk) but does not follow the provided template structure with checkboxes for type of change and affected areas. Use the repository's PR template: mark the type of change (likely 'New feature'), affected areas (likely 'Map / Globe' and 'API endpoints'), and complete the checklist items to ensure consistency and clarity.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(unrest): surface protest source links' clearly and specifically describes the main change: adding the ability to display protest source links in the UI.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/protest-source-links

Comment @coderabbitai help to get the list of available commands and usage tips.

@lspassos1

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/api/UnrestService.openapi.yaml (1)

284-288: Constrain sourceUrls in the API contract.

Line 284 currently defines an unbounded string array. Please encode the link cap and URL shape in schema so clients can rely on it.

Proposed OpenAPI schema refinement
                 sourceUrls:
                     type: array
+                    maxItems: 5
                     items:
                         type: string
+                        format: uri
+                        pattern: '^https?://'
                         description: Source article URLs, when provided by the upstream feed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api/UnrestService.openapi.yaml` around lines 284 - 288, The OpenAPI
schema for the sourceUrls array is currently unbounded; update the sourceUrls
property in UnrestService.openapi.yaml to constrain it by adding a maxItems
(e.g., 10) and tighten each item to a URL shape by changing the item schema to
type: string with format: uri (or a suitable pattern) so clients can rely on
both a link cap and validated URL format; modify the sourceUrls schema (the
array named "sourceUrls") and its items definition accordingly.
src/components/MapPopup.ts (1)

1524-1528: Avoid sanitizing the same URL twice in cluster rows.

Line 1524 sanitizes during find, then Line 1526 sanitizes again for href. Sanitizing once keeps this path simpler.

♻️ Suggested cleanup
-      const sourceUrl = event.sourceUrls?.find(url => sanitizeUrl(url));
-      const sourceLink = sourceUrl
-        ? ` <a class="popup-link cluster-source-link" href="${sanitizeUrl(sourceUrl)}" target="_blank" rel="noopener noreferrer nofollow">${t('popups.source')} →</a>`
+      const safeSourceUrl = event.sourceUrls
+        ?.map((url) => sanitizeUrl(url))
+        .find((url): url is string => Boolean(url));
+      const sourceLink = safeSourceUrl
+        ? ` <a class="popup-link cluster-source-link" href="${safeSourceUrl}" target="_blank" rel="noopener noreferrer nofollow">${t('popups.source')} →</a>`
         : '';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/components/MapPopup.ts` around lines 1524 - 1528, The code sanitizes the
same URL twice when building cluster rows; change the logic that currently uses
event.sourceUrls?.find(url => sanitizeUrl(url)) and then calls
sanitizeUrl(sourceUrl) again so you sanitize once and reuse it: iterate or find
the first url, run sanitizeUrl(url) once storing the result (e.g.
sanitizedSourceUrl), check that sanitizedSourceUrl is truthy for presence, and
use sanitizedSourceUrl in the href and link text generation (affecting variables
sourceUrl/sourceLink and the construct that returns the cluster-item string).
scripts/seed-unrest-events.mjs (1)

104-110: Normalize sourceUrls on first dedupe insert as well.

On Line 106, first-write events are stored without passing through mergeSourceUrls(...). Normalizing there too keeps dedupe invariants local and future-proof.

♻️ Suggested hardening
     const existing = unique.get(key);
     if (!existing) {
+      event.sourceUrls = mergeSourceUrls(event.sourceUrls);
       unique.set(key, event);
     } else if (event.sourceType === 'UNREST_SOURCE_TYPE_ACLED' && existing.sourceType !== 'UNREST_SOURCE_TYPE_ACLED') {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/seed-unrest-events.mjs` around lines 104 - 110, When inserting the
first occurrence into the dedupe map (the unique.set(key, event) in the block
where existing is falsy), normalize the event's sourceUrls (and sources) the
same way you do in the ACLED merge path: set event.sourceUrls =
mergeSourceUrls(event.sourceUrls, []) and normalize event.sources (e.g.,
event.sources = [...new Set(event.sources)]) before calling unique.set(key,
event) so the initial entry follows the same dedupe invariants as later merges.
src/components/map-popup-source-links.ts (1)

27-31: Use the sanitized URL for domain extraction too.

href is built from safeUrl, but domain text is derived from raw url. Using safeUrl for both keeps output consistent.

♻️ Suggested tweak
-    const domain = extractSourceDomain(url) || `${label} ${links.length + 1}`;
+    const domain = extractSourceDomain(safeUrl) || `${label} ${links.length + 1}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/components/map-popup-source-links.ts` around lines 27 - 31, The domain
text is being extracted from the raw url while the anchor href uses
sanitizeUrl(url); change extraction to use the sanitized value so displayed
domain matches the href: after const safeUrl = sanitizeUrl(url); compute domain
with extractSourceDomain(safeUrl) || `${label} ${links.length + 1}` and keep
using safeUrl for href; adjust any surrounding code that references
extractSourceDomain(url) to use safeUrl instead (symbols: sanitizeUrl,
extractSourceDomain, linkClass, links).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/api/worldmonitor.openapi.yaml`:
- Around line 21247-21251: The OpenAPI schema for the sourceUrls array is
missing the implementation limit; update the sourceUrls schema to include
maxItems: 5 so it matches the MAX_SOURCE_URLS constant used in
server/worldmonitor/unrest/v1/_shared.ts; ensure the description remains and the
array still has items.type: string and items.description unchanged.

---

Nitpick comments:
In `@docs/api/UnrestService.openapi.yaml`:
- Around line 284-288: The OpenAPI schema for the sourceUrls array is currently
unbounded; update the sourceUrls property in UnrestService.openapi.yaml to
constrain it by adding a maxItems (e.g., 10) and tighten each item to a URL
shape by changing the item schema to type: string with format: uri (or a
suitable pattern) so clients can rely on both a link cap and validated URL
format; modify the sourceUrls schema (the array named "sourceUrls") and its
items definition accordingly.

In `@scripts/seed-unrest-events.mjs`:
- Around line 104-110: When inserting the first occurrence into the dedupe map
(the unique.set(key, event) in the block where existing is falsy), normalize the
event's sourceUrls (and sources) the same way you do in the ACLED merge path:
set event.sourceUrls = mergeSourceUrls(event.sourceUrls, []) and normalize
event.sources (e.g., event.sources = [...new Set(event.sources)]) before calling
unique.set(key, event) so the initial entry follows the same dedupe invariants
as later merges.

In `@src/components/map-popup-source-links.ts`:
- Around line 27-31: The domain text is being extracted from the raw url while
the anchor href uses sanitizeUrl(url); change extraction to use the sanitized
value so displayed domain matches the href: after const safeUrl =
sanitizeUrl(url); compute domain with extractSourceDomain(safeUrl) || `${label}
${links.length + 1}` and keep using safeUrl for href; adjust any surrounding
code that references extractSourceDomain(url) to use safeUrl instead (symbols:
sanitizeUrl, extractSourceDomain, linkClass, links).

In `@src/components/MapPopup.ts`:
- Around line 1524-1528: The code sanitizes the same URL twice when building
cluster rows; change the logic that currently uses event.sourceUrls?.find(url =>
sanitizeUrl(url)) and then calls sanitizeUrl(sourceUrl) again so you sanitize
once and reuse it: iterate or find the first url, run sanitizeUrl(url) once
storing the result (e.g. sanitizedSourceUrl), check that sanitizedSourceUrl is
truthy for presence, and use sanitizedSourceUrl in the href and link text
generation (affecting variables sourceUrl/sourceLink and the construct that
returns the cluster-item string).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: b9ff51db-4ce7-4e9e-9765-5c9512eba6f3

📥 Commits

Reviewing files that changed from the base of the PR and between 3741676 and d056773.

⛔ Files ignored due to path filters (2)
  • src/generated/client/worldmonitor/unrest/v1/service_client.ts is excluded by !**/generated/**
  • src/generated/server/worldmonitor/unrest/v1/service_server.ts is excluded by !**/generated/**
📒 Files selected for processing (13)
  • docs/api/UnrestService.openapi.json
  • docs/api/UnrestService.openapi.yaml
  • docs/api/worldmonitor.openapi.yaml
  • proto/worldmonitor/unrest/v1/unrest_event.proto
  • scripts/seed-unrest-events.mjs
  • server/worldmonitor/unrest/v1/_shared.ts
  • src/components/MapPopup.ts
  • src/components/map-popup-source-links.ts
  • src/services/unrest/index.ts
  • src/styles/main.css
  • src/types/index.ts
  • tests/seed-unrest-gdelt-fetch.test.mjs
  • tests/unrest-source-links.test.mts

Comment on lines +21247 to +21251
sourceUrls:
type: array
items:
type: string
description: Source article URLs, when provided by the upstream feed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check for source URL capping logic in the codebase

# Search for maxItems or limit constants related to source URLs
rg -i 'maxItems|max.*source.*url|source.*url.*limit|cap.*source' --type yaml --type ts --type js -C 3

Repository: lspassos1/WD

Length of output: 50370


🏁 Script executed:

rg "format: uri|format: url" docs/api/worldmonitor.openapi.yaml | head -20

Repository: lspassos1/WD

Length of output: 38


🏁 Script executed:

rg -A 5 "sourceUrls:" docs/api/worldmonitor.openapi.yaml

Repository: lspassos1/WD

Length of output: 296


Add maxItems constraint to document source URL cap.

The sourceUrls array is capped at 5 URLs in the implementation (MAX_SOURCE_URLS in server/worldmonitor/unrest/v1/_shared.ts), but the schema does not enforce or document this limit. Add maxItems: 5 to match the behavior.

Proposed fix
 sourceUrls:
     type: array
     items:
         type: string
         description: Source article URLs, when provided by the upstream feed.
+    maxItems: 5
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sourceUrls:
type: array
items:
type: string
description: Source article URLs, when provided by the upstream feed.
sourceUrls:
type: array
items:
type: string
description: Source article URLs, when provided by the upstream feed.
maxItems: 5
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api/worldmonitor.openapi.yaml` around lines 21247 - 21251, The OpenAPI
schema for the sourceUrls array is missing the implementation limit; update the
sourceUrls schema to include maxItems: 5 so it matches the MAX_SOURCE_URLS
constant used in server/worldmonitor/unrest/v1/_shared.ts; ensure the
description remains and the array still has items.type: string and
items.description unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[koala73#131] Add protest source-link fidelity on the map

1 participant