feat(unrest): surface protest source links#30
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis pull request adds end-to-end support for associating upstream source article URLs with unrest events. The changes include API schema definitions (OpenAPI and protobuf), backend logic for extracting and deduplicating source URLs from ACLED and GDELT data sources, frontend components for rendering source links in event popups, and corresponding test coverage. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
@codex review |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
docs/api/UnrestService.openapi.yaml (1)
284-288: ConstrainsourceUrlsin the API contract.Line 284 currently defines an unbounded string array. Please encode the link cap and URL shape in schema so clients can rely on it.
Proposed OpenAPI schema refinement
sourceUrls: type: array + maxItems: 5 items: type: string + format: uri + pattern: '^https?://' description: Source article URLs, when provided by the upstream feed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/api/UnrestService.openapi.yaml` around lines 284 - 288, The OpenAPI schema for the sourceUrls array is currently unbounded; update the sourceUrls property in UnrestService.openapi.yaml to constrain it by adding a maxItems (e.g., 10) and tighten each item to a URL shape by changing the item schema to type: string with format: uri (or a suitable pattern) so clients can rely on both a link cap and validated URL format; modify the sourceUrls schema (the array named "sourceUrls") and its items definition accordingly.src/components/MapPopup.ts (1)
1524-1528: Avoid sanitizing the same URL twice in cluster rows.Line 1524 sanitizes during
find, then Line 1526 sanitizes again forhref. Sanitizing once keeps this path simpler.♻️ Suggested cleanup
- const sourceUrl = event.sourceUrls?.find(url => sanitizeUrl(url)); - const sourceLink = sourceUrl - ? ` <a class="popup-link cluster-source-link" href="${sanitizeUrl(sourceUrl)}" target="_blank" rel="noopener noreferrer nofollow">${t('popups.source')} →</a>` + const safeSourceUrl = event.sourceUrls + ?.map((url) => sanitizeUrl(url)) + .find((url): url is string => Boolean(url)); + const sourceLink = safeSourceUrl + ? ` <a class="popup-link cluster-source-link" href="${safeSourceUrl}" target="_blank" rel="noopener noreferrer nofollow">${t('popups.source')} →</a>` : '';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/MapPopup.ts` around lines 1524 - 1528, The code sanitizes the same URL twice when building cluster rows; change the logic that currently uses event.sourceUrls?.find(url => sanitizeUrl(url)) and then calls sanitizeUrl(sourceUrl) again so you sanitize once and reuse it: iterate or find the first url, run sanitizeUrl(url) once storing the result (e.g. sanitizedSourceUrl), check that sanitizedSourceUrl is truthy for presence, and use sanitizedSourceUrl in the href and link text generation (affecting variables sourceUrl/sourceLink and the construct that returns the cluster-item string).scripts/seed-unrest-events.mjs (1)
104-110: NormalizesourceUrlson first dedupe insert as well.On Line 106, first-write events are stored without passing through
mergeSourceUrls(...). Normalizing there too keeps dedupe invariants local and future-proof.♻️ Suggested hardening
const existing = unique.get(key); if (!existing) { + event.sourceUrls = mergeSourceUrls(event.sourceUrls); unique.set(key, event); } else if (event.sourceType === 'UNREST_SOURCE_TYPE_ACLED' && existing.sourceType !== 'UNREST_SOURCE_TYPE_ACLED') {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/seed-unrest-events.mjs` around lines 104 - 110, When inserting the first occurrence into the dedupe map (the unique.set(key, event) in the block where existing is falsy), normalize the event's sourceUrls (and sources) the same way you do in the ACLED merge path: set event.sourceUrls = mergeSourceUrls(event.sourceUrls, []) and normalize event.sources (e.g., event.sources = [...new Set(event.sources)]) before calling unique.set(key, event) so the initial entry follows the same dedupe invariants as later merges.src/components/map-popup-source-links.ts (1)
27-31: Use the sanitized URL for domain extraction too.
hrefis built fromsafeUrl, but domain text is derived from rawurl. UsingsafeUrlfor both keeps output consistent.♻️ Suggested tweak
- const domain = extractSourceDomain(url) || `${label} ${links.length + 1}`; + const domain = extractSourceDomain(safeUrl) || `${label} ${links.length + 1}`;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/map-popup-source-links.ts` around lines 27 - 31, The domain text is being extracted from the raw url while the anchor href uses sanitizeUrl(url); change extraction to use the sanitized value so displayed domain matches the href: after const safeUrl = sanitizeUrl(url); compute domain with extractSourceDomain(safeUrl) || `${label} ${links.length + 1}` and keep using safeUrl for href; adjust any surrounding code that references extractSourceDomain(url) to use safeUrl instead (symbols: sanitizeUrl, extractSourceDomain, linkClass, links).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/api/worldmonitor.openapi.yaml`:
- Around line 21247-21251: The OpenAPI schema for the sourceUrls array is
missing the implementation limit; update the sourceUrls schema to include
maxItems: 5 so it matches the MAX_SOURCE_URLS constant used in
server/worldmonitor/unrest/v1/_shared.ts; ensure the description remains and the
array still has items.type: string and items.description unchanged.
---
Nitpick comments:
In `@docs/api/UnrestService.openapi.yaml`:
- Around line 284-288: The OpenAPI schema for the sourceUrls array is currently
unbounded; update the sourceUrls property in UnrestService.openapi.yaml to
constrain it by adding a maxItems (e.g., 10) and tighten each item to a URL
shape by changing the item schema to type: string with format: uri (or a
suitable pattern) so clients can rely on both a link cap and validated URL
format; modify the sourceUrls schema (the array named "sourceUrls") and its
items definition accordingly.
In `@scripts/seed-unrest-events.mjs`:
- Around line 104-110: When inserting the first occurrence into the dedupe map
(the unique.set(key, event) in the block where existing is falsy), normalize the
event's sourceUrls (and sources) the same way you do in the ACLED merge path:
set event.sourceUrls = mergeSourceUrls(event.sourceUrls, []) and normalize
event.sources (e.g., event.sources = [...new Set(event.sources)]) before calling
unique.set(key, event) so the initial entry follows the same dedupe invariants
as later merges.
In `@src/components/map-popup-source-links.ts`:
- Around line 27-31: The domain text is being extracted from the raw url while
the anchor href uses sanitizeUrl(url); change extraction to use the sanitized
value so displayed domain matches the href: after const safeUrl =
sanitizeUrl(url); compute domain with extractSourceDomain(safeUrl) || `${label}
${links.length + 1}` and keep using safeUrl for href; adjust any surrounding
code that references extractSourceDomain(url) to use safeUrl instead (symbols:
sanitizeUrl, extractSourceDomain, linkClass, links).
In `@src/components/MapPopup.ts`:
- Around line 1524-1528: The code sanitizes the same URL twice when building
cluster rows; change the logic that currently uses event.sourceUrls?.find(url =>
sanitizeUrl(url)) and then calls sanitizeUrl(sourceUrl) again so you sanitize
once and reuse it: iterate or find the first url, run sanitizeUrl(url) once
storing the result (e.g. sanitizedSourceUrl), check that sanitizedSourceUrl is
truthy for presence, and use sanitizedSourceUrl in the href and link text
generation (affecting variables sourceUrl/sourceLink and the construct that
returns the cluster-item string).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: b9ff51db-4ce7-4e9e-9765-5c9512eba6f3
⛔ Files ignored due to path filters (2)
src/generated/client/worldmonitor/unrest/v1/service_client.tsis excluded by!**/generated/**src/generated/server/worldmonitor/unrest/v1/service_server.tsis excluded by!**/generated/**
📒 Files selected for processing (13)
docs/api/UnrestService.openapi.jsondocs/api/UnrestService.openapi.yamldocs/api/worldmonitor.openapi.yamlproto/worldmonitor/unrest/v1/unrest_event.protoscripts/seed-unrest-events.mjsserver/worldmonitor/unrest/v1/_shared.tssrc/components/MapPopup.tssrc/components/map-popup-source-links.tssrc/services/unrest/index.tssrc/styles/main.csssrc/types/index.tstests/seed-unrest-gdelt-fetch.test.mjstests/unrest-source-links.test.mts
| sourceUrls: | ||
| type: array | ||
| items: | ||
| type: string | ||
| description: Source article URLs, when provided by the upstream feed. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check for source URL capping logic in the codebase
# Search for maxItems or limit constants related to source URLs
rg -i 'maxItems|max.*source.*url|source.*url.*limit|cap.*source' --type yaml --type ts --type js -C 3Repository: lspassos1/WD
Length of output: 50370
🏁 Script executed:
rg "format: uri|format: url" docs/api/worldmonitor.openapi.yaml | head -20Repository: lspassos1/WD
Length of output: 38
🏁 Script executed:
rg -A 5 "sourceUrls:" docs/api/worldmonitor.openapi.yamlRepository: lspassos1/WD
Length of output: 296
Add maxItems constraint to document source URL cap.
The sourceUrls array is capped at 5 URLs in the implementation (MAX_SOURCE_URLS in server/worldmonitor/unrest/v1/_shared.ts), but the schema does not enforce or document this limit. Add maxItems: 5 to match the behavior.
Proposed fix
sourceUrls:
type: array
items:
type: string
description: Source article URLs, when provided by the upstream feed.
+ maxItems: 5📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| sourceUrls: | |
| type: array | |
| items: | |
| type: string | |
| description: Source article URLs, when provided by the upstream feed. | |
| sourceUrls: | |
| type: array | |
| items: | |
| type: string | |
| description: Source article URLs, when provided by the upstream feed. | |
| maxItems: 5 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/api/worldmonitor.openapi.yaml` around lines 21247 - 21251, The OpenAPI
schema for the sourceUrls array is missing the implementation limit; update the
sourceUrls schema to include maxItems: 5 so it matches the MAX_SOURCE_URLS
constant used in server/worldmonitor/unrest/v1/_shared.ts; ensure the
description remains and the array still has items.type: string and
items.description unchanged.
Summary
Adds source URL fidelity to unrest/protest events so map popups can link back to upstream articles when GDELT or ACLED provides a usable URL. Closes #25. Refs koala73#131.
Root cause
The unrest contract only preserved source identifiers such as GDELT or ACLED. Seeded protest events could not carry article-level URLs through the API/client boundary, so the map popup had no source link to render.
Changes
source_urlstoUnrestEventand regenerated TS/OpenAPI artifacts.SocialUnrestEvent.sourceUrls.Validation
make generatenpx tsx --test tests/seed-unrest-gdelt-fetch.test.mjsnpm run typechecknpm run typecheck:apinpx biome lint scripts/seed-unrest-events.mjs server/worldmonitor/unrest/v1/_shared.ts src/components/MapPopup.ts src/services/unrest/index.ts tests/seed-unrest-gdelt-fetch.test.mjscompleted with existing MapPopup warnings outside this changegit diff --checkRisk
Low. The new field is additive; existing cached seed payloads without
sourceUrlscontinue to render without links.