Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 112 additions & 33 deletions scripts/data-transformers/content-generators.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

import { escapeHtml } from '../html-utils.js';
import type { Language } from '../types/language.js';
import type { ArticleContentData, WeekAheadData, RawDocument } from './types.js';
import type { ArticleContentData, WeekAheadData, RawDocument, RawCalendarEvent } from './types.js';
import { getPillarTransition } from '../editorial-pillars.js';
import {
L,
Expand Down Expand Up @@ -50,6 +50,40 @@ const TITLE_SUFFIX_TEMPLATES: Readonly<Record<string, (t: string) => string>> =
zh: t => `,包括"${t}"`,
};

/** Extract meaningful keywords from text for cross-reference matching (min 2 chars, captures EU, KU, etc.) */
function extractKeywords(text: string): string[] {
return text.toLowerCase().split(/\s+/).filter(w => w.length >= 2);
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extractKeywords function filters words by minimum length of 2 characters, which correctly captures acronyms like 'EU', 'KU', and 'AI'. However, it splits on whitespace only and doesn't handle common Swedish compound word patterns or punctuation-separated terms.

For example, if an event title is "Budget-diskussion EU-riktlinjer" (Budget discussion EU guidelines), splitting on \s+ will produce "Budget-diskussion" and "EU-riktlinjer" as single keywords, potentially missing matches against documents that use "Budget" or "EU" alone.

Consider splitting on [\s\-,]+ to also break on hyphens and commas, or using a more sophisticated tokenization approach for better cross-referencing accuracy.

Suggested change
return text.toLowerCase().split(/\s+/).filter(w => w.length >= 2);
return text.toLowerCase().split(/[\s,-]+/u).filter(w => w.length >= 2);

Copilot uses AI. Check for mistakes.
}

/** Find documents related to a calendar event by organ match or keyword overlap (max 3) */
function findRelatedDocuments(event: RawCalendarEvent, documents: RawDocument[]): RawDocument[] {
const rec = event as Record<string, string>;
const eventOrgan = rec['organ'] ?? '';
const keywords = extractKeywords(rec['rubrik'] ?? rec['titel'] ?? rec['title'] ?? '');
Comment on lines +59 to +62
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function accesses the organ property via type casting to Record<string, string>, but RawCalendarEvent doesn't define an organ field in scripts/data-transformers/types.ts (lines 12-24). This creates a type safety gap.

Consider either:

  1. Adding organ?: string; to the RawCalendarEvent interface if the MCP server actually returns this field
  2. Or documenting that this field may not exist in production data, which would cause the organ-matching logic to silently fail

The test mocks provide this field, but if production data doesn't include it, the cross-referencing feature will only work via keyword matching, not organ matching.

Copilot uses AI. Check for mistakes.
return documents.filter(doc => {
const docOrgan = doc.organ ?? doc.committee ?? '';
if (eventOrgan && docOrgan && eventOrgan.toLowerCase() === docOrgan.toLowerCase()) return true;
const docText = (doc.titel ?? doc.title ?? '').toLowerCase();
return keywords.some(kw => docText.includes(kw));
}).slice(0, 3);
}

/** Find written questions related to a calendar event by keyword overlap (max 3) */
function findRelatedQuestions(event: RawCalendarEvent, questions: RawDocument[]): RawDocument[] {
const rec = event as Record<string, string>;
const keywords = extractKeywords(rec['rubrik'] ?? rec['titel'] ?? rec['title'] ?? '');
return questions.filter(q => {
const qText = (q.titel ?? q.title ?? '').toLowerCase();
return keywords.some(kw => qText.includes(kw));
}).slice(0, 3);
}

/** Extract targeted minister name from interpellation summary "till MINISTER" header line */
function extractMinister(summary: string): string {
const m = summary.match(/\btill\s+([^\n]+)/i);
return m ? m[1].trim() : '';
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern /\btill\s+([^\n]+)/i captures everything after "till " until the first newline. This assumes the minister name doesn't contain newlines, but the captured group could include trailing content if the format is "till MINISTER\nExtra text".

For robustness, consider trimming or more specifically matching common minister title patterns. Swedish minister names typically follow patterns like "Statsminister NAME" or "Minister för AREA NAME". A more specific pattern could avoid capturing unwanted trailing text.

Suggested change
return m ? m[1].trim() : '';
if (!m) return '';
const raw = m[1].trim();
// Remove common trailing topic clauses (e.g. "om X", "angående Y") and punctuation
const lowerRaw = raw.toLowerCase();
const stopPhrases = [' om ', ' angående ', ' rörande ', ' beträffande '];
let end = raw.length;
for (const phrase of stopPhrases) {
const idx = lowerRaw.indexOf(phrase);
if (idx !== -1 && idx < end) {
end = idx;
}
}
// Also cut at common terminating punctuation if it comes earlier
const punctIdx = raw.search(/[?:;.,]/);
if (punctIdx !== -1 && punctIdx < end) {
end = punctIdx;
}
return raw.slice(0, end).trim();

Copilot uses AI. Check for mistakes.
}

export function generateWeekAheadContent(data: WeekAheadData, lang: Language | string): string {
const { events, highlights, context } = data;
// Cast to ArticleContentData to access documents field (passed via switch cast)
Expand Down Expand Up @@ -89,6 +123,50 @@ export function generateWeekAheadContent(data: WeekAheadData, lang: Language | s
<h3>${dayName ? dayName + ' - ' : ''}${titleHtml}</h3>
<p>${event.description || `${eventTime}: ${event.details || 'Parliamentary session scheduled.'}`}</p>
`;

// Policy Context: cross-reference related documents and questions per event
const relatedPolicyDocs = findRelatedDocuments(event, documents);
const relatedPolicyQs = findRelatedQuestions(event, questions);
if (relatedPolicyDocs.length > 0 || relatedPolicyQs.length > 0) {
const policyContextLabel = lang === 'sv' ? 'Policysammanhang'
: lang === 'de' ? 'Politischer Kontext'
: lang === 'fr' ? 'Contexte politique'
: lang === 'es' ? 'Contexto político'
: lang === 'da' ? 'Politisk kontekst'
: lang === 'no' ? 'Politisk kontekst'
: lang === 'fi' ? 'Poliittinen konteksti'
: lang === 'nl' ? 'Beleidscontext'
: lang === 'ar' ? 'السياق السياسي'
: lang === 'he' ? 'הקשר מדיניות'
: lang === 'ja' ? '政策コンテキスト'
: lang === 'ko' ? '정책 맥락'
: lang === 'zh' ? '政策背景'
: 'Policy Context';
content += ` <div class="policy-context-box">\n`;
content += ` <h4>${policyContextLabel}</h4>\n`;
relatedPolicyDocs.forEach(doc => {
const drec = doc as Record<string, string>;
const docTitle = drec['titel'] ?? drec['title'] ?? 'Document';
const dokId = drec['dok_id'] ?? '';
const docUrl = dokId ? sanitizeUrl(`https://riksdagen.se/sv/dokument-och-lagar/dokument/${encodeURIComponent(dokId)}/`) : '';
content += ` <div class="document-entry">\n`;
content += ` <h5>${docUrl ? `<a href="${docUrl}" target="_blank" rel="noopener noreferrer">` : ''}${svSpan(escapeHtml(docTitle), lang)}${docUrl ? '</a>' : ''}</h5>\n`;
const sig = generatePolicySignificance(doc, lang);
if (sig) content += ` <p class="policy-significance">${escapeHtml(sig)}</p>\n`;
content += ` </div>\n`;
});
relatedPolicyQs.forEach(q => {
const qrec = q as Record<string, string>;
const qTitle = qrec['titel'] ?? qrec['title'] ?? 'Question';
const qParty = qrec['parti'] ? ` (${escapeHtml(qrec['parti'])})` : '';
const qDokId = qrec['dok_id'] ?? '';
const qUrl = qDokId ? sanitizeUrl(`https://riksdagen.se/sv/dokument-och-lagar/dokument/${encodeURIComponent(qDokId)}/`) : '';
content += ` <div class="document-entry">\n`;
content += ` <h5>${qUrl ? `<a href="${qUrl}" target="_blank" rel="noopener noreferrer">` : ''}${svSpan(escapeHtml(qTitle), lang)}${qUrl ? '</a>' : ''}${qParty}</h5>\n`;
content += ` </div>\n`;
});
content += ` </div>\n`;
}
});
}

Expand Down Expand Up @@ -144,22 +222,22 @@ export function generateWeekAheadContent(data: WeekAheadData, lang: Language | s
});
}

// Parliamentary Questions: upcoming written questions to ministers
// Questions to Watch: upcoming written questions cross-referenced with debate topics
if (questions.length > 0) {
const questionsLabel = lang === 'sv' ? 'Skriftliga frågor till statsråd'
: lang === 'de' ? 'Schriftliche parlamentarische Anfragen'
: lang === 'fr' ? 'Questions écrites au gouvernement'
: lang === 'es' ? 'Preguntas escritas al gobierno'
: lang === 'da' ? 'Skriftlige spørgsmål til ministrene'
: lang === 'no' ? 'Skriftlige spørsmål til statsrådene'
: lang === 'fi' ? 'Kirjalliset kysymykset ministerille'
: lang === 'nl' ? 'Schriftelijke vragen aan ministers'
: lang === 'ar' ? 'أسئلة مكتوبة للحكومة'
: lang === 'he' ? 'שאלות כתובות לממשלה'
: lang === 'ja' ? '大臣への書面質問'
: lang === 'ko' ? '장관에 대한 서면 질문'
: lang === 'zh' ? '书面质询政府'
: 'Parliamentary Questions to Ministers';
const questionsLabel = lang === 'sv' ? 'Frågor att bevaka'
: lang === 'de' ? 'Zu beobachtende Anfragen'
: lang === 'fr' ? 'Questions à surveiller'
: lang === 'es' ? 'Preguntas a seguir'
: lang === 'da' ? 'Spørgsmål at holde øje med'
: lang === 'no' ? 'Spørsmål å følge med på'
: lang === 'fi' ? 'Seurattavat kysymykset'
: lang === 'nl' ? 'Te volgen vragen'
: lang === 'ar' ? 'أسئلة تستحق المتابعة'
: lang === 'he' ? 'שאלות לעקוב'
: lang === 'ja' ? '注目の質問'
: lang === 'ko' ? '주목할 질문'
: lang === 'zh' ? '值得关注的问题'
: 'Questions to Watch';
content += `\n <h2>${questionsLabel}</h2>\n`;
questions.slice(0, 8).forEach(q => {
const rec = q as Record<string, string>;
Expand All @@ -174,32 +252,32 @@ export function generateWeekAheadContent(data: WeekAheadData, lang: Language | s
});
}

// Interpellations: formal parliamentary interpellations awaiting ministerial response
// Interpellation Spotlight: formal interpellations enriched with minister response context
if (interpellations.length > 0) {
const interLabel = lang === 'sv' ? 'Interpellationer under behandling'
: lang === 'de' ? 'Interpellationen in Bearbeitung'
: lang === 'fr' ? 'Interpellations en cours'
: lang === 'es' ? 'Interpelaciones en curso'
: lang === 'da' ? 'Forespørgsler til behandling'
: lang === 'no' ? 'Interpellasjoner til behandling'
: lang === 'fi' ? 'Käsittelyssä olevat välikysymykset'
: lang === 'nl' ? 'Interpellaties in behandeling'
: lang === 'ar' ? 'الاستجوابات البرلمانية قيد المعالجة'
: lang === 'he' ? 'בקשות הבהרה בטיפול'
: lang === 'ja' ? '処理中の質問主意書'
: lang === 'ko' ? '처리 중인 대정부 질문'
: lang === 'zh' ? '待处理的质询'
: 'Interpellations Pending';
const interLabel = lang === 'sv' ? 'Interpellationer i fokus'
: lang === 'de' ? 'Interpellationen im Fokus'
: lang === 'fr' ? 'Interpellations en vedette'
: lang === 'es' ? 'Interpelaciones destacadas'
: lang === 'da' ? 'Forespørgsler i fokus'
: lang === 'no' ? 'Interpellasjoner i fokus'
: lang === 'fi' ? 'Välikysymykset valokeilassa'
: lang === 'nl' ? 'Interpellaties in de spotlight'
: lang === 'ar' ? 'أبرز الاستجوابات البرلمانية'
: lang === 'he' ? 'בקשות הבהרה בזרקור'
: lang === 'ja' ? '注目の質問主意書'
: lang === 'ko' ? '주목할 대정부 질문'
: lang === 'zh' ? '质询聚焦'
: 'Interpellation Spotlight';
content += `\n <h2>${interLabel}</h2>\n`;
interpellations.slice(0, 8).forEach(interp => {
const rec = interp as Record<string, string>;
const titleText = rec['titel'] || rec['title'] || 'Interpellation';
const party = rec['parti'] ? ` (${escapeHtml(rec['parti'])})` : '';
const dok_id = rec['dok_id'] ?? '';
const iUrl = dok_id ? sanitizeUrl(`https://riksdagen.se/sv/dokument-och-lagar/dokument/${encodeURIComponent(dok_id)}/`) : '';
// Extract clean summary: content starts after "till MINISTER\n" line
// Extract minister and clean summary from the header lines
const rawSummary = rec['summary'] ?? '';
// Find start of actual content after the header lines (Interpellation NNN / av AUTHOR / till MINISTER)
const ministerName = extractMinister(rawSummary);
const tillMatch = rawSummary.match(/\btill\s+[^\n]+\n\s*/i);
const contentStart = tillMatch
? rawSummary.indexOf(tillMatch[0]) + tillMatch[0].length
Expand All @@ -215,6 +293,7 @@ export function generateWeekAheadContent(data: WeekAheadData, lang: Language | s
content += ` <div class="document-entry">\n`;
content += ` <h4>${iUrl ? `<a href="${iUrl}" target="_blank" rel="noopener noreferrer">` : ''}${svSpan(escapeHtml(titleText), lang)}${iUrl ? '</a>' : ''}</h4>\n`;
if (party) content += ` <p class="policy-significance">${escapeHtml(party)}</p>\n`;
if (ministerName) content += ` <p class="minister-target">→ ${svSpan(escapeHtml(ministerName), lang)}</p>\n`;
if (cleanedSummary) content += ` <p>${svSpan(escapeHtml(cleanedSummary) + '…', lang)}</p>\n`;
content += ` </div>\n`;
});
Expand Down
16 changes: 6 additions & 10 deletions scripts/news-types/week-ahead.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,16 +41,12 @@
* 6. **Party Positioning**: Known party stances on upcoming votes
* 7. **International Context**: EU/Nordic cooperation dimensions
*
* **MCP DATA SOURCE:**
* Primary tool: get_calendar_events
* - Retrieves riksdag calendar for specified date range
* - Includes session times, committee assignments, topics
* - Enables systematic prospective coverage
*
* TODO: Implement additional tools for comprehensive analysis:
* - search_dokument: Find related policy documents for calendar items
* - get_fragor: Written questions related to upcoming debates
* - get_interpellationer: Interpellations (parliamentary questions) for upcoming sessions
* **MCP DATA SOURCES (all five tools actively used):**
* - get_calendar_events: upcoming committee/chamber sessions (primary driver)
* - search_dokument: policy documents cross-referenced per calendar event (Policy Context boxes)
* - search_anforanden: recent speeches providing debate context
* - get_fragor: written questions linked to upcoming debates (Questions to Watch section)
* - get_interpellationer: interpellations enriched with minister response context (Interpellation Spotlight)
*
* **OPERATIONAL WORKFLOW:**
* 1. Calculate Date Range: Get calendar for next 7 calendar days
Expand Down
91 changes: 91 additions & 0 deletions tests/news-types/week-ahead.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,97 @@ describe('Week-Ahead Article Generation', () => {
});
});

describe('Enhanced Cross-Referencing', () => {
it('should show Policy Context box when event organ matches document organ', async () => {
// Provide a high-priority event (contains 'EU' to pass isHighPriority) with organ matching the doc
mockClientInstance.fetchCalendarEvents.mockResolvedValue([{
id: '1', title: 'EU budget vote', date: '2026-02-16', type: 'chamber', organ: 'Kammaren',
}]);
mockClientInstance.searchDocuments.mockResolvedValue([{
titel: 'Budget Proposition 2026',
dok_id: 'H901prop1',
doktyp: 'prop',
organ: 'Kammaren',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
Comment on lines +452 to +454
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests use result.articles[0]! (non-null assertion) to access the first article. While the test expects result.success to be true, if the article generation silently produces zero articles, this would cause a test crash rather than a clear failure message.

Consider using optional chaining result.articles[0]?.html or explicit length checks with better error messages to make test failures more debuggable when articles aren't generated as expected.

Copilot uses AI. Check for mistakes.
// 'EU budget vote' event has organ 'Kammaren', matching the document's organ
expect(article.html).toContain('Policy Context');
expect(article.html).toContain('policy-context-box');
});
Comment on lines +440 to +458
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only verifies that keyword-overlap matching works when an event title contains "budget" and a question title also contains "budget". However, it doesn't test:

  1. The organ-matching code path (which has a type safety issue)
  2. Cases where no documents/questions match any events
  3. Cases where more than 3 matches exist (to verify the slice(0,3) limit works)
  4. Events that are high-priority but have no related content

Consider adding test cases to exercise all code paths in the cross-referencing logic.

Copilot uses AI. Check for mistakes.

it('should show Questions to Watch section label', async () => {
mockClientInstance.fetchWrittenQuestions.mockResolvedValue([{
titel: 'Question about budget funding',
dok_id: 'H901fr1',
parti: 'V',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
expect(article.html).toContain('Questions to Watch');
});

it('should show Swedish Questions to Watch label in sv version', async () => {
mockClientInstance.fetchWrittenQuestions.mockResolvedValue([{
titel: 'Fråga om budgetanslaget',
dok_id: 'H901fr2',
parti: 'S',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['sv'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
expect(article.html).toContain('Frågor att bevaka');
});

it('should show Interpellation Spotlight section label', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Question about housing policy',
dok_id: 'H901ip1',
parti: 'S',
summary: 'Interpellation 2025/26:1\nav John Doe\ntill Statsminister Ulf Kristersson\nDetta är en fråga om bostadspolitiken.',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
expect(article.html).toContain('Interpellation Spotlight');
});

it('should extract and display minister name from interpellation summary', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Question about housing policy',
dok_id: 'H901ip2',
parti: 'MP',
summary: 'Interpellation 2025/26:2\nav Jane Doe\ntill Statsminister Ulf Kristersson\nFråga om bostadspolitiken.',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
expect(article.html).toContain('minister-target');
expect(article.html).toContain('Statsminister Ulf Kristersson');
});

Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that minister names are extracted and displayed, but doesn't test edge cases where:

  1. The summary has no "till" line (extractMinister returns empty string)
  2. The summary has malformed minister lines
  3. Multiple "till" patterns appear

Consider adding test cases for these edge cases to ensure the feature degrades gracefully when interpellation summaries don't follow the expected format.

Suggested change
it('should handle interpellation summary without minister line gracefully', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Question about transport policy',
dok_id: 'H901ip4',
parti: 'M',
summary: 'Interpellation 2025/26:4\nav Alex Example\nDetta är en fråga om transportpolitiken utan specifik ministerrad.',
}]);
const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
// When no minister can be extracted, component should degrade gracefully and not render a minister-target
expect(article.html).not.toContain('minister-target');
});
it('should handle malformed minister line without breaking rendering', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Question about education policy',
dok_id: 'H901ip5',
parti: 'L',
// "till" appears but no valid minister name follows
summary: 'Interpellation 2025/26:5\nav Chris Example\ntill \nDetta är en fråga om utbildningspolitiken.',
}]);
const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
// Malformed minister line should not cause a crash or render an empty minister-target
expect(article.html).not.toContain('minister-target');
});
it('should handle multiple "till" patterns in summary without failing', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Question about climate and finance policy',
dok_id: 'H901ip6',
parti: 'C',
summary: 'Interpellation 2025/26:6\nav Pat Example\ntill Klimatminister Romina Pourmokhtari\noch till Finansminister Elisabeth Svantesson\nDetta är en fråga om klimat- och finanspolitiken.',
}]);
const result = await weekAheadModule.generateWeekAhead({ languages: ['en'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
// The spotlight section should still render; parsing ambiguities must not break generation
expect(article.html).toContain('Interpellation Spotlight');
});

Copilot uses AI. Check for mistakes.
it('should show Swedish Interpellation Spotlight label in sv version', async () => {
mockClientInstance.fetchInterpellations.mockResolvedValue([{
titel: 'Interpellation om miljöpolitik',
dok_id: 'H901ip3',
parti: 'MP',
summary: 'Interpellation 2025/26:3\nav Eva Svensson\ntill Klimatminister Romina Pourmokhtari\nFråga om klimatpolitiken.',
}]);

const result = await weekAheadModule.generateWeekAhead({ languages: ['sv'] });
expect(result.success).toBe(true);
const article = result.articles[0]!;
expect(article.html).toContain('Interpellationer i fokus');
});
});

describe('Integration with Writer', () => {
it('should call writeArticle function if provided', async () => {
const mockWriter = vi.fn();
Expand Down
Loading