Skip to content

Feature/multi source scraper#62

Open
jannicaSD wants to merge 2 commits into
codeforpakistan:mainfrom
jannicaSD:feature/multi-source-scraper
Open

Feature/multi source scraper#62
jannicaSD wants to merge 2 commits into
codeforpakistan:mainfrom
jannicaSD:feature/multi-source-scraper

Conversation

@jannicaSD
Copy link
Copy Markdown

@jannicaSD jannicaSD commented Mar 20, 2026

Summary

  • Improved representatives scraper workflow
  • Added multi-source scraping structure
  • Fixed image download stream handling
  • Updated env loading behavior for local CLI scripts
  • Improved DB migration reliability with pgvector extension setup

What I changed

  • scripts/scrape-representatives.ts
    • Refactored source handling
    • Fixed image write stream issue
    • Improved URL normalization and output handling
  • lib/env.mjs
    • Load .env.local for CLI commands
  • lib/db/migrate.ts
    • Ensure vector extension exists before migrations

Validation

  • npm run db:generate
  • npm run db:migrate (successful)
  • npm run scrape:representatives (runs successfully)

Notes

  • .env.local is ignored and not committed.

Known limitations

Provincial assembly URLs currently use a generic parser, so some sources may return zero records until source-specific selectors are added.
National Assembly source is the currently verified data source.

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 20, 2026

@jannicaSD is attempting to deploy a commit to the passion projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants