Skip to content

batchnode/chanchinthar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title permalink layout
Chanchinthar
/about/
page

Chanchinthar (News Portal)

Chanchinthar is an automated Mizo news portal designed to deliver high-quality, unbiased, and ad-free news to the Mizo-speaking community. The project utilizes modern AI (Google Gemini) and a robust Python-based automation pipeline to transform global news into fluent Mizo reporting.


🚀 How It Works (The Pipeline)

The system is designed to be completely hands-free, running on a local cron schedule. The pipeline follows a strict 6-step process:

  1. Reconcile (reconcile_stories.py): Checks for discrepancies between the database and physical files. Ensures no story is lost or misplaced.
  2. Harvest (rss-grabber.py): Scours top-tier global news agencies (BBC, Reuters, Ars Technica) and collects raw content.
  3. Process (worker.py):
    • Cleans raw HTML into usable text.
    • Consults previous coverage to avoid duplicates.
    • Uses Gemini 1.5 Flash to translate and summarize news into professional Mizo.
    • Generates Markdown files with structured YAML front matter.
  4. Auto-Cleanup (auto_cleanup.py): Performs a sanity check on orphaned archive files or posts.
  5. Nuclear Audit (log_cleanup.py): A strict integrity check that ensures every entry in the master JSON log has a corresponding 1:1 match with physical HTML and Markdown files.
  6. Global YAML Repair (repair_all_posts.py): A final safety step that normalizes all YAML formatting (quotes, escaping, and line breaks) to ensure the website builds perfectly every time.

🛠️ Technical Stack


📁 Project Structure

  • _posts/: Final generated news articles in Markdown.
  • _automation/: Core Python scripts and prompt templates.
  • news_queue/: Temporary storage for raw HTML files waiting to be processed.
  • news_archive/: Permanent storage for processed raw news content.
  • processed_stories.json: The master database of all processed news.

💻 Local Setup

  1. Install Ruby & Jekyll:
    bundle install
  2. Set Up Python Environment:
    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt # feedparser, requests, pyyaml
  3. Configure API Key: Export your GEMINI_API_KEY to your environment.
  4. Run Pipeline:
    bash _automation/run_pipeline.sh

🤝 Support Us

Chanchinthar is a non-profit, community-driven project. We do not run ads or monetize your attention. Our mission is pure reporting for the Mizo community. If you value this work, please consider supporting us through donations.


Thu dik leh rintlak chiah puanchhuah hi kan thupui ber a ni.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages