A news reader that puts quality and clarity first.
Most news interfaces are designed for engagement, not comprehension. They are cluttered, dense, and built to keep you scrolling rather than to leave you informed. Raw feeds mix sources of varying quality, introduce translation artefacts, and reproduce the noise of the original publication without filtering any of it.
This project places an LLM between raw news and the reader, merging sources on the same event into a single well-written article and presenting it in a clean, distraction-free interface.
- Fetches news from RSS feeds and open publishers on a schedule
- Merges multiple sources on the same event and rewrites the result via LLM: correct spelling and grammar, no mixed languages, configurable tone
- Groups news by topic sections (politics, society, culture, etc.) — click a section to filter the feed
- Presents content in a clean, accessible interface with large fonts, high contrast, and large touch targets
- Supports multiple users with independent profiles and preferences
- Provides text-to-speech when the browser supports it
- Shows a daily digest — content is ready when you open the app, no waiting
Anyone who wants to read the news without the noise. Users configure their own feed; a family member or caregiver can optionally set up an account on behalf of someone else.
Neither the end user nor anyone acting on their behalf ever touches the codebase. They access the web app only.
- Docker and docker-compose
The Ollama service is optional in Compose (profile local-llm). Clustering, embeddings, and rewrites need a reachable Ollama. .env.example sets COMPOSE_PROFILES=local-llm so a copied .env starts Ollama in Docker and runs ollama-init once per up (pulls qwen2.5:7b, qwen2.5:3b, nomic-embed-text into the ollama_data volume). The worker waits for that init to finish before running. Model pull happens at container start, not during docker build.
With Ollama in Docker (typical):
git clone https://github.com/etorhub/dossier.git
cd dossier
cp .env.example .env
# GPU (default ollama image expects NVIDIA)
docker compose up --build -d
# No NVIDIA GPU: add the CPU override for the ollama service
# docker compose -f docker-compose.yml -f docker-compose.cpu.yml up --build -dWithout a .env, pass the profile explicitly: docker compose --profile local-llm up --build -d.
Ollama on the host instead (e.g. Windows app or ollama serve in WSL): remove or comment out COMPOSE_PROFILES=local-llm in .env, pull the same model tags on the host (ollama pull …), and set OLLAMA_HOST for the worker (for example http://host.docker.internal:11434 on Docker Desktop / WSL). See .env.example.
Wait for services to be healthy (web at http://localhost:5000, worker running, ollama healthy if you use the profile). Then populate with news:
./scripts/fetch-news.shThe script fetches feeds, extracts full text, clusters articles, and rewrites them. When it finishes, the app has real content.
Operators can monitor the pipeline at the ops dashboard: http://localhost:5001. It shows job runs, feed health, source availability, articles, stories, and user activity. No authentication by default (restrict access at the network level).
docker compose up -d opsA default admin is ready to use: admin@admin.com / admin. Log in to access the app.
To grant admin privileges to another user (for future use):
docker compose exec web flask make-admin your@email.comSee docs/ADMIN_DASHBOARD.md for ops dashboard documentation.
The scheduler runs jobs on a schedule. To run them manually:
| Command | Where | Description |
|---|---|---|
flask seed-sources |
Web | Load sources from config/sources.yaml (auto-run on startup) |
python -m app.worker_cli fetch-feeds |
Worker | Fetch all due RSS feeds |
python -m app.worker_cli enrich-articles |
Worker | Extract full article content for pending articles |
python -m app.worker_cli cluster-articles |
Worker | Embed and cluster today's articles |
python -m app.worker_cli rewrite-articles |
Worker | Rewrite articles for all user profiles |
python -m app.worker_cli run-pipeline |
Worker | Full pipeline once (seed → fetch → enrich → cluster → rewrite) |
With Docker:
docker compose exec worker python -m app.worker_cli run-pipelineOr use ./scripts/fetch-news.sh for the same result.
# Requires Python 3.12+ and a running PostgreSQL instance
pip install -r requirements.txt
flask run| Layer | Technology |
|---|---|
| Backend | Python 3.12+ / Flask |
| Database | PostgreSQL 18 |
| LLM | Ollama (local, no API key) |
| Frontend | HTML + CSS + HTMX (no JavaScript frameworks) |
| Scheduling | APScheduler (worker container) |
| Packaging | Docker + docker-compose |
See docs/TECH_STACK.md for full details.
| Document | Description |
|---|---|
| CONTRIBUTING.md | How to contribute — setup, code standards, commits, PRs |
| CODE_OF_CONDUCT.md | Community standards and enforcement |
| SECURITY.md | Security policy and vulnerability reporting |
| CLAUDE.md | AI assistant context (Claude Code) — coding rules, architecture constraints, design principles |
| .cursor/rules/ | Cursor IDE rules — same context via project-context.mdc (always apply) plus architecture, accessibility, LLM, news-source-discovery |
| docs/TECH_STACK.md | Tech stack, project structure, dependencies, Docker setup |
| docs/ARCHITECTURE.md | System architecture, database schema, component map, request lifecycle |
| docs/ADMIN_DASHBOARD.md | Ops dashboard: pipeline monitoring, job history, source availability, user activity, incidents |
| docs/I18N.md | Internationalization: locale selection, translation catalogs, updating strings |
| docs/MVP_PLAN.md | Phased MVP plan with tasks and success criteria |
| docs/news_source_discovery_agent.md | News source discovery pipeline specification |
Accessibility is a constraint, not a feature. Good defaults benefit all users:
- Minimum 48x48px touch targets on all interactive elements
- Base font size 22px, line height 1.6
- WCAG AA contrast minimum (4.5:1), AAA target (7:1) in high-contrast mode
- One article at a time — no infinite scroll
- Text-to-speech via Web Speech API (hidden when not supported)
- Semantic HTML throughout
- No hover-only interactions, no timed content
AGPL-3.0. See LICENSE for details.
The project is a reading aid, not a republisher. Every article links to and credits the original source. Copyright remains with the publisher.