BlogWatcher

A Go CLI tool to track blog articles, detect new posts, and manage read/unread status. Supports both RSS/Atom feeds and HTML scraping as fallback.

Features

Dual Source Support - Tries RSS feeds first, falls back to HTML scraping
Automatic Feed Discovery - Detects RSS/Atom URLs from blog homepages
Read/Unread Management - Track which articles you've read
Blog Filtering - View articles from specific blogs
Duplicate Prevention - Never tracks the same article twice
Colored CLI Output - User-friendly terminal interface

Installation

# Homebrew (Linux)
brew install Hyaxia/tap/blogwatcher

# Install the CLI
go install github.com/Hyaxia/blogwatcher/cmd/blogwatcher@latest

# Or build locally
go build ./cmd/blogwatcher

Windows and Linux binaries are also available on the GitHub Releases page.

Usage

Adding Blogs

# Add a blog (auto-discovers RSS feed)
blogwatcher add "My Favorite Blog" https://example.com/blog

# Add with explicit feed URL
blogwatcher add "Tech Blog" https://techblog.com --feed-url https://techblog.com/rss.xml

# Add with HTML scraping selector (for blogs without feeds)
blogwatcher add "No-RSS Blog" https://norss.com --scrape-selector "article h2 a"

# Add with per-blog User-Agent override
blogwatcher add "Blocked Blog" https://blocked.example --feed-url https://blocked.example/feed --user-agent "Mozilla/5.0 ..."

Managing Blogs

# List all tracked blogs
blogwatcher blogs

# Remove a blog (and all its articles)
blogwatcher remove "My Favorite Blog"

# Remove without confirmation
blogwatcher remove "My Favorite Blog" -y

Scanning for New Articles

# Scan all blogs for new articles
blogwatcher scan

# Scan a specific blog
blogwatcher scan "Tech Blog"

# Per-blog User-Agent is configured at add time via --user-agent
# Example above: blogwatcher add ... --user-agent "Mozilla/5.0 ..."

Viewing Articles

# List unread articles
blogwatcher articles

# List all articles (including read)
blogwatcher articles --all

# List articles from a specific blog
blogwatcher articles --blog "Tech Blog"

Managing Read Status

# Mark an article as read (use article ID from articles list)
blogwatcher read 42

# Mark an article as unread
blogwatcher unread 42

# Mark all unread articles as read
blogwatcher read-all

# Mark all unread articles as read for a blog (skip prompt)
blogwatcher read-all --blog "Tech Blog" --yes

How It Works

Scanning Process

For each tracked blog, BlogWatcher first attempts to parse the RSS/Atom feed
If no feed URL is configured, it tries to auto-discover one from the blog homepage
If RSS parsing fails and a scrape_selector is configured, it falls back to HTML scraping
New articles are saved to the database as unread
Already-tracked articles are skipped

Feed Auto-Discovery

BlogWatcher searches for feeds in two ways:

Looking for <link rel="alternate"> tags with RSS/Atom types
Checking common feed paths: /feed, /rss, /feed.xml, /atom.xml, etc.

HTML Scraping

When RSS isn't available, provide a CSS selector that matches article links:

# Example selectors
--scrape-selector "article h2 a"      # Links inside article h2 tags
--scrape-selector ".post-title a"     # Links with post-title class
--scrape-selector "#blog-posts a"     # Links inside blog-posts ID

Database

By default, BlogWatcher stores data in SQLite at ~/.blogwatcher/blogwatcher.db.

To isolate independent blog lists (for example, to avoid accidental read-all across unrelated sets), set BLOGWATCHER_DB:

BLOGWATCHER_DB="$HOME/.blogwatcher/work.db" blogwatcher scan

With BLOGWATCHER_DB unset or empty, BlogWatcher falls back to the default path.

Database tables:

blogs - Tracked blogs (name, URL, feed URL, scrape selector)
articles - Discovered articles (title, URL, dates, read status)

Development

Requirements

Go 1.24+

Running Tests

# Run all tests
go test ./...

Publishing

in addition to publishing to main a new tag should be published so homebrew will get the updated version:

  git tag vX.Y.Z
  git push origin vX.Y.Z

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
cmd/blogwatcher		cmd/blogwatcher
internal		internal
skills		skills
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BlogWatcher

Features

Installation

Usage

Adding Blogs

Managing Blogs

Scanning for New Articles

Viewing Articles

Managing Read Status

How It Works

Scanning Process

Feed Auto-Discovery

HTML Scraping

Database

Development

Requirements

Running Tests

Publishing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BlogWatcher

Features

Installation

Usage

Adding Blogs

Managing Blogs

Scanning for New Articles

Viewing Articles

Managing Read Status

How It Works

Scanning Process

Feed Auto-Discovery

HTML Scraping

Database

Development

Requirements

Running Tests

Publishing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages