A Go CLI tool to track blog articles, detect new posts, and manage read/unread status. Supports both RSS/Atom feeds and HTML scraping as fallback.
- Dual Source Support - Tries RSS feeds first, falls back to HTML scraping
- Automatic Feed Discovery - Detects RSS/Atom URLs from blog homepages
- Read/Unread Management - Track which articles you've read
- Blog Filtering - View articles from specific blogs
- Duplicate Prevention - Never tracks the same article twice
- Colored CLI Output - User-friendly terminal interface
# Homebrew (Linux)
brew install Hyaxia/tap/blogwatcher
# Install the CLI
go install github.com/Hyaxia/blogwatcher/cmd/blogwatcher@latest
# Or build locally
go build ./cmd/blogwatcherWindows and Linux binaries are also available on the GitHub Releases page.
# Add a blog (auto-discovers RSS feed)
blogwatcher add "My Favorite Blog" https://example.com/blog
# Add with explicit feed URL
blogwatcher add "Tech Blog" https://techblog.com --feed-url https://techblog.com/rss.xml
# Add with HTML scraping selector (for blogs without feeds)
blogwatcher add "No-RSS Blog" https://norss.com --scrape-selector "article h2 a"
# Add with per-blog User-Agent override
blogwatcher add "Blocked Blog" https://blocked.example --feed-url https://blocked.example/feed --user-agent "Mozilla/5.0 ..."# List all tracked blogs
blogwatcher blogs
# Remove a blog (and all its articles)
blogwatcher remove "My Favorite Blog"
# Remove without confirmation
blogwatcher remove "My Favorite Blog" -y# Scan all blogs for new articles
blogwatcher scan
# Scan a specific blog
blogwatcher scan "Tech Blog"
# Per-blog User-Agent is configured at add time via --user-agent
# Example above: blogwatcher add ... --user-agent "Mozilla/5.0 ..."# List unread articles
blogwatcher articles
# List all articles (including read)
blogwatcher articles --all
# List articles from a specific blog
blogwatcher articles --blog "Tech Blog"# Mark an article as read (use article ID from articles list)
blogwatcher read 42
# Mark an article as unread
blogwatcher unread 42
# Mark all unread articles as read
blogwatcher read-all
# Mark all unread articles as read for a blog (skip prompt)
blogwatcher read-all --blog "Tech Blog" --yes- For each tracked blog, BlogWatcher first attempts to parse the RSS/Atom feed
- If no feed URL is configured, it tries to auto-discover one from the blog homepage
- If RSS parsing fails and a
scrape_selectoris configured, it falls back to HTML scraping - New articles are saved to the database as unread
- Already-tracked articles are skipped
BlogWatcher searches for feeds in two ways:
- Looking for
<link rel="alternate">tags with RSS/Atom types - Checking common feed paths:
/feed,/rss,/feed.xml,/atom.xml, etc.
When RSS isn't available, provide a CSS selector that matches article links:
# Example selectors
--scrape-selector "article h2 a" # Links inside article h2 tags
--scrape-selector ".post-title a" # Links with post-title class
--scrape-selector "#blog-posts a" # Links inside blog-posts IDBy default, BlogWatcher stores data in SQLite at ~/.blogwatcher/blogwatcher.db.
To isolate independent blog lists (for example, to avoid accidental read-all across unrelated sets), set BLOGWATCHER_DB:
BLOGWATCHER_DB="$HOME/.blogwatcher/work.db" blogwatcher scanWith BLOGWATCHER_DB unset or empty, BlogWatcher falls back to the default path.
Database tables:
- blogs - Tracked blogs (name, URL, feed URL, scrape selector)
- articles - Discovered articles (title, URL, dates, read status)
- Go 1.24+
# Run all tests
go test ./...in addition to publishing to main a new tag should be published so homebrew will get the updated version:
git tag vX.Y.Z
git push origin vX.Y.Z
MIT