A Gmail system that sorts your email, learns from corrections, and gets cheaper over time.
Status: Deployment blocked on Google Advanced Protection Program — see investigation Platform: Google Apps Script + Gemini 2.0 Flash
License: MIT
Every system for managing email just moves the work. Gmail's category tabs sort mail into 5 buckets you can't customize — corrections don't compound, and everything still sits in your inbox. Gmail filters give you control, but they rot: new senders appear, old ones go dormant, and maintaining the rules is the same tedious manual effort filters were supposed to eliminate.
The real problem isn't filtering — it's that the system doesn't maintain itself. inbox-shepherd does. It classifies senders it's never seen before, logs every decision so mistakes are visible and structurally fixable, and promotes patterns into rules automatically. You open Gmail and the only things in your inbox are the messages that actually need you.
┌─────────────────────────────────────────────────────────┐
│ is:inbox │
└──────────────────────────┬──────────────────────────────┘
│
┌───────────▼────────────┐
│ Header Screener │ No bulk headers?
│ (Tier 1) │──── Personal mail ──→ stays in inbox
└───────────┬────────────┘
│ bulk headers
┌───────────▼────────────┐
│ Static Rules │ INBOX rule match?
│ (Tier 2) │──── Urgent (2FA) ───→ stays in inbox
│ │ Label rule match?
│ │──── Known sender ───→ label + archive
└───────────┬────────────┘
│ no rule match
┌───────────▼────────────┐
│ LLM Classifier │
│ (Tier 3) │──── Classified ─────→ label + archive
│ │──── Uncertain ──────→ _review (inbox)
└────────────────────────┘
Two subsystems share the codebase:
The Operator runs every 5 minutes, processing your inbox through the pipeline above.
- Header Screener (Tier 1): Checks for bulk-mail indicators. No bulk headers → personal email → stays in inbox, untouched. Zero cost, zero config.
- Static Rules (Tier 2): Deterministic sender-to-label mappings — no LLM needed. Also catches urgent automated email (2FA codes, security alerts) and keeps it in inbox despite bulk headers.
- LLM Classifier (Tier 3): Handles everything else, including senders the system has never seen. Gemini Flash classifies into a configurable taxonomy (10 categories by default). This is what filters can't do — you don't need to anticipate every sender.
Every decision is logged to an observation store. This data is what makes the system improvable.
The Strategist evolves the rules over time. In v1, the Strategist is you: review the observation log, spot patterns, update Config.js. A sender the Classifier handles 30 times in a row? Promote it to a Rule — now it's faster and free. In v2, this becomes automated software. The system gets cheaper and more accurate the longer it runs.
- Zero Trust / Default-Deny — Only email without bulk headers stays in the Inbox. Everything else must be classified.
- Separate the Data Plane from the Control Plane — Running rules (cheap, fast, every 5 min) and maintaining rules (thoughtful, expensive, daily/weekly) are distinct software.
- Observable — Every routing decision is logged with enough context for the control plane to analyze and propose changes.
- Non-Destructive by Default — Labels and archives, never deletes (v1).
- Resilient to Neglect — If it breaks for a month, it catches up automatically when restored.
- Retroactive Consistency — Rule changes can apply backward. The archive stays consistent with the current taxonomy.
- Conservative by Default — Taxonomy changes require sustained signal and human approval.
- Portable — Fork it, point it at your Gmail, customize Config.js, and go.
- Zero Operational Burden — The system must not trade manual effort in email for manual effort in system management. Corrections compound into better rules.
- Does not detect spam or malware (Gmail handles that).
- Does not auto-reply or draft responses.
- Does not delete any email — labels and archives only.
- Single Gmail account only.
- Requires a paid Gemini API key (~$1–5/month at typical volume, decreasing as Rules replace Classifier calls).
Currently in dry-run validation. Setup: clone the repo, run
npm install -g @google/clasp && clasp login, editConfig.jswith your taxonomy and owner email, setGEMINI_API_KEYandOBSERVATION_SHEET_IDin Apps Script Project Settings > Script Properties, deploy viaclasp push, and runinstallTrigger()from the editor to enable the 5-minute trigger. Start withdryRun: trueto validate before going live.
- Roadmap — Prioritized initiatives
- Operator Requirements — Data plane: email routing engine (v5.1)
- Operator Brainstorm — Detailed design decisions and edge cases
- Strategist Design — Control plane: rule management and taxonomy evolution (stub)
- CLAUDE.md — AI agent conventions and project structure
Requirements definition✅Design analysis✅Implementation✅ (Phases 1–5: foundation, infrastructure, tiers 1–3, orchestration)- Deployment blocked — Advanced Protection Program (APP) restricts the OAuth scopes inbox-shepherd needs. See app-blocker.md for the full investigation and paths forward.
| Component | Technology |
|---|---|
| Runtime | Google Apps Script (V8) |
| Deployment | @google/clasp |
| LLM | Gemini 2.0 Flash (paid API) |
| Observation store | Google Sheets |
| State persistence | Apps Script PropertiesService |
| Version control | Git + GitHub |
All email processing stays within the Google ecosystem. The only external API call is to Gemini (googleapis.com) under paid-tier terms — inputs are not used for training. The LLM sees only: sender name + address with platform annotation, addressing annotation (the owner's email address is never included), subject line, and a ~100-character sanitized body snippet. HTML is stripped, URLs are redacted. API keys live in Apps Script PropertiesService (encrypted at rest), never in source code. The observation store (Google Sheets) contains sender addresses and subject lines — keep it private.
MIT