Chrome extension that removes tracking from URLs, expands shortened links,
and catches phishing domains — offline, local, no data leaves your browser.
Every link you share carries invisible baggage: utm_source, fbclid, gclid, affiliate tags buried inside Amazon paths, session tokens encoded in Base64. URL shorteners hide where the link actually goes. Phishing sites register domains with Cyrillic characters that look identical to google.com or paypal.com in a browser bar.
I built this to understand how deep that problem goes — and to build a tool that actually fixes it without sending your data anywhere.
Left: expanding and cleaning a tinyurl → AliExpress link in aggressive mode. Right: tracking stats in the settings page.
The extension runs every URL through a 4-layer pipeline:
URL in
|
v
[Platform Rules] ── Custom logic per site (Amazon, YouTube, LinkedIn, Twitter, Instagram, AliExpress)
|
v
[ClearURLs DB] ── 730+ rules across 205 tracking providers, auto-updated weekly
|
v
[Heuristics] ── Shannon entropy analysis + UUID/Base64/hex pattern detection
|
v
[Preservation] ── Keeps essential params (product IDs, search queries, pagination)
|
v
Clean URL
Three modes control how aggressive the cleaning is:
| Mode | What it does | Trade-off |
|---|---|---|
| Minimal | Strips known trackers (utm_*, fbclid, gclid) |
Never breaks links |
| Smart | Adds entropy heuristics + platform rules | Recommended default |
| Aggressive | Whitelist-only — keeps id, page, q, v, t and drops everything else |
Maximum privacy, may break edge cases |
Beyond cleaning, shortened URLs are expanded through a 4-strategy cascade (embedded URL extraction → native redirect following → iframe capture → external service fallback) covering 1,400+ shortener domains. The process loops up to 5 times to resolve chains like bit.ly → t.co → redirect.com → real-destination.com.
Static blocklists can't keep up — new tracking params appear daily. So instead of only matching known names, the extension measures the Shannon entropy of each parameter value:
ref=homepage → 2.4 bits/char → functional, keep it
_ga=2.18943.10873 → 3.4 bits/char → tracking token, remove
sid=a1b2c3d4e5f6 → 3.7 bits/char → unique ID, remove
High entropy means high randomness, which means the value was generated to identify you, not to serve a page. Threshold is ~3.0 bits/char. On top of that, the heuristic layer catches UUIDs, Base64 padding patterns, and hex hashes by format.
The extension maps Cyrillic, Greek, and fullwidth Unicode characters back to ASCII and checks against 15+ high-value phishing targets:
аpple.com → Cyrillic 'а' → impersonates apple.com
gооgle.com → Cyrillic 'о' → impersonates google.com
This runs on every URL before it reaches the user.
An 11,000+ domain blacklist is loaded into a Set at startup — lookups take sub-millisecond time, no network needed. For deeper analysis, optional VirusTotal integration scans against 70+ engines, but only when the user explicitly clicks the button. API keys stay in chrome.storage.local (device-only, never synced).
| Layer | What | Pattern |
| Background | Service Worker — context menu, scheduled rule updates, message routing | Event-driven |
| Popup | StateManager (single source of truth) + UIController (state-driven DOM) | Observer |
| Rule Engine | 7 providers chained: each checks domain match, applies transforms, passes to next | Chain of Responsibility |
| Cleaning | 4-layer pipeline with 3 swappable strategies (minimal/smart/aggressive) | Strategy |
| Networking | Fetch with exponential backoff + jitter, per-request timeouts via AbortSignal | Retry with backoff |
| Storage | Quota-aware persistence, auto-trims at 90% of 10MB limit | SafeStorage |
Zero frameworks. Vanilla JS + ES6 modules. No bundler, no transpiler — the code in the repo is the code in the browser. Adding a new platform means writing one provider class and registering it in the factory.
- No dependencies in production — Chrome APIs + Fetch API + ES6
- Dual storage:
chrome.storage.syncfor settings,.localfor sensitive data - ClearURLs rule database (205 providers, 730+ rules, bundled + auto-updated)
- 1,400+ shortener domains from PeterDaveHello/url-shorteners (CC BY-SA 4.0)
git clone https://github.com/patatapython/link-cleaner-pro.git
cd link-cleaner-pro
npm install # dev dependencies only (Jest)
npm test # run test suiteThen load as unpacked extension in chrome://extensions/ (Developer mode ON).
Right-click any link → "Clean this link". Done.
No telemetry. No analytics. No content scripts. No background monitoring. Everything runs locally. VirusTotal scans are opt-in and manual. Full policy · How it works
- ClearURLs — open-source tracking parameter database with 205 providers and 730+ rules
- PeterDaveHello/url-shorteners — community-curated list of 1,400+ URL shortening services (CC BY-SA 4.0)
- VirusTotal — threat intelligence API scanning against 70+ antivirus engines
MIT License

