Link Cleaner Pro

Chrome extension that removes tracking from URLs, expands shortened links,
and catches phishing domains — offline, local, no data leaves your browser.

Leer en Español

Why

Every link you share carries invisible baggage: utm_source, fbclid, gclid, affiliate tags buried inside Amazon paths, session tokens encoded in Base64. URL shorteners hide where the link actually goes. Phishing sites register domains with Cyrillic characters that look identical to google.com or paypal.com in a browser bar.

I built this to understand how deep that problem goes — and to build a tool that actually fixes it without sending your data anywhere.

Screenshots

Left: expanding and cleaning a tinyurl → AliExpress link in aggressive mode. Right: tracking stats in the settings page.

How It Works

The extension runs every URL through a 4-layer pipeline:

  URL in
    |
    v
 [Platform Rules]  ──  Custom logic per site (Amazon, YouTube, LinkedIn, Twitter, Instagram, AliExpress)
    |
    v
 [ClearURLs DB]    ──  730+ rules across 205 tracking providers, auto-updated weekly
    |
    v
 [Heuristics]      ──  Shannon entropy analysis + UUID/Base64/hex pattern detection
    |
    v
 [Preservation]    ──  Keeps essential params (product IDs, search queries, pagination)
    |
    v
  Clean URL

Three modes control how aggressive the cleaning is:

Mode	What it does	Trade-off
Minimal	Strips known trackers (`utm_*`, `fbclid`, `gclid`)	Never breaks links
Smart	Adds entropy heuristics + platform rules	Recommended default
Aggressive	Whitelist-only — keeps `id`, `page`, `q`, `v`, `t` and drops everything else	Maximum privacy, may break edge cases

Beyond cleaning, shortened URLs are expanded through a 4-strategy cascade (embedded URL extraction → native redirect following → iframe capture → external service fallback) covering 1,400+ shortener domains. The process loops up to 5 times to resolve chains like bit.ly → t.co → redirect.com → real-destination.com.

The Interesting Parts

Entropy-based tracker detection

Static blocklists can't keep up — new tracking params appear daily. So instead of only matching known names, the extension measures the Shannon entropy of each parameter value:

ref=homepage        → 2.4 bits/char  →  functional, keep it
_ga=2.18943.10873   → 3.4 bits/char  →  tracking token, remove
sid=a1b2c3d4e5f6    → 3.7 bits/char  →  unique ID, remove

High entropy means high randomness, which means the value was generated to identify you, not to serve a page. Threshold is ~3.0 bits/char. On top of that, the heuristic layer catches UUIDs, Base64 padding patterns, and hex hashes by format.

Homograph attack detection

The extension maps Cyrillic, Greek, and fullwidth Unicode characters back to ASCII and checks against 15+ high-value phishing targets:

аpple.com   →  Cyrillic 'а' →  impersonates apple.com
gооgle.com  →  Cyrillic 'о' →  impersonates google.com

This runs on every URL before it reaches the user.

Offline-first security

An 11,000+ domain blacklist is loaded into a Set at startup — lookups take sub-millisecond time, no network needed. For deeper analysis, optional VirusTotal integration scans against 70+ engines, but only when the user explicitly clicks the button. API keys stay in chrome.storage.local (device-only, never synced).

Architecture

Layer	What	Pattern
Background	Service Worker — context menu, scheduled rule updates, message routing	Event-driven
Popup	StateManager (single source of truth) + UIController (state-driven DOM)	Observer
Rule Engine	7 providers chained: each checks domain match, applies transforms, passes to next	Chain of Responsibility
Cleaning	4-layer pipeline with 3 swappable strategies (minimal/smart/aggressive)	Strategy
Networking	Fetch with exponential backoff + jitter, per-request timeouts via AbortSignal	Retry with backoff
Storage	Quota-aware persistence, auto-trims at 90% of 10MB limit	SafeStorage

Zero frameworks. Vanilla JS + ES6 modules. No bundler, no transpiler — the code in the repo is the code in the browser. Adding a new platform means writing one provider class and registering it in the factory.

Tech Stack

No dependencies in production — Chrome APIs + Fetch API + ES6
Dual storage: chrome.storage.sync for settings, .local for sensitive data
ClearURLs rule database (205 providers, 730+ rules, bundled + auto-updated)
1,400+ shortener domains from PeterDaveHello/url-shorteners (CC BY-SA 4.0)

Quick Start

git clone https://github.com/patatapython/link-cleaner-pro.git
cd link-cleaner-pro
npm install    # dev dependencies only (Jest)
npm test       # run test suite

Then load as unpacked extension in chrome://extensions/ (Developer mode ON).

Right-click any link → "Clean this link". Done.

Privacy

No telemetry. No analytics. No content scripts. No background monitoring. Everything runs locally. VirusTotal scans are opt-in and manual. Full policy · How it works

Credits

ClearURLs — open-source tracking parameter database with 205 providers and 730+ rules
PeterDaveHello/url-shorteners — community-curated list of 1,400+ URL shortening services (CC BY-SA 4.0)
VirusTotal — threat intelligence API scanning against 70+ antivirus engines

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
background		background
icons		icons
img		img
lib		lib
options		options
popup		popup
tests		tests
.gitignore		.gitignore
HOW_IT_WORKS.md		HOW_IT_WORKS.md
PRIVACY.md		PRIVACY.md
README.md		README.md
README_ES.md		README_ES.md
babel.config.cjs		babel.config.cjs
jest.config.cjs		jest.config.cjs
manifest.json		manifest.json
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Link Cleaner Pro

Why

Screenshots

How It Works

The Interesting Parts

Entropy-based tracker detection

Homograph attack detection

Offline-first security

Architecture

Tech Stack

Quick Start

Privacy

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Link Cleaner Pro

Why

Screenshots

How It Works

The Interesting Parts

Entropy-based tracker detection

Homograph attack detection

Offline-first security

Architecture

Tech Stack

Quick Start

Privacy

Credits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages