Skip to content

atenreiro/opensquat

openSquat Logo

openSquat Core

Python 3.10+ License: GPL v3 GitHub issues GitHub stars


πŸ“‘ Table of Contents


🎯 What is openSquat?

openSquat is an Open Source Intelligence (OSINT) security tool that identifies cyber squatting threats targeting your brand or domains:

Threat Type Description
🎣 Phishing Fraudulent domains mimicking your brand
πŸ”€ Typosquatting Domains with common typos (e.g., gooogle.com)
🌐 IDN Homograph Look-alike characters from other alphabets
πŸ‘₯ DoppelgΓ€nger Domains containing your brand name
πŸ”€ Bitsquatting Single-bit errors in domain names

🌟 Featured In

"A powerful swiss army knife for brand protection" β€” WhoisXML API Blog, August 2022

"A tool with insane power to fight typosquatting and all related types of cyber mischief." β€” WhoisXML API Blog, August 2022

"A handy tool for collecting information on newly registered domains." β€” ranked Top 5 phishing detection tool β€” SOCRadar Blog, July 2022

"openSquat provides essential protection against domain squatting and phishing attacks through automated monitoring and detection." β€” Prince Yadav, TutorialsPoint, March 2026

Academic Citation

"OpenSquat identified 103 squatting domains, 960 active phishing websites, and 53 domains with suspicious certificates." β€” Sharma et al., Journal of Information Security and Cybercrimes Research (JISCR), Vol. 7, Issue 1, June 2024


πŸ”“ Open-Core Model

openSquat follows an open-core model:

  • Core detection engine β€” Open source and community-driven
  • Advanced capabilities β€” Delivered through commercial intelligence services

This model enables transparency and community collaboration while supporting the scale, reliability, and operational requirements of enterprise use.


✨ Key Features

  • πŸ“… Daily NRD feeds β€” Automatic newly registered domain updates
  • πŸ” Similarity detection β€” Levenshtein distance algorithm
  • πŸ”“ Three operating modes β€” Community (free feed), Premium Feed (paid feed, same local pipeline), or Premium API (hosted lookalike service). The two Premium modes share a single openSquat API key β€” see Premium and API Modes.
  • πŸ›‘οΈ VirusTotal integration β€” Check domain reputation
  • 🌐 Quad9 DNS validation β€” Identify malicious domains
  • πŸ“œ Certificate Transparency β€” Monitor SSL/TLS certificates
  • πŸ“Š Multiple output formats β€” TXT, JSON, CSV

πŸš€ Quick Start

Install via pip (recommended)

pip install opensquat
opensquat -k keywords.txt

Or clone the repository

git clone https://github.com/atenreiro/opensquat
cd opensquat
pip install -r requirements.txt
python3 opensquat.py -k keywords.txt

Repo users: in all the examples below, replace opensquat with python3 opensquat.py to run from a cloned checkout.


πŸ“¦ Requirements

  • Python 3.10+
  • Dependencies: confusable_homoglyphs, homoglyphs, colorama, requests, dnspython, beautifulsoup4

πŸ“– Usage

Basic Commands

# Default run
opensquat

# Show all options
opensquat -h

# Use custom keywords file
opensquat -k my_keywords.txt

Validation Options

# DNS validation via Quad9
opensquat --dns

# Check Certificate Transparency logs
opensquat --ct

# Scan for open ports (80/443)
opensquat --portcheck

# Cross-reference phishing databases
opensquat --phishing results.txt

Output Formats

# Save as JSON
opensquat -o results.json -t json

# Save as CSV
opensquat -o results.csv -t csv

Confidence Levels

Level Flag Description
0 -c 0 Very high (fewer results, high accuracy)
1 -c 1 High (default)
2 -c 2 Medium
3 -c 3 Low
4 -c 4 Very low (more results, more false positives)

Note: On the API side (--api), the five confidence levels map to four fuzziness values (exact, low, auto, high) β€” -c 3 and -c 4 both map to high. See Premium and API Modes for the full mapping and how to override with --api-fuzziness.


πŸ’Ž Premium and API Modes

openSquat supports three modes. The default (Community) is unchanged β€” existing users need no flags. The two Premium modes share a single openSquat API key; pick Premium Feed if you want the same local detection pipeline with a larger feed, or Premium API if you want server-side detection with no local feed download.

Mode Flag What it does
Community (default) (none) Downloads the free NRD feed (~100k domains/day) and runs local Levenshtein detection.
Premium Feed --premium Downloads the paid NRD feed (nrd-lite, much larger) using your openSquat API key, then runs the same local Levenshtein detection.
Premium API --api Skips local feed download. Queries the openSquat lookalike REST API per keyword and returns server-side matches.

Get an API key

Sign up at opensquat.com to get a key. The same key works for both Premium Feed (--premium) and Premium API (--api).

Provide the API key (priority order)

  1. --api-key YOUR_KEY on the command line
  2. OPENSQUAT_API_KEY environment variable
  3. api_key.txt in the current directory (one key per file, # comments allowed)

The CLI flag is visible in ps output. Prefer the env var or key file in shared environments.

Examples

# Premium Feed mode β€” same local pipeline, larger feed
export OPENSQUAT_API_KEY=os_xxxxxxxxxxxx
opensquat -k keywords.txt --premium

# Premium API mode β€” server-side detection per keyword
opensquat -k keywords.txt --api

# Premium API + DNS reputation check on each returned domain
opensquat -k keywords.txt --api --dns

# Premium API with JSON output grouped by keyword
opensquat -k keywords.txt --api -t json -o results.json

# Tune the Premium API search
opensquat -k keywords.txt --api --api-fuzziness high --api-history-days 7 --api-max-results 200

When --premium or --api successfully loads a key, the CLI prints a masked confirmation line so you can verify which key was picked up without leaking it:

[*] API key loaded: os_gL...L5Mb

In Premium API mode, the run summary reports the active mode, the number of API calls made, and your remaining balance with usage delta (for example, 4972 (used 4 of 4976 this run)). Per-keyword progress lines appear in the same order as your keywords file even though the calls run in parallel. Quota exhaustion (HTTP 429) returns partial results gracefully; auth errors (401) and plan errors (403) abort with a clear message.

If the backend rate-limits your request (HTTP 429 with a Retry-After header), the tool distinguishes it from quota exhaustion: you'll see a yellow [!] Rate limit hit (retry in Ns) warning instead of the red quota exhausted message, partial results are still returned, and the summary preserves your real API balance so you can see exactly how many credits you actually used. To avoid triggering rate limits on large scans, pass --api-rate-limit N to cap outbound requests per second across all workers. A value of 8 is a safe starting point for most backends.

# Throttle to 8 requests/second across all workers
opensquat -k keywords.txt --api --api-rate-limit 8

Output format recommendation

JSON is the recommended output format for Premium API mode because the API returns per-domain metadata that the other formats cannot carry as cleanly: the registered TLD, the NRD first-seen date, an IDN homograph flag, and the unicode rendering of the homograph when the domain is one.

opensquat -k keywords.txt --api -t json -o results.json

Example of the richer output in Premium API mode (trimmed):

[
  {
    "keyword": "microsoft",
    "domains": [
      {"domain": "securite-microsoft.fr", "tld": "fr", "date": "09-04-2026", "idn": false},
      {"domain": "xn--mirosoft-hw7c.com", "tld": "com", "date": "09-04-2026", "idn": true, "unicode": "miα΄„rosoft.com"}
    ]
  }
]

The idn flag plus the unicode rendering let you see at a glance that xn--mirosoft-hw7c.com is actually α΄„ (Latin Letter Small Capital C) impersonating the c in "microsoft" β€” information that a plain punycode string completely hides.

CSV output is also supported and produces one row per domain with the same metadata columns, which suits analysts working in Excel or pandas:

opensquat -k keywords.txt --api -t csv -o results.csv

The CSV is written with a UTF-8 BOM so Excel on Windows correctly renders the unicode homograph column.

Community and Premium Feed modes emit the same JSON top-level shape for cross-mode consistency, but with only the domain field populated per entry β€” the NRD feed does not carry the per-domain metadata that only the hosted API has:

[
  {
    "keyword": "microsoft",
    "domains": [
      {"domain": "mirosoft.com"},
      {"domain": "mcrosoft.net"}
    ]
  }
]

If you pass --api-key without also selecting --premium or --api, the CLI prints a one-line hint that the key will be ignored in Community mode (no silent mode-switching).

In Premium API mode, -c/--confidence is auto-mapped to API fuzziness (0β†’exact, 1β†’low, 2β†’auto, 3β†’high, 4β†’high). Use --api-fuzziness to override.

Premium API (--api) is incompatible with --doppelganger and -d/--domains.


βš™οΈ Configuration

Keywords File (keywords.txt)

# Lines starting with # are comments
mycompany
mybrand
myproduct

VirusTotal API Key (vt_key.txt)

To use --vt or --subdomains, add your API key:

# Get your free API key at https://www.virustotal.com
your_api_key_here

openSquat API Key (api_key.txt)

Required for --premium and --api. Create an api_key.txt file in the working directory:

# Get your key at https://opensquat.com
# Lines starting with # are ignored; the first non-comment line is used.
os_your_key_here

The CLI resolves the key in this order: --api-key flag β†’ $OPENSQUAT_API_KEY environment variable β†’ api_key.txt file. The env var and file methods are preferred over the CLI flag in shared environments, since CLI arguments are visible via ps.


πŸ€– Automation

Run daily via crontab:

# pip-installed (recommended) β€” every day at 8 AM, feeds update ~7:30 AM UTC
0 8 * * * cd /path/to/workdir && opensquat -k keywords.txt -o results.json -t json

# Repo checkout β€” invoke opensquat.py directly with python3
0 8 * * * cd /path/to/opensquat && python3 opensquat.py -k keywords.txt -o results.json -t json

The cd into a working directory matters if you rely on api_key.txt (resolved from the current directory) or want results.json written to a specific place.


πŸ“‹ CLI Reference

Argument Default Description
-k, --keywords keywords.txt Keywords file to search
-o, --output results.txt Output filename
-t, --type txt Output format: txt, json, csv
-c, --confidence 1 Confidence level (0-4). In --api mode this is auto-mapped to fuzziness (-c 3 and -c 4 both β†’ high).
-d, --domains β€” Use local domain file instead of downloading
-u, --url opensquat feed URL to download domain feed
--dns β€” Enable Quad9 DNS validation
--doppelganger β€” Doppelganger-only mode (keyword in domain + reachability check)
--ct β€” Search Certificate Transparency logs
--phishing β€” Cross-reference phishing database
--subdomains β€” Fetch subdomains via VirusTotal
--portcheck β€” Check for open ports 80/443
--vt β€” Validate against VirusTotal
--premium β€” Premium Feed mode β€” use the paid NRD feed (requires openSquat API key)
--api β€” Premium API mode β€” query the openSquat lookalike REST API per keyword (no local feed)
--api-key β€” openSquat API key (or set $OPENSQUAT_API_KEY, or use api_key.txt)
--api-fuzziness (from -c) Premium API mode: exact, low, high, or auto
--api-history-days β€” Premium API mode: NRD history window in days (clipped to plan cap)
--api-max-results β€” Premium API mode: max results per keyword (clipped to plan cap)
--api-rate-limit (unlimited) Premium API mode: max outbound requests per second across all workers

🀝 Contributing

We welcome contributions! See our Contributing Guide for details.

  • πŸ› Report bugs via GitHub Issues
  • πŸ’‘ Request features by opening an issue
  • πŸ”§ Submit PRs for bug fixes or enhancements
  • πŸ“ Release notes β€” see the CHANGELOG for what's new in each version

πŸ‘€ Author

Andre Tenreiro β€” LinkedIn Β· PGP Key


πŸ“œ License

This project is licensed under the GNU GPL v3.

About

The openSquat is an open-source tool for detecting domain look-alikes by searching for newly registered domains that might be impersonating legit domains and brands.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Sponsor this project

Packages

 
 
 

Contributors

Languages