WaybackWebSift

Tool to scrape emails, phone numbers, and links from a given URL either passively from archived sources or actively by fetching the URL. This project is a rewrite of WebSift by s-r-e-e-r-a-j in Python.

Features

Scraping Emails: Extract emails from visible text as well as from mailto: links.
Scraping Phone Numbers: Extract phone numbers found in visible text and from tel: links.
Scraping Links: Extract HTTP and HTTPS links from the page.
Passive Recon: Fetch content from archived sources using Wayback Machine or archive.is.

Requirements

The project requires Python 3 and the following packages:

A requirements.txt file is provided for easy installation.

Installation

Clone the repository:

git clone https://github.com/yetanotherf0rked/waybackwebsift.git
cd waybackwebsift

Install the required packages:

pip install -r requirements.txt

Usage

Run the main script:

python waybackwebsift.py

Follow the interactive prompts to choose the URL, the archive source (if any), the data to scrape, and whether or not you want to save the results in a specified folder.

Known Issues

When requesting archive.today, we get a 302 with a timeout before getting the URL. This is not supported yet by the script.
Links extracted when using archivers are suffixed by their archived URLs.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
waybackwebsift.gif		waybackwebsift.gif
waybackwebsift.py		waybackwebsift.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaybackWebSift

Features

Requirements

Installation

Usage

Known Issues

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

yetanotherf0rked/WaybackWebSift

Folders and files

Latest commit

History

Repository files navigation

WaybackWebSift

Features

Requirements

Installation

Usage

Known Issues

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages