Skip to content

zahanzo/SigHunter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SigHunter: High-Performance Binary Pattern Analyzer

Project Overview

SigHunter is a specialized, dual-mode analysis tool developed in C++ designed to scan compiled binaries and live process memory for specific hexadecimal patterns, commonly known as "signatures". Developed as a final project for CS50, this utility is built for reverse engineers and security analysts who need to identify compiler artifacts, packers, or malicious code fragments within Windows Portable Executable (PE) files and live RAM.

Unlike a simple string search, SigHunter performs a deep dive into the binary structure. It maps the PE header to identify all sections but performs a Global Deep Scan, bypassing structural corruptions (like XOR obfuscation) to find hidden payloads even in file overlays.

Core Features

  • Dynamic Memory Analysis (RAM Scan): Attach the engine directly to a live Process ID (PID) to extract and scan raw memory using Windows API (ReadProcessMemory), defeating on-disk packers.
  • Global Deep Scan: SigHunter scans the entire byte array (from 0x0 to EOF). Matches are automatically tagged with their exact memory section or flagged as [In: OVERLAY/UNKNOWN] if found outside mapped regions.
  • PE Header Parsing: Automatically detects Windows PE files, validates the MZ and PE signatures, and calculates the exact bounds of all file sections.
  • Wildcard Support: Supports dynamic byte patterns (e.g., 60 E8 ?? ?? ?? ??) to match instructions with variable memory offsets or addresses.
  • Noise Reduction & Logging: Implements a "Smart Trim" feature that limits terminal output for high-frequency matches to prevent flooding, while allowing full data export to a raw log file.

Technical Architecture

1. Project Structure

  • Scanner.hpp: Defines the engine's core data structures (SigByte, Signature, MatchRecord, SectionRange) and the Scanner class blueprint.
  • Scanner.cpp: The implementation layer containing the PE parser logic, memory reading, hexadecimal string compilation, and the sliding-window search algorithm.
  • SigHunter.cpp: The CLI entry point that handles argument parsing, input validation, and engine initialization.
  • signatures.txt: A modular external database following the format: [CATEGORY]|PATTERN|DESCRIPTION.
  • PoC's/: A folder containing Proof of Concept (PoC) scripts, including Python XOR obfuscators, a upx packed program, and a program that's run, insert a signature direct in it's memory, and it's pid, base memory, and the memory where the payload where injected for you to analize it dinamicaly.

2. The Scanning Engine

The search logic utilizes a high-performance sliding window approach.

  • Static Mode: Loads the target binary into a std::vector<uint8_t> in RAM, preventing I/O bottlenecking during multi-signature scans.
  • Dynamic Mode: Uses EnumProcessModules and GetModuleInformation to determine the live image size, then copies the live process memory directly into the scanning buffer.
  • Comparison Logic: The engine iterates through the memory buffer. If a wildcard flag (??) is detected in the compiled signature, the engine skips that specific byte comparison, ensuring a match for dynamic instructions.

3. Cross-Platform PE Parser

To achieve professional-grade accuracy, SigHunter implements a manual PE Parser:

  • Validation: Validates the MZ signature (0x4D 0x5A) and locates the NT Headers via the e_lfanew pointer.
  • Section Mapping: Iterates through the Section Table to identify all segments (e.g., .rdata, .data), extracting PointerToRawData and SizeOfRawData. This allows the engine to accurately report exactly where a malicious signature is hiding.

Usage

Build

To compile the project using a standard C++ compiler (like g++ or MSVC): '''bash g++ SigHunter.cpp Scanner.cpp -o sighunter.exe -lpsapi ''' (Note: -lpsapi is required for the Windows Process Status API used in RAM scanning).

Execution

1. Static Scan (File on Disk) Run a static scan with full log export: '''bash ./sighunter.exe -t target_binary.exe -w signatures.txt -o full_report.txt '''

2. Dynamic Scan (Live Process Memory) Run a scan directly against a running Process ID (PID): '''bash ./sighunter.exe -p -w signatures.txt -o mem_report.txt '''

How to Test (Proof of Concept)

To comply with security policies and avoid distributing pre-compiled or potentially flagged executables (.exe), the tests/ directory contains the source code and automation scripts to build the test environment locally.

  1. Navigate to the tests/ folder.
  2. Run the provided batch script to generate the targets: '''bash ./setup_tests.bat ''' This script requires g++ and Python 3. It will automatically:
  • Compile the hello.cpp source code into a base binary (hello_clean.exe).
  • Execute a custom Python script (PoC_corrupt.py) that uses an XOR cipher to corrupt the PE header and injects a custom 0xRobert payload into the file overlay (hello_infected.exe).
  • And Compile the Poc.cpp into (Poc.exe) for a dinamic PoC example.

You can then run sighunter.exe against these generated binaries or execute the infected binary and scan its PID to see the dynamic RAM extraction in action.

Design Decisions

  • Memory vs. Disk: Loading the binary into a RAM buffer was prioritized to ensure that multi-signature scans are almost instantaneous.
  • Zero-Dependency Parsing: The PE parser was written from scratch using bitwise operations rather than the <windows.h> library, making the static analysis portable and capable of analyzing Windows binaries on Linux systems.
  • UI/UX for Analysts: The "Smart Trim" logic was a critical addition; during testing, standard compiler padding generated over 14,000 matches. Trimming this noise in the terminal while logging it to a file ensures the tool is usable for human analysts.

About the Author

Robert Souza Lages is an undergraduate student in Systems Analysis and Development at UniSenac (Pelotas, Brazil). This project reflects a deep interest in low-level software engineering, C++, memory management, and the internal structures of operating systems.

About

A high-performance C++ binary analysis engine for deep static scanning and live process memory inspection. (CS50 Final Project)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors