This document describes the current boundaries of memdump-toolkit: what it cannot do, where detection has gaps, and what constraints are inherent to offline minidump analysis.
The toolkit parses .dmp files via the minidump library. It cannot process:
- Full crash dumps (complete memory images with kernel data)
- Linux/macOS core dumps (
core.*files) - VMware memory snapshots (
.vmem,.vmss) - Hyper-V saved state files (
.vsv,.bin) - Volatility-compatible raw memory images
- Hibernation files (
hiberfil.sys)
Each dump represents one process. There is no cross-process correlation — you cannot trace an injection chain (process A injected into process B) from a single dump. Analyzing multiple dumps requires running the tool separately on each and manually correlating the IOC exports.
The toolkit only has the memory-resident state of the process. It cannot access the original on-disk files for comparison. This is the single largest architectural constraint and blocks several detection techniques (see §3).
Inline hooks (JMP patches at function entry points) and IAT hooks (import table modifications) are invisible without the on-disk originals to diff against. pe-sieve detects these by comparing in-memory module bytes against the on-disk PE, page by page. Since minidumps do not include the on-disk files, this comparison is not possible.
The stack frame walker (CHECK 9) identifies return addresses outside known modules, which catches classic shellcode-on-stack scenarios. It does not detect return-oriented programming (ROP) within legitimate modules — where every return address points into a valid DLL but the sequence of gadgets forms a malicious payload. ROP detection would require control flow analysis of the return address chain against known gadget databases.
Executable MEM_MAPPED regions outside modules receive LOW severity. These could be legitimate memory-mapped sections (fonts, locale data) or injected code. Without the backing file path (which minidumps do not always include), there is no way to distinguish between the two.
The stack frame walker now uses .pdata/.xdata unwind information (Phase 0) for precise x64 stack walking, falling back to frame pointer chains and heuristic scan when unwind data is unavailable. Current limitations:
- x64 only — 32-bit processes use SEH exception chains (different format), not
.pdata. These still rely on frame pointer walking + stack scan. - Requires readable .pdata — if the module's .pdata section is not captured in the dump (truncated dump), falls back to heuristic methods.
- Prolog/epilog edge case — if a thread is suspended mid-prolog or mid-epilog, the unwind delta may be slightly wrong. This is rare in practice (threads are typically in function bodies or system calls).
The headerless PE scanner finds binaries with zeroed MZ headers by locating section table patterns. It has these limits:
- Requires at least 2 intact section headers with valid names or printable ASCII
- If the attacker zeroes the entire PE header AND the section table, recovery is not possible
- Scan window is capped at 8 KB (0x2000) from segment start — PEs with unusually large optional headers pushing the section table beyond this offset will be missed (rare in practice)
Memory-dumped DLL hashes do not match their on-disk originals due to relocation, IAT patching, and page zeroing applied by the Windows loader. This means:
- On-disk hash databases (VirusTotal, etc.) will not match memory-dumped files
- The
--known-goodfeature requires hashes computed from memory dumps of clean systems, not from on-disk files
The C2 hunt scans raw process memory for indicators (URLs, IPs, hostnames, private keys) but cannot validate findings against external data sources:
- No PCAP/network capture correlation
- No DNS log cross-referencing
- No firewall rule validation
- No file system artifact inspection
Extracted IOCs (hashes, IPs, URLs) are not checked against external threat intelligence feeds. Integration with VirusTotal, AbuseIPDB, OTX, or similar services would transform output from "suspicious indicators" to "confirmed known-bad indicators" but requires network access and API keys.
The toolkit identifies what is suspicious (injection indicators, malicious modules, C2 artifacts) but does not reconstruct the order of events. PE timestamps and module load order provide hints, but a true attack timeline would require correlating with ETW traces, event logs, or Sysmon data — none of which are present in a minidump.
Module analysis (Step 3) uses ProcessPoolExecutor to analyze binaries across CPU cores (up to 4 workers). Each worker compiles YARA rules once on startup, so there is a per-worker initialization cost. For dumps with fewer than 4 binaries, analysis runs sequentially. On single-core systems or restricted environments where multiprocessing is unavailable, the toolkit falls back to sequential automatically.
Community rulesets (6 repositories, thousands of rule files) are compiled individually per scan. There is no rule caching between binaries or between runs. Pre-compiling rules into a binary cache on first use would significantly reduce per-binary scan time.
Large known-good hash sets (millions of entries) loaded as a Python set[str] consume significant RAM. For very large hash sets, a bloom filter or SQLite-backed lookup would be more appropriate than in-memory storage.
These limitations are fundamental to the approach and cannot be solved within the minidump format:
| Limitation | Why | What would fix it |
|---|---|---|
| No disk baseline comparison | Minidumps contain only in-memory state | Live-process scanning (pe-sieve) or full memory image + disk access |
| No handle/object information | MiniDumpNormal does not include handle data | MiniDumpWithHandleData flag at capture time |
| No kernel memory | Minidumps are user-mode only | Full memory forensics (Volatility) |
| Incomplete stack memory | Not all minidump types include full thread stacks | MiniDumpWithFullMemory at capture time |
| No network connection state | TCP/UDP table is not in the minidump | Capture with ProcDump /ma + netstat, or use Volatility |
| Relocated module hashes | Windows loader applies fixups before capture | Hash against the in-memory image, not the on-disk file |
These are concrete improvements that would close the most impactful gaps:
| Priority | Feature | Impact | Effort |
|---|---|---|---|
| High | Threat intel enrichment (VT, OTX, AbuseIPDB) | Confirms whether IOCs are known-bad | Medium |
| High | Multi-dump correlation | Traces injection chains across processes | High |
| DONE | |||
| DONE | |||
| DONE | |||
| Low | Volatility integration | Supports full memory images | High |
| DONE | |||
| Low | SIGMA/YARA rule generation | Auto-generate detection rules from findings | Medium |