This reference documents the public API for the 30+ modules in .claude/helpers/. These modules provide the foundational data access, analysis taxonomies, and infrastructure used by skills, agents, and commands.
Provides read-only access to per-binary analysis SQLite databases.
- IndividualAnalysisDB(db_path: str)
get_file_info()-> FileInfoRecordget_function_by_id(func_id: int)-> FunctionRecordget_function_by_name(name: str)-> FunctionRecordsearch_functions(name_contains: str, limit: int = 50, offset: int = 0)-> Page[FunctionRecord]get_all_functions(limit: int = 100, offset: int = 0)-> Page[FunctionRecord]
- FunctionRecord(dataclass)
- Fields:
id,name,mangled_name,signature,decompiled_code,assembly_code,inbound_xrefs,outbound_xrefs,strings,globals,dangerous_api_calls,loop_analysis,stack_frame
- Fields:
- FileInfoRecord(dataclass)
- Fields:
file_name,file_hash,architecture,security_features,exports,entry_points,tls_callbacks
- Fields:
- parse_json_safe(data: str | dict | None) -> dict | list | None
- Parses JSON strings from DB columns into Python objects.
Manages the tracking database for all analyzed modules.
- AnalyzedFilesDB(db_path: str)
get_by_file_name(name: str)-> list[AnalyzedFileRecord]get_complete()-> list[AnalyzedFileRecord]get_by_hash(file_hash: str)-> AnalyzedFileRecord | None
- AnalyzedFileRecord(dataclass)
- Fields:
file_name,file_hash,analysis_db_path,status,created_at
- Fields:
Centralized resolution for database file paths.
resolve_db_path(db_path: str, workspace_root: Path)-> str- Resolves a DB path relative to workspace_root, falling back to
extracted_dbs/.
- Resolves a DB path relative to workspace_root, falling back to
resolve_db_path_auto(db_path: str)-> str- Same as above but auto-detects workspace root from helpers/ location.
resolve_module_db(module_name_or_path: str, workspace_root: Path, *, require_complete: bool = True)-> str | None- Resolves a module name or
.dbpath to an absolute DB path.
- Resolves a module name or
resolve_module_db_auto(module_name_or_path: str, *, require_complete: bool = True)-> str | None- Same as above but auto-detects workspace root.
resolve_tracking_db(workspace_root: Path)-> str | None- Returns the path to
analyzed_files.db(checksextracted_dbs/then root).
- Returns the path to
resolve_tracking_db_auto()-> str | None- Same as above but auto-detects workspace root.
High-performance function-to-file resolution using function_index.json.
load_function_index(module_name: str)-> dict- Loads the function index for a specific module.
lookup_function(index: dict, name: str)-> dict | None- Resolves a function name to its metadata (file, library tag, ID).
filter_by_library(index: dict, functions: list[str])-> list[str]- Filters out functions tagged as library code (WIL, STL, CRT, etc.).
is_application_function(index: dict, name: str)-> bool- Returns True if the function is not tagged as library code.
Unified function lookup across multiple modules.
resolve_function(db: IndividualAnalysisDB, identifier: str | int)-> FunctionRecord | None- Resolves a function by its name or integer ID.
search_functions_by_pattern(db: IndividualAnalysisDB, pattern: str)-> list[FunctionRecord]- Searches functions using substring or regex patterns.
Multi-dimensional search across module databases.
- unified_search.py (Standalone Script)
- Dimensions:
name,signature,string,api,dangerous,class,export - Match Modes:
substring,regex,fuzzy - Usage:
python .claude/helpers/unified_search.py <db> --query <term>
- Dimensions:
Classification of Win32/NT APIs into functional and security categories.
classify_api(api_name: str)-> str | None- Returns the functional category (e.g.,
file_io,registry,network).
- Returns the functional category (e.g.,
classify_api_security(api_name: str)-> str | None- Returns the security impact category (e.g.,
privilege_escalation,data_leakage).
- Returns the security impact category (e.g.,
classify_api_fingerprint(api_name: str)-> str | None- Returns a coarse fingerprint bucket (
"com","rpc","security","crypto") for module-level density counting.
- Returns a coarse fingerprint bucket (
get_dangerous_api_set()-> set[str]- Returns a set of all APIs classified as security-sensitive.
DISPATCH_KEYWORDS: tuple of function-name substrings suggesting dispatch/routing behaviour.strip_import_prefix(api_name: str)-> str- Removes IDA import-thunk prefixes (
__imp_,_imp_,j_,cs:) from an API name.
- Removes IDA import-thunk prefixes (
IMP_PREFIX_RE-- Compiled regex for import-prefix stripping.
Mappings for C/C++ type sizes and IDA-to-C type conversions.
TYPE_SIZES: dict[str, int] (e.g.,BYTE: 1,DWORD: 4)IDA_TO_C_TYPE: dict[str, str] (e.g.,_BYTE: unsigned char)SIZE_TO_C_TYPE: dict[int, str] (e.g.,4: uint32_t)
Directed graph construction and traversal for function xrefs.
- CallGraph
from_functions(functions: list[FunctionRecord])-> CallGraphreachable_from(func_id: int, max_depth: int = 5)-> set[int]callers_of(func_id: int, max_depth: int = 5)-> set[int]find_path(start_id: int, end_id: int)-> list[int] | Noneget_stats()-> dict (node count, edge count, SCC count)
Resolution of external function calls across analyzed modules.
- CrossModuleGraph(tracking_db: AnalyzedFilesDB)
resolve_external_call(caller_module: str, callee_name: str)-> FunctionWithModuleInfo | Nonebuild_cross_module_chain(start_func: str, start_module: str, depth: int = 3)-> list
Filesystem-based result caching with DB mtime validation.
get_cached(db_path: str, operation: str, params: dict = None)-> dict | None- Retrieves a cached result if the DB mtime matches and TTL is valid.
cache_result(db_path: str, operation: str, data: dict, params: dict = None)-> None- Atomically writes a result to the cache directory.
clear_cache(module_name: str = None)-> None- Clears the cache for a specific module or the entire runtime.
Subprocess management and dynamic module loading.
run_skill_script(skill: str, script: str, args: list[str])-> dict- Executes a skill script as a subprocess and returns the parsed JSON output.
load_skill_module(skill: str, script: str)-> module- Dynamically imports a skill script as a Python module.
find_skill_script(skill: str, script: str)-> Path- Resolves the absolute path to a skill script.
Standardized JSON output for skill scripts.
emit_json(data: dict, *, status: str = "ok")-> None- Writes a dict to stdout wrapped with a
"status"key.
- Writes a dict to stdout wrapped with a
emit_json_list(key: str, items: list, *, extra: dict = None)-> None- Writes a list payload under key, wrapped with
"status".
- Writes a list payload under key, wrapped with
should_force_json(args)-> bool- Returns
Truewhen--jsonwas passed or--workspace-diris set.
- Returns
Structured JSON error reporting to stderr.
emit_error(message: str, code: str)-> None- Writes
{"error": message, "code": code}to stderr and exits with code 1.
- Writes
log_error(message: str, code: str)-> None- Writes the error JSON to stderr without exiting.
log_warning(message: str)-> None- Writes a warning message to stderr.
Throttled progress reporting for long-running operations.
- ProgressReporter(total: int, operation: str)
update(current: int)-> Nonestatus_message(msg: str)-> None
progress_iter(iterable, total: int = None, operation: str = "")-> iterator- Wraps an iterable with automatic progress reporting.
Parsing of Microsoft C++ mangled names.
parse_class_from_mangled(mangled_name: str)-> dict | None- Extracts
class_name,method_name,namespaces, androle(e.g.,constructor).
- Extracts
Access to pre-computed module fingerprints.
load_module_profile(module_name: str)-> dictget_noise_ratio(profile: dict)-> floatget_technology_flags(profile: dict)-> dict[str, bool] (e.g.,com,rpc,security)
Efficient loading of multiple function records.
batch_extract_function_data(db: IndividualAnalysisDB, func_ids: list[int])-> list[dict]batch_resolve_functions(db: IndividualAnalysisDB, identifiers: list[str | int])-> list[FunctionRecord]
WinRT server index built from extraction data across four access contexts (caller IL x server privilege).
- WinrtAccessContext(enum) --
HIGH_IL_ALL,HIGH_IL_PRIVILEGED,MEDIUM_IL_ALL,MEDIUM_IL_PRIVILEGED - WinrtMethod(dataclass) --
access,type,name,file; properties:short_name,class_name,binary_name - WinrtInterface(dataclass) --
name,guid,methods: list[WinrtMethod],pseudo_idl: list[str]; property:method_count - WinrtServer(dataclass) -- server class metadata with computed properties:
is_out_of_process,is_in_process,runs_as_system,has_permissive_sddl,is_remote_activatable,is_base_trustrisk_tier(context)-> str -- compute risk tier for a given access contextbest_risk_tier-> str -- highest risk across all contextsto_dict()-> dict
- WinrtIndex -- queryable index:
load(data_root)-- load all four access contextsget_servers_for_module(name)-> list[WinrtServer]get_servers_by_class(class_name)-> WinrtServer | Noneget_procedures_for_module(name)-> list[str]is_winrt_procedure(module, func_name)-> boolget_interfaces_for_module(name)-> list[WinrtInterface]get_methods_for_class(class_name)-> list[WinrtMethod]search_methods(pattern)-> list[WinrtMethod]get_access_contexts_for_class(class_name)-> set[WinrtAccessContext]get_privileged_surface(caller_il)-> list[WinrtServer]get_servers_by_risk(tier)-> list[WinrtServer]summary()-> dict
- get_winrt_index(force_reload: bool = False) -> WinrtIndex -- cached singleton
- invalidate_winrt_index() -- clear cached index
COM server index built from extraction data across four access contexts (caller IL x server privilege).
- ComAccessContext(enum) --
HIGH_IL_ALL,HIGH_IL_PRIVILEGED,MEDIUM_IL_ALL,MEDIUM_IL_PRIVILEGED - ComMethod(dataclass) --
access,type,name,file,interface_name; properties:short_name,class_name,binary_name - ComInterface(dataclass) --
name,guid,methods: list[ComMethod],pseudo_idl: list[str]; property:method_count - ComServer(dataclass) -- CLSID metadata with computed properties:
is_out_of_process(includes DLL surrogate),is_in_process,runs_as_system,has_permissive_launch,has_permissive_accessis_remote_activatable,is_trusted_marshaller,can_elevate,auto_elevationrisk_tier(context)-> str -- compute risk tier for a given access contextbest_risk_tier-> str -- highest risk across all contextsto_dict()-> dict
- ComIndex -- queryable index:
load(data_root)-- load all four access contextsget_servers_for_module(name)-> list[ComServer]get_server_by_clsid(clsid)-> ComServer | Noneget_procedures_for_module(name)-> list[str]is_com_procedure(module, func_name)-> boolget_interfaces_for_module(name)-> list[ComInterface]get_methods_for_clsid(clsid)-> list[ComMethod]search_methods(pattern)-> list[ComMethod]get_access_contexts_for_clsid(clsid)-> set[ComAccessContext]get_privileged_surface(caller_il)-> list[ComServer]get_servers_by_risk(tier)-> list[ComServer]get_elevatable_servers()-> list[ComServer]get_servers_by_service(name)-> list[ComServer]find_servers_for_interface(iid)-> list[ComServer]summary()-> dict
- get_com_index(force_reload: bool = False) -> ComIndex -- cached singleton
- invalidate_com_index() -- clear cached index
Integrity checking for analysis databases.
validate_analysis_db(db_path: str)-> ValidationResultquick_validate(db_path: str)-> bool
Session ID resolution and scratchpad path management.
resolve_session_id(stdin_data: dict)-> str- Resolves the current session ID from environment variables or the hook
protocol's stdin JSON payload. Resolution priority:
AGENT_SESSION_IDenv varconversation_idfrom stdin (Cursor)session_idfrom stdin (Claude Code)- UUID4 fallback
- Resolves the current session ID from environment variables or the hook
protocol's stdin JSON payload. Resolution priority:
scratchpad_path(session_id: str)-> Path- Returns the path to the session-scoped scratchpad file.
Regex-based extraction of function calls, arguments, and parameter usage from IDA decompiled C/C++ code.
extract_function_calls(code: str, *, keywords: frozenset[str] = _DEFAULT_KEYWORDS)-> list[dict]- Extracts call sites from decompiled code. Handles multi-line calls by joining lines when parentheses are unbalanced. Each dict has:
function_name,line_number,line,arguments,result_var.
- Extracts call sites from decompiled code. Handles multi-line calls by joining lines when parentheses are unbalanced. Each dict has:
discover_calls_with_xrefs(code: str, xrefs: list[dict], *, keywords: frozenset[str] = _DEFAULT_KEYWORDS)-> list[dict]- Uses DB
simple_outbound_xrefsas ground truth for call discovery, enriches with argument expressions from the regex parser. Preferred overextract_function_callsalone.
- Uses DB
split_arguments(args_str: str)-> list[str]- Splits comma-delimited argument strings while respecting nested
()and[].
- Splits comma-delimited argument strings while respecting nested
find_param_in_calls(code: str, param_name: str, *, keywords: frozenset[str] = _DEFAULT_KEYWORDS)-> list[dict]- Finds calls where a named parameter appears in an argument expression. Each dict has:
function_name,arg_position,arg_expression,line_number,is_direct.
- Finds calls where a named parameter appears in an argument expression. Each dict has:
extract_balanced_parens(text: str, start: int = 0)-> str | None- Extracts content from balanced parentheses starting at
text[start].
- Extracts content from balanced parentheses starting at
Scans x64 assembly code for struct/class field accesses to infer memory layouts. Assembly is the sole evidence source -- it provides deterministic sizes from instruction operands.
scan_assembly_struct_accesses(asm: str)-> list[dict]- Scans x64 assembly for
[reg+offset]memory access patterns with ptr-size inference. Returns dicts with:base,byte_offset,size,param_num,source,line_num.
- Scans x64 assembly for
scan_batch_struct_accesses(code: str, type_sizes: dict[str, int])-> list[dict]- Batch-lift style scanning: returns
base,offset,size,type_name,patternfor indexed, direct, and zero-offset accesses.
- Batch-lift style scanning: returns
merge_struct_fields(fields: list[dict])-> list[dict]- Merges overlapping field accesses from multiple functions into a unified layout sorted by offset.
parse_signature_params(signature: str)-> dict[str, str]- Parses C-style function signature parameter names and types into a
{name: type}dict.
- Parses C-style function signature parameter names and types into a
Unified finding schema for normalizing results across all vulnerability scanners (taint, memory corruption, logic).
- Finding(dataclass) -- Scanner-agnostic vulnerability finding.
- Fields:
function_name,function_id,module,source_type,source_category,sink,sink_category,severity,score,exploitability_score,exploitability_rating,verification_status,guards,path,evidence_lines,summary,extra to_dict()-> dictdedup_key(property) -> str -- Deduplication key:function_id::sink::source_category.path_signature(property) -> str -- SHA-256 hash prefix of sorted path elements.
- Fields:
from_taint_finding(finding: dict, func_info: dict | None = None)-> Finding- Converts a taint-analysis finding dict to a unified Finding.
from_memory_finding(finding: dict)-> Finding- Converts a MemCorruptionFinding dict to a unified Finding.
from_logic_finding(finding: dict)-> Finding- Converts a LogicFinding dict to a unified Finding.
from_verified_finding(verified: dict)-> Finding- Converts a VerificationResult dict to a unified Finding (handles both memory and logic verified outputs).
normalize_scanner_output(data: dict, source_type: str)-> list[Finding]- Extracts findings from a scanner's JSON output and normalizes them. Handles both raw and verified finding lists.
Merges, deduplicates, and ranks findings across multiple scanner outputs.
merge_findings(*scanner_outputs: tuple[dict, str])-> list[Finding]- Merges findings from multiple scanners. Each arg is
(data_dict, source_type). Returns deduplicated, score-sorted list.
- Merges findings from multiple scanners. Each arg is
deduplicate(findings: list[Finding], *, max_per_key: int = 3)-> list[Finding]- Removes duplicate findings (same function + sink + category). Keeps up to
max_per_keydistinct paths per dedup key, sorted by score.
- Removes duplicate findings (same function + sink + category). Keeps up to
rank(findings: list[Finding])-> list[Finding]- Sorts findings by composite score descending. Uses exploitability_score if available, severity as tiebreaker.
findings_summary(findings: list[Finding])-> dict- Produces summary: total count, by_severity, by_source, top_score.
Cross-report finding comparison for AI vulnerability scanners.
discover_reports(reports_dir: Path, scan_type: str | None = None)-> list[ReportMeta]- Finds
.findings.jsoncompanion files in a reports directory, sorted by timestamp (newest first).
- Finds
load_findings_json(path: Path)-> dict- Loads and validates a
.findings.jsoncompanion file. Raises FileNotFoundError or ValueError.
- Loads and validates a
compare_findings(current: dict, previous: dict)-> ComparisonResult- Compares findings between two scan reports. Matches by vulnerability_type + primary_function. Returns recurring, new, missed, severity changes, verdict conflicts, remediation changes, coverage delta.
format_comparison_section(result: ComparisonResult, previous_report_path: str | None, previous_timestamp: str | None)-> str- Generates a markdown
## Previous Findings Comparisonsection for appending to scan reports.
- Generates a markdown
x64 fastcall register mappings and assembly width constants.
PARAM_REGISTERS: dict[int, set[str]] -- Parameter number (1-based) to register alias set (e.g.,1: {"rcx", "ecx", "cx", "cl", "ch"}).REGISTER_TO_PARAM: dict[str, int] -- Reverse lookup: register name to parameter number.PARAM_REGS_X64-- Backward-compatible alias forREGISTER_TO_PARAM.ASM_REG_SIZES: dict[str, int] -- Register name to byte width (e.g.,"rax": 8,"eax": 4).ASM_PTR_SIZES: dict[str, int] -- Instruction operand width qualifiers (e.g.,"byte": 1,"qword": 8).STACK_REGS: frozenset[str] -- Stack/frame registers excluded from struct-field inference.param_name_for(param_number: int)-> str- Returns IDA-style positional parameter name (
a1,a2, ...).
- Returns IDA-style positional parameter name (
Parameter surface metadata from C-style signatures.
describe_parameter_surface(signature)→dictwith keys:param_count,has_buffer_pointer,has_string_pointer,has_size_param,has_buffer_size_pair,has_handle,has_com_interface,has_struct_pointer,has_flags_param,pointer_param_count,characteristicsPARAM_TYPE_PATTERNS-- 11 regex patterns mapping Windows types to categoriesBUFFER_SIZE_PAIR_PATTERNS-- 3 compiled regexes detecting buffer+size parameter pairs
SDDL ACE parsing with Deny support and effective permission computation.
- ParsedACE(dataclass)
- Fields:
ace_type("A"Allow or"D"Deny),flags,rights,object_guid,inherit_object_guid,account_sid
- Fields:
parse_sddl_aces(sddl: str)-> list[ParsedACE]- Parses all ACEs from an SDDL string in evaluation order.
effective_permissions_for_sid(sddl: str, sid: str, *, permissive_sids: set[str] | None = None)-> tuple[bool, str]- Determines whether a SID has effective access after Deny evaluation. Returns
(has_access, reason).
- Determines whether a SID has effective access after Deny evaluation. Returns
is_permissive_sddl(sddl: str)-> bool- Checks whether any permissive SID (WD, AC, AU, IU) has effective access. Correctly handles Deny overrides.
PERMISSIVE_SIDS: frozenset of well-known permissive SID abbreviations.
Pre-execution validation for slash command arguments.
- CommandValidationResult(dataclass)
- Fields:
ok,errors,error_codes,warnings,resolved add_error(msg: str, code: ErrorCode | str)-> Noneadd_warning(msg: str)-> None
- Fields:
validate_module(module_name: str, workspace_root: Path | None = None, *, allow_code_only: bool = False)-> CommandValidationResult- Validates module existence and DB accessibility. On success,
result.resolved["db_path"]contains the absolute DB path.
- Validates module existence and DB accessibility. On success,
validate_function_arg(db_path: str, function_ref: str)-> CommandValidationResult- Validates that a function reference resolves in the given DB. On success,
result.resolved["function"]contains the resolved record.
- Validates that a function reference resolves in the given DB. On success,
validate_depth_param(value: Any, max_depth: int = 20)-> CommandValidationResult- Validates a depth parameter is a positive integer within bounds.
validate_command_args(command_name: str, args: dict[str, Any], workspace_root: Path | None = None)-> CommandValidationResult- Dispatches to per-command validators based on
command_name. Validates module, function, depth, and command-specific flags.
- Dispatches to per-command validators based on
command_preflight(command_name: str, module: str | None = None, function: str | None = None, **kwargs)-> CommandValidationResult- Convenience wrapper: validates and resolves all arguments in one call.
Schema parsing and validation for headless batch pipeline YAML definitions.
- StepConfig(dataclass, frozen) -- Metadata for a supported pipeline step.
- Fields:
name,kind(StepKind),description,goal,valid_options
- Fields:
- StepDef(dataclass, frozen) -- A parsed step entry from the YAML file.
- Fields:
name,options,config
- Fields:
- PipelineSettings(dataclass, frozen) -- Execution settings after config/YAML merge.
- Fields:
continue_on_error,max_workers,step_timeout,parallel_modules,max_module_workers,no_cache module_workers(property) -> int
- Fields:
- PipelineDef(dataclass, frozen) -- Fully parsed pipeline definition.
- Fields:
name,source_path,modules,steps,settings,output
- Fields:
- ResolvedModule(dataclass, frozen) -- Module name resolved to a concrete DB path.
- Fields:
module_name,db_path
- Fields:
STEP_REGISTRY: dict[str, StepConfig] -- Registry of all supported pipeline step names.load_pipeline(yaml_path: str | Path)-> PipelineDef- Parses a YAML pipeline definition file into a typed
PipelineDef.
- Parses a YAML pipeline definition file into a typed
resolve_modules(modules: list[str] | Literal["all"], workspace_root: Path)-> list[ResolvedModule]- Resolves module names (or
"all") to concrete DB paths.
- Resolves module names (or
validate_pipeline(definition: PipelineDef, workspace_root: Path)-> ValidationResult- Validates a parsed pipeline: checks steps exist in registry, modules resolve, options are valid.
render_output_path(template: str, module_name: str, workspace_root: Path)-> str- Renders an output directory path from a template with
{module},{timestamp}placeholders.
- Renders an output directory path from a template with
Execution engine for headless batch pipelines.
- StepResult(dataclass) -- Outcome of one pipeline step.
- Fields:
step_name,status,elapsed_seconds,workspace_path,error,data
- Fields:
- ModuleResult(dataclass) -- Execution summary for a single module.
- Fields:
module_name,db_path,status,elapsed_seconds,step_results,errors
- Fields:
- BatchResult(dataclass) -- Execution summary for a full batch run.
- Fields:
pipeline_name,source_path,output_dir,status,dry_run,settings,modules,total_elapsed_seconds
- Fields:
execute_module(module: ResolvedModule, steps: list[StepDef], settings: PipelineSettings, batch_dir: str | Path)-> ModuleResult- Executes all pipeline steps for a single module.
execute_pipeline(definition: PipelineDef, workspace_root: Path)-> BatchResult- Executes a full batch pipeline: resolves modules, dispatches steps (with optional module-level parallelism), writes manifest and summary.
dispatch_goal_step(step: StepDef, module: ResolvedModule, settings: PipelineSettings, batch_dir: Path)-> StepResult- Dispatches a goal-type step (triage, security) to the triage-coordinator agent.
dispatch_scan_step(step: StepDef, module: ResolvedModule, settings: PipelineSettings, batch_dir: Path)-> StepResult- Dispatches a security-scan step (memory, logic scanners).
dispatch_skill_step(step: StepDef, module: ResolvedModule, settings: PipelineSettings, batch_dir: Path)-> StepResult- Dispatches a skill-group step (classify, callgraph, taint, dossiers, entrypoints).
write_batch_manifest(batch_dir: str | Path, definition: PipelineDef, progress: dict)-> None- Writes/updates the batch manifest file with current progress.
write_batch_summary(batch_dir: str | Path, batch_result: BatchResult)-> str- Writes the final batch summary JSON and returns its path.
Run-directory I/O primitives for multi-step workflow handoff.
create_run_dir(module_name: str, goal: str)-> str- Creates and returns a new workspace run directory path under
.claude/workspace/.
- Creates and returns a new workspace run directory path under
list_runs(module: str | None = None, goal: str | None = None, limit: int | None = 10)-> list[dict]- Lists workspace runs, optionally filtered by module or goal. Returns manifest metadata per run.
write_results(run_dir: str | Path, step_name: str, full_data: Any, summary_data: Any)-> dict[str, str]- Writes full
results.jsonandsummary.jsonfor a step. Returns paths dict.
- Writes full
read_results(run_dir: str | Path, step_name: str)-> Any- Reads and returns full results JSON for a step (with workspace envelope).
read_step_payload(run_dir: str | Path, step_name: str)-> Any- Reads and returns the unwrapped skill output payload (envelope stripped).
read_summary(run_dir: str | Path, step_name: str)-> Any- Reads and returns summary JSON for a step.
get_step_paths(run_dir: str | Path, step_name: str)-> dict[str, str]- Returns paths for a step's results and summary files (no I/O).
update_manifest(run_dir: str | Path, step_name: str, status: str, summary_path: str | Path)-> None- Updates the manifest with step status and summary path reference.
summarize_json_payload(payload: Any)-> dict- Produces a compact preview of a JSON payload (key counts, list lengths, scalar truncation).
utc_iso()-> str -- Returns current UTC time as ISO-8601 string.safe_name(value: str, fallback: str = "item")-> str -- Sanitizes a string for use in file paths.coerce_path(value: str | Path)-> Path -- Resolves a path to absolute.MANIFEST_FILE,RESULTS_FILE,SUMMARY_FILE-- Filename constants.
Workspace step setup bootstrap reducing boilerplate for skill scripts.
prepare_step(run_dir: str | Path, step_name: str)-> dict[str, str]- Creates step subdirectory and returns paths for output files (
step_name,step_path,results_path,summary_path).
- Creates step subdirectory and returns paths for output files (
complete_step(run_dir: str | Path, step_name: str, full_data: Any, summary_data: Any, status: str = "success")-> dict[str, str]- Writes step results + summary and updates the manifest in one call.
Validates workspace handoff compliance for run directories.
- WorkspaceValidationResult(dataclass)
- Fields:
valid,run_dir,issues,manifest,step_count to_dict()-> dict
- Fields:
validate_workspace_run(run_dir: str | Path)-> WorkspaceValidationResult- Validates: run directory exists, manifest.json is valid, steps have status and summary_path, referenced files exist, each step directory has results.json and summary.json.
Shared orchestration helpers for agent scripts.
- AgentStep(dataclass, frozen) -- Description of a skill-script invocation.
- Fields:
name,skill_name,script_name,args,timeout,json_output,workspace_dir,workspace_step,max_retries
- Fields:
- AgentStepResult(dataclass) -- Execution result for one step.
- Fields:
name,skill_name,script_name,success,elapsed_seconds,exit_code,error,stdout,stderr,json_data to_dict()-> dict
- Fields:
- AgentBase -- Shared runner wrapper for agent skill invocations.
run_skill_script_result(skill_name, script_name, args, *, timeout, json_output, workspace_dir, workspace_step, max_retries, warn_on_failure)-> dictrun_skill_script(skill_name, script_name, args, *, timeout, workspace_dir, workspace_step, max_retries)-> dict | list | None
- AgentOrchestrator -- Lightweight step execution with retry and circuit-breaker.
__init__(runner: AgentBase | None, *, max_workers: int = 4, failure_threshold: int | None = None)run_step(step: AgentStep)-> AgentStepResultresults(property) -> list[AgentStepResult]summary()-> dict -- Aggregate statistics: total, failed, elapsed, steps.
PE import/export table index across all analyzed modules for loader-level dependency resolution.
- ExportEntry(dataclass, frozen)
- Fields:
module,db_path,name,ordinal,is_forwarded,forwarded_to
- Fields:
- ImportEntry(dataclass, frozen)
- Fields:
importing_module,source_module,function_name,is_delay_loaded,ordinal
- Fields:
- ImportExportIndex -- Queryable index (context manager).
__init__(tracking_db: str | None = None, workspace_root: str | Path | None = None, *, max_workers: int = 8)who_exports(function_name: str)-> list[ExportEntry] -- Modules whose PE export table contains the function.who_imports(function_name: str, *, source_module: str | None = None)-> list[ImportEntry] -- Modules that import the function.module_consumers(module_name: str)-> dict[str, list[str]] -- Modules that import from module_name, grouped by importing module.module_suppliers(module_name: str)-> dict[str, list[str]] -- Modules that module_name imports from, grouped by supplier.resolve_forwarder_chain(module: str, function_name: str)-> list[ExportEntry] -- Follows PE forwarded export chains to the final implementation.dependency_graph()-> dict[str, set[str]] -- Module-to-module dependency edges from PE import tables.module_export_list(module_name: str)-> list[ExportEntry] -- All exports for a module.summary()-> dict -- Aggregate statistics (module count, export count, import count).- Context manager:
with ImportExportIndex() as idx: ...
IDA type size mappings and C type translation tables. Expands the brief entry in section 3.
TYPE_SIZES: dict[str, int] -- IDA type name to byte size (e.g.,"_BYTE": 1,"DWORD": 4,"_QWORD": 8). Covers_BYTE,BYTE,char,_WORD,WORD,short,_DWORD,DWORD,int,LONG,HRESULT,_QWORD,QWORD,__int64, and unsigned variants.IDA_TO_C_TYPE: dict[str, str] -- IDA type name to C standard type for header generation (e.g.,"_DWORD": "uint32_t","HRESULT": "HRESULT").SIZE_TO_C_TYPE: dict[int, str] -- Field byte-size to default C type (e.g.,1: "uint8_t",4: "uint32_t",8: "uint64_t").
Shared SQL utilities for safe LIKE queries.
escape_like(value: str)-> str- Escapes SQL LIKE meta-characters (
\,%,_) so the value is matched literally. Callers must appendESCAPE '\'to the LIKE clause.
- Escapes SQL LIKE meta-characters (
LIKE_ESCAPE: str -- The SQLESCAPE '\'clause fragment to append to LIKE expressions.
Centralized logging configuration for the runtime.
configure_logging()-> None- Sets up the
helperslogger hierarchy with a stderr handler. Safe to call multiple times. Level controlled byDEEPEXTRACT_LOG_LEVELenvironment variable (defaultWARNING).
- Sets up the