Skip to content

Attack Surface Graph

“samuele edited this page Apr 18, 2026 · 10 revisions

Attack Surface Graph

RedAmon uses a Neo4j graph database as the single source of truth for every finding. The graph stores the complete topology of the target's attack surface as an interconnected knowledge graph, enabling both visual exploration in the webapp and intelligent querying by the AI agent.


Node Types

The graph contains 22 node types organized into seven categories.

Infrastructure Nodes

Represent the network topology:

Node Key Properties Description
Domain name, registrar, creation_date, expiration_date, WHOIS data Root domain with full WHOIS information
Subdomain name, has_dns_records, status, status_codes, http_live_url_count Discovered hostname with HTTP liveness (status: "resolved", "no_http", or HTTP code like "200", "404")
IP address, version, is_cdn, cdn_name, asn Resolved IP address with CDN/ASN metadata
Port number, protocol, state Open port on an IP
Service name, product, version, banner Running service with version info
ExternalDomain domain, sources, redirect_from_urls, redirect_to_urls, status_codes_seen, times_seen Out-of-scope domain encountered during recon (redirects, crawling, historical scans)

Web Application Nodes

Represent the application layer:

Node Key Properties Description
BaseURL url, status_code, title, server, response_time_ms, resolved_ip Live HTTP endpoint with response metadata
Endpoint path, method, has_parameters, is_form, source Discovered URL path with HTTP method
Parameter name, position (query/body/header/path), is_injectable Input parameter, flagged when vulnerable

Technology & Security Nodes

Represent detected software and security posture:

Node Key Properties Description
Technology name, version, categories, confidence, detected_by, known_cve_count Framework, library, or server
Header name, value, is_security_header HTTP response header
Certificate subject_cn, issuer, not_after, san, tls_version TLS certificate details
DNSRecord type (A/AAAA/MX/NS/TXT/SOA), value, ttl DNS record

Vulnerability & Exploitation Nodes

Represent security findings and successful attacks:

Node Key Properties Description
Vulnerability id, name, severity (lowercase), source (nuclei/gvm/security_check/nmap_nse), curl_command Scanner finding with evidence
CVE id, cvss, severity (uppercase), description, published Known vulnerability from NVD
MitreData cve_id, cwe_id, cwe_name, abstraction CWE weakness mapping
Capec capec_id, name, likelihood, severity, execution_flow Common attack pattern
ChainFinding finding_type, severity, title, evidence, confidence EvoGraph: agent discovery (replaces legacy Exploit node) — see EvoGraph

JS Reconnaissance Nodes

Represent analyzed JavaScript files and their findings (hierarchical structure):

Node Key Properties Description
JsReconFinding (js_file) id, finding_type='js_file', title, source_url, is_uploaded Analyzed JS file -- parent node for all findings from that file
JsReconFinding (finding) id, finding_type, severity, confidence, title, evidence Individual finding. finding_type ∈ {dependency_confusion, source_map_exposure, dom_sink, framework, dev_comment, email, internal_ip, object_reference, cloud_asset, external_domain}. Type-specific extras: cloud_asset -> cloud_provider/cloud_asset_type; external_domain -> times_seen/sample_urls; object_reference -> potential_idor (UUID v4 heuristic, not actual IDOR detection)

Each JS file becomes a JsReconFinding node with finding_type='js_file'. All findings, secrets, and endpoints from that file are linked to it -- not directly to Domain/BaseURL. Secrets and endpoints reuse the existing Secret and Endpoint node types with source='js_recon'. external_domain findings are the exception: they link directly to Domain since they have no single parent JS file.

TruffleHog Secret Scanning Nodes

Represent findings from TruffleHog secret scanning:

Node Key Properties Description
TrufflehogScan id, target_org, started_at, completed_at, total_findings, total_repositories Scan metadata and statistics
TrufflehogRepository id, name A scanned GitHub repository
TrufflehogFinding detector_name, verified, redacted, repository, file, commit, line, link Individual secret finding with verification status

Relationship Chain

The graph connects nodes through directed relationships that mirror real infrastructure:

Domain ──HAS_SUBDOMAIN──> Subdomain
Domain ──HAS_EXTERNAL_DOMAIN──> ExternalDomain
Subdomain ──RESOLVES_TO──> IP
IP ──HAS_PORT──> Port
Port ──RUNS_SERVICE──> Service
Port ──SERVES_URL──> BaseURL
Service ──POWERED_BY──> BaseURL
BaseURL ──HAS_ENDPOINT──> Endpoint
BaseURL ──USES_TECHNOLOGY──> Technology
BaseURL ──HAS_HEADER──> Header
BaseURL ──HAS_CERTIFICATE──> Certificate
Endpoint ──HAS_PARAMETER──> Parameter
Technology ──HAS_KNOWN_CVE──> CVE
CVE ──HAS_CWE──> MitreData
MitreData ──HAS_CAPEC──> Capec
Vulnerability ──FOUND_AT──> Endpoint
Vulnerability ──AFFECTS_PARAMETER──> Parameter
Vulnerability ──HAS_CVE──> CVE
IP/Subdomain/Domain ──HAS_VULNERABILITY──> Vulnerability

Nmap enrichment (service detection + NSE scripts):
Service ──USES_TECHNOLOGY──> Technology
Port ──HAS_TECHNOLOGY──> Technology
Vulnerability ──AFFECTS──> Port
Vulnerability ──FOUND_ON──> Technology

JS Reconnaissance (hierarchical: parent -> file -> findings):
BaseURL ──HAS_JS_FILE──> JsReconFinding(js_file)     [pipeline-crawled JS]
Domain  ──HAS_JS_FILE──> JsReconFinding(js_file)     [uploaded JS files]
JsReconFinding(js_file) ──HAS_JS_FINDING──> JsReconFinding
JsReconFinding(js_file) ──HAS_SECRET──> Secret        [source='js_recon']
JsReconFinding(js_file) ──HAS_ENDPOINT──> Endpoint    [source='js_recon']
Domain ──HAS_JS_FINDING──> JsReconFinding(external_domain)   [no single parent JS file]

TruffleHog secret scanning:
Domain ──HAS_TRUFFLEHOG_SCAN──> TrufflehogScan
TrufflehogScan ──HAS_REPOSITORY──> TrufflehogRepository
TrufflehogRepository ──HAS_FINDING──> TrufflehogFinding

EvoGraph bridges (attack chain → recon graph):
AttackChain ─ ─CHAIN_TARGETS─ ─> IP / Subdomain
ChainStep ─ ─STEP_TARGETED─ ─> IP / Port
ChainStep ─ ─STEP_EXPLOITED─ ─> CVE
ChainFinding ─ ─FOUND_ON─ ─> IP / Subdomain
ChainFinding ─ ─FINDING_RELATES_CVE─ ─> CVE

Vulnerability Source Differences

Vulnerabilities connect differently depending on their source:

Source Connection Pattern
Nuclei (web application) Linked via FOUND_AT to the Endpoint and AFFECTS_PARAMETER to the vulnerable Parameter
GVM (network level) Linked via HAS_VULNERABILITY directly to IP and Subdomain nodes
Nmap NSE (service level) Linked via AFFECTS to Port, FOUND_ON to Technology, and HAS_CVE to CVE
Security checks (DNS/email/headers) Linked via HAS_VULNERABILITY to Subdomain or Domain

Multi-Tenant Design

Every node includes user_id and project_id properties. All queries are automatically scoped to the current user and project — the AI agent never generates tenant filters itself, preventing accidental cross-project data access.

Query Pattern

MATCH (d:Domain {user_id: $userId, project_id: $projectId})
-[:HAS_SUBDOMAIN]->(s:Subdomain)
-[:RESOLVES_TO]->(ip:IP)
-[:HAS_PORT]->(p:Port)
RETURN d, s, ip, p

How the AI Agent Uses the Graph

Before taking any offensive action, the agent queries the graph to build situational awareness:

  1. Attack surface mapping — queries the Domain → Subdomain → IP → Port → Service chain
  2. Technology-CVE correlation — traverses Technology → CVE relationships, prioritizing by CVSS score
  3. Injectable parameter discovery — queries Parameter nodes flagged as is_injectable: true
  4. Exploit feasibility assessment — cross-references ports, services, and CVEs to find matching Metasploit modules
  5. Post-exploitation context — after exploiting, creates a ChainFinding(exploit_success) in the EvoGraph, bridged to the target IP and CVE

The text-to-Cypher system includes 25+ example query patterns and automatically retries with error context on failure (up to 3 attempts).


Graph Visualization

The graph is visualized on the Red Zone:

  • 2D mode — force-directed layout with pan, zoom, and node selection
  • 3D mode — WebGL rendering with rotation for large graphs
  • Color coding — each node type has a distinct color
  • Filtering — use the bottom bar to show/hide specific node types
  • Node drawer — click any node to see all properties

EvoGraph Bridge

The recon graph is connected to RedAmon's EvoGraph (Evolutive Attack Chain Graph) through bridge relationships. EvoGraph tracks everything the AI agent does during exploitation sessions — every tool execution, finding, decision, and failure — and bridges back to the recon graph nodes they relate to.

Relationship From (EvoGraph) To (Recon Graph) Purpose
CHAIN_TARGETS AttackChain IP / Subdomain / Port / CVE / Domain Attack chain's target
STEP_TARGETED ChainStep IP / Subdomain / Port Step's target infrastructure
STEP_EXPLOITED ChainStep CVE CVE this step attempted to exploit
STEP_IDENTIFIED ChainStep Technology Technology identified during this step
FOUND_ON ChainFinding IP / Subdomain Where the finding was discovered
FINDING_RELATES_CVE ChainFinding CVE CVE related to the finding
CREDENTIAL_FOR ChainFinding Service / Port Service/port the credential works on

This means you can traverse from any recon graph node to see all attack chain activity that targeted it — enabling cross-session intelligence queries like "what has been tried against this IP?" or "which CVEs have been successfully exploited?".

Full details: See EvoGraph — Attack Chain Evolution for the complete attack chain graph schema, node types, and cross-session learning.


Next Steps

Clone this wiki locally