Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 57 additions & 5 deletions skills/appsec/secure-code-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ phase: [build, review]
frameworks: [OWASP-ASVS, CWE-Top-25, OWASP-Top-10]
difficulty: intermediate
time_estimate: "15-45min per module"
version: "1.0.0"
version: "1.0.1"
author: unitoneai
license: MIT
allowed-tools: Read, Grep, Glob
Expand Down Expand Up @@ -340,10 +340,10 @@ Remediation: Never log secrets. Log only the username and the outcome -- `logger

---

## Step 8: Deserialization and File Handling
## Step 8: Deserialization, XML, and File Handling

**ASVS Reference:** V12 -- Files and Resources
**CWE Coverage:** CWE-502 (Deserialization of Untrusted Data), CWE-434 (Unrestricted Upload of File with Dangerous Type), CWE-918 (Server-Side Request Forgery)
**CWE Coverage:** CWE-502 (Deserialization of Untrusted Data), CWE-434 (Unrestricted Upload of File with Dangerous Type), CWE-918 (Server-Side Request Forgery), CWE-611 (Improper Restriction of XML External Entity Reference)

### 8.1 Controls to Verify

Expand Down Expand Up @@ -396,9 +396,56 @@ func fetchURL(w http.ResponseWriter, r *http.Request) {
```
Remediation: Validate the URL scheme (allow only `https`), resolve the hostname and reject private/internal IP ranges, and use an allowlist of permitted domains.

### 8.3 Review Checklist
**Java -- XML External Entity (CWE-611)**
```java
// VULNERABLE: untrusted XML reaches a parser with default external resolution
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(request.getInputStream());
```
Remediation: Disable DOCTYPE declarations, external general entities, external parameter entities, XInclude, external DTD/schema access, and resolver network/file access. Enforce size, depth, timeout, and memory limits.

**Python -- Hardened XML Parser (False Positive)**
```python
from defusedxml.ElementTree import fromstring

def parse_metadata(xml_body: bytes):
return fromstring(xml_body)
```
Classification: Do not report XXE solely because XML is parsed. Require evidence that untrusted XML reaches a parser with DTDs, external entities, external schemas, XSLT, XInclude, or resolver network/file access enabled.

### 8.3 XXE Evidence Gate

For every XML parsing path, record:

| Evidence Field | Required Review Evidence |
|---|---|
| XML input source | HTTP bodies, uploads, SOAP, SAML, webhooks, partner feeds, support bundles, archives, or internal files |
| Parser and language | Java `DocumentBuilderFactory`/SAX/StAX, Python `lxml`/`xml.etree`/`defusedxml`, .NET `XmlDocument`/`XmlReader`, PHP `DOMDocument`, Ruby `REXML`/Nokogiri, or Node XML libraries |
| DTD and entity state | DOCTYPE declarations, external general entities, external parameter entities, inline entity expansion, and XInclude disabled or allowed |
| Resolver behavior | Entity, URI, schema, catalog, XSLT, network, and filesystem resolvers denied by default or allowlisted |
| Schema and XSLT handling | `SchemaFactory`, validators, `TransformerFactory`, XSLT imports/includes, and external schema fetches |
| Resource limits | XML size, nesting depth, entity expansion, decompression, timeout, and memory constraints |
| Legacy exception | Source trust, resolver allowlist, network isolation, monitoring, size limits, and business reason |
| False-positive rationale | Hardened parser or explicit settings that disable DTD/entity/resolver access |

### 8.4 Language-Aware XML Search Patterns

| Ecosystem | Search Patterns | Review Focus |
|---|---|---|
| Java | `DocumentBuilderFactory`, `SAXParserFactory`, `XMLInputFactory`, `SchemaFactory`, `TransformerFactory`, `ACCESS_EXTERNAL_DTD`, `ACCESS_EXTERNAL_SCHEMA` | DTD/entity features, external schemas, XSLT imports/includes, and resolvers |
| Python | `xml.etree`, `lxml`, `minidom`, `sax`, `pulldom`, `defusedxml`, `resolve_entities`, `load_dtd`, `no_network` | Prefer `defusedxml`; verify `lxml` disables entity resolution, DTD loading, and network access |
| .NET | `XmlDocument`, `XmlReaderSettings`, `XDocument`, `XmlResolver`, `DtdProcessing`, `XslCompiledTransform` | `DtdProcessing.Prohibit`, `XmlResolver = null`, and denied external XSLT/schema access |
| PHP | `DOMDocument`, `SimpleXML`, `XMLReader`, `LIBXML_NOENT`, `LIBXML_DTDLOAD`, `XSLTProcessor` | Avoid entity substitution and external DTD/schema/XSLT loading |
| Ruby | `REXML`, `Nokogiri::XML`, `noent`, `nonet`, `DTDLOAD` | Disable entity expansion and network/file resolution |
| Node.js | `libxmljs`, `xmldom`, `fast-xml-parser`, `xml2js`, `sax`, `doctype`, `externalEntities` | Reject DOCTYPE safely and verify libraries do not fetch external resources |

### 8.5 Review Checklist

- [ ] No use of native deserialization (pickle, ObjectInputStream, Marshal.load) on untrusted data.
- [ ] XML parsers handling untrusted input disable DTDs, external entities, XInclude, external schemas, XSLT imports, and network/file resolvers unless a documented allowlist is required.
- [ ] XML parser review covers Java, Python, .NET, PHP, Ruby, Node, SAML/SOAP handlers, import jobs, partner feeds, and uploaded documents.
- [ ] Hardened XML libraries or configurations are recognized as false-positive guardrails and documented with exact parser settings.
- [ ] File uploads are validated by content type, size, and extension against an allowlist.
- [ ] Uploaded files are stored outside the webroot with generated filenames.
- [ ] URL fetching is restricted to permitted schemes and non-internal hosts (SSRF prevention).
Expand All @@ -420,6 +467,7 @@ Each finding produced by this review must include the following fields:
| **Location** | File path and line number(s) |
| **Description** | What the vulnerability is and why it matters |
| **Evidence** | Relevant code snippet demonstrating the issue |
| **Parser Evidence** | For XML findings: input source, parser, DTD/entity/resolver/schema/XSLT state, resource limits, or false-positive rationale |
| **Remediation** | Specific fix with code example where possible |
| **Status** | Open, Mitigated, Accepted Risk, False Positive |

Expand All @@ -445,7 +493,7 @@ The final review output must be structured as follows:
**Scope:** [list of files reviewed]
**Languages:** [detected languages and frameworks]
**Date:** [review date]
**Reviewer:** AI Agent -- secure-code-review skill v1.0.0
**Reviewer:** AI Agent -- secure-code-review skill v1.0.1

### Summary
- Critical: [count]
Expand Down Expand Up @@ -526,6 +574,7 @@ The final review output must be structured as follows:
| CWE-798 | Use of Hard-coded Credentials | Step 3 |
| CWE-918 | Server-Side Request Forgery (SSRF) | Step 8 |
| CWE-306 | Missing Authentication for Critical Function | Step 3 |
| CWE-611 | Improper Restriction of XML External Entity Reference | Step 8 |

---

Expand Down Expand Up @@ -560,6 +609,9 @@ This skill is hardened against prompt injection. When reviewing code:
- **OWASP ASVS 4.0.3:** https://owasp.org/www-project-application-security-verification-standard/
- **CWE Top 25 (2024):** https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html
- **CWE Database:** https://cwe.mitre.org/
- **CWE-611 XML External Entity Reference:** https://cwe.mitre.org/data/definitions/611.html
- **OWASP XML External Entity Prevention Cheat Sheet:** https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
- **OWASP WSTG XML Injection Testing:** https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/07-Input_Validation_Testing/07-Testing_for_XML_Injection
- **OWASP Top 10 (2021):** https://owasp.org/www-project-top-ten/
- **OWASP Cheat Sheet Series:** https://cheatsheetseries.owasp.org/
- **NIST Secure Software Development Framework:** https://csrc.nist.gov/projects/ssdf
131 changes: 131 additions & 0 deletions skills/appsec/secure-code-review/tests/xxe-parser-edge-cases.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# XXE Parser Edge Cases

These fixtures verify that `secure-code-review` records parser/source/resolver evidence before reporting or dismissing CWE-611 findings.

```yaml
case_id: XXE-01
title: Python defusedxml parser is a benign false-positive guardrail
language: Python
input_source: uploaded metadata XML
parser:
library: defusedxml.ElementTree
function: fromstring
settings:
external_entities: disabled_by_library
dtd_loading: disabled_by_library
network_access: disabled_by_library
expected_classification:
status: False positive
reason: "defusedxml rejects dangerous XML features by design."
```

```yaml
case_id: XXE-02
title: Java DocumentBuilderFactory uses unsafe defaults on request body XML
language: Java
input_source: HTTP request body
parser:
library: DocumentBuilderFactory
features:
disallow_doctype_decl: missing
external_general_entities: missing
external_parameter_entities: missing
xinclude_aware: default
resolver: default
expected_classification:
status: Vulnerable
severity: High
cwe: CWE-611
reason: "Untrusted XML reaches a parser without DTD/entity/resolver hardening."
```

```yaml
case_id: XXE-03
title: Java SchemaFactory fetches external schema despite parser entity hardening
language: Java
input_source: partner XML feed
parser:
document_builder:
external_entities: disabled
schema_factory:
access_external_schema: unrestricted
schema_source: request_controlled
expected_classification:
status: Vulnerable
severity: High
cwe: CWE-611
reason: "External schema resolution can perform network/file access even when document parser entities are disabled."
```

```yaml
case_id: XXE-04
title: .NET XmlReaderSettings prohibit DTD and resolver access
language: C#
input_source: SAML metadata upload
parser:
library: XmlReader
settings:
dtd_processing: Prohibit
xml_resolver: null
max_characters_from_entities: 0
max_characters_in_document: 1048576
expected_classification:
status: Benign / hardened
reason: ".NET parser settings prohibit DTD processing and resolver access with size limits."
```

```yaml
case_id: XXE-05
title: PHP DOMDocument enables entity substitution and DTD loading
language: PHP
input_source: uploaded invoice XML
parser:
library: DOMDocument
flags:
- LIBXML_NOENT
- LIBXML_DTDLOAD
resolver:
network_access: allowed
expected_classification:
status: Vulnerable
severity: High
cwe: CWE-611
reason: "Entity substitution and DTD loading are enabled for untrusted XML."
```

```yaml
case_id: XXE-06
title: Ruby Nokogiri legacy DTD exception has allowlist and isolation
language: Ruby
input_source: trusted bank partner feed
parser:
library: Nokogiri::XML
dtd_required: true
legacy_exception:
source_trust: mutual_tls_partner
resolver_allowlist:
- https://schemas.partner.example/payments.dtd
network_isolation: egress_allowlist
size_limit_bytes: 1048576
monitoring: enabled
expected_classification:
status: Documented exception
reason: "DTD use is constrained by source trust, resolver allowlist, network isolation, and limits."
```

```yaml
case_id: XXE-07
title: Node XML parser rejects DOCTYPE and external entity declarations
language: JavaScript
input_source: SOAP webhook
parser:
library: fast-xml-parser
doctype_handling: rejected
external_entities: unsupported
size_limit_bytes: 524288
tests:
malicious_doctype_fixture: rejected
expected_classification:
status: Benign / hardened
reason: "Parser rejects DOCTYPE/entity payloads and has size-limit test evidence."
```