Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .jules/sentinel.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,11 @@
**Prevention:**
1. Parse URLs and check hostnames against `localhost` and private IP ranges using `ipaddress` module.
2. Enforce strict length limits on user inputs (e.g., profile IDs) to prevent resource exhaustion or buffer abuse.

## 2025-01-24 - [SSRF via DNS Resolution]

Check notice

Code scanning / Remark-lint (reported by Codacy)

Warn when references to undefined definitions are found. Note

[no-undefined-references] Found reference to undefined definition

Check notice

Code scanning / Remark-lint (reported by Codacy)

Warn when shortcut reference links are used. Note

[no-shortcut-reference-link] Use the trailing [] on reference links
**Vulnerability:** The `validate_folder_url` function performed checks on the initial hostname but did not resolve domains to their IP addresses. This allowed attackers to bypass private IP blocklists by using a custom domain (e.g., `local.test`) that resolves to a private IP (e.g., `127.0.0.1`), leading to SSRF.
**Learning:** Validating hostnames without DNS resolution is insufficient for SSRF protection. Attackers can easily map public domains to internal IPs (DNS Rebinding/Pinning).
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "DNS Rebinding/Pinning" is potentially confusing. DNS Rebinding is an attack technique, while DNS Pinning is a defense mechanism. Consider clarifying this to "DNS Rebinding attacks" for accuracy, as the context discusses how attackers can bypass validation by controlling DNS resolution.

Suggested change
**Learning:** Validating hostnames without DNS resolution is insufficient for SSRF protection. Attackers can easily map public domains to internal IPs (DNS Rebinding/Pinning).
**Learning:** Validating hostnames without DNS resolution is insufficient for SSRF protection. Attackers can easily map public domains to internal IPs (DNS rebinding attacks).

Copilot uses AI. Check for mistakes.
**Prevention:**
1. Resolve domains to their IP addresses using `socket.getaddrinfo`.
2. Validate all resolved IPs against a blocklist of private/loopback ranges.
3. Fail closed if DNS resolution fails.
16 changes: 14 additions & 2 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import sys
import time
import re
import socket
import concurrent.futures
import threading
import ipaddress
Expand Down Expand Up @@ -209,8 +210,19 @@
log.warning(f"Skipping unsafe URL (private IP): {sanitize_for_log(url)}")
return False
except ValueError:
# Not an IP literal, it's a domain.
pass
# Not an IP literal, it's a domain. Resolve to check for private IPs.
try:
# Resolve hostname to IPs
addr_infos = socket.getaddrinfo(hostname, None)
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DNS resolution using socket.getaddrinfo is synchronous and blocking, which can cause performance issues. Each URL validation will block the calling thread while performing DNS lookups, which could significantly slow down the application, especially when processing multiple URLs. Consider adding a timeout parameter to prevent hanging on slow DNS servers, or document this performance characteristic if it's acceptable for the use case.

Copilot uses AI. Check for mistakes.
for family, kind, proto, canonname, sockaddr in addr_infos:

Check warning

Code scanning / Prospector (reported by Codacy)

Unused variable 'family' (unused-variable) Warning

Unused variable 'family' (unused-variable)

Check notice

Code scanning / Pylint (reported by Codacy)

Unused variable 'proto' Note

Unused variable 'proto'

Check notice

Code scanning / Pylint (reported by Codacy)

Unused variable 'kind' Note

Unused variable 'kind'

Check notice

Code scanning / Pylint (reported by Codacy)

Unused variable 'family' Note

Unused variable 'family'

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Unused variable 'kind' Note

Unused variable 'kind'

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Unused variable 'family' Note

Unused variable 'family'

Check notice

Code scanning / Pylint (reported by Codacy)

Unused variable 'canonname' Note

Unused variable 'canonname'

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Unused variable 'proto' Note

Unused variable 'proto'

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Unused variable 'canonname' Note

Unused variable 'canonname'
ip_str = sockaddr[0]
resolved_ip = ipaddress.ip_address(ip_str)
if resolved_ip.is_private or resolved_ip.is_loopback:
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation only checks is_private and is_loopback, but doesn't check for other potentially dangerous IP address categories. Consider also checking is_link_local (e.g., 169.254.0.0/16), is_multicast, and is_reserved to provide more comprehensive SSRF protection against access to special-use IP addresses.

Copilot uses AI. Check for mistakes.
log.warning(f"Skipping unsafe URL (domain {sanitize_for_log(hostname)} resolves to private IP {resolved_ip}): {sanitize_for_log(url)}")

Check warning

Code scanning / Prospector (reported by Codacy)

Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning

Use lazy % formatting in logging functions (logging-fstring-interpolation)

Check warning

Code scanning / Pylint (reported by Codacy)

Line too long (159/100) Warning

Line too long (159/100)

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (159/100) Warning

Line too long (159/100)

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Use lazy % formatting in logging functions Note

Use lazy % formatting in logging functions
return False
except socket.gaierror:
log.warning(f"Could not resolve hostname {sanitize_for_log(hostname)}, treating as unsafe.")

Check warning

Code scanning / Prospector (reported by Codacy)

Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning

Use lazy % formatting in logging functions (logging-fstring-interpolation)

Check warning

Code scanning / Pylint (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Use lazy % formatting in logging functions Note

Use lazy % formatting in logging functions
Comment on lines +223 to +224
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The socket.gaierror exception only catches DNS resolution failures, but socket.getaddrinfo can also raise other exceptions like socket.timeout or OSError. While the outer try-except block at line 227 will catch these, it's less specific and the error message won't accurately reflect a DNS resolution failure. Consider catching a broader set of socket-related exceptions (OSError, socket.timeout) alongside socket.gaierror for more specific error handling.

Suggested change
except socket.gaierror:
log.warning(f"Could not resolve hostname {sanitize_for_log(hostname)}, treating as unsafe.")
except (socket.gaierror, socket.timeout, OSError) as exc:
log.warning(f"Could not resolve hostname {sanitize_for_log(hostname)} (socket error: {exc}), treating as unsafe.")

Copilot uses AI. Check for mistakes.
return False
Comment on lines +213 to +225
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This DNS resolution approach is vulnerable to Time-of-Check-Time-of-Use (TOCTOU) race conditions. An attacker could use DNS rebinding where the domain initially resolves to a safe IP during validation, but later resolves to a private IP when the actual request is made. While this fix improves security, be aware that it doesn't fully protect against sophisticated DNS rebinding attacks. Consider implementing additional protections such as DNS pinning (caching resolved IPs and using them for the request) or validating IPs at request time if httpx supports custom DNS resolution.

Suggested change
# Not an IP literal, it's a domain. Resolve to check for private IPs.
try:
# Resolve hostname to IPs
addr_infos = socket.getaddrinfo(hostname, None)
for family, kind, proto, canonname, sockaddr in addr_infos:
ip_str = sockaddr[0]
resolved_ip = ipaddress.ip_address(ip_str)
if resolved_ip.is_private or resolved_ip.is_loopback:
log.warning(f"Skipping unsafe URL (domain {sanitize_for_log(hostname)} resolves to private IP {resolved_ip}): {sanitize_for_log(url)}")
return False
except socket.gaierror:
log.warning(f"Could not resolve hostname {sanitize_for_log(hostname)}, treating as unsafe.")
return False
# Not an IP literal; avoid DNS-based validation here to prevent
# time-of-check vs time-of-use (TOCTOU) issues with DNS rebinding.
# Additional enforcement should be done at request time (e.g. via
# httpx transport / DNS configuration) if private IP access must be
# restricted for resolved domains.
pass

Copilot uses AI. Check for mistakes.

except Exception as e:
log.warning(f"Failed to validate URL {sanitize_for_log(url)}: {e}")
Expand Down
Loading