update: egress scan and URL to WASM version

nocomplexity · nocomplexity · commit 4bfa462a187c · 2026-04-14T21:40:28.000+02:00
diff --git a/module5/egressdetection.md b/module5/egressdetection.md
@@ -0,0 +1,132 @@
+---
+title: Data Exfiltration Detection
+short_title: Data Egress scan 
+---
+
+
+Detecting potential data exfiltration caused by Python programs is important.
+
+:::{important} 
+Detecting Data Exfiltration in Python Code that Uses Telemetry, Remote Analytics, and SaaS Integrations
+
+This is an essential step in **mitigating security risks**.
+:::
+
+
+## Why Python Data Exfiltration Detection Matters
+In Static Application Security Testing (SAST), identifying interactions with remote services is a fundamental requirement. A robust security audit must prioritize data exfiltrationâ€”the unauthorized or undocumented transfer of informationâ€”as a primary risk factor.
+
+## Understanding Data Egress
+
+Data egress occurs when information travels from your secure internal perimeter to an external destination. In a Python context, this includes the public internet, third-party cloud environments, partner networks, or SaaS integrations.
+
+## Legitimate vs. Malicious Intent
+
+In Python development, outbound data flow is often a core functional requirement. Modern applications rely on authorized egress paths for:
+- Communication: Sending automated emails or notifications.
+- Integration: Delivering API responses to external consumers.
+- Infrastructure: Syncing database backups to remote cloud storage.
+
+However, Python's flexibility makes it a prime candidate for advanced exfiltration techniques. Malicious actors or compromised dependencies can hide unauthorized data transfers within seemingly benign traffic, often bypassing standard network-level detection.
+
+
+**The Fallacy of "Anonymous" Collection**
+
+While many Python telemetry modules claim anonymity, **privacy risks** persist. If the backend systems are closed-source, they rely on **security by obscurity**, violating a core security principle. 
+
+:::{danger} 
+Telemetry and various Python analytics and remote monitoring modules often collect more metadata than documented, sending private data to unknown, potentially vulnerable services.
+:::
+
+## How to Check for Data Exfiltration
+
+**Python Code Audit** includes functionality to detect potential data exfiltration risks. This feature is available through:
+
+- the [CLI interface](userguide), and
+
+- the [API](apidocs/modules).
+
+Using the Python Code Audit CLI interface:
+The egress detection function can be activated with the following command:
+
+```bash
+codeaudit filescan <pythonfile|package-name|directory> [OUTPUTFILE]
+```
+
+**Report Output**
+
+In the generated HTML report, each analysed file is evaluated for potential data exfiltration to external services.
+
+If a potential risk is detected, the report will display:
+> *&#9888;&#65039; External Egress Risk: Detected outbound connection logic or API keys that may facilitate data egress.*
+
+The report also highlights the exact lines of code that triggered the detection.
+
+:::{tip} 
+**Always review discovered modules carefully.**
+
+In the report, under the section:
+
+`> View used modules in this file.`
+
+the HTML report lists all modules detected per file. Understanding each module is criticalâ€”some are strong indicators of possible data exchange with external systems. Review them to assess potential security or privacy risks.
+:::
+
+If no external egress risks are identified, the report will display:
+> *&#x2705; No logic for connecting to remote services found. Risk of data exfiltration to external systems is low.*
+
+
+
+:::{important} 
+No tool can provide 100% guarantees. This applies to Python Code Audit as well as to any other security analysis tool.
+:::
+
+
+:::{admonition} Scope and Intent of **Python Code Audit Egress Scanning**
+:class: note
+
+It is critical to distinguish between Data Egress Detection and Secret Scanning. While both are vital components of a secure development lifecycle, they address entirely different threat vectors.
+:::
+
+
+The **Python Code Audit** egress detection functionality is **NOT** designed to identify secrets within your source code.
+
+Understanding the Difference:
+* Data Egress Detection: Focuses on the destination and mechanism of data leaving your environment (e.g., identifying telemetry hooks, hidden API calls, or SaaS integrations).
+
+* Secret Scanning: Focuses on the credentials themselves (e.g., hardcoded API keys, passwords, or certificates), whether they are plaintext or obfuscated.
+
+
+## The "Shift-Left" Advantage
+Detecting exfiltration at the network level is reactive and expensive. It often fails when traffic is encrypted or blended with legitimate SaaS calls. Moving detection to the code level (Shift-Left) is more cost-effective and provides:
+
+1. Supply Chain Integrity: Auditing third-party libraries before integration. If a library contains undocumented "phone home" logic, it can be blocked early.
+2. Defense in Depth: Perimeter tools (Firewalls, DLP, CASBs) are essential but not infallible. Source code detection adds a vital internal layer of defense.
+
+Security Mandate: From a **Zero Trust** standpoint, organisations must verify if telemetry is present in their Python code and ensure all associated risks are mitigated through code, systems, and management processes.
+
+## Assessing the Security Risks
+
+Telemetry represents a deliberate hole in your network perimeter. When Python applications implement advanced tracking without granular consent, they transition from a "utility" to a significant security liability.
+
+1. Sensitive Data Leakage
+
+Telemetry often captures more than just "events." Without rigorous sanitization, these streams can include:
+- **PII (Personally Identifiable Information):** Usernames, IP addresses, and location data.
+- **Secrets in Logs:** Authentication tokens or database strings caught in stack traces.
+- **Business Logic:** Proprietary metadata revealing internal infrastructure.
+
+
+2. Expanded Attack Surface
+Every external API endpoint is a potential point of failure.
+
+-  **Unauthenticated Endpoints:** Many telemetry "sinks" lack robust auth, making them easy targets for interception.
+-  **Library Vulnerabilities:** The telemetry module itself may contain vulnerabilities (e.g., RCE or path traversal) that grant attackers a foothold.
+
+3. The "When, Not If" Data Breach
+
+Data sent to a third party is only as secure as their defenses.
+
+- **Loss of Custody**: Once data leaves your perimeter, you lose the ability to protect it.
+- **Transparency Gaps:** You are dependent on the provider to detect and report breachesâ€”a process that often takes months.
+
diff --git a/module5/installation.md b/module5/installation.md
@@ -8,6 +8,26 @@ For this course we use the best FOSS (Free and Open Source) SAST Tool for Python
 Python Code Audit is compatible with both Unix-based systems (Linux/macOS) and Windows.
 
 
+## Use the browser-based version
+
+To access the local browser-based version of Python Code Audit, follow the link below:
+
+{button}`Launch webbased version <https://nocomplexity.com/codeauditapp/dashboardapp.html>`
+
+
+The browser-based (WASM) version allows you to run **Python Code Audit** directly in your web browser without installing anything. This means you can quickly validate and inspect packages hosted on PyPI.org in a safe and isolated environment. It is especially useful for learning, quick checks, and reviewing package integrity before downloading or installing them locally.
+
+
+:::{note} 
+Validating local Python files and directories is not possible with the browser based version.
+:::
+
+
+## Install the package locally
+
+In order to make use of all the functionality of **Python Code Audit**, you must install the Python package locally.
+For this course and for regular security validation it is advised to use the full version.
+
 To install Python Code Audit, run the following command in your terminal or command prompt:
 ```Python
 pip install -U codeaudit
diff --git a/toc.yml b/toc.yml
@@ -34,6 +34,7 @@ project:
           - file: module5/complexitycheck.md
           - file: module5/sastscan.md
           - file: module5/sast_boundaries.md
+          - file: module5/egressdetection.md
           - file: module5/paretoprinciple.md
     - file: module6/module6_overview.md     #dropdown menu    
       children: