Skip to content

🛡️ Sentinel: [CRITICAL] Fix sensitive data exposure in logs#172

Closed
abhimehro wants to merge 1 commit intomainfrom
sentinel-log-redaction-14313453863725986533
Closed

🛡️ Sentinel: [CRITICAL] Fix sensitive data exposure in logs#172
abhimehro wants to merge 1 commit intomainfrom
sentinel-log-redaction-14313453863725986533

Conversation

@abhimehro
Copy link
Owner

🚨 Severity: CRITICAL
💡 Vulnerability: Credentials in URLs (Basic Auth and sensitive query parameters) were logged to stdout/stderr in plain text when using sanitize_for_log.
🎯 Impact: Attackers with access to logs could steal credentials or API tokens.
🔧 Fix: Updated sanitize_for_log in main.py to redact Basic Auth credentials and sensitive query parameters (token, key, secret, password, auth, access_token, api_key).
✅ Verification: Added new test case in tests/test_log_sanitization.py. Verified with reproduction script repro_log_leak.py.


PR created automatically by Jules for task 14313453863725986533 started by @abhimehro

🚨 Severity: CRITICAL
💡 Vulnerability: Credentials in URLs (Basic Auth and sensitive query parameters) were logged to stdout/stderr in plain text when using `sanitize_for_log`.
🎯 Impact: Attackers with access to logs could steal credentials or API tokens.
🔧 Fix: Updated `sanitize_for_log` in `main.py` to redact Basic Auth credentials and sensitive query parameters (token, key, secret, password, auth, access_token, api_key).
✅ Verification: Added new test case in `tests/test_log_sanitization.py`. Verified with reproduction script `repro_log_leak.py`.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings February 8, 2026 10:48
@trunk-io
Copy link

trunk-io bot commented Feb 8, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

@github-actions github-actions bot added the python label Feb 8, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a crucial security enhancement by preventing the exposure of sensitive credentials and API tokens in application logs. It achieves this by significantly improving the log sanitization mechanism, ensuring that confidential information within URLs is automatically redacted, thereby mitigating a critical data leakage risk.

Highlights

  • Critical Security Fix: This pull request addresses a critical security vulnerability where sensitive data, specifically Basic Auth credentials and various API tokens/secrets in URLs, were being logged in plain text.
  • Enhanced Log Sanitization: The sanitize_for_log function in main.py has been updated to automatically redact Basic Auth credentials from URLs and sensitive query parameters (e.g., token, key, secret, password) before logging.
  • New Test Cases: New test cases have been added in tests/test_log_sanitization.py to thoroughly verify that the updated sanitize_for_log function correctly redacts credentials and sensitive query parameters, including case-insensitive matching.
  • Security Documentation: A new markdown file, .jules/sentinel.md, has been added to serve as a journal for documenting critical security learnings and patterns within the codebase.
Changelog
  • .jules/sentinel.md
    • Added a new file to document critical security learnings and patterns.
  • main.py
    • Modified sanitize_for_log to redact Basic Auth credentials from URLs using a regular expression.
    • Implemented redaction for sensitive query parameters (token, key, secret, password, auth, access_token, api_key) in URLs, with case-insensitive matching.
  • tests/test_log_sanitization.py
    • Added test_sanitize_for_log_redacts_credentials to verify the redaction of Basic Auth credentials.
    • Added tests to confirm redaction of sensitive query parameters.
    • Included a test case to ensure case-insensitive redaction of query parameters.
Activity
  • The pull request was automatically created by Jules.
  • The task associated with this PR was initiated by @abhimehro.
  • The PR description clearly outlines the critical severity, vulnerability, impact, and the specific fix implemented, along with verification steps.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

self.assertIn("[REDACTED]", sanitized_param)

# Test Case Insensitivity
url_with_token = "https://example.com/folder.json?TOKEN=mytoken"

Check notice

Code scanning / Bandit

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken' Note test

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken'
s = s.replace(TOKEN, "[REDACTED]")

# Redact Basic Auth in URLs (e.g. https://user:pass@host)
s = re.sub(r"://[^/@]+@", "://[REDACTED]@", s)

Check warning

Code scanning / Pylint (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style

# Redact sensitive query parameters
sensitive_keys = r"token|key|secret|password|auth|access_token|api_key"
s = re.sub(

Check warning

Code scanning / Pylint (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style
self.assertIn("[REDACTED]", sanitized_param)

# Test Case Insensitivity
url_with_token = "https://example.com/folder.json?TOKEN=mytoken"

Check notice

Code scanning / Bandit (reported by Codacy)

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken' Note test

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken'
s = s.replace(TOKEN, "[REDACTED]")

# Redact Basic Auth in URLs (e.g. https://user:pass@host)
s = re.sub(r"://[^/@]+@", "://[REDACTED]@", s)

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style

# Redact sensitive query parameters
sensitive_keys = r"token|key|secret|password|auth|access_token|api_key"
s = re.sub(

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical security vulnerability by redacting sensitive credentials from logs. However, the implementation is incomplete as it misses sensitive data in URL fragments (e.g., #access_token=...) and fails to apply sanitize_for_log to exception objects (e) in several logging statements throughout main.py (e.g., lines 537, 632, 670). This means credentials could still leak in error logs, which the PR intended to prevent. Additionally, consider improving query parameter redaction to handle empty values and expanding the test suite to cover more edge cases.

Comment on lines +160 to +165
s = re.sub(
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The regex for redacting sensitive query parameters should also include the fragment separator #. Sensitive data like access_token is frequently passed in the URL fragment (after the # symbol) and will not be redacted by the current implementation. Additionally, the current regex [^&\s]+ doesn't handle cases where the value is empty (e.g., ?token=&...), which could lead to incomplete redaction. Consider changing + to * to correctly redact such cases.

Suggested change
s = re.sub(
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)
s = re.sub(
r"([?&#])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)

url_with_token = "https://example.com/folder.json?TOKEN=mytoken"
sanitized_token = main.sanitize_for_log(url_with_token)
self.assertNotIn("mytoken", sanitized_token)
self.assertIn("[REDACTED]", sanitized_token)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current tests are a great start! To make them more robust, I'd suggest a few things:

  1. Make existing assertions more specific. For example, instead of self.assertIn("[REDACTED]"), you could check for the full redacted key-value pair like self.assertIn("secret=[REDACTED]").
  2. Add test cases for more scenarios, such as empty sensitive values and multiple sensitive parameters in one URL.

Here's a suggestion to add a couple more test cases:

Suggested change
self.assertIn("[REDACTED]", sanitized_token)
self.assertIn("[REDACTED]", sanitized_token)
# Test multiple sensitive params and mixed params
url_multi = "https://example.com/api?id=123&token=abc&name=user&api_key=def"
sanitized_multi = main.sanitize_for_log(url_multi)
self.assertIn("id=123", sanitized_multi)
self.assertIn("name=user", sanitized_multi)
self.assertNotIn("token=abc", sanitized_multi)
self.assertNotIn("api_key=def", sanitized_multi)
self.assertIn("token=[REDACTED]", sanitized_multi)
self.assertIn("api_key=[REDACTED]", sanitized_multi)
# Test empty sensitive param
url_empty = "https://example.com/api?token=&id=123"
sanitized_empty = main.sanitize_for_log(url_empty)
self.assertIn("token=[REDACTED]", sanitized_empty)
self.assertIn("id=123", sanitized_empty)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens log sanitization to prevent leaking credentials embedded in URLs, aligning with the project’s existing focus on safe terminal/log output.

Changes:

  • Extend sanitize_for_log to redact Basic Auth credentials in URLs and redact common sensitive query parameters.
  • Add a unit test covering Basic Auth and sensitive query parameter redaction.
  • Add a .jules/sentinel.md journal stub for documenting critical security learnings.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
main.py Expands sanitize_for_log to redact URL credentials and sensitive query params.
tests/test_log_sanitization.py Adds coverage for URL credential/query-param redaction behavior.
.jules/sentinel.md Introduces a security journal template for future audit notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +159 to +165
sensitive_keys = r"token|key|secret|password|auth|access_token|api_key"
s = re.sub(
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query-parameter redaction pattern ...=[^&\s]+ will also consume trailing punctuation when a URL is followed by characters like ), ., ,, etc. in the surrounding log message (because those chars are not & or whitespace). That can unintentionally remove non-secret context from logs. Consider parsing URLs with urllib.parse and rewriting query values, or tightening the regex so it stops at common URL terminators (e.g., # and trailing punctuation) while preserving surrounding text.

Copilot uses AI. Check for mistakes.
Comment on lines +155 to +165
# Redact Basic Auth in URLs (e.g. https://user:pass@host)
s = re.sub(r"://[^/@]+@", "://[REDACTED]@", s)

# Redact sensitive query parameters
sensitive_keys = r"token|key|secret|password|auth|access_token|api_key"
s = re.sub(
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions verification with a repro_log_leak.py reproduction script, but that file isn’t included in this PR (and doesn’t appear to exist in the repository root). Either add the script to the PR (if it’s intended to be versioned) or update the PR description to avoid referring to a non-existent artifact.

Copilot uses AI. Check for mistakes.
Comment on lines 149 to +156
def sanitize_for_log(text: Any) -> str:
"""Sanitize text for logging, ensuring TOKEN is redacted and control chars are escaped."""
s = str(text)
if TOKEN and TOKEN in s:
s = s.replace(TOKEN, "[REDACTED]")

# Redact Basic Auth in URLs (e.g. https://user:pass@host)
s = re.sub(r"://[^/@]+@", "://[REDACTED]@", s)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitize_for_log’s docstring no longer reflects its behavior: it says it only redacts TOKEN/control chars, but the function now also redacts Basic Auth credentials and sensitive query parameters. Please update the docstring to describe the full set of redactions so callers know what guarantees it provides.

Copilot uses AI. Check for mistakes.
Copilot AI added a commit that referenced this pull request Feb 8, 2026
…itization, add dry-run plan details

Incorporates the best changes from 36 Jules PRs, addressing review feedback:

Bolt (Performance) - from PR #173:
- Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level
- Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule
- Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173

Sentinel (Security) - from PR #172 with review feedback:
- Enhance sanitize_for_log to redact Basic Auth credentials in URLs
- Redact sensitive query parameters (token, key, secret, password, etc.)
- Handle fragment separators (#) per Gemini Code Assist review
- Use [^&#\s]* pattern per Copilot reviewer suggestion
- Update docstring per reviewer suggestion
- Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172

Palette (UX) - from PR #174 with lint fixes:
- Add print_plan_details function for dry-run visibility
- Fix duplicate render_progress_bar definition bug
- Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174

Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
@abhimehro abhimehro closed this Feb 9, 2026
@abhimehro abhimehro deleted the sentinel-log-redaction-14313453863725986533 branch February 9, 2026 00:20
abhimehro added a commit that referenced this pull request Feb 9, 2026
…itization, add dry-run plan details

Incorporates the best changes from 36 Jules PRs, addressing review feedback:

Bolt (Performance) - from PR #173:
- Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level
- Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule
- Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173

Sentinel (Security) - from PR #172 with review feedback:
- Enhance sanitize_for_log to redact Basic Auth credentials in URLs
- Redact sensitive query parameters (token, key, secret, password, etc.)
- Handle fragment separators (#) per Gemini Code Assist review
- Use [^&#\s]* pattern per Copilot reviewer suggestion
- Update docstring per reviewer suggestion
- Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172

Palette (UX) - from PR #174 with lint fixes:
- Add print_plan_details function for dry-run visibility
- Fix duplicate render_progress_bar definition bug
- Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174

Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants