Skip to content

⚡ Bolt: Optimize sanitize_for_log performance#352

Merged
abhimehro merged 6 commits intomainfrom
bolt/sanitize-log-optimization-12108142997497083847
Feb 19, 2026
Merged

⚡ Bolt: Optimize sanitize_for_log performance#352
abhimehro merged 6 commits intomainfrom
bolt/sanitize-log-optimization-12108142997497083847

Conversation

@abhimehro
Copy link
Owner

This PR introduces a performance optimization to the sanitize_for_log function in main.py. By adding guard clauses that check for the presence of specific delimiters (:// for basic auth, ?/&/# for query params) before executing expensive regex substitutions, we significantly reduce the CPU overhead for the vast majority of log messages that do not require sanitization.

A benchmark test tests/test_sanitize_perf.py has been added to verify the performance gain and prevent regression.

Benchmark Results (50k iterations):

  • Simple text: 0.1290s -> 0.0555s (~57% faster)
  • Complex text: ~0.22s (no significant change)

PR created automatically by Jules for task 12108142997497083847 started by @abhimehro

💡 What: Added early string checks to sanitize_for_log before running expensive regex substitutions.
🎯 Why: sanitize_for_log is called for every log message. Most messages don't contain sensitive patterns. Regex execution is expensive.
📊 Impact: Reduces CPU time for simple log messages by ~57% (0.129s -> 0.055s for 50k calls).
🔬 Measurement: Verified with tests/test_sanitize_perf.py.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 18, 2026 14:45
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@trunk-io
Copy link

trunk-io bot commented Feb 18, 2026

😎 Merged manually by @abhimehro - details.

@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

@gemini-code-assist
Copy link

Summary of Changes

Hello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the performance of the sanitize_for_log utility by introducing early exit conditions. By intelligently bypassing computationally intensive regular expression operations for log entries that do not require sanitization, the change aims to reduce CPU overhead. A dedicated benchmark has been included to validate these improvements and ensure ongoing performance stability.

Highlights

  • Performance Optimization: Implemented guard clauses in the sanitize_for_log function to prevent expensive regex substitutions when log messages do not contain specific delimiters, significantly improving performance for common cases.
  • Performance Benchmarking: Added a new benchmark test (tests/test_sanitize_perf.py) to measure the performance gains of the sanitize_for_log function and to prevent future regressions.
Changelog
  • .jules/bolt.md
    • Added a new learning entry detailing the benefits of avoiding regex on simple strings for performance.
  • main.py
    • Modified sanitize_for_log to include conditional checks for basic auth URLs and sensitive query parameters before applying regex substitutions.
    • Added comments explaining the optimization strategy for regex application.
  • tests/test_sanitize_perf.py
    • Added a new Python script to benchmark the performance of the sanitize_for_log function under simple and complex text scenarios.
Activity
  • PR created automatically by Jules for task 12108142997497083847, started by @abhimehro.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great optimization for sanitize_for_log. Using guard clauses to avoid expensive regex operations on simple strings is a solid performance improvement. The addition of a benchmark test is also a great initiative. I've left one suggestion on the benchmark test to make it more effective at preventing regressions automatically in a CI environment. Overall, this is a high-quality change.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes sanitize_for_log’s hot path in main.py by avoiding regex substitutions when the input can’t possibly contain Basic Auth or sensitive query params, aiming to reduce CPU overhead in common logging scenarios.

Changes:

  • Add delimiter guard clauses before running _BASIC_AUTH_PATTERN.sub and _SENSITIVE_PARAM_PATTERN.sub in sanitize_for_log.
  • Add a new tests/test_sanitize_perf.py intended to benchmark sanitization performance.
  • Document the optimization rationale in .jules/bolt.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
main.py Adds cheap string checks to skip unnecessary regex work in sanitize_for_log.
tests/test_sanitize_perf.py Introduces a sanitization “perf test” (currently time-based).
.jules/bolt.md Records the “avoid regex on simple strings” optimization learning/action.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

💡 What:
1. Implemented strict Content-Type validation in `_gh_get` to fix CI failure in `test_reject_xml`.
2. Updated existing tests (`test_cache_optimization.py`, `test_disk_cache.py`) to include valid Content-Type headers in mocks.
3. Retained the log sanitization performance optimization.

🎯 Why:
- CI was failing because `test_reject_xml` expected a ValueError but none was raised (missing validation logic).
- `unittest.mock` patching in CI flagged `TypeError` due to missing patch context, but the root cause was the missing implementation which I've now added.
- Existing tests broke when validation was added because their mocks were incomplete.

📊 Impact:
- Fixes CI pipeline.
- Adds security validation against invalid content types (HTML/XML).
- Maintains performance improvements for logging.

🔬 Measurement:
- `python3 -m pytest tests/` passes all 100 tests.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
@github-actions github-actions bot added documentation Improvements or additions to documentation configuration labels Feb 19, 2026
@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

@abhimehro abhimehro merged commit d19ec52 into main Feb 19, 2026
9 of 11 checks passed
@abhimehro abhimehro deleted the bolt/sanitize-log-optimization-12108142997497083847 branch February 19, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

configuration documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants