⚡ Bolt: Parallelize folder deletion and optimize regex#149
⚡ Bolt: Parallelize folder deletion and optimize regex#149
Conversation
- Replaced sequential folder deletion loop with concurrent execution using `ThreadPoolExecutor` - Pre-compiled regex pattern for `is_valid_rule` - Added tests for parallel deletion and regex optimization Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Merging to
|
Summary of ChangesHello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant performance enhancements to the synchronization process. By parallelizing folder deletions, it transforms a sequential bottleneck into a more efficient concurrent operation. Additionally, it optimizes regex validation by pre-compiling the pattern, which reduces overhead in frequently called functions. These changes collectively aim to deliver a faster and more responsive sync experience, particularly during the initial setup or 'clean slate' phase. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| # Case 14: is_valid_rule uses pre-compiled regex | ||
| def test_is_valid_rule_uses_regex(monkeypatch): | ||
| m = reload_main_with_env(monkeypatch) | ||
| assert m.is_valid_rule("valid.rule") is True |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| def test_is_valid_rule_uses_regex(monkeypatch): | ||
| m = reload_main_with_env(monkeypatch) | ||
| assert m.is_valid_rule("valid.rule") is True | ||
| assert m.is_valid_rule("invalid rule") is False |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| m = reload_main_with_env(monkeypatch) | ||
| assert m.is_valid_rule("valid.rule") is True | ||
| assert m.is_valid_rule("invalid rule") is False | ||
| assert m.is_valid_rule("invalid;rule") is False |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| assert m.is_valid_rule("invalid rule") is False | ||
| assert m.is_valid_rule("invalid;rule") is False | ||
| # Check that RULE_PATTERN is available | ||
| assert hasattr(m, "RULE_PATTERN") |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| assert m.is_valid_rule("invalid;rule") is False | ||
| # Check that RULE_PATTERN is available | ||
| assert hasattr(m, "RULE_PATTERN") | ||
| assert m.RULE_PATTERN.pattern == r"^[a-zA-Z0-9.\-_:*\/]+$" |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| m.sync_profile("pid", folder_urls, no_delete=False) | ||
|
|
||
| # Verification | ||
| assert mock_delete.call_count == 2 |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| has_deletion_pool = True | ||
| break | ||
|
|
||
| assert ( |
Check notice
Code scanning / Bandit
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| monkeypatch.setattr(m, "delete_folder", mock_delete) | ||
|
|
||
| # Mock other calls to prevent errors | ||
| monkeypatch.setattr(m, "get_all_existing_rules", MagicMock(return_value=set())) |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
|
|
||
| # Mock other calls to prevent errors | ||
| monkeypatch.setattr(m, "get_all_existing_rules", MagicMock(return_value=set())) | ||
| monkeypatch.setattr(m, "countdown_timer", MagicMock()) |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| # Mock other calls to prevent errors | ||
| monkeypatch.setattr(m, "get_all_existing_rules", MagicMock(return_value=set())) | ||
| monkeypatch.setattr(m, "countdown_timer", MagicMock()) | ||
| monkeypatch.setattr(m, "_process_single_folder", MagicMock(return_value=True)) |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| monkeypatch.setattr(m, "countdown_timer", MagicMock()) | ||
| monkeypatch.setattr(m, "_process_single_folder", MagicMock(return_value=True)) | ||
|
|
||
| # Mock fetching |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| monkeypatch.setattr(m, "_process_single_folder", MagicMock(return_value=True)) | ||
|
|
||
| # Mock fetching | ||
| mock_validate = MagicMock(return_value=True) |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
test_main.py
Outdated
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| @@ -1,3 +1,4 @@ | |||
| import concurrent.futures | |||
Check warning
Code scanning / Pylint (reported by Codacy)
Missing module docstring Warning test
| # Mock delete_folder to track calls | ||
| mock_delete = MagicMock(return_value=True) | ||
| monkeypatch.setattr(m, "delete_folder", mock_delete) | ||
|
|
Check warning
Code scanning / Pylint (reported by Codacy)
Missing function docstring Warning test
| mock_validate.cache_clear = MagicMock() | ||
| monkeypatch.setattr(m, "validate_folder_url", mock_validate) | ||
|
|
||
| def fetch_side_effect(url): |
Check warning
Code scanning / Pylint (reported by Codacy)
Missing function docstring Warning test
| # 2. Deletion (max_workers=5) | ||
| # 3. Processing (max_workers=1) | ||
|
|
||
| has_deletion_pool = False |
Check warning
Code scanning / Pylint (reported by Codacy)
Missing function docstring Warning test
| @@ -1,3 +1,4 @@ | |||
| import concurrent.futures | |||
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Missing module docstring Warning test
| # Mock delete_folder to track calls | ||
| mock_delete = MagicMock(return_value=True) | ||
| monkeypatch.setattr(m, "delete_folder", mock_delete) | ||
|
|
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Missing function or method docstring Warning test
| mock_delete = MagicMock(return_value=True) | ||
| monkeypatch.setattr(m, "delete_folder", mock_delete) | ||
|
|
||
| # Mock other calls to prevent errors |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Variable name "m" doesn't conform to snake_case naming style Warning test
| mock_validate.cache_clear = MagicMock() | ||
| monkeypatch.setattr(m, "validate_folder_url", mock_validate) | ||
|
|
||
| def fetch_side_effect(url): |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Missing function or method docstring Warning test
There was a problem hiding this comment.
Code Review
This pull request introduces two solid performance optimizations: pre-compiling a regular expression and parallelizing folder deletion. The changes are well-implemented and include corresponding tests, which is great to see. The pre-compilation of the regex in is_valid_rule is a standard and effective optimization. Parallelizing the I/O-bound deletion tasks using a ThreadPoolExecutor should significantly improve the performance of the sync process, especially for profiles with many folders. I have a couple of minor suggestions to improve maintainability by replacing hardcoded values with constants.
|
|
||
| if folders_to_delete: | ||
| # Parallel delete to speed up the "clean slate" phase | ||
| with concurrent.futures.ThreadPoolExecutor(max_workers=5) as delete_executor: |
There was a problem hiding this comment.
To improve maintainability and avoid magic numbers, it's best to define 5 as a constant at the top of the file (e.g., DELETE_WORKERS = 5 in the constants section) and use the constant here. This makes it easier to find and change in the future.
# In constants section
DELETE_WORKERS = 5
# In sync_profile
with concurrent.futures.ThreadPoolExecutor(max_workers=DELETE_WORKERS) as delete_executor:|
|
||
| has_deletion_pool = False | ||
| for call_args in calls: | ||
| if call_args.kwargs.get("max_workers") == 5: |
There was a problem hiding this comment.
This assertion hardcodes the value 5, creating a tight coupling with the implementation. If the number of workers is extracted into a constant in main.py (as suggested in my other comment), this test should reference that constant to remain robust against future changes. For example:
if call_args.kwargs.get("max_workers") == m.DELETE_WORKERS:There was a problem hiding this comment.
Pull request overview
This PR implements two performance optimizations to speed up the sync process: pre-compiling a regex pattern used in rule validation and parallelizing folder deletion during the "clean slate" phase.
Changes:
- Pre-compiled the regex pattern used in
is_valid_ruleto avoid repeated compilation overhead in tight loops - Parallelized folder deletion in
sync_profileusing ThreadPoolExecutor with 5 workers to reduce latency during cleanup - Added tests to verify the regex pre-compilation and parallel deletion behavior
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| main.py | Added pre-compiled RULE_PATTERN constant and updated is_valid_rule to use it; refactored folder deletion to use parallel execution with ThreadPoolExecutor |
| test_main.py | Added test_is_valid_rule_uses_regex to verify regex pre-compilation; added test_sync_profile_parallel_deletion to verify parallel folder deletion behavior |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if folders_to_delete: | ||
| # Parallel delete to speed up the "clean slate" phase | ||
| with concurrent.futures.ThreadPoolExecutor(max_workers=5) as delete_executor: |
There was a problem hiding this comment.
Consider using max_workers=3 for DELETE operations instead of 5, consistent with POST operations which also modify state. Line 945 mentions "aggressive rate limits" for POST operations, and DELETE operations may face similar rate limiting. While GETs use 5 workers (line 682), that's because they're "usually cheaper" on rate limits. This would maintain the parallelization benefit while being more conservative about rate limits.
| with concurrent.futures.ThreadPoolExecutor(max_workers=5) as delete_executor: | |
| with concurrent.futures.ThreadPoolExecutor(max_workers=3) as delete_executor: |
…itization, add dry-run plan details Incorporates the best changes from 36 Jules PRs, addressing review feedback: Bolt (Performance) - from PR #173: - Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level - Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule - Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173 Sentinel (Security) - from PR #172 with review feedback: - Enhance sanitize_for_log to redact Basic Auth credentials in URLs - Redact sensitive query parameters (token, key, secret, password, etc.) - Handle fragment separators (#) per Gemini Code Assist review - Use [^&#\s]* pattern per Copilot reviewer suggestion - Update docstring per reviewer suggestion - Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172 Palette (UX) - from PR #174 with lint fixes: - Add print_plan_details function for dry-run visibility - Fix duplicate render_progress_bar definition bug - Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174 Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
…itization, add dry-run plan details Incorporates the best changes from 36 Jules PRs, addressing review feedback: Bolt (Performance) - from PR #173: - Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level - Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule - Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173 Sentinel (Security) - from PR #172 with review feedback: - Enhance sanitize_for_log to redact Basic Auth credentials in URLs - Redact sensitive query parameters (token, key, secret, password, etc.) - Handle fragment separators (#) per Gemini Code Assist review - Use [^&#\s]* pattern per Copilot reviewer suggestion - Update docstring per reviewer suggestion - Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172 Palette (UX) - from PR #174 with lint fixes: - Add print_plan_details function for dry-run visibility - Fix duplicate render_progress_bar definition bug - Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174 Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
- Parallelize folder deletion using ThreadPoolExecutor with 3 workers - Use conservative worker count (3) instead of 5 to respect API rate limits - Add DELETE_WORKERS constant for easy configuration - Sanitize both folder names and exceptions in error logging to prevent log injection - Use lazy % formatting for logging (best practice) - Rename exception variable from 'e' to 'exc' for clarity - Add comprehensive test coverage for parallel deletion Addresses feedback from PRs #146, #149 Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
- Parallelize folder deletion using ThreadPoolExecutor with 3 workers - Use conservative worker count (3) instead of 5 to respect API rate limits - Add DELETE_WORKERS constant for easy configuration - Sanitize both folder names and exceptions in error logging to prevent log injection - Use lazy % formatting for logging (best practice) - Rename exception variable from 'e' to 'exc' for clarity - Add comprehensive test coverage for parallel deletion Addresses feedback from PRs #146, #149 Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
sync_profileusingThreadPoolExecutor(max 5 workers).is_valid_rule.test_sync_profile_parallel_deletiontest case.PR created automatically by Jules for task 4728167391881213641 started by @abhimehro