⚡ Bolt: Avoid thread pool overhead for small rule updates#186
⚡ Bolt: Avoid thread pool overhead for small rule updates#186
Conversation
This change checks if there is only one batch of rules to push (<= 500 rules) and processes it synchronously, bypassing the overhead of creating a ThreadPoolExecutor. This optimization benefits small folder updates, which are common. Also includes fixes for test suite stability (module reloading issues in tests). Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Merging to
|
|
👋 Development Partner is reviewing this PR. Will provide feedback shortly. |
Summary of ChangesHello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the efficiency of rule processing by introducing a conditional execution path that bypasses the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable optimization to avoid ThreadPoolExecutor overhead for single-batch rule updates, a common scenario. The change is well-implemented and accompanied by new performance tests that verify the intended behavior. The associated refactoring of test patching logic also improves the overall robustness of the test suite.
I have two suggestions for improvement. First, the optimization in main.py has led to some code duplication, which I've detailed in a comment. Second, a change to the .jules/bolt.md journal file appears to have accidentally removed existing content. Addressing these points would make this solid PR even better.
| ## 2025-02-23 - Python Module Patching Flakiness | ||
| **Learning:** Patching global variables (like `USE_COLORS`) in a module can fail if previous tests manipulated `sys.modules` (e.g., `del sys.modules['main']`) or used `sys.path.append` to re-import the module from a different path. This results in the test holding a reference to an old module object while `patch` modifies the new one. | ||
| **Action:** Avoid `sys.path.append` in tests if possible. Use `patch.object(module, 'var', val)` instead of `patch('module.var', val)` where `module` is imported inside the test function to ensure it targets the current `sys.modules` entry. |
There was a problem hiding this comment.
| if total_batches == 1: | ||
| # Avoid thread pool overhead for single batch (very common case) | ||
| result = process_batch(1, batches[0]) | ||
| if result: | ||
| successful_batches = 1 | ||
| existing_rules.update(result) | ||
|
|
||
| render_progress_bar( | ||
| successful_batches, | ||
| total_batches, | ||
| f"Folder {sanitize_for_log(folder_name)}", | ||
| ) | ||
| else: | ||
| with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: | ||
| futures = { | ||
| executor.submit(process_batch, i, batch): i | ||
| for i, batch in enumerate(batches, 1) | ||
| } | ||
|
|
||
| for future in concurrent.futures.as_completed(futures): | ||
| result = future.result() | ||
| if result: | ||
| successful_batches += 1 | ||
| existing_rules.update(result) | ||
|
|
||
| render_progress_bar( | ||
| successful_batches, | ||
| total_batches, | ||
| f"Folder {sanitize_for_log(folder_name)}", | ||
| ) | ||
| for future in concurrent.futures.as_completed(futures): | ||
| result = future.result() | ||
| if result: | ||
| successful_batches += 1 | ||
| existing_rules.update(result) | ||
|
|
||
| render_progress_bar( | ||
| successful_batches, | ||
| total_batches, | ||
| f"Folder {sanitize_for_log(folder_name)}", | ||
| ) |
There was a problem hiding this comment.
While this optimization for a single batch is a good idea, it has introduced significant code duplication. The logic for processing a batch result (updating successful_batches and existing_rules, and rendering the progress bar) is now repeated in both the if total_batches == 1 block and the else block's loop.
This duplication makes the code harder to maintain, as any change to this logic will need to be applied in two places. Consider refactoring this by extracting the common result-handling logic into a separate (possibly nested) function.
| def test_print_plan_details_no_colors(capsys): | ||
| """Test print_plan_details output when colors are disabled.""" | ||
| with patch("main.USE_COLORS", False): | ||
| import main |
Check warning
Code scanning / Prospector (reported by Codacy)
Import outside toplevel (main) (import-outside-toplevel) Warning test
| def test_print_plan_details_empty_folders(capsys): | ||
| """Test print_plan_details with no folders.""" | ||
| with patch("main.USE_COLORS", False): | ||
| import main |
Check warning
Code scanning / Prospector (reported by Codacy)
Import outside toplevel (main) (import-outside-toplevel) Warning test
| def test_print_plan_details_with_colors(capsys): | ||
| """Test print_plan_details output when colors are enabled.""" | ||
| with patch("main.USE_COLORS", True): | ||
| import main |
Check warning
Code scanning / Prospector (reported by Codacy)
Import outside toplevel (main) (import-outside-toplevel) Warning test
| def test_print_plan_details_no_colors(capsys): | ||
| """Test print_plan_details output when colors are disabled.""" | ||
| with patch("main.USE_COLORS", False): | ||
| import main |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Import outside toplevel (main) Warning test
| def test_print_plan_details_empty_folders(capsys): | ||
| """Test print_plan_details with no folders.""" | ||
| with patch("main.USE_COLORS", False): | ||
| import main |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Import outside toplevel (main) Warning test
| def test_print_plan_details_with_colors(capsys): | ||
| """Test print_plan_details output when colors are enabled.""" | ||
| with patch("main.USE_COLORS", True): | ||
| import main |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Import outside toplevel (main) Warning test
| @@ -0,0 +1,77 @@ | |||
|
|
|||
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Missing module docstring Warning test
| # Assumes pytest is running from root, so 'main' is importable directly | ||
| import main | ||
|
|
||
| class TestPushRulesPerformance(unittest.TestCase): |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Missing class docstring Warning test
| self.profile_id = "test-profile" | ||
| self.folder_name = "test-folder" | ||
| self.folder_id = "test-folder-id" | ||
| self.do = 0 |
Check warning
Code scanning / Pylintpython3 (reported by Codacy)
Attribute name "do" doesn't conform to snake_case naming style Warning test
| @@ -0,0 +1,77 @@ | |||
|
|
|||
Check warning
Code scanning / Pylint (reported by Codacy)
Missing module docstring Warning test
| # Assumes pytest is running from root, so 'main' is importable directly | ||
| import main | ||
|
|
||
| class TestPushRulesPerformance(unittest.TestCase): |
Check warning
Code scanning / Pylint (reported by Codacy)
Missing class docstring Warning test
| self.profile_id = "test-profile" | ||
| self.folder_name = "test-folder" | ||
| self.folder_id = "test-folder-id" | ||
| self.do = 0 |
Check warning
Code scanning / Pylint (reported by Codacy)
Attribute name "do" doesn't conform to snake_case naming style Warning test
There was a problem hiding this comment.
Pull request overview
Optimizes rule pushing by avoiding ThreadPoolExecutor setup when only a single batch of rules is being sent (the common case), and adds/adjusts tests to validate the behavior and reduce patching/import flakiness.
Changes:
- Update
push_rules()to process a single batch inline and only create a thread pool for multiple batches. - Add a performance/regression test asserting
ThreadPoolExecutoris not used for a single batch. - Adjust tests to patch
USE_COLORSviapatch.objectand remove asys.pathmanipulation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
main.py |
Skips thread pool creation when total_batches == 1, retaining parallel execution for multi-batch updates. |
tests/test_push_rules_perf.py |
Adds tests asserting executor usage behavior for single vs multiple batches. |
tests/test_plan_details.py |
Imports main inside tests and patches USE_COLORS via patch.object for robustness. |
tests/test_cache_optimization.py |
Comments out repo-root sys.path modification (but leaves a now-misleading comment). |
.jules/bolt.md |
Replaces prior journal content with a new entry about module patching flakiness. |
| # Mock submit to return a Future | ||
| future = concurrent.futures.Future() | ||
| future.set_result(["processed"]) | ||
| mock_executor_instance.submit.return_value = future | ||
|
|
There was a problem hiding this comment.
This test configures submit() to always return the same Future. Because push_rules() builds a dict keyed by futures, duplicate futures collapse into one entry, so only one batch is observed as completed and push_rules() will log an error / return False even though the executor was used. Use distinct futures per batch (e.g., side_effect) and assert submit was called total_batches times (and ideally that push_rules() returns True) so the test validates the intended behavior without producing error logs.
| # Add root to path to import main | ||
| sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) | ||
| # sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) | ||
|
|
||
| import main |
There was a problem hiding this comment.
The comment says the test adds the repo root to sys.path, but the sys.path.append(...) line is now commented out. Please update/remove the comment (and consider removing now-unused imports like sys/os if they’re no longer needed) to avoid misleading future maintainers about how main is imported in this test module.
| ## 2025-02-23 - Python Module Patching Flakiness | ||
| **Learning:** Patching global variables (like `USE_COLORS`) in a module can fail if previous tests manipulated `sys.modules` (e.g., `del sys.modules['main']`) or used `sys.path.append` to re-import the module from a different path. This results in the test holding a reference to an old module object while `patch` modifies the new one. | ||
| **Action:** Avoid `sys.path.append` in tests if possible. Use `patch.object(module, 'var', val)` instead of `patch('module.var', val)` where `module` is imported inside the test function to ensure it targets the current `sys.modules` entry. |
There was a problem hiding this comment.
This change replaces the entire prior Bolt journal with a single entry, which loses historical context and past learnings. If the intent is to add a new lesson, consider appending this entry (or moving older entries to an archive section) rather than deleting the existing content.
| # Assumes pytest is running from root, so 'main' is importable directly | ||
| import main | ||
|
|
There was a problem hiding this comment.
main is imported at module import time, which can make tests flaky when other tests delete/reload sys.modules['main'] and can also trigger main.py import-time side effects during collection. Prefer importing main inside each test method (or in setUp) and then patching via patch.object(main, ...) to ensure the test always targets the currently-imported module object.
|
Closing: superseded by #192 (latest thread pool overhead optimization PR). |
💡 What: Skip ThreadPoolExecutor creation when pushing a single batch of rules.
🎯 Why: Thread creation and context management adds overhead for small tasks. Most rule updates are small (< 500 rules).
📊 Impact: Eliminates thread pool overhead for the majority of folder updates.
🔬 Measurement: Verified with
tests/test_push_rules_perf.pywhich assertsThreadPoolExecutoris not called for single batch.PR created automatically by Jules for task 3623770900998418111 started by @abhimehro