Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,7 @@
## 2024-05-24 - [Skip Validation for Known Data]
**Learning:** Performing expensive validation (e.g. regex) on data that is already known to be valid (e.g. exists in trusted remote state) is redundant. Checking existence in a local set (O(1)) before validation avoids CPU overhead for duplicates.
**Action:** In filtering loops, check "is already processed/known" before "is valid", especially if "valid" implies "safe to process" and "known" implies "already processed".

## 2026-02-04 - [Optimize Buffer for Large Downloads]
**Learning:** When downloading large files (e.g., blocklists), the default chunk size of HTTP libraries might be small, leading to excessive loop iterations and list operations. Increasing the buffer size (e.g., to 16KB) reduces CPU overhead during I/O-bound operations.
**Action:** When using `iter_bytes()` or similar streaming methods for large resources, explicitly set a larger `chunk_size` (e.g., 16384) to improve throughput and reduce CPU usage.
3 changes: 2 additions & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -1000,7 +1000,8 @@ def _gh_get(url: str) -> Dict:
# 2. Stream and check actual size
chunks = []
current_size = 0
for chunk in r.iter_bytes():
# Optimization: Use 16KB chunks to reduce loop overhead/appends for large files
for chunk in r.iter_bytes(chunk_size=16 * 1024):
current_size += len(chunk)
if current_size > MAX_RESPONSE_SIZE:
raise ValueError(
Expand Down
Loading