fix: NotebookEdit hook bypass, gitleaks tests, scan_text early-exit#5
Conversation
…early-exit - SEC-001: add NotebookEdit to HOOK_MATCHER so secrets in Jupyter notebook cells are scanned (hook.py already handled the tool, but claude.py never registered the matcher) - ARCH-001: add max_findings parameter to scan_text/scan_file for early-exit on obviously-compromised files; default None preserves backward compat - TEST-001: add TestGitleaksIntegration covering _scan_with_gitleaks field mapping, timeout handling, and scan_path deduplication by fingerprint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0cdfa5244e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| SETTINGS_PATH = Path.home() / ".claude" / "settings.json" | ||
| HOOK_COMMAND = "gitshield-claude-hook" | ||
| HOOK_MATCHER = "Write|Edit|Bash" | ||
| HOOK_MATCHER = "Write|Edit|Bash|NotebookEdit" |
There was a problem hiding this comment.
Migrate existing Claude hook matcher to include NotebookEdit
Updating HOOK_MATCHER only affects fresh installs; existing users who already have a gitshield-claude-hook entry with the old matcher (Write|Edit|Bash) are still bypassable for NotebookEdit. install_hook() exits early when _is_installed() finds any gitshield command, so it never rewrites the matcher for already-installed hooks, leaving the security fix unapplied for upgraded installations.
Useful? React with 👍 / 👎.
| if max_findings is not None and len(findings) >= max_findings: | ||
| break |
There was a problem hiding this comment.
Enforce max_findings limit during pattern matching
The new max_findings guard is checked only once per input line, so a single line that matches multiple patterns can still append several findings after the limit is reached (e.g., max_findings=1 can return more than one finding). That violates the documented “stop scanning once this many findings are accumulated” behavior and weakens the intended short-circuit for heavily compromised content.
Useful? React with 👍 / 👎.
Remove find_git_root, Optional, and tempfile unused imports flagged by ruff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add migration path for NotebookEdit matcher in install_hook() so existing users get the updated matcher without reinstalling - Remove unguarded regex search in _gitignore_regex_is_safe to prevent ReDoS on main thread; always use thread-based timeout - Harden batch parser in _scan_staged: bounds check before header parse, skip non-blob entries while advancing pos correctly - Fix max_findings overshoot in scan_text by trimming results post-loop - Log warning on OSError when reading .gitshieldignore so users know their suppressions are not being applied Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
NotebookEdittoHOOK_MATCHERinclaude.py— the hook handler already processedNotebookEditevents, but the matcher string never registered for them, so Claude Code silently skipped scanning secrets written to Jupyter notebook cellsmax_findings: Optional[int] = Nonetoscan_textandscan_file— provides an early-exit for obviously-compromised files; defaultNoneis fully backward compatibleTestGitleaksIntegrationtotests/test_scanner_unit.pycovering_scan_with_gitleaksfield mapping, timeout handling (returns[]), empty-report handling, andscan_pathdeduplication-by-fingerprint merge logicTest plan
pytest tests/test_scanner_unit.py— 17 tests including 4 new gitleaks testspytest tests/test_claude_install.py— hook matcher constant verifiedpytest tests/— all 171 tests pass🤖 Generated with Claude Code