🛡️ Sentinel: Enforce strict whitelist for rule validation#136
Conversation
Replaced fragile blacklist with strict regex whitelist to prevent injection. Added comprehensive security tests. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Merging to
|
Summary of ChangesHello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly bolsters the security of rule validation by transitioning from a fragile blacklist to a robust whitelist mechanism. The Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| """ | ||
| Tests the is_valid_rule function against a strict whitelist of inputs. | ||
| """ | ||
| assert is_valid_rule(rule) == expected_validity, f"Failed for rule: {rule}" |
Check notice
Code scanning / Bandit (reported by Codacy)
Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test
| 1. Implement strict validation on all data items from external lists (`is_valid_rule`). | ||
| 2. Filter out items containing dangerous characters (`<`, `>`, `"` etc.) or control codes. | ||
|
|
||
| ## 2026-05-24 - [Weak Blacklist Validation for Rules] |
Check notice
Code scanning / Remark-lint (reported by Codacy)
Warn when references to undefined definitions are found. Note
| 1. Implement strict validation on all data items from external lists (`is_valid_rule`). | ||
| 2. Filter out items containing dangerous characters (`<`, `>`, `"` etc.) or control codes. | ||
|
|
||
| ## 2026-05-24 - [Weak Blacklist Validation for Rules] |
Check notice
Code scanning / Remark-lint (reported by Codacy)
Warn when shortcut reference links are used. Note
There was a problem hiding this comment.
Code Review
This pull request makes a significant security improvement by replacing a fragile blacklist with a strict whitelist for rule validation. The change in is_valid_rule effectively mitigates a class of injection vulnerabilities. The accompanying tests are thorough, though I've pointed out a couple of cases that seem to validate incorrect rule formats. I've also suggested a minor performance optimization by pre-compiling the new regular expression. Overall, this is a solid and important change.
| ("http://example.com", True), # Safe from injection (contains only allowed chars) | ||
| ("example.com/path", True), # Allowed (CIDR uses /) |
There was a problem hiding this comment.
These two test cases seem to assert that invalid rule formats are valid, which could lead to incorrect behavior or mask future bugs.
http://example.comis a URL, not a valid domain/IP/CIDR rule for DNS filtering.example.com/pathis also not a valid rule format. The/is intended for CIDR notation with IPs, not for paths with domains.
These should likely be expected to fail validation. This would also mean tightening the regex in is_valid_rule to be more semantically strict, which would make the function more robust.
| ("http://example.com", True), # Safe from injection (contains only allowed chars) | |
| ("example.com/path", True), # Allowed (CIDR uses /) | |
| ("http://example.com", False), # Not a valid rule format (URL, not domain/IP) | |
| ("example.com/path", False), # Not a valid rule format (domain with path) |
| if any(c in dangerous_chars for c in rule): | ||
| # Strict whitelist to prevent injection | ||
| # ^[a-zA-Z0-9.\-_:*\/]+$ | ||
| if not re.match(r"^[a-zA-Z0-9.\-_:*\/]+$", rule): |
There was a problem hiding this comment.
This is a great security improvement! For a performance optimization, consider pre-compiling this regular expression at the module level. Since is_valid_rule is called inside a loop for every rule, pre-compiling avoids the overhead of parsing the regex pattern on each call.
You could do something like this:
# At the top of main.py
_VALID_RULE_RE = re.compile(r"^[a-zA-Z0-9.\-_:*\/]+$")
# Then in is_valid_rule()
if not _VALID_RULE_RE.match(rule):
return FalseThere was a problem hiding this comment.
Pull request overview
This PR strengthens input validation for rule processing by replacing a fragile blacklist approach with a strict whitelist regex pattern, significantly improving security against injection attacks.
Changes:
- Replaced blacklist validation in
is_valid_rule()with a whitelist regex pattern (^[a-zA-Z0-9.\-_:*\/]+$) - Added comprehensive test coverage for both valid inputs (domains, IPs, IPv6, CIDR, wildcards) and malicious inputs (injection payloads)
- Documented the vulnerability and fix in the sentinel log
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| main.py | Replaced blacklist with strict whitelist regex in is_valid_rule() function |
| tests/test_security.py | Added parametrized test cases covering valid and malicious input patterns |
| .jules/sentinel.md | Documented the vulnerability, learning, and prevention measures |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
✅ Merged via local merge and push to
main.Changes integrated:
^[a-zA-Z0-9.\-_:*\/]+$).jules/sentinel.mdwith the security learningTest Results: 56/56 tests passing.