Skip to content

🛡️ Sentinel: [MEDIUM] Fix missing input validation on external JSON data#409

Merged
abhimehro merged 1 commit intomainfrom
sentinel-validation-fix-13083465383994297818
Feb 22, 2026
Merged

🛡️ Sentinel: [MEDIUM] Fix missing input validation on external JSON data#409
abhimehro merged 1 commit intomainfrom
sentinel-validation-fix-13083465383994297818

Conversation

@abhimehro
Copy link
Owner

🚨 Severity: MEDIUM
💡 Vulnerability: The application processes external JSON data (from URLs) without validating that rules is a list and rule_groups is a list of dictionaries. This can lead to crashes (DoS) if malformed data is encountered.
🎯 Impact: Application crash (AttributeError/TypeError) when processing malicious or malformed blocklists.
🔧 Fix: Added strict type checking in validate_folder_data for rules and rule_groups.
✅ Verification: Added test_validate_folder_data_structure in test_main.py covering invalid inputs. Run python3 -m pytest test_main.py to verify.


PR created automatically by Jules for task 13083465383994297818 started by @abhimehro

- Added validation for `rules` and `rule_groups` types in `validate_folder_data` to prevent crashes.
- Added comprehensive unit tests in `test_main.py`.
- Fixed existing test mocks to handle `clear_cache` argument.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings February 20, 2026 11:01
@trunk-io
Copy link

trunk-io bot commented Feb 20, 2026

😎 Merged manually by @abhimehro - details.

@github-actions
Copy link

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

# 1. Invalid 'rules' type (string instead of list)
invalid_rules = valid_base.copy()
invalid_rules["rules"] = "not_a_list"
assert m.validate_folder_data(invalid_rules, "url") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
invalid_rules = valid_base.copy()
invalid_rules["rules"] = "not_a_list"
assert m.validate_folder_data(invalid_rules, "url") is False
assert "rules" in str(mock_log.error.call_args)

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# 2. Invalid 'rule_groups' type (string instead of list)
invalid_rg_type = valid_base.copy()
invalid_rg_type["rule_groups"] = "not_a_list"
assert m.validate_folder_data(invalid_rg_type, "url") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
invalid_rg_type = valid_base.copy()
invalid_rg_type["rule_groups"] = "not_a_list"
assert m.validate_folder_data(invalid_rg_type, "url") is False
assert "rule_groups" in str(mock_log.error.call_args)

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# 3. Invalid 'rule_groups' content (list of strings instead of dicts)
invalid_rg_content = valid_base.copy()
invalid_rg_content["rule_groups"] = ["not_a_dict"]
assert m.validate_folder_data(invalid_rg_content, "url") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
invalid_rg_content = valid_base.copy()
invalid_rg_content["rule_groups"] = ["not_a_dict"]
assert m.validate_folder_data(invalid_rg_content, "url") is False
assert "must be an object" in str(mock_log.error.call_args)

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# 4. Invalid 'rules' inside 'rule_groups'
invalid_rg_rules = valid_base.copy()
invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
assert m.validate_folder_data(invalid_rg_rules, "url") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
invalid_rg_rules = valid_base.copy()
invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
assert m.validate_folder_data(invalid_rg_rules, "url") is False
assert "must be a list" in str(mock_log.error.call_args)

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# 5. Valid cases
valid_rules = valid_base.copy()
valid_rules["rules"] = [{"PK": "rule1"}]
assert m.validate_folder_data(valid_rules, "url") is True

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

valid_rg = valid_base.copy()
valid_rg["rule_groups"] = [{"rules": [{"PK": "rule1"}]}]
assert m.validate_folder_data(valid_rg, "url") is True

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
@gemini-code-assist
Copy link

Summary of Changes

Hello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a medium-severity vulnerability by enhancing input validation for external JSON data. Previously, malformed 'rules' or 'rule_groups' could lead to application crashes. The changes introduce explicit type checks to ensure data integrity, thereby improving the application's resilience and preventing potential denial-of-service scenarios.

Highlights

  • Security Fix: Implemented strict type checking for 'rules' and 'rule_groups' fields within external JSON data processed by validate_folder_data to prevent application crashes from malformed input.
  • Test Coverage: Added a new comprehensive test case, test_validate_folder_data_structure, to verify the robustness of the input validation logic against various invalid data structures.
Changelog
  • main.py
    • Added validation to ensure 'rules' is a list if present in external JSON data.
    • Added validation to ensure 'rule_groups' is a list if present in external JSON data.
    • Implemented checks to confirm each item within 'rule_groups' is a dictionary.
    • Added validation for 'rules' within each 'rule_group' item to ensure it is a list.
  • test_main.py
    • Added mock_args.clear_cache = False to existing test setup functions.
    • Introduced test_validate_folder_data_structure to test various invalid and valid structures for 'rules' and 'rule_groups' in validate_folder_data.
Activity
  • PR created automatically by Jules for task 13083465383994297818, initiated by @abhimehro.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

if not isinstance(data["rule_groups"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: 'rule_groups' must be a list."
)

Check warning

Code scanning / Prospector (reported by Codacy)

Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning

Use lazy % formatting in logging functions (logging-fstring-interpolation)
main.py Outdated

Check warning

Code scanning / Pylint (reported by Codacy)

Variable name "rg" doesn't conform to snake_case naming style Warning

Variable name "rg" doesn't conform to snake_case naming style
main.py Outdated

Check warning

Code scanning / Pylint (reported by Codacy)

Line too long (101/100) Warning

Line too long (101/100)
main.py Outdated

Check warning

Code scanning / Pylint (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)
main.py Outdated

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Variable name "rg" doesn't conform to snake_case naming style Warning

Variable name "rg" doesn't conform to snake_case naming style
main.py Outdated

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (101/100) Warning

Line too long (101/100)
main.py Outdated

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Line too long (108/100) Warning

Line too long (108/100)
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}] must be an object."
)
return False
if "rules" in rg and not isinstance(rg["rules"], list):

Check notice

Code scanning / Pylintpython3 (reported by Codacy)

Use lazy % formatting in logging functions Note

Use lazy % formatting in logging functions
invalid_rg_content["rule_groups"] = ["not_a_dict"]
assert m.validate_folder_data(invalid_rg_content, "url") is False
assert "must be an object" in str(mock_log.error.call_args)
mock_log.reset_mock()

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert m.validate_folder_data(invalid_rg_content, "url") is False
assert "must be an object" in str(mock_log.error.call_args)
mock_log.reset_mock()

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
assert m.validate_folder_data(invalid_rg_rules, "url") is False
assert "must be a list" in str(mock_log.error.call_args)
mock_log.reset_mock()

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert m.validate_folder_data(invalid_rg_rules, "url") is False
assert "must be a list" in str(mock_log.error.call_args)
mock_log.reset_mock()

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
valid_rules["rules"] = [{"PK": "rule1"}]
assert m.validate_folder_data(valid_rules, "url") is True

valid_rg = valid_base.copy()

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert m.validate_folder_data(valid_rules, "url") is True

valid_rg = valid_base.copy()
valid_rg["rule_groups"] = [{"rules": [{"PK": "rule1"}]}]

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
test_main.py Outdated

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
test_main.py Outdated

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
test_main.py Outdated

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
test_main.py Outdated

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR attempts to fix a medium-severity security vulnerability where the application processes external JSON data from blocklist URLs without validating the structure of rules and rule_groups fields. The fix adds type checking to ensure these fields are lists and that rule_groups items are dictionaries. However, the validation is incomplete and leaves critical type confusion vulnerabilities unaddressed.

Changes:

  • Added validation to check that rules is a list (if present)
  • Added validation to check that rule_groups is a list of dictionaries (if present)
  • Added test coverage for the new validation logic
  • Updated existing test mocks to include the clear_cache attribute

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
main.py Adds structural validation for rules and rule_groups fields in validate_folder_data function, but fails to validate individual rule items are dictionaries
test_main.py Adds test cases for new validation logic and updates existing test mocks with clear_cache attribute
Comments suppressed due to low confidence (2)

test_main.py:776

  • This test duplicates and extends functionality already tested in tests/test_folder_validation.py. Following the codebase convention, validation tests for validate_folder_data should be placed in tests/test_folder_validation.py rather than in the root test_main.py file.

Consider moving this test to tests/test_folder_validation.py to maintain consistency with the existing test organization, where security and validation tests are organized by function/feature in the tests/ directory.

def test_validate_folder_data_structure(monkeypatch):
    """Test validation of 'rules' and 'rule_groups' structures."""
    m = reload_main_with_env(monkeypatch)
    mock_log = MagicMock()
    monkeypatch.setattr(m, "log", mock_log)

    valid_base = {
        "group": {"group": "ValidFolder"}
    }

    # 1. Invalid 'rules' type (string instead of list)
    invalid_rules = valid_base.copy()
    invalid_rules["rules"] = "not_a_list"
    assert m.validate_folder_data(invalid_rules, "url") is False
    assert "rules" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 2. Invalid 'rule_groups' type (string instead of list)
    invalid_rg_type = valid_base.copy()
    invalid_rg_type["rule_groups"] = "not_a_list"
    assert m.validate_folder_data(invalid_rg_type, "url") is False
    assert "rule_groups" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 3. Invalid 'rule_groups' content (list of strings instead of dicts)
    invalid_rg_content = valid_base.copy()
    invalid_rg_content["rule_groups"] = ["not_a_dict"]
    assert m.validate_folder_data(invalid_rg_content, "url") is False
    assert "must be an object" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 4. Invalid 'rules' inside 'rule_groups'
    invalid_rg_rules = valid_base.copy()
    invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
    assert m.validate_folder_data(invalid_rg_rules, "url") is False
    assert "must be a list" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 5. Valid cases
    valid_rules = valid_base.copy()
    valid_rules["rules"] = [{"PK": "rule1"}]
    assert m.validate_folder_data(valid_rules, "url") is True

    valid_rg = valid_base.copy()
    valid_rg["rule_groups"] = [{"rules": [{"PK": "rule1"}]}]
    assert m.validate_folder_data(valid_rg, "url") is True

test_main.py:776

  • The test suite is missing a critical test case: verifying that individual rule items must be dictionaries, not just that 'rules' is a list. The vulnerability can still be triggered with payloads like {"group": {"group": "X"}, "rules": ["string1", "string2"]} or {"group": {"group": "X"}, "rule_groups": [{"rules": [123, 456]}]}.

Add test cases for:

  1. rules containing non-dict items (e.g., strings, integers)
  2. rule_groups[i].rules containing non-dict items
def test_validate_folder_data_structure(monkeypatch):
    """Test validation of 'rules' and 'rule_groups' structures."""
    m = reload_main_with_env(monkeypatch)
    mock_log = MagicMock()
    monkeypatch.setattr(m, "log", mock_log)

    valid_base = {
        "group": {"group": "ValidFolder"}
    }

    # 1. Invalid 'rules' type (string instead of list)
    invalid_rules = valid_base.copy()
    invalid_rules["rules"] = "not_a_list"
    assert m.validate_folder_data(invalid_rules, "url") is False
    assert "rules" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 2. Invalid 'rule_groups' type (string instead of list)
    invalid_rg_type = valid_base.copy()
    invalid_rg_type["rule_groups"] = "not_a_list"
    assert m.validate_folder_data(invalid_rg_type, "url") is False
    assert "rule_groups" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 3. Invalid 'rule_groups' content (list of strings instead of dicts)
    invalid_rg_content = valid_base.copy()
    invalid_rg_content["rule_groups"] = ["not_a_dict"]
    assert m.validate_folder_data(invalid_rg_content, "url") is False
    assert "must be an object" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 4. Invalid 'rules' inside 'rule_groups'
    invalid_rg_rules = valid_base.copy()
    invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
    assert m.validate_folder_data(invalid_rg_rules, "url") is False
    assert "must be a list" in str(mock_log.error.call_args)
    mock_log.reset_mock()

    # 5. Valid cases
    valid_rules = valid_base.copy()
    valid_rules["rules"] = [{"PK": "rule1"}]
    assert m.validate_folder_data(valid_rules, "url") is True

    valid_rg = valid_base.copy()
    valid_rg["rule_groups"] = [{"rules": [{"PK": "rule1"}]}]
    assert m.validate_folder_data(valid_rg, "url") is True

Comment on lines +1115 to +1118
# Validate 'rules' if present (must be a list)
if "rules" in data and not isinstance(data["rules"], list):
log.error(f"Invalid data from {sanitize_for_log(url)}: 'rules' must be a list.")
return False
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation checks if 'rules' is a list, but doesn't validate that each item in the list is a dictionary. The code at lines 2138, 2152, and 2258 uses r.get("PK") which will throw an AttributeError if r is not a dict (e.g., if it's a string, int, or None). This leaves the DoS vulnerability partially unfixed.

Add validation to ensure each rule item is a dict:

if "rules" in data:
    if not isinstance(data["rules"], list):
        log.error(f"Invalid data from {sanitize_for_log(url)}: 'rules' must be a list.")
        return False
    for i, rule in enumerate(data["rules"]):
        if not isinstance(rule, dict):
            log.error(
                f"Invalid data from {sanitize_for_log(url)}: rules[{i}] must be an object."
            )
            return False

Copilot uses AI. Check for mistakes.
Comment on lines +1133 to +1137
if "rules" in rg and not isinstance(rg["rules"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules must be a list."
)
return False
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the 'rules' validation above, this code validates that rules within rule_groups is a list, but doesn't validate that each item in that list is a dictionary. The code at line 2138 uses r.get("PK") which will throw an AttributeError if any rule item is not a dict.

Add validation to ensure each rule item within rule_groups is a dict:

if "rules" in rg:
    if not isinstance(rg["rules"], list):
        log.error(
            f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules must be a list."
        )
        return False
    for j, rule in enumerate(rg["rules"]):
        if not isinstance(rule, dict):
            log.error(
                f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules[{j}] must be an object."
            )
            return False
Suggested change
if "rules" in rg and not isinstance(rg["rules"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules must be a list."
)
return False
if "rules" in rg:
if not isinstance(rg["rules"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules must be a list."
)
return False
# Ensure each rule within the group is an object (dict),
# because later code treats each rule as a mapping (e.g., rule.get(...)).
for j, rule in enumerate(rg["rules"]):
if not isinstance(rule, dict):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules[{j}] must be an object."
)
return False

Copilot uses AI. Check for mistakes.
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces crucial input validation for external JSON data, specifically ensuring that 'rules' and 'rule_groups' are of the expected list and dictionary types, respectively. This directly addresses a potential Denial of Service vulnerability by preventing application crashes due to malformed data. The addition of comprehensive unit tests for these validation checks further strengthens the code's robustness. The changes are well-implemented and directly tackle the identified security concern.

Comment on lines +1115 to +1118
# Validate 'rules' if present (must be a list)
if "rules" in data and not isinstance(data["rules"], list):
log.error(f"Invalid data from {sanitize_for_log(url)}: 'rules' must be a list.")
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This validation correctly checks if 'rules' is present and, if so, ensures it's a list. This is a good step towards preventing TypeError if a non-list value is provided, as described in the PR description.

Comment on lines +1120 to +1137
# Validate 'rule_groups' if present (must be a list of dicts)
if "rule_groups" in data:
if not isinstance(data["rule_groups"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: 'rule_groups' must be a list."
)
return False
for i, rg in enumerate(data["rule_groups"]):
if not isinstance(rg, dict):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}] must be an object."
)
return False
if "rules" in rg and not isinstance(rg["rules"], list):
log.error(
f"Invalid data from {sanitize_for_log(url)}: rule_groups[{i}].rules must be a list."
)
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The validation for 'rule_groups' is thorough, checking both the top-level type (must be a list) and iterating through its elements to ensure each is a dictionary. Furthermore, it correctly validates the 'rules' key within each rule group. This nested validation is critical for preventing crashes when processing complex, malformed JSON structures.

mock_args.dry_run = False
mock_args.no_delete = False
mock_args.plan_json = None
mock_args.clear_cache = False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding mock_args.clear_cache = False ensures that the test environment for test_interactive_prompts_show_hints accurately reflects the default state of clear_cache when not explicitly set by the user. This prevents unintended side effects or test failures if the default behavior of clear_cache were to change or if the test were run in an environment where clear_cache might be implicitly true.

mock_args.dry_run = False
mock_args.no_delete = False
mock_args.plan_json = None
mock_args.clear_cache = False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous comment, explicitly setting mock_args.clear_cache = False in test_interactive_input_extracts_id ensures test isolation and predictability. This is good practice for robust unit testing.

Comment on lines +730 to +775
def test_validate_folder_data_structure(monkeypatch):
"""Test validation of 'rules' and 'rule_groups' structures."""
m = reload_main_with_env(monkeypatch)
mock_log = MagicMock()
monkeypatch.setattr(m, "log", mock_log)

valid_base = {
"group": {"group": "ValidFolder"}
}

# 1. Invalid 'rules' type (string instead of list)
invalid_rules = valid_base.copy()
invalid_rules["rules"] = "not_a_list"
assert m.validate_folder_data(invalid_rules, "url") is False
assert "rules" in str(mock_log.error.call_args)
mock_log.reset_mock()

# 2. Invalid 'rule_groups' type (string instead of list)
invalid_rg_type = valid_base.copy()
invalid_rg_type["rule_groups"] = "not_a_list"
assert m.validate_folder_data(invalid_rg_type, "url") is False
assert "rule_groups" in str(mock_log.error.call_args)
mock_log.reset_mock()

# 3. Invalid 'rule_groups' content (list of strings instead of dicts)
invalid_rg_content = valid_base.copy()
invalid_rg_content["rule_groups"] = ["not_a_dict"]
assert m.validate_folder_data(invalid_rg_content, "url") is False
assert "must be an object" in str(mock_log.error.call_args)
mock_log.reset_mock()

# 4. Invalid 'rules' inside 'rule_groups'
invalid_rg_rules = valid_base.copy()
invalid_rg_rules["rule_groups"] = [{"rules": "not_a_list"}]
assert m.validate_folder_data(invalid_rg_rules, "url") is False
assert "must be a list" in str(mock_log.error.call_args)
mock_log.reset_mock()

# 5. Valid cases
valid_rules = valid_base.copy()
valid_rules["rules"] = [{"PK": "rule1"}]
assert m.validate_folder_data(valid_rules, "url") is True

valid_rg = valid_base.copy()
valid_rg["rule_groups"] = [{"rules": [{"PK": "rule1"}]}]
assert m.validate_folder_data(valid_rg, "url") is True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new test function test_validate_folder_data_structure is excellent. It covers various invalid scenarios for rules and rule_groups, including incorrect types and malformed nested structures. This directly verifies the effectiveness of the input validation added in main.py and ensures the application's resilience against malformed external JSON data. The use of mock_log to assert error messages is also a good approach.

@abhimehro abhimehro merged commit 0534ce5 into main Feb 22, 2026
79 checks passed
@abhimehro abhimehro deleted the sentinel-validation-fix-13083465383994297818 branch February 22, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants