🛡️ Sentinel: Enforce strict Content-Type validation by abhimehro · Pull Request #344 · abhimehro/ctrld-sync

abhimehro · 2026-02-18T10:59:28Z

🛡️ Sentinel: Security Enhancement - Enforce Content-Type validation

Context:
The application fetches blocklists from remote URLs. While it parses the response as JSON, it previously did not validate the Content-Type header. This could allow the application to process unexpected content (e.g., HTML from a captive portal or error page) if it happened to be valid JSON, or lead to confusing errors.

Changes:

Modified _gh_get in main.py to enforce that the Content-Type header contains one of: application/json, text/json, or text/plain.
If the content type is invalid (e.g., text/html), the request is rejected with a clear ValueError, preventing further processing.
Updated existing tests in tests/test_cache_optimization.py and tests/test_disk_cache.py to include valid Content-Type headers in their mocks.
Added tests/test_content_type.py to verify the new validation logic.

Impact:

Prevents the application from processing non-JSON content types.
Improves error reporting for misconfigured URLs or network interception (captive portals).
Enhances defense-in-depth for input validation.

PR created automatically by Jules for task 2047347595139016233 started by @abhimehro

- Added Content-Type validation to `_gh_get` in `main.py`. - Updated existing tests to comply with new validation. - Added new test file `tests/test_content_type.py`. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>

google-labs-jules · 2026-02-18T10:59:30Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

trunk-io · 2026-02-18T10:59:31Z

😎 Merged manually by @abhimehro - details.

github-actions · 2026-02-18T10:59:40Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

gemini-code-assist · 2026-02-18T10:59:42Z

Summary of Changes

Hello @abhimehro, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a crucial security enhancement by enforcing Content-Type validation when the application fetches remote blocklists. Previously, the system would parse any content as JSON if it was syntactically valid, even if the Content-Type header indicated something else. This change prevents the processing of potentially malicious or unintended content, such as HTML from a captive portal, thereby improving the application's robustness and security posture against unexpected network responses.

Highlights

Security Enhancement: Implemented strict Content-Type validation within the _gh_get function to prevent the application from processing unexpected content types when fetching blocklists.
Content-Type Enforcement: The _gh_get function now explicitly checks if the Content-Type header is application/json, text/json, or text/plain, raising a ValueError for any other types.
Test Updates: Existing tests in test_cache_optimization.py and test_disk_cache.py were updated to include valid Content-Type headers in their mock responses to align with the new validation.
New Test Suite: A new test file, test_content_type.py, was added to specifically verify the correct behavior of the Content-Type validation logic, including both allowed and rejected types.

Changelog

main.py
- Added Content-Type validation logic in _gh_get for both initial requests and retries.
- Introduced a ValueError if the Content-Type header does not match allowed types (application/json, text/json, text/plain).
tests/test_cache_optimization.py
- Imported the httpx library.
- Updated mock HTTP responses to explicitly include Content-Type: application/json in headers.
tests/test_content_type.py
- Added a new test file to verify Content-Type validation.
- Included tests to ensure application/json and text/plain are allowed.
- Added tests to confirm text/html and application/xml are correctly rejected with a ValueError.
tests/test_disk_cache.py
- Imported the httpx library.
- Updated mock HTTP responses to explicitly include Content-Type: application/json in headers.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

This PR enhances security by enforcing strict Content-Type validation when fetching blocklist data from remote URLs, preventing the application from processing unexpected content types like HTML from captive portals or error pages.

Changes:

Added Content-Type header validation to the _gh_get function in both the normal response path and the 304 retry path
Updated existing tests to include valid Content-Type headers in their mocks
Added comprehensive test coverage for the new validation logic

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
main.py	Added Content-Type validation logic (lines 941-948 and 1006-1014) to enforce that responses contain application/json, text/json, or text/plain
tests/test_content_type.py	New test file verifying Content-Type validation accepts valid types (application/json, text/plain with charset) and rejects invalid types (text/html, application/xml)
tests/test_cache_optimization.py	Updated mock response to include Content-Type: application/json header
tests/test_disk_cache.py	Updated mock response to include Content-Type: application/json header

main.py

tests/test_content_type.py

gemini-code-assist

Code Review

This pull request introduces a valuable security enhancement by enforcing Content-Type validation when fetching remote blocklists. The implementation is correct and is accompanied by good test coverage, including a new dedicated test file. My review includes a couple of suggestions to improve maintainability by reducing code duplication in main.py and to enhance the readability of the new tests.

gemini-code-assist · 2026-02-18T11:07:40Z

main.py

+            # Security: Enforce Content-Type to be JSON or text
+            # This prevents processing of unexpected content (e.g., HTML from captive portals)
+            ct = r.headers.get("content-type", "").lower()
+            allowed_types = ("application/json", "text/json", "text/plain")
+            if not any(t in ct for t in allowed_types):
+                raise ValueError(
+                    f"Invalid Content-Type from {sanitize_for_log(url)}: {ct}. "
+                    f"Expected one of: {', '.join(allowed_types)}"
+                )


This Content-Type validation logic is nearly identical to the block on lines 941-948. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this logic into a helper function that can be called from both locations.

gemini-code-assist · 2026-02-18T11:07:40Z

tests/test_content_type.py

+        try:
+            main._gh_get("https://example.com/malicious.html")
+            # If it doesn't raise, we fail the test (once fixed)
+            # But for TDD, we expect this to fail AFTER the fix.
+            # For now, let's assert that it *should* raise ValueError
+        except ValueError as e:
+            self.assertIn("Invalid Content-Type", str(e))
+            return
+
+        # If we are here, no exception was raised.
+        # This confirms the vulnerability (or lack of validation).
+        # We can mark this as "expected failure" or just print it.
+        # For now, I'll fail the test so I can see it pass later.
+        self.fail("Should have raised ValueError for text/html Content-Type")


Now that the validation logic is implemented, this test can be simplified. Using the assertRaises context manager, as you've done in test_reject_xml, would make this test more concise and readable. The comments related to TDD are also no longer necessary.

with self.assertRaises(ValueError) as cm: main._gh_get("https://example.com/malicious.html") self.assertIn("Invalid Content-Type", str(cm.exception))

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-02-18T19:11:00Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-02-18T19:11:12Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-02-18T19:11:16Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-02-18T19:11:26Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

github-actions · 2026-02-18T19:11:31Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

…39016233

github-actions · 2026-02-19T00:40:32Z

👋 Development Partner is reviewing this PR. Will provide feedback shortly.

Copilot AI review requested due to automatic review settings February 18, 2026 10:59

github-actions bot added the python label Feb 18, 2026

Copilot started reviewing on behalf of abhimehro February 18, 2026 10:59 View session

Copilot AI reviewed Feb 18, 2026

View reviewed changes

main.py Outdated Show resolved Hide resolved

tests/test_content_type.py Show resolved Hide resolved

tests/test_content_type.py Show resolved Hide resolved

tests/test_content_type.py Outdated Show resolved Hide resolved

tests/test_content_type.py Show resolved Hide resolved

gemini-code-assist bot reviewed Feb 18, 2026

View reviewed changes

github-actions bot mentioned this pull request Feb 18, 2026

[repo-status] Daily Repo Status - February 18, 2026 🌟 #351

Closed

abhimehro and others added 2 commits February 18, 2026 13:10

Update main.py

b4b98a8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update test_content_type.py

8a4b0f4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update test_content_type.py

c2cd144

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update test_content_type.py

edc9177

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update test_content_type.py

795ff6f

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'main' into sentinel/content-type-validation-20473475951…

077488a

…39016233

abhimehro merged commit b6e9558 into main Feb 19, 2026
13 of 15 checks passed

abhimehro deleted the sentinel/content-type-validation-2047347595139016233 branch February 19, 2026 00:41

Conversation

abhimehro commented Feb 18, 2026

Uh oh!

google-labs-jules bot commented Feb 18, 2026

Uh oh!

trunk-io bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

trunk-io bot commented Feb 18, 2026 •

edited

Loading