🛡️ Sentinel: [HIGH] Fix XSS by sanitizing go-readability output with bluemonday#135
🛡️ Sentinel: [HIGH] Fix XSS by sanitizing go-readability output with bluemonday#135
Conversation
- Adds github.com/microcosm-cc/bluemonday dependency - Initializes global htmlSanitizer with UGCPolicy - Sanitizes contentBuf in handler before formatting - Adds api/xss_test.go to verify fix and prevent regression - Updates .jules/sentinel.md with learning Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Summary of ChangesHello @lucasew, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request implements a critical security enhancement by introducing robust HTML sanitization to mitigate a high-severity Cross-Site Scripting (XSS) vulnerability. By integrating the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively mitigates a high-severity XSS vulnerability by sanitizing HTML output using bluemonday. The approach is sound, and the inclusion of a dedicated test case is excellent for verifying the fix. My review includes a couple of suggestions for improvement: one to enhance the efficiency and readability of the sanitization logic, and another to address a potential race condition in the new test code to improve test suite robustness. Overall, this is a well-executed security enhancement.
api/index.go
Outdated
| sanitized := htmlSanitizer.SanitizeBytes(contentBuf.Bytes()) | ||
| contentBuf.Reset() | ||
| contentBuf.Write(sanitized) |
There was a problem hiding this comment.
This sequence of sanitizing, resetting, and writing to the buffer can be simplified and made more idiomatic. Using SanitizeReader reads from the original buffer and returns a new buffer containing the sanitized content. This is cleaner than modifying the buffer in place and avoids multiple separate operations.
| sanitized := htmlSanitizer.SanitizeBytes(contentBuf.Bytes()) | |
| contentBuf.Reset() | |
| contentBuf.Write(sanitized) | |
| contentBuf = htmlSanitizer.SanitizeReader(contentBuf) |
| originalClient := httpClient | ||
| httpClient = &http.Client{ | ||
| Timeout: httpClientTimeout, | ||
| } | ||
| defer func() { httpClient = originalClient }() |
There was a problem hiding this comment.
Modifying the global httpClient variable here is not safe for parallel execution. If tests are run in parallel in the future (e.g., with t.Parallel()), this could cause data races. To make this test safe for parallelism, you should protect the modification of the global variable with a mutex. A longer-term solution would be to refactor the code to avoid a global HTTP client and instead use dependency injection.
- Adds github.com/microcosm-cc/bluemonday dependency - Initializes global htmlSanitizer with UGCPolicy - Sanitizes contentBuf in handler before formatting - Adds api/xss_test.go to verify fix and prevent regression - Updates .jules/sentinel.md with learning - Fixes unused parameter lint error in api/xss_test.go Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
- Adds github.com/microcosm-cc/bluemonday dependency - Initializes global htmlSanitizer with UGCPolicy - Sanitizes contentBuf in handler before formatting - Adds api/xss_test.go to verify fix and prevent regression - Updates .jules/sentinel.md with learning - Fixes unused parameter lint error in api/xss_test.go - Runs go mod tidy to update dependencies correctly Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
- Adds github.com/microcosm-cc/bluemonday dependency - Initializes global htmlSanitizer with UGCPolicy - Sanitizes contentBuf in handler before formatting - Adds api/xss_test.go to verify fix and prevent regression - Updates .jules/sentinel.md with learning - Fixes unused parameter lint error in api/xss_test.go - Runs go mod tidy to update dependencies correctly Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
- Adds github.com/microcosm-cc/bluemonday dependency - Initializes global htmlSanitizer with UGCPolicy - Sanitizes contentBuf in handler before formatting - Adds api/xss_test.go to verify fix and prevent regression - Updates .jules/sentinel.md with learning - Fixes unused parameter lint error in api/xss_test.go - Runs go mod tidy to update dependencies correctly - Explicitly ignores errors in tests and handler to satisfy strict linters Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
Severity
High
Vulnerability
The application uses
go-readabilityto parse and clean up web articles. However,go-readabilityis not a strict security sanitizer and may preserve dangerous HTML attributes (likeonerror) or schemes (likejavascript:). This allows a malicious or compromised website to serve content that, when processed by our API and rendered to a user, executes arbitrary JavaScript in the user's browser (Stored/Reflected XSS).Impact
An attacker can craft a URL that, when viewed through this reader application, executes malicious scripts. This could lead to session hijacking, phishing, or other client-side attacks against users of the reader.
Fix
Implemented strict HTML sanitization using
github.com/microcosm-cc/bluemondaywith theUGCPolicy. The sanitization is applied immediately after thego-readabilityrendering step, ensuring that thecontentBufused for all output formats (HTML, JSON, Markdown) is free of XSS vectors.Verification
Added
api/xss_test.gowhich:<img onerror=...>and<a href="javascript:...">.onerrororjavascript:.go test ./apito verify.PR created automatically by Jules for task 1476203048334350304 started by @lucasew