TF-4524 Fix blank plain-text email caused by backslash-hex false-positive in HTML sanitizer #4527
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis pull request implements a defense-in-depth fix for a plain-text email rendering bug where certain backslash-hex escape patterns caused message bodies to become blank. The solution introduces a new Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/test/utils/sanitize_plain_text_html_output_transformer_test.dart`:
- Line 12: The shared test helper currently trims every transformer output via
the process helper: change String process(String input) =>
transformer.process(input, htmlEscape).trim(); to return the raw transformer
output without trimming so whitespace-preservation regressions are detectable;
locate and edit the process helper function and update any tests that depended
on trimmed output to explicitly call .trim() where appropriate (references:
process helper, transformer.process, htmlEscape).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7748a983-ddd9-4f5f-9941-a96e0a2239cb
📒 Files selected for processing (4)
core/lib/presentation/utils/html_transformer/text/sanitize_plain_text_html_output_transformer.dartcore/test/utils/sanitize_plain_text_html_output_transformer_test.dartdocs/adr/0089-fix-plain-text-email-blank-on-backslash-hex-patterns.mdlib/features/email/data/local/html_analyzer.dart
…lse-positive Signed-off-by: dab246 <tdvu@linagora.com>
|
This PR has been deployed to https://linagora.github.io/tmail-flutter/4527. |
…izer false-positive Signed-off-by: dab246 <tdvu@linagora.com>
…izer false-positive
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/test/utils/html_transform_text_plain_test.dart`:
- Around line 103-126: The tests use brittle substring checks like
contains('<script>'), contains('<img '), contains('<div ') which can miss
equivalent malicious variants; update the XSS assertions in the group to use
boundary-aware tag matchers (e.g., regex or token-boundary checks) against the
transform output so they reliably detect any opening tag forms (reference the
transform(...) calls in the three tests) — replace the plain contains assertions
with matchers that assert absence of an opening tag pattern like `<\s*script\b`,
`<\s*img\b`, and `<\s*div\b` (or equivalent boundary-based matchers) to ensure
robust detection.
In `@core/test/utils/plain_text_transformer_pipeline_test.dart`:
- Around line 113-117: The test "SHOULD escape and block <script> tags injected
in plain text" currently checks only for the exact substring "<script>", which
misses variants like "<script type='...'>"; update the assertion to use a regex
boundary check (e.g., look for "<script\b" case-insensitive) against the
pipeline output produced by runPipeline so the expect in that test asserts that
out does not match /<script\b/i, ensuring any <script ...> forms are caught;
keep the test name and use the existing runPipeline output variable.
In `@test/features/email/data/local/html_analyzer_text_plain_test.dart`:
- Around line 96-105: The tests for
transformPlain(TextPlainEmailCorpus.scriptTag) and
transformPlain(TextPlainEmailCorpus.imgTracker) only check for exact substrings
like "<script>" and "<img " which miss variants with attributes or different
spacing; change the assertions to assert the transformed content does not match
a tag-opening pattern instead (e.g. use a RegExp for "<\\s*script\\b" and
"<\\s*img\\b" or the language's equivalent negative match) so any "<script...>"
or "<img...>" or self-closing variants are caught; update the two tests to use
these regex-based non-matching checks against the returned .content from
transformPlain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d5f2217f-522b-4c35-bbdf-19724b17e6fa
📒 Files selected for processing (10)
core/test/fixtures/email_content_corpus.dartcore/test/fixtures/html_email_corpus.dartcore/test/utils/html_transform_text_html_test.dartcore/test/utils/html_transform_text_plain_test.dartcore/test/utils/plain_text_transformer_pipeline_test.darttest/features/email/data/local/html_analyzer_text_html_test.darttest/features/email/data/local/html_analyzer_text_plain_test.darttest/fixtures/email_html_corpus.darttest/fixtures/html_email_corpus.darttest/fixtures/text_plain_email_corpus.dart
✅ Files skipped from review due to trivial changes (2)
- test/fixtures/text_plain_email_corpus.dart
- test/fixtures/email_html_corpus.dart
…L sanitizer false-positive Signed-off-by: dab246 <tdvu@linagora.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/fixtures/text_plain_email_corpus.dart (1)
22-33: 💤 Low valueConsider using raw multiline strings for improved readability.
The current concatenation of raw strings (for backslash preservation) with regular strings (for newline interpretation) is correct but less readable than alternatives.
♻️ Refactor suggestion: use raw multiline strings
- static const String phpStackTrace = - r'Fatal error: Uncaught \App\Http\Exception\NotFound' '\n' - r'Stack trace: `#0` \App\Router\Dispatcher::dispatch()' '\n' - 'In file: /var/www/html/public/index.php on line 42'; + static const String phpStackTrace = r'''Fatal error: Uncaught \App\Http\Exception\NotFound +Stack trace: `#0` \App\Router\Dispatcher::dispatch() +In file: /var/www/html/public/index.php on line 42''';Apply the same pattern to
phpNamespaceWithUrl:- static const String phpNamespaceWithUrl = - 'On Monday, Alice wrote:\n' - r"throw new \App\DB\Exception\AuthFailed('access denied');" '\n' - 'See https://jira.example.com/issues/1234 for context.'; + static const String phpNamespaceWithUrl = r'''On Monday, Alice wrote: +throw new \App\DB\Exception\AuthFailed('access denied'); +See https://jira.example.com/issues/1234 for context.''';🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/fixtures/text_plain_email_corpus.dart` around lines 22 - 33, The phpNamespaceWithUrl constant (and similar string constants like phpStackTrace/windowsFilePath/goPackagePath) uses concatenated raw and regular strings which is correct but hard to read; replace those concatenations with a single raw multiline Dart string (e.g., r"""...""" or r'''...''') so backslashes are preserved and newlines are explicit in the literal—update the declaration of phpNamespaceWithUrl to use one raw multiline string containing the whole value (including the \App\DB\Exception text, the newline, and the URL) and apply the same pattern to phpStackTrace, windowsFilePath, and goPackagePath where appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/fixtures/text_plain_email_corpus.dart`:
- Around line 22-33: The phpNamespaceWithUrl constant (and similar string
constants like phpStackTrace/windowsFilePath/goPackagePath) uses concatenated
raw and regular strings which is correct but hard to read; replace those
concatenations with a single raw multiline Dart string (e.g., r"""...""" or
r'''...''') so backslashes are preserved and newlines are explicit in the
literal—update the declaration of phpNamespaceWithUrl to use one raw multiline
string containing the whole value (including the \App\DB\Exception text, the
newline, and the URL) and apply the same pattern to phpStackTrace,
windowsFilePath, and goPackagePath where appropriate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 60eea130-1482-43d0-947b-8800c002689f
📒 Files selected for processing (2)
core/test/utils/standardize_html_sanitizing_transformers_test.darttest/fixtures/text_plain_email_corpus.dart
… by HTML sanitizer false-positive Signed-off-by: dab246 <tdvu@linagora.com>
… caused by HTML sanitizer false-positive Signed-off-by: dab246 <tdvu@linagora.com>
…t email caused by HTML sanitizer false-positive Signed-off-by: dab246 <tdvu@linagora.com>
There was a problem hiding this comment.
Gates Passed
3 Quality Gates Passed
See analysis details in CodeScene
Quality Gate Profile: The Bare Minimum
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
Issue
#4524
Problem
Plain-text emails containing
\XXpatterns — such as PHP/Go namespace paths like\Sabre\DAV\Exception\Forbidden— were rendered as a completely blank white screen.Root cause was a four-level chain:
StandardizeHtmlSanitizingTransformers(an HTML sanitizer) was incorrectly placed first in thetextPlainpipeline — plain text is not HTML.NodeSanitizer._isEncodedJs()indart-neatsusesRegExp(r'\\[0-9a-f]{2}')to detect unicode-escape obfuscation. This matches any single\XXpair, so\DAin\Sabre\DAVtriggers it._shouldStripText()callsnode.remove()→ entire text node deleted.body.innerHtmlbecomes empty → blank screen.Solution
Rebuilt the
textPlaintransformer pipeline with three dedicated layers:SanitizeAutolinkHtmlTransformersSanitizePlainTextHtmlOutputTransformer(new)<a>tags with safehref(validated viaUri.tryParse) and attribute allowlist; unwrap everything elsePersistPreformattedTextTransformerThe HTML sanitizer (
StandardizeHtmlSanitizingTransformers) is removed from the plain-text path entirely.It remains in the HTML path where it belongs.
Related
Demo
Summary by CodeRabbit
Bug Fixes
New Features
Tests