Skip to content

fix(content_disposition): enforce RFC 7230 quoted-pair grammar strictly#53

Merged
nao1215 merged 1 commit into
mainfrom
fix/issue-50-quoted-pair-rfc7230
May 20, 2026
Merged

fix(content_disposition): enforce RFC 7230 quoted-pair grammar strictly#53
nao1215 merged 1 commit into
mainfrom
fix/issue-50-quoted-pair-rfc7230

Conversation

@nao1215

@nao1215 nao1215 commented May 20, 2026

Copy link
Copy Markdown
Owner

Summary

multipartkit/content_disposition.parse previously honoured every \X escape inside a quoted-string by silently dropping the backslash and keeping the second character, even when X was outside the RFC 7230 §3.2.6 quoted-pair grammar (HTAB / SP / VCHAR / obs-text). That let NUL, CR, LF, and other ASCII control bytes flow into the decoded name / filename, enabling NUL-smuggling against downstream C-string consumers. This PR aligns the parser with the RFC and rejects illegal quoted-pairs with the new Error(InvalidQuotedPair(_)).

Changes

  • src/multipartkit/error.gleam: add InvalidQuotedPair(String) variant with a doc-comment that explains the grammar and the security motivation.
  • src/multipartkit/internal/text.gleam: add read_token_or_quoted_strict plus its supporting QuotedFault (QuotedSyntax / QuotedPairInvalid) and a quoted-pair second-character validator that allows HTAB, SP, VCHAR (%x21-7E), and obs-text (%x80-FF) while rejecting NUL, LF, CR, the remaining %x00-1F control bytes, and DEL (%x7F). The existing lenient read_token_or_quoted / read_quoted_string is left untouched so Content-Type parsing in header.gleam keeps its current behaviour.
  • src/multipartkit/content_disposition.gleam: route the parameter-value parse through the strict variant, mapping QuotedPairInvalid to Error(InvalidQuotedPair(original)) and QuotedSyntax to the existing Error(InvalidContentDisposition(original)). The module doc-comment now documents the RFC 7230 §3.2.6 contract.
  • test/regression_cd_quoted_pair_test.gleam: the eight acceptance tests from issue content_disposition.parse over-eager quoted-pair unescape — drops backslash before non-VCHAR (NUL, CR, LF) silently #50 — canonical \\ and \" round-trips, the VCHAR-escape backslash-drop case, NUL / CR / LF rejection, a no-escape baseline, and a UTF-8 name regression check.
  • CHANGELOG.md: one-line ### Changed entry under ## [Unreleased].

Design Decisions

Picked option A (strict RFC 7230 compliance) from the issue: rejecting bad escapes upstream means callers never see a name or filename carrying smuggled control bytes, which closes the security angle the issue called out without forcing every caller to add their own sanitiser. The strict parser lives next to the existing lenient one rather than replacing it because Content-Type parameter parsing (header.gleam) is out of scope for #50 and shares the helper — keeping both variants avoids unrelated behaviour changes there. The second-character validator enumerates the rejected ASCII code points explicitly with \u{...} literals so the grammar boundaries are obvious at the call site, and any grapheme outside that block (including all multi-byte UTF-8) is treated as obs-text and allowed, matching the RFC.

Closes #50

@coderabbitai

coderabbitai Bot commented May 20, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@nao1215 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 58 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1eba6476-a7ed-4ea5-a6bd-545ba7549677

📥 Commits

Reviewing files that changed from the base of the PR and between 567591f and 2000da4.

📒 Files selected for processing (5)
  • CHANGELOG.md
  • src/multipartkit/content_disposition.gleam
  • src/multipartkit/error.gleam
  • src/multipartkit/internal/text.gleam
  • test/regression_cd_quoted_pair_test.gleam
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-50-quoted-pair-rfc7230

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nao1215 nao1215 merged commit ab187f8 into main May 20, 2026
5 checks passed
@nao1215 nao1215 deleted the fix/issue-50-quoted-pair-rfc7230 branch May 20, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

content_disposition.parse over-eager quoted-pair unescape — drops backslash before non-VCHAR (NUL, CR, LF) silently

1 participant