fix(content_disposition): enforce RFC 7230 quoted-pair grammar strictly#53
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
multipartkit/content_disposition.parsepreviously honoured every\Xescape inside a quoted-string by silently dropping the backslash and keeping the second character, even whenXwas outside the RFC 7230 §3.2.6quoted-pairgrammar (HTAB / SP / VCHAR / obs-text). That letNUL,CR,LF, and other ASCII control bytes flow into the decodedname/filename, enablingNUL-smuggling against downstream C-string consumers. This PR aligns the parser with the RFC and rejects illegal quoted-pairs with the newError(InvalidQuotedPair(_)).Changes
src/multipartkit/error.gleam: addInvalidQuotedPair(String)variant with a doc-comment that explains the grammar and the security motivation.src/multipartkit/internal/text.gleam: addread_token_or_quoted_strictplus its supportingQuotedFault(QuotedSyntax/QuotedPairInvalid) and aquoted-pairsecond-character validator that allowsHTAB,SP,VCHAR(%x21-7E), andobs-text(%x80-FF) while rejectingNUL,LF,CR, the remaining%x00-1Fcontrol bytes, andDEL(%x7F). The existing lenientread_token_or_quoted/read_quoted_stringis left untouched so Content-Type parsing inheader.gleamkeeps its current behaviour.src/multipartkit/content_disposition.gleam: route the parameter-value parse through the strict variant, mappingQuotedPairInvalidtoError(InvalidQuotedPair(original))andQuotedSyntaxto the existingError(InvalidContentDisposition(original)). The module doc-comment now documents the RFC 7230 §3.2.6 contract.test/regression_cd_quoted_pair_test.gleam: the eight acceptance tests from issue content_disposition.parse over-eager quoted-pair unescape — drops backslash before non-VCHAR (NUL, CR, LF) silently #50 — canonical\\and\"round-trips, the VCHAR-escape backslash-drop case,NUL/CR/LFrejection, a no-escape baseline, and a UTF-8nameregression check.CHANGELOG.md: one-line### Changedentry under## [Unreleased].Design Decisions
Picked option A (strict RFC 7230 compliance) from the issue: rejecting bad escapes upstream means callers never see a
nameorfilenamecarrying smuggled control bytes, which closes the security angle the issue called out without forcing every caller to add their own sanitiser. The strict parser lives next to the existing lenient one rather than replacing it because Content-Type parameter parsing (header.gleam) is out of scope for #50 and shares the helper — keeping both variants avoids unrelated behaviour changes there. The second-character validator enumerates the rejected ASCII code points explicitly with\u{...}literals so the grammar boundaries are obvious at the call site, and any grapheme outside that block (including all multi-byte UTF-8) is treated asobs-textand allowed, matching the RFC.Closes #50