Escape inner > when emitting verbatim tag suffixes#58
Merged
Conversation
A verbatim tag suffix (`!<...>`) containing an inner `>` followed by a tag terminator emitted YAML that closed the tag early and failed to reparse. Percent-escape `>` to `%3E` in the verbatim form so the emitted tag matches what the parser un-escapes on read. Widen the tag property-test generator to produce suffixes containing indicator characters, and add regression coverage for awkward suffixes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A verbatim tag suffix (
!<...>) containing an inner>followed by a tag terminator was emitted with the>left literal, which closed the verbatim tag early and produced YAML that failed to reparse. For example,parse_str("!<%3E:G> value")parses a node with tag suffix>:G, but re-emitting it produced!<>:G> value, which the parser then rejected with "tag must be separated from the node value by whitespace."Fix
push_uri_escaped_tag_suffixnow percent-escapes>to%3Ein the verbatim form. The parser already un-escapes%3Eback to>(decode_tag_uri_escapes), so emit and parse stay inverse. Handle-form escaping (colons) is unchanged, and no other currently-literal characters are over-escaped.arb_tag) is widened to generate suffixes containing indicator characters (including>,:, space, and flow indicators); this had been narrowed in a way that hid the bug. The round-trip property now exercises these suffixes and passes because of the emitter fix.>:G,a>:b,>>:c,a>:b c) and for the parse-reachable!<%3E:G>repro.Testing
cargo build --all-features,cargo clippy --all-features(clean)cargo test --all-features(all suites pass)parser_propertiesround-trip property run repeatedly withPROPTEST_CASES=4000to exercise the widened, randomized generator