[format.string.escaped] does not specify boundary conditions for sequences of ill-formed code units

[\[format.string.escaped\]p2.2](https://eel.is/c++draft/format.string.escaped#2.2) states:
> For each code unit sequence X in S that either encodes a single character, is a shift sequence, or is a sequence of ill-formed code units, processing is in order as follows:
What constitutes a "sequence of ill-formed code units" is not specified. That is fine for implementation-defined encodings, but a precise definition could be specified for UTF-8, UTF-16, and UTF-32.

[Unicode PR-121](https://www.unicode.org/review/pr-121.html) provides a definition for "entire ill-formed subsequence" that is a good candidate for how a "sequence of ill-formed code units" might be defined:
> In these policy statements, "entire ill-formed subsequence" refers to all code units in the ill-formed subsequence up to but not including the start of the next well-formed code unit sequence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[format.string.escaped] does not specify boundary conditions for sequences of ill-formed code units #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[format.string.escaped] does not specify boundary conditions for sequences of ill-formed code units #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions