Skip to content

Commit 7f6fdbc

Browse files
committed
docs: document length-prefixed bytes arrays and null-terminated strings
1 parent d2f5430 commit 7f6fdbc

2 files changed

Lines changed: 67 additions & 0 deletions

File tree

docs/guide.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,8 @@ class Chars:
269269

270270
### Bytes arrays
271271

272+
#### Fixed-length
273+
272274
Fixed-length byte arrays can be represented in both size modes by annotating a
273275
field with `typing.Annotated` and a positive length. The field's unpacked Python
274276
representation will be a `bytes` object zero-padded or truncated to the
@@ -287,6 +289,69 @@ class FixedLength:
287289
FixedLength(fixed=b'Hello, wor')
288290
```
289291

292+
!!! tip "Tip: null-terminated strings"
293+
294+
Fixed-length `bytes` arrays are truncated to the exact length specified in
295+
the `Annotated` argument. If you require `bytes` arrays to always be
296+
null-terminated (e.g. for passing to a C API), add a [`PadAfter`
297+
annotation](#manual-padding) to the field:
298+
299+
```python
300+
@dcs.dataclass_struct()
301+
class FixedLengthNullTerminated:
302+
# Equivalent to `unsigned char[11]` in C
303+
fixed: Annotated[bytes, 10, dcs.PadAfter(1)]
304+
```
305+
306+
```python
307+
>>> FixedLengthNullTerminated(b"0123456789A").pack()
308+
b'0123456789\x00'
309+
```
310+
311+
#### Length-prefixed
312+
313+
One issue with fixed-length `bytes` arrays is that data shorter than the length
314+
will be zero-padded when unpacking to the Python type:
315+
316+
```python
317+
>>> packed = FixedLength(b'Hello').pack()
318+
>>> packed
319+
b'Hello\x00\x00\x00\x00\x00'
320+
>>> FixedLength.from_packed(packed)
321+
FixedLength(fixed=b'Hello\x00\x00\x00\x00\x00')
322+
```
323+
324+
An alternative is to use *length-prefixed arrays*, also known as [*Pascal
325+
strings*](https://en.wikipedia.org/wiki/Pascal_string). These store the length
326+
of the array in the first byte, meaning that the available length without
327+
truncation is 255. To use length-prefixed arrays, annotate a `bytes` with
328+
[`LengthPrefixed`][dataclasses_struct.LengthPrefixed]:
329+
330+
```python
331+
from typing import Annotated
332+
333+
@dcs.dataclass_struct()
334+
class PascalStrings:
335+
s: Annotated[bytes, dcs.LengthPrefixed(10)] # (1)!
336+
```
337+
338+
1. The length passed to `LengthPrefixed` must be between 2 and 256 inclusive.
339+
340+
```python
341+
>>> packed = PascalStrings(b"12345").pack()
342+
>>> packed
343+
b'\x05Hello\x00\x00\x00\x00'
344+
>>> PascalStrings.from_packed(packed)
345+
PascalStrings(s=b'Hello')
346+
```
347+
348+
!!! note
349+
350+
The size passed to [`LengthPrefixed`][dataclasses_struct.LengthPrefixed] is
351+
the size of the packed representation of the field *including the size
352+
byte*, so the maximum length the array can be without truncation is one less
353+
than the size.
354+
290355
### Fixed-length arrays
291356

292357
Fixed-length arrays can be represented by annotating a `list` field with

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ markdown_extensions:
4040
- pymdownx.inlinehilite
4141
- pymdownx.snippets
4242
- pymdownx.superfences
43+
- pymdownx.details
4344
- pymdownx.magiclink
45+
- admonition
4446
watch:
4547
- dataclasses_struct

0 commit comments

Comments
 (0)