Skip to content

feat(examples): add usage examples with integration tests#156

Merged
kessplas merged 33 commits into
stagingfrom
tonyknap/examples
Apr 16, 2026
Merged

feat(examples): add usage examples with integration tests#156
kessplas merged 33 commits into
stagingfrom
tonyknap/examples

Conversation

@texastony

Copy link
Copy Markdown

Examples

KMS Keyring Put/Get (kms_keyring_put_get_example.py)

  • Basic encrypt/decrypt roundtrip with KMS Keyring
  • Demonstrates encryption context bound to S3 bucket and key
  • Includes sample KMS key policy for encryption context validation

Legacy Decrypt (legacy_decrypt_example.py)

  • Decrypt V1 objects encrypted by older S3 Encryption Clients
  • Demonstrates enable_legacy_wrapping_algorithms and enable_legacy_unauthenticated_modes
  • Uses REQUIRE_ENCRYPT_ALLOW_DECRYPT commitment policy

Delayed Auth Streaming (delayed_auth_streaming_example.py)

  • Streaming decryption of large files with enable_delayed_authentication
  • Reads decrypted content in 1 MB chunks without buffering entire object

Instruction File (instruction_file_example.py)

CI

  • Added examples step to python-integ.yml workflow
  • Registered examples pytest mark in pyproject.toml

Test Results

  • 3 passed, 1 xfailed

texastony added 15 commits March 6, 2026 11:13
Introduce BufferedDecryptingStream that wraps the S3 StreamingBody and
decrypts lazily on first read. No plaintext is released until the entire
ciphertext is read and the GCM auth tag is verified, matching the Java
S3EC's BufferedCipherSubscriber behavior.

- Add stream.py with BufferedDecryptingStream (read, iter_chunks, close)
- Pipeline returns BufferedDecryptingStream instead of decrypted bytes
- Event handler passes stream directly as parsed["Body"]
…ations

Add enable_delayed_authentication field to S3EncryptionClientConfig,
defaulting to False. Includes duvet specification citations from
client.md#enable-delayed-authentication.
…eaming

Add DelayedAuthDecryptingStream that releases plaintext incrementally
via AES-GCM cipher.update() before tag verification. The GCM tag (last
16 bytes) is held back and verified on stream exhaustion via
finalize_with_tag(). Matches Java S3EC CipherSubscriber pattern.
… handler

- Pipeline.decrypt() accepts enable_delayed_authentication param and
  returns DelayedAuthDecryptingStream when True, BufferedDecryptingStream
  when False. Raises error if param is None (must be explicitly set).
- Event handler passes config flag to pipeline.
- Remove duplicated defaults from pipeline params — config is single
  source of truth.
- Update unit tests to pass instruction_suffix explicitly.
…delayed auth

Replace individual delayed-auth tests with pytest.mark.parametrize
covering both buffered and delayed-auth modes across ascii, empty,
unicode, utf-8, latin-1, binary data, and no-body cases.
…n citations

Add unit tests that verify the behavioral contract of both stream modes:
- DelayedAuthDecryptingStream releases plaintext before GCM tag verification
- BufferedDecryptingStream withholds all plaintext until tag is verified

Includes duvet type=test citations for enable-delayed-authentication spec.
…layed-auth streaming

- Add 50 MB V2 delayed-auth streaming decryption test against static object
- Add 50 MB V3 test (skipped, V3 not yet implemented)
- Add 61 GiB V2/V3 placeholder tests marked @pytest.mark.large (skipped,
  static objects not yet created)
- Parametrize existing instruction file tests with buffered/delayed-auth modes
…h plaintext

- Fix pipeline decrypt() docstring to reflect both return types
- Add assertion that full delayed-auth stream output matches expected plaintext
- Remove 61 GiB V2/V3 placeholder tests (static objects not yet created)
- Remove large pytest mark registration (no longer used)
BufferedDecryptingStream and DelayedAuthDecryptingStream now take a
decryptor object and tag_length instead of raw key/nonce. This makes
them reusable across algorithm suites (GCM, key-committing GCM, CBC).
… algorithm suite dispatch

- Merge staging's key commitment, commitment policy, algorithm suite
  config, CBC decryption, and V3 decryption into the streaming branch
- Algorithm suite dispatch now returns streaming decryptors for all paths
  (GCM, key-committing GCM, CBC) instead of eager in-memory decryption
- Add unpadder support to streams for CBC PKCS7 padding removal
- Update all tests to pass enable_delayed_authentication and use .read()
  on stream results
- Add cipher_tag_length_bytes and cipher_block_size_bytes properties to AlgorithmSuite
- Replace hardcoded GCM_TAG_LENGTH and PKCS7(128) with algorithm suite properties
- Remove dead code: _decrypt_cbc_content()
- Make _make_decrypting_stream and _decrypt_kc_gcm_content static methods
- Remove GCM_TAG_LENGTH constant from stream.py
- Make tag_length required (no default) on DelayedAuthDecryptingStream
@texastony texastony marked this pull request as draft March 20, 2026 17:46
@texastony texastony changed the base branch from staging to tonyknap/feat-buffered-decryption-aes-gcm March 20, 2026 17:46
@texastony texastony marked this pull request as ready for review March 20, 2026 17:47
- Make instruction_suffix and enable_delayed_authentication positional args
- Move duvet annotation to BufferedDecryptingStream return
- Hardcode CBC to always stream (no auth tag, matches Java behavior)
- Move duvet annotations from _decrypt_kc_gcm_content to _decrypt_kc_gcm_streaming
- Remove unused _decrypt_kc_gcm_content method
- Fix DelayedAuthDecryptingStream CBC unpadding (peek + incremental unpadder)
- Add CBC unit tests for both stream types (roundtrip, chunked, finalize, padding)
- Add delayed authentication mode integration test with duvet citation
Split DelayedAuthDecryptingStream into DelayedAuthCBCDecryptingStream
and DelayedAuthGCMDecryptingStream. CBC and GCM are mutually exclusive
paths — CBC uses an unpadder with no auth tag, GCM uses a rolling tag
buffer with no padding — so the single-class design carried impossible
field combinations and conditional branching in read().

All three stream classes (BufferedDecryptingStream and the two new
delayed-auth classes) now extend botocore's StreamingBody with
@define(slots=False), inheriting iter_chunks, iter_lines, __iter__,
and __next__ for free.

Updated _make_decrypting_stream dispatch in pipelines.py and test
constructors in test_stream.py.
# Delayed-Auth Streams: Empty Read Behavior

## Problem

DelayedAuthGCMDecryptingStream.read(amt) can return b"" mid-stream before
the stream is exhausted. This happens when the read size is small relative
to the GCM tag length (16 bytes) — the stream can't distinguish ciphertext
from the trailing auth tag until it has accumulated more than tag_length bytes.

Example with 20 bytes of ciphertext+tag, read(7):

1. read(7) → 7 bytes buffered, <= 16 → returns b""
2. read(7) → 14 bytes buffered, <= 16 → returns b""
3. read(7) → 20 bytes total, splits ciphertext/tag → returns plaintext

In Python, read() returning b"" conventionally signals EOF. This breaks
common patterns like: while chunk := stream.read(7)

DelayedAuthCBCDecryptingStream does not have this issue — CBC cipher.update()
always produces output when given input.

## Java Behavior

Java's CipherSubscriber (the delayed-auth equivalent) does the same thing.
When cipher.update() returns null/empty, it explicitly sends an empty
ByteBuffer downstream. This is fine in Java's reactive streams model where
empty emissions are normal signaling. In Python's read() API, it's surprising.

## Options Considered

1. Keep as-is, document it — match Java semantics.
2. Loop internally in read() — more Pythonic, but violates io.py Reader.read
   contract: "If size is specified, at most size items will be read."
3. Require minimum read size (chosen) — raise if amt < tag_length + 1.

## ESDK-Python Comparison

ESDK-Python's StreamDecryptor never has this problem because it decrypts at
the frame level. Each frame has its own IV and tag, so authentication is
per-frame. S3EC operates on a single non-framed GCM ciphertext where the tag
is simply appended — the stream must separate tag from ciphertext on the fly.
- Refactor DelayedAuthGCMDecryptingStream to use content_length instead
  of rolling tag buffer and peek-ahead
- Add ContentLength validation in on_get_object_after_call
- Pass content_length through pipeline to all stream constructors
- Rename stream classes: BufferedDecryptingGCMStream → GCMBufferedDecryptingStream,
  DelayedAuthCBCDecryptingStream → CBCDecryptingStream,
  DelayedAuthGCMDecryptingStream → GCMDelayedAuthDecryptingStream
- Track _amount_read for progress in all three streams
- Remove minimum read size restriction
- GCMBufferedDecryptingStream.__enter__ returns self for consistent
  context manager behavior across all stream classes
- GCMDelayedAuthDecryptingStream raises on content_length < tag_length
- Clarify content_length comment as ciphertext content length
- KMS Keyring put/get roundtrip with encryption context
- Legacy V1 object decryption with enable_legacy_wrapping_algorithms
- Delayed authentication streaming decryption for large files
- Instruction file decryption with default and custom suffixes (xfail, #152)
- Register examples pytest mark in pyproject.toml
- Add examples step to CI workflow
…to get_object kwarg

Move instruction_file_suffix from a client-level config attribute to a
per-request keyword argument (InstructionFileSuffix) on get_object().
This allows a single S3EncryptionClient to use different instruction file
suffixes per request, matching the spec requirement that custom suffixes
be supported on GetObject requests.

The suffix is passed through thread-local context to the plugin event
handler, following the same pattern as EncryptionContext.

Resolves #152.
Base automatically changed from tonyknap/feat-buffered-decryption-aes-gcm to staging April 2, 2026 17:15
- KMS Keyring put/get roundtrip with encryption context
- Legacy V1 object decryption with enable_legacy_wrapping_algorithms
- Delayed authentication streaming decryption for large files
- Instruction file decryption with default and custom suffixes (xfail, #152)
- Register examples pytest mark in pyproject.toml
- Add examples step to CI workflow
@texastony texastony force-pushed the tonyknap/examples branch from 8a7b73d to 016767a Compare April 2, 2026 17:21

@kessplas kessplas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes to #159

Comment thread examples/src/delayed_auth_streaming_example.py

chunks = []
while True:
chunk = body.read(CHUNK_SIZE)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my other comment, we should probably wrap this with a try / except to demonstrate the behavior where an error is thrown during read.

Comment thread examples/src/instruction_file_example.py
# The client will fetch "<key>.custom-suffix-instruction" for the encryption metadata.
custom_config = S3EncryptionClientConfig(
keyring=keyring,
instruction_file_suffix=".custom-suffix-instruction",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to account for #159

Comment thread examples/src/kms_keyring_put_get_example.py
Comment thread examples/test/test_i_instruction_file_example.py Outdated
@texastony texastony requested a review from kessplas April 9, 2026 18:18

@kessplas kessplas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kessplas kessplas merged commit 09f3b1e into staging Apr 16, 2026
5 checks passed
@texastony texastony deleted the tonyknap/examples branch April 16, 2026 18:21
@texastony texastony restored the tonyknap/examples branch April 16, 2026 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants