Add caps to free-form executable content fields#147
Conversation
foxpatch-aleph
left a comment
There was a problem hiding this comment.
The PR adds useful input validation constraints to prevent resource exhaustion. However, there are redundant field definitions in child classes that override parent fields with identical constraints, and type inconsistencies where child classes use Optional[Dict] instead of Optional[Dict[str, Any]]. Additionally, no tests are added to verify the new constraints work correctly.
aleph_message/models/execution/instance.py (line 39): The metadata and authorized_keys fields are already defined in BaseExecutableContent with the same constraints. These overrides are redundant and create maintenance overhead. Consider removing them and inheriting from the parent class instead. Also, Optional[Dict] should be Optional[Dict[str, Any]] for type consistency with the parent.
aleph_message/models/execution/program.py (line 81): Same issue as InstanceContent: metadata and authorized_keys are redundantly overridden from BaseExecutableContent with identical constraints. Remove these and inherit from the parent. Also fix the type from Optional[Dict] to Optional[Dict[str, Any]].
aleph_message/tests/test_models.py (line 1): No tests are added to verify the new constraints. Consider adding tests that verify ValidationError is raised when: metadata exceeds 256 entries, authorized_keys exceeds 256 items or 8192 chars, variables exceeds 256 entries or key/value length limits, volumes exceeds 256 items, replaces exceeds 128 chars, and volume labels exceed 256 chars.
cd444f9 to
a1c0b7c
Compare
foxpatch-aleph
left a comment
There was a problem hiding this comment.
The PR adds useful constraints to prevent abuse of free-form fields, but has a significant issue: child classes (InstanceContent, ProgramContent) redundantly redefine metadata and authorized_keys fields that are already defined in the parent class BaseExecutableContent. This creates type inconsistencies (Optional[Dict] vs Optional[Dict[str, Any]]) and unnecessary duplication. The redundant field definitions should be removed from child classes to properly inherit from the parent. Additionally, no tests were added to verify the new validation constraints work correctly.
aleph_message/models/execution/instance.py (line 39): This metadata field is already defined in BaseExecutableContent with proper type Optional[Dict[str, Any]]. Redefining it here with Optional[Dict] loses type information. Remove this redundant definition to inherit from parent.
aleph_message/models/execution/instance.py (line 41): This authorized_keys field is already defined in BaseExecutableContent. Remove this redundant definition. The parent already has max_length=MAX_AUTHORIZED_KEYS and proper description.
aleph_message/models/execution/program.py (line 81): Both metadata and authorized_keys are already defined in BaseExecutableContent. Remove these redundant definitions from ProgramContent to properly inherit from the parent class.
aleph_message/tests/test_models.py (line 1): Consider adding tests to verify the new constraints work correctly. For example: test that a metadata dict with >256 entries raises ValidationError, test that an authorized_keys list with >256 entries raises ValidationError, test that a variable key >128 chars raises ValidationError, etc.
a1c0b7c to
9f2831d
Compare
foxpatch-aleph
left a comment
There was a problem hiding this comment.
This PR correctly adds validation caps to execution content fields using Pydantic's Field constraints. The implementation is sound: max_length properly constrains list lengths, dict key counts, and string lengths. The Annotated types for VariableKey/Value and AuthorizedKey correctly enforce per-item constraints. However, there are redundant field redeclarations in InstanceContent and ProgramContent that shadow inherited fields from BaseExecutableContent, and the PR lacks tests to verify the new constraints work as expected.
aleph_message/models/execution/instance.py (line 39): The metadata and authorized_keys fields (lines 39-45) are re-declared here but are already defined in BaseExecutableContent. This is redundant. Consider removing these redeclarations to avoid potential confusion and maintain DRY principles. Note: The type annotation Optional[Dict] here differs from the parent's Optional[Dict[str, Any]].
aleph_message/models/execution/program.py (line 81): Similarly, metadata and authorized_keys (lines 81-84) are re-declared despite being inherited from BaseExecutableContent. These can be removed to reduce redundancy.
aleph_message/tests/test_models.py (line 1): Consider adding tests for the new validation constraints. For example: test that metadata with >256 entries fails, test that authorized_keys with >256 keys fails, test that a key >8192 chars fails, test that variables with >256 entries fails, test that variable keys >128 chars or values >4096 chars fail, test that volumes >256 fail, test that replaces >128 chars fails, test that volume comment/mount/name >256 chars fails. This ensures the constraints are properly enforced.
|
Thanks for the review.
|
foxpatch-aleph
left a comment
There was a problem hiding this comment.
This PR correctly adds caps to free-form executable content fields to prevent abuse. The implementation uses Pydantic's max_length constraint appropriately for both collections (lists/dicts) and strings. The constants are clearly defined, the tests comprehensively cover boundary conditions (at limit and over limit), and all 9 new tests pass. The refactoring that removes duplicate field definitions from InstanceContent and ProgramContent is correct since they now properly inherit from BaseExecutableContent.
aleph_message/models/execution/abstract.py (line 28): Consider adding a docstring comment explaining why these specific limits were chosen (e.g., 'SSH keys typically max at ~4KB for RSA-4096, so 8192 provides headroom').
aleph_message/tests/test_models.py (line 779): Minor nit: The test comment says 'exercising comment is enough' but consider also testing mount on EphemeralVolume and name on PersistentVolume explicitly for clarity, even though they share the same constant.
Cap metadata (256 entries), authorized_keys (256 x 8192 chars), variables (256 x 128/4096), volumes (256 entries), replaces (128 chars) and volume comment/mount/name (256 chars).
The overrides were needed when BaseExecutableContent declared these fields as required (no default); the previous commit on this branch added default=None upstream, so the subclass redeclarations no longer carry their weight. They also weakened the type (Optional[Dict] vs the parent's Optional[Dict[str, Any]] with VariableKey/VariableValue annotations) and would let the caps drift between parent and child. Remove the redeclarations and prune the now-unused imports. The payment override stays because the parent field has no default.
One test per constraint, each exercising the at-limit pass and the over-limit rejection, keyed on the named constants so the assertions track the caps if they move: - metadata entries - authorized_keys list length and per-item length - variables entries, key length, value length - volumes list length - replaces length - volume label length (single test on EphemeralVolume.comment, since comment/mount/name share MAX_VOLUME_LABEL_LENGTH) Existing InstanceContent/ProgramContent fixtures cover the "valid" baseline, so these tests focus on the negative boundary.
a359493 to
f56ca52
Compare
foxpatch-aleph
left a comment
There was a problem hiding this comment.
This PR correctly implements caps on free-form executable content fields to prevent abuse. The implementation uses Pydantic's validation features appropriately with Annotated types for individual field lengths and max_length for collection sizes. All 9 new boundary tests pass and verify both the limit and over-limit cases. The refactoring in instance.py and program.py removes duplicate field definitions that are now properly inherited from BaseExecutableContent. No regressions in existing tests.
Cap metadata (256 entries), authorized_keys (256 x 8192 chars), variables (256 x 128/4096), volumes (256 entries), replaces (128 chars) and volume comment/mount/name (256 chars).
Test plan
hatch -e testing run pytest(unchanged — same 4 pre-existing network-flaky tests fail on main)