refactor: python package to be compatible with proto3 envelope#1849
Conversation
Reviewer's GuideRefactors the messaging stack and Python client to use a proto3 Envelope-based runtime message-type registry, replacing the old build-time index generation, updates build/test wiring to the new proto3 layout, and improves documentation/comments throughout the C++ messaging headers and tests. Sequence diagram for Python client initialization using proto3 EnvelopesequenceDiagram
actor User
participant MessageLib
participant Connection
participant envelope_pb2 as envelope_pb2_module
participant Envelope as Envelope_message
participant Server
User->>MessageLib: create MessageLib(...)
MessageLib->>Connection: __init__(...)
activate Connection
MessageLib->>Connection: registerEnvelope(envelope_pb2_module, "Envelope")
Connection->>envelope_pb2_module: getattr(envelope_pb2_module, "Envelope")
envelope_pb2_module-->>Connection: Envelope_message
Connection->>Envelope_message: read DESCRIPTOR.fields
loop for each message field
Connection->>Connection: store msg_type, desc in maps
end
Connection-->>MessageLib: mappings ready
deactivate Connection
MessageLib->>Connection: sendRecv(VersionRequest(), 10000)
activate Connection
Connection->>envelope_pb2_module: create VersionRequest()
Connection->>Connection: lookup msg_type via descriptor
Connection->>Envelope_message: wrap VersionRequest into Envelope
Connection->>Server: send framed Envelope
Server-->>Connection: reply Envelope frame
Connection->>Envelope_message: unwrap inner message
Connection-->>MessageLib: VersionReply, msg_type
deactivate Connection
MessageLib->>Connection: sendRecv(GetAuthStatusRequest(), 10000)
Connection->>Server: send framed Envelope
Server-->>Connection: AuthStatusReply Envelope
Connection-->>MessageLib: AuthStatusReply, msg_type
MessageLib-->>User: initialized, auth status known
Class diagram for updated Python client message-type registry and usageclassDiagram
direction LR
class Connection {
-logger _logger
-dict~int,Descriptor~ _msg_desc_by_type
-dict~string,Descriptor~ _msg_desc_by_name
-dict~Descriptor,int~ _msg_type_by_desc
+__del__()
+registerEnvelope(envelope_module, envelope_class_name="Envelope")
+registerProtocol(msg_module)
..other existing methods..
}
class MessageLib {
-Connection _conn
-string new_client_avail
-bool _auth
-string _uid
+__init__(...)
+manualAuthByPassword(uid, password)
+manualAuthByToken(token)
+getDailyMessage()
+sendRecv(msg, timeout=None)
..other existing methods..
}
class envelope_pb2 {
<<module>>
+class Envelope
+class VersionRequest
+class GetAuthStatusRequest
+class AuthenticateByPasswordRequest
+class AuthenticateByTokenRequest
+class DailyMessageRequest
..other message types..
}
class Envelope_py {
<<protobuf message>>
+DESCRIPTOR
+fields
}
class Descriptor {
<<protobuf Descriptor>>
+string name
+int number
+Descriptor message_type
+repeated FieldDescriptor fields
}
Connection --> Descriptor : uses in mappings
Connection o--> envelope_pb2 : during registerEnvelope
envelope_pb2 --> Envelope_py : class Envelope
Envelope_py --> Descriptor : DESCRIPTOR
MessageLib --> Connection : composition
MessageLib ..> envelope_pb2 : uses message classes
class anon_pb2 {
<<deprecated module>>
}
class auth_pb2 {
<<deprecated module>>
}
MessageLib ..x anon_pb2 : replaced by envelope_pb2
MessageLib ..x auth_pb2 : replaced by envelope_pb2
note for Connection "registerProtocol is deprecated; registerEnvelope inspects Envelope.DESCRIPTOR.fields and builds type maps at runtime"
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
registerEnvelope, consider adding basic validation/error handling (e.g., check thatenvelope_class_nameexists on the module and that it has aDESCRIPTOR) and avoid silently doing nothing or raising an obscureAttributeErrorwhen the wrong module/class is passed. - When deriving message types from the envelope descriptor, you may want to restrict registration to only the fields that are intended as payloads (for example, by checking
field.containing_oneofor a naming convention) to avoid accidentally registering internal or future metadata fields as message types.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `registerEnvelope`, consider adding basic validation/error handling (e.g., check that `envelope_class_name` exists on the module and that it has a `DESCRIPTOR`) and avoid silently doing nothing or raising an obscure `AttributeError` when the wrong module/class is passed.
- When deriving message types from the envelope descriptor, you may want to restrict registration to only the fields that are intended as payloads (for example, by checking `field.containing_oneof` or a naming convention) to avoid accidentally registering internal or future metadata fields as message types.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
….com:ORNL/DataFed into 1845-DAPS-refactor-repo-server-to-use-proto3
…ub.com:ORNL/DataFed into 1847-DAPS-refactor-python-client-to-use-proto3
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The MessageAttribute::toString documentation now calls out all four enum values but the implementation still only handles ID and KEY; either extend the function to cover STATE and CORRELATION_ID or adjust the comment to avoid implying they are supported.
- ProtoBufMap::requiresAuth still uses a hard-coded set of anonymous message names; consider deriving this from the same envelope/type registry used elsewhere so that new or renamed message types don't silently get the wrong auth behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The MessageAttribute::toString documentation now calls out all four enum values but the implementation still only handles ID and KEY; either extend the function to cover STATE and CORRELATION_ID or adjust the comment to avoid implying they are supported.
- ProtoBufMap::requiresAuth still uses a hard-coded set of anonymous message names; consider deriving this from the same envelope/type registry used elsewhere so that new or renamed message types don't silently get the wrong auth behavior.
## Individual Comments
### Comment 1
<location> `python/datafed_pkg/datafed/Connection.py:164-171` </location>
<code_context>
+ #
# @param msg_module - Protobuf module (imported *_pb2 module)
#
def registerProtocol(self, msg_module):
- # Message descriptors are stored by name created by protobuf compiler
- # A custom post-proc tool generates and appends _msg_name_to_type with
- # defined DataFed-sepcific numer message types
-
+ import warnings
+ warnings.warn(
+ "registerProtocol() is deprecated, use registerEnvelope() instead",
+ DeprecationWarning,
+ stacklevel=2,
+ )
for name, desc in sorted(msg_module.DESCRIPTOR.message_types_by_name.items()):
msg_t = msg_module._msg_name_to_type[name]
self._msg_desc_by_type[msg_t] = desc
</code_context>
<issue_to_address>
**issue (bug_risk):** Deprecated registerProtocol still relies on _msg_name_to_type even though the generator was removed.
Because `pyproto_add_msg_idx.py` was removed, this now assumes `msg_module._msg_name_to_type` exists when it may not, causing `AttributeError` at runtime for any remaining `registerProtocol` callers. Please either add a compatibility path that derives the mapping without `_msg_name_to_type`, or make this method fail fast with a clear error (or remove it) so consumers don’t hit a confusing runtime failure.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def registerProtocol(self, msg_module): | ||
| # Message descriptors are stored by name created by protobuf compiler | ||
| # A custom post-proc tool generates and appends _msg_name_to_type with | ||
| # defined DataFed-sepcific numer message types | ||
|
|
||
| import warnings | ||
| warnings.warn( | ||
| "registerProtocol() is deprecated, use registerEnvelope() instead", | ||
| DeprecationWarning, | ||
| stacklevel=2, | ||
| ) | ||
| for name, desc in sorted(msg_module.DESCRIPTOR.message_types_by_name.items()): |
There was a problem hiding this comment.
issue (bug_risk): Deprecated registerProtocol still relies on _msg_name_to_type even though the generator was removed.
Because pyproto_add_msg_idx.py was removed, this now assumes msg_module._msg_name_to_type exists when it may not, causing AttributeError at runtime for any remaining registerProtocol callers. Please either add a compatibility path that derives the mapping without _msg_name_to_type, or make this method fail fast with a clear error (or remove it) so consumers don’t hit a confusing runtime failure.
…-to-use-proto3' into 1847-DAPS-refactor-python-client-to-use-proto3
* [DAPS-1850] - refactor web server proto3 (#1852)
c531a4b
into
1845-DAPS-refactor-repo-server-to-use-proto3
Ticket
Description
How Has This Been Tested?
Artifacts (if appropriate):
Tasks
Summary by Sourcery
Adopt a proto3 envelope-based messaging scheme across C++ and Python components, deriving message type IDs from the Envelope descriptor at runtime and aligning framing, mapping, and tests with the new protocol.
Enhancements:
Build:
Tests: