amber/src/main/scala/org/apache/texera/amber/engine/architecture/sendsemantics/partitioners/Partitioner.scala defines the NetworkOutputBuffer class — the per-receiver batched-tuple sender that every concrete Partitioner subclass uses to push data downstream. It has stateful buffer + flush semantics that nothing currently pins:
| Path |
Behavior to pin |
addTuple below batchSize |
Tuple appended to the internal buffer; nothing sent yet. |
addTuple at the exact batchSize boundary |
Auto-flush — a single DataFrame carrying every buffered tuple is sent to the receiver. |
| Multiple auto-flush cycles |
Each batch produces exactly one DataFrame; tuples don't leak across batches; sequence numbers / receiver stay consistent. |
flush() on a non-empty buffer |
One DataFrame sent; buffer reset to empty. |
flush() on an empty buffer |
No-op (no DataFrame sent) — pin so a regression that sends an empty DataFrame per call breaks here. |
sendState |
Flush-bookended: any pending tuples drain first (in their own DataFrame), then a StateFrame is sent, then a trailing flush() runs (no-op since the post-state buffer is already empty). The bookend flushes ensure state ordering relative to data is preserved. |
| Constructor params |
batchSize defaults to ApplicationConfig.defaultDataTransferBatchSize; can be overridden. to and dataOutputPort exposed as val. |
The test wires a real NetworkOutputGateway with a recording handler so the assertions exercise the production codepath end-to-end (sequence number assignment, channel-id construction, payload type) rather than mocking the gateway.
Priority
P3 - Low
Task Type
amber/src/main/scala/org/apache/texera/amber/engine/architecture/sendsemantics/partitioners/Partitioner.scaladefines theNetworkOutputBufferclass — the per-receiver batched-tuple sender that every concretePartitionersubclass uses to push data downstream. It has stateful buffer + flush semantics that nothing currently pins:addTuplebelowbatchSizeaddTupleat the exactbatchSizeboundaryDataFramecarrying every buffered tuple is sent to the receiver.DataFrame; tuples don't leak across batches; sequence numbers / receiver stay consistent.flush()on a non-empty bufferDataFramesent; buffer reset to empty.flush()on an empty bufferDataFramesent) — pin so a regression that sends an emptyDataFrameper call breaks here.sendStateDataFrame), then aStateFrameis sent, then a trailingflush()runs (no-op since the post-state buffer is already empty). The bookend flushes ensure state ordering relative to data is preserved.batchSizedefaults toApplicationConfig.defaultDataTransferBatchSize; can be overridden.toanddataOutputPortexposed asval.The test wires a real
NetworkOutputGatewaywith a recording handler so the assertions exercise the production codepath end-to-end (sequence number assignment, channel-id construction, payload type) rather than mocking the gateway.Priority
P3 - Low
Task Type