-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Support stream DataFrame interface in iotdb python client #17035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds a streaming DataFrame interface to the IoTDB Python client, enabling efficient retrieval of large query results in configurable chunks.
Changes:
- Introduced
has_next_df()andnext_df()methods toSessionDataSetfor iterating over query results as DataFrames - Implemented internal buffering mechanism in
IoTDBRpcDataSetwithnext_dataframe()method that returns chunks of exactlyfetch_sizerows - Refactored
result_set_to_pandas()into_process_buffer()and_build_dataframe()helper methods to support both batch and streaming DataFrame operations
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| iotdb-client/client-py/iotdb/utils/iotdb_rpc_dataset.py | Added streaming buffer (__df_buffer), implemented next_dataframe() method for chunked DataFrame retrieval, refactored DataFrame construction into reusable helper methods, added Optional type import |
| iotdb-client/client-py/iotdb/utils/SessionDataSet.py | Added public API methods has_next_df() and next_df() that delegate to underlying RPC dataset streaming implementation |
| iotdb-client/client-py/table_model_session_example.py | Demonstrated usage of new streaming API with has_next_df()/next_df() pattern |
| iotdb-client/client-py/session_example.py | Demonstrated usage of new streaming API with has_next_df()/next_df() pattern |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #17035 +/- ##
============================================
- Coverage 39.28% 39.28% -0.01%
Complexity 212 212
============================================
Files 5104 5104
Lines 341484 341490 +6
Branches 43520 43522 +2
============================================
- Hits 134149 134148 -1
- Misses 207335 207342 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|



This pull request introduces a new streaming DataFrame interface to the IoTDB Python client, enabling users to efficiently fetch large query results in manageable chunks. The main changes add support for iterating over result sets by DataFrame, buffering data internally, and updating example scripts to demonstrate the new API.
Streaming DataFrame API:
has_next_df()andnext_df()methods toSessionDataSetto allow users to check for and retrieve the next chunk of results as a DataFrame.IoTDBRpcDataSetwith__df_buffer, and provided methods_has_buffered_data()andnext_dataframe()to manage and return DataFrames of sizefetch_size. [1] [2]_process_buffer()and_build_dataframe()to support both streaming and batch DataFrame construction.Example and usability improvements:
session_example.pyandtable_model_session_example.pyto demonstrate usage of the new streaming DataFrame API. [1] [2]Optionalto support new method signatures.