-
Notifications
You must be signed in to change notification settings - Fork 4.2k
GH-49644: [Python] Support converting list of multi-dimensional arrays to FixedShapeTensor #50203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
84154db
e695d01
daed54f
832acee
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -908,8 +908,20 @@ class PyListConverter : public ListConverter<T, PyConverter, PyConverterTrait> { | |||||
|
|
||||||
| Status AppendNdarray(PyObject* value) { | ||||||
| PyArrayObject* ndarray = reinterpret_cast<PyArrayObject*>(value); | ||||||
| OwnedRef flattened; | ||||||
| if (PyArray_NDIM(ndarray) != 1) { | ||||||
| return Status::Invalid("Can only convert 1-dimensional array values"); | ||||||
| // GH-49644: a fixed-size list (e.g. the storage of a fixed-shape tensor) | ||||||
| // can be built from a multi-dimensional array by flattening it in C | ||||||
| // order. The total number of elements must still match the list size, | ||||||
| // which the builder validates below. 0-dimensional arrays and | ||||||
| // variable-sized lists remain restricted to 1-dimensional values. | ||||||
|
Comment on lines
+913
to
+917
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could be less verbose.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. okay sir 👍 |
||||||
| if (PyArray_NDIM(ndarray) < 2 || this->list_type_->id() != Type::FIXED_SIZE_LIST) { | ||||||
| return Status::Invalid("Can only convert 1-dimensional array values"); | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should probably say something like:
Suggested change
|
||||||
| } | ||||||
| flattened.reset(PyArray_Ravel(ndarray, NPY_CORDER)); | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why did you decided to use
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PyArray_Ravel(ndarray, NPY_CORDER) returns a 1-D, C-order view, which lets me reuse the existing 1-D append paths unchanged (the typed fast paths and the PySequence fallback). It's zero-copy when the input is already C-contiguous and only copies for non-contiguous/Fortran-order input, where a copy is unavoidable to get C-order data anyway, which also matches the fixed-shape-tensor's row-major storage. Alternatives: PyArray_Flatten(NPY_CORDER) gives the same result but always copies; a manual NpyIter/buffer loop could avoid even the view allocation but would mean duplicating the typed-append fast paths. Ravel was the smallest change with the best reuse/performance balance.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you have a look at I am not sure which would fit better but am interested in what you think.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also to note here - PyArray_Ravel seems to default to C order. So we should document somewhere that no matter the input arrays order we will return C order. (@AlenkaF please doublecheck me here :) )
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please look into @AlenkaF proposal here, it might be better to have the explicit approach here.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @AlenkaF PyArray_Ravel returns a 1-D C-order view so I could reuse the existing 1-D append paths, but per the discussion I'll switch to the explicit PyArray_CheckFromAny + NPY_ARRAY_C_CONTIGUOUS approach and read PyArray_DATA directly, it's more idiomatic here and lets me handle byte order in the same call.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rok Will do, switching to the explicit PyArray_CheckFromAny + NPY_ARRAY_C_CONTIGUOUS approach and reading PyArray_DATA directly, matching the existing pattern. That should also fold in the byte-order handling.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great, I think having more control can be helpful. Thanks for looking into it!
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you 🙏, i'm currently taking a look at the reference |
||||||
| RETURN_IF_PYERROR(); | ||||||
| value = flattened.obj(); | ||||||
| ndarray = reinterpret_cast<PyArrayObject*>(value); | ||||||
| } | ||||||
| if (PyArray_ISBYTESWAPPED(ndarray)) { | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder how |
||||||
| // TODO | ||||||
|
|
||||||
Uh oh!
There was an error while loading. Please reload this page.