Skip to content

Raise error when multiple IPC streams are received#23

Open
pmm-motif wants to merge 1 commit intoduckdb:mainfrom
pmm-motif:arrow-scan-multi-ipcs
Open

Raise error when multiple IPC streams are received#23
pmm-motif wants to merge 1 commit intoduckdb:mainfrom
pmm-motif:arrow-scan-multi-ipcs

Conversation

@pmm-motif
Copy link

@pmm-motif pmm-motif commented Nov 25, 2023

It's perhaps an obscure case, but I had issues with using Node.js DuckDB library with register_buffer() when multiple Arrow Table streams were passed as an argument. This use-case somewhat works with DuckDB WASM (it's possible to append additional Arrow Tables) so I initially thought it's some sort of bug.

However, as I was investigating this, I realized that I'm probably using it incorrectly in the first place (though it's not clearly documented). I.e. when multiple IPC Buffers are provided, they must be all part of the same Arrow IPC Stream.

The underlying Arrow IPC Reader seems to be silently stopping when getting into EOS. What I'm proposing in this PR is detecting when more messages are available after getting EOS and throwing an exception rather than just silently consuming these.

It's plausible that I'm missing something or that this check should go Apache Arrow instead. Curious of your thoughts on this

Thank you!

@pmm-motif pmm-motif force-pushed the arrow-scan-multi-ipcs branch from 503d783 to 9981d03 Compare November 25, 2023 19:49
@pmm-motif pmm-motif changed the title Raise error when multiple IPC tables are received Raise error when multiple IPC streams are received Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant