I've been handed a database in which one of the fields contains XML documents as compressed blobs (ZIP/PK archives). Rather than writing the file out to disk every time and reading it in using read_xml I tried using the archive package's bindings to libarchive to read it directly. It works great except that I'm trying to close the connections properly afterward and learned that R crashes if I incorrectly close the rawConnection object before calling read_xml.
xml_text <- '<text>Hi</text>'
blob <- charToRaw(xml_text)
blob_con <- rawConnection(blob)
spec_con <- archive::archive_read(blob_con)
close(blob_con)
xml2::read_xml(spec_con)
The error I get when R crashes is:
Error in (function (con, rw = "") : invalid connection
terminate called after throwing an instance of 'cpp11::unwind_exception'
what(): std::exception
I honestly don't know whether this is an archive problem or an xml2 problem so happy to repost there, I just don't know enough about the internals to learn where the problem itself is and read_xml is what's currently causing the crash when called.
I'm on R 4.5.1 with xml2 1.3.8 and archive version 1.1.12.