Split source XML files to reduce XML-to-SPARQL ETL memory consumption

- [ ] Split each source file into a folder of individual record files, using streaming file splitter.
- [ ] Refactor XML-to-SPARQL pipeline to individually load record files from these folders, and pass to the RDF conversion XSLT.
- [ ] Pass the record's type to the conversion XSLT as a parameter (replacing the [file type recognition code](https://github.com/NationalMuseumAustralia/Collection-API-ETL/blob/master/emu-to-crm.xsl#L17-L33) in the XSLT)
- [ ] Replace the [stylesheet which marks some Piction images as preferred](https://github.com/NationalMuseumAustralia/Collection-API-ETL/blob/c8138c3e4b6564424f0368657dcfa3d21fe6117a/add-preferred-tag-to-piction-images.xsl) with equivalent SPARQL update query.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split source XML files to reduce XML-to-SPARQL ETL memory consumption #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Split source XML files to reduce XML-to-SPARQL ETL memory consumption #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions