Skip to content

Split source XML files to reduce XML-to-SPARQL ETL memory consumption #6

@Conal-Tuohy

Description

@Conal-Tuohy
  • Split each source file into a folder of individual record files, using streaming file splitter.
  • Refactor XML-to-SPARQL pipeline to individually load record files from these folders, and pass to the RDF conversion XSLT.
  • Pass the record's type to the conversion XSLT as a parameter (replacing the file type recognition code in the XSLT)
  • Replace the stylesheet which marks some Piction images as preferred with equivalent SPARQL update query.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions