-
Notifications
You must be signed in to change notification settings - Fork 0
Transformations
Transformations
Pipelines, document transformation, file format conversion,
ETL Extract, Transform, Load, structured data and smaller data volumes, better quality control and data governance
ELT Extract, Load, Transform, large data volumes, ustructured data, complex tranformations, reduced latecy and faster data laoding
Cleansing, data wrangling,
XSLT data transformations
| Input Documents | Transformation Application | Output Documents |
|---|---|---|
| XML Input, data to transform >>>> | ||
| >>>> XSLT Porcessor, & bespoke code | >>>> Result Documents, output | |
| XSLT Code, transformation rules >>>> |
XProc XML pipelines <todo: table as for xslt>
Other file processing pipelines
Transformation
- XML W3C, WS input, throughput, output
- XProc W3C, WP, WS, an XML transformaiton language, pipelines, xml, xhtml, text, json, binary
- XSLT W3C, WP, transformation rules
- Regular expressions - supported in XSLT 2.0, aka rational expressions,
Data model throughput
- XSD W3C, schema for xml docs,
- XSL W3C, styesheet for xml docs,
- XQuery W3C, querying xml docs
- XPath W3C, navigating xml docs
- XDM - XQuery & XPath Data Model
- XSL FO W3C, output to PDF, Postscript, to be depricated? discontinued in 2013?
- XForms W3C, programitic,
Data input - XSLT will only take XML conformant things as data imput
- XML
- RDB tables, as XML
- GIS, as XML
- XML output from others sources
To data i/o - the linked data world of things, non XML conformant things must first be parsed and transformed to XML by means other than XSLT processor,
- JSON WP, serialisation
- CSV WP, flat file, comman speparated
- RDF W3C, WP, SPO triples
- OWL W3C, WP, ontology
- HTML IETF, WP, presentation
To print
- PDF Adobe
- PostScript Adobe?
To image?
- PNG?
- A N Other?
XSLT 3.0 support for
- Java
- .NET
- C/C++
- Python
- PHP
- Node.js
Browser
- XSLT 3.0 JavaScript lib can be hosted in browser
- Modern browser native support for XSLT 1.0
Tools
- Cocoon, WP, ASF,
- XML Calabash WS open source XProc processor,
- MorganaXProc WS, open source XProc processor,
- XSLT Tester,
- XSLT Fiddle
- XSDEditor, free open source,
Libs
XSLT libs
- libxslt - C, XSLT 1.0, & more, GNOME
- libxml2 - C, XML parsing, GNOME
- Saxon - Java, JavaScript, .NET, open/closed source, XSLT 3.0, ... WS
- Xalan - Java, C++, XSLT 1.0, XPath 1.0, ASF, WS
XML libs STaX - Java, streaming XML processing. pull based parser. large XML docs, no XSLT | XQuery engines, no XSD validation, SAX - Java, XML processing, push based parser Xerces2 - Java, XML schema validation, XSD 1.1,
CSV libs Apache Commons CSV - Java, CSV file parsing and generation.
JSON libs <todo: consider moving>
- Jackson GH, WP, WS, Java, seralisation, JSON, Avro, BSON, CBOR, CSV, Smile, Properties, Protobuf, TOML, XML, and YAML
- Gson GH, WP Java, serialization, JSON, Google
Command line processors
- many of the software libraries listed elsewhere.
- xsltproc - Linux/macOS
- xml2json - Linus/macOS
- xsltjson - ?
- msxsl - Windows - discontinued, posed secutiry threat
Examples
- xsltproc transform.xsl hello_world.csv > hello_world.xml
- xsltproc transform.xsl hello_world.csv | xml2json -i xml -o json > hello_world.json
File extensions
- xml, xaml, xsl, xslt, xsd, dtd, xul, kml, svg, mxml, xsml,
- json, n3, owl, rdf, rdfs, ttl, ...
- html, xhtml, pdf,
- fo?
- <todo: others to list, >
References