Skip to content

Transformations

York Earwaker edited this page Sep 26, 2024 · 38 revisions

Transformations

Pipelines, document transformation, file format conversion,

ETL Extract, Transform, Load, structured data and smaller data volumes, better quality control and data governance

ELT Extract, Load, Transform, large data volumes, ustructured data, complex tranformations, reduced latecy and faster data laoding

Cleansing, data wrangling,

XSLT data transformations

Input Documents Transformation Application Output Documents
XML Input, data to transform >>>>
>>>> XSLT Porcessor, & bespoke code >>>> Result Documents, output
XSLT Code, transformation rules >>>>

XProc XML pipelines <todo: table as for xslt>

Other file processing pipelines

Transformation

  • XML W3C, WS input, throughput, output
  • XProc W3C, WP, WS, an XML transformaiton language, pipelines, xml, xhtml, text, json, binary
  • XSLT W3C, WP, transformation rules
  • Regular expressions - supported in XSLT 2.0, aka rational expressions,

Data model throughput

  • XSD W3C, schema for xml docs,
  • XSL W3C, styesheet for xml docs,
  • XQuery W3C, querying xml docs
  • XPath W3C, navigating xml docs
  • XDM - XQuery & XPath Data Model
  • XSL FO W3C, output to PDF, Postscript, to be depricated? discontinued in 2013?
  • XForms W3C, programitic,

Data input - XSLT will only take XML conformant things as data imput

  • XML
  • RDB tables, as XML
  • GIS, as XML
  • XML output from others sources

To data i/o - the linked data world of things, non XML conformant things must first be parsed and transformed to XML by means other than XSLT processor,

  • JSON WP, serialisation
  • CSV WP, flat file, comman speparated
  • RDF W3C, WP, SPO triples
  • OWL W3C, WP, ontology
  • HTML IETF, WP, presentation

To print

  • PDF Adobe
  • PostScript Adobe?

To image?

  • PNG?
  • A N Other?

XSLT 3.0 support for

  • Java
  • .NET
  • C/C++
  • Python
  • PHP
  • Node.js

Browser

  • XSLT 3.0 JavaScript lib can be hosted in browser
  • Modern browser native support for XSLT 1.0

Tools

  • Cocoon, WP, ASF,
  • XML Calabash WS open source XProc processor,
  • MorganaXProc WS, open source XProc processor,
  • XSLT Tester,
  • XSLT Fiddle
  • XSDEditor, free open source,

Libs

XSLT libs

  • libxslt - C, XSLT 1.0, & more, GNOME
  • libxml2 - C, XML parsing, GNOME
  • Saxon - Java, JavaScript, .NET, open/closed source, XSLT 3.0, ... WS
  • Xalan - Java, C++, XSLT 1.0, XPath 1.0, ASF, WS

XML libs STaX - Java, streaming XML processing. pull based parser. large XML docs, no XSLT | XQuery engines, no XSD validation, SAX - Java, XML processing, push based parser Xerces2 - Java, XML schema validation, XSD 1.1,

CSV libs Apache Commons CSV - Java, CSV file parsing and generation.

JSON libs <todo: consider moving>

  • Jackson GH, WP, WS, Java, seralisation, JSON, Avro, BSON, CBOR, CSV, Smile, Properties, Protobuf, TOML, XML, and YAML
  • Gson GH, WP Java, serialization, JSON, Google

Command line processors

  • many of the software libraries listed elsewhere.
  • xsltproc - Linux/macOS
  • xml2json - Linus/macOS
  • xsltjson - ?
  • msxsl - Windows - discontinued, posed secutiry threat

Examples

  • xsltproc transform.xsl hello_world.csv > hello_world.xml
  • xsltproc transform.xsl hello_world.csv | xml2json -i xml -o json > hello_world.json

File extensions

  • xml, xaml, xsl, xslt, xsd, dtd, xul, kml, svg, mxml, xsml,
  • json, n3, owl, rdf, rdfs, ttl, ...
  • html, xhtml, pdf,
  • fo?
  • <todo: others to list, >

References

  • Comparison of XML editors, WP
  • XML Schema editors, WP

Clone this wiki locally