CONFERENCE PAPER

XML Pipeline Performance

This paper describes advanced methods for optimizing XML pipeline performance. It extends our original smaller study into ‘Filter Pipeline Performance’ on the saxon-help email list.

Proving Pipeline Performance

We investigate how to improve XML pipeline performance concentrating on the mechanisms used to interlink the various pipeline components, such as XML parsers, XSLT filter stages, XML comparison, Java filters based on the SAX XMLFilter interface and serialization.

We have looked at optimisation techniques, particularly those related to using event-based and streaming filters coded in Java. Our findings show the best applications for building XML processing pipelines.

(Please note that the performance figures presented predate the Saxon 9.3 release which addresses some of the issues discussed).

Within this Conference Paper we:

  • Have a look at how run times and memory sizes affect different technique performance when building pipelines.
  • Discuss XML pipeline optimisation techniques and best practices.
  • Review performance at various stages of the pipeline configuration.

We found the best performance using custom Java code and s9api. As well as the performance benefits the s9api interface is easier to understand and generally more flexible than JAXP.

Related Media

DeltaXML and DeltaJSON products look within the structure of your data to identify relevant differences, so you can diagnose, debug and process data efficiently.

Localisation of text to multiple target languages has always presented unique challenges. Although it is easy to translate a single version of a document into different languages, it is much more difficult to maintain translations over multiple versions of a document.

A new best practice within the top business markets. This paper illustrates why in today’s economy you must oversee your document and data file management and change systems to continue the success of your company.