Xerces Support

1. Introduction

For many years, the XML parser Apache Xerces has been distributed with all of our products. While not required, its status has been as our recommended XML parser. In our Core 7.2 release, some new features will require Xerces as the parser, with a minimum required version of 2.9.0. This is the version that is distributed with our product releases. This document outlines which functions require Xerces and gives details on how to use a different parser if you wish to do so.

2. New Features

2.1. DCP - Document Comparator Pipeline

DCP is the DocumentComparator's equivalent of the PipelinedComparator's DXP, although it is defined as an XMLSchema rather than as a DTD. Processing of the schema-defined XML files makes use of some validation features that are only available in Xerces 2.9.0 and above.

2.2. Whitespace Handling

The latest version of Core includes improvements to the handling of whitespace. A requirement for this improved functionality was to add the detection of 'ignorable whitespace' to the existing com.deltaxml.pipe.filters.LexicalPreservation code. This detection requires access to methods available only in Xerces 2.9.0 and above.

3. Configuration changes

3.1. Configuration Properties

The com.deltaxml.config.JavaPlatform.useSAXParserFactory property is now set to 'false' by default. This means that when a SAXParser is created inside Core, it is done using an explicit class name that loads Xerces.

In the .NET API, the explicit instantiation of a Xerces class is the only option available.

3.2. Classpath

While the jar files for the command-line tool and the GUI have always contained a classpath containing Xerces, deltaxml.jar did not. This has now been updated to include xercesImpl.jar on paths including deltaxml.jar.

4. What can I do without Xerces?

While use of the com.deltaxml.cores9api.PipelinedComparatorS9 and com.deltaxml.cores9api.DocumentComparator classes now requires Xerces, it is still possible to use some of our legacy classes without requiring Xerces.

The com.deltaxml.core.PipelinedComparator can still be used without Xerces as the parser as long as LexicalPreservation functionality is not used. If it is, a ClassNotFoundException will be thrown with details about the required Xerces classes. To configure the use of a different parser, please see the 'Using a different parser' section below.

5. Using a different parser

Three changes muct be made in order to use a different parser:

  1. Update the configuration properties to use the SAXParserFactory
  2. Remove or rename the existing xercesImpl.jar
  3. Optionally add a different parser to the classpath

1. The com.deltaxml.config.JavaPlatform.useSAXParserFactory must be set to true. See Configuration Properties for more information on how to do this.

2. The existing xercesImpl.jar will still be loaded from the deltaxml.jar classpath. In order to stop this from occurring, either move or rename the xercesImpl.jar in the Core release directory.

3. This stage is optional. Completing the steps above will cause the JVM's internal parser to be used (this is not recommended). To replace it with a different parser, make the relevant jar files available on the classpath. If the jar files do not advertise themselves as implementing a SAXParserFactory, you will need to set up the JVM property appropriately.