Technical Specification

Overview

DeltaXML Merge merges three or more well-formed XML 1.0 files and generates a single XML file describing the differences between the files. The file representing the differences is known as a delta file. We use the term 'file' in this specification but the XML inputs and outputs may use other datatype representations including strings, in-memory trees or event streams.

The DeltaXML Merge software provides a procedural interface that can be embedded in other Java-based or, in some but not all releases, .NET-based software.

Delta Files

A DeltaXML delta file has the same basic structure as the files that have been compared, with some additional attributes and elements. An XML namespace (the DeltaXML namespace) distinguishes these additional elements and attributes from those found in the input files. The delta file includes unchanged elements and attributes. The delta file provides a structured representation of the input files as a single file in which common data is shared.

XML Processing

DeltaXML Merge is built on DeltaXML Core and handles whitespace in the same way.

Comments and processing instructions can be preserved so that they appear in the delta file. Internal parsed general entities can be expanded or preserved. CDATA sections can be expanded or preserved.

DeltaXML handles namespaces and will detect elements in the same namespace even if the namespace prefix values are different. An element or attribute in a namespace may have a different namespace prefix in the delta file from that used in the input file.

Merge Process

DeltaXML Merge merges the XML files, taking account of the tree structure of the files and identifying corresponding elements in the files. Corresponding elements will have the same element local name and namespace and will have corresponding parent elements. The root elements of the files must have the same local name and namespace. DeltaXML Merge determines the alignment at each level in the tree structure between the files. The alignment algorithm determines the longest common subsequence of corresponding elements. The alignment algorithm gives precedence to elements that are exactly equal over those that have just the same element name and namespace.

The XML inputs are loaded into DeltaXML Merge in order. The order is recorded in an attribute on the root element of the merged delta file.

For a delta with type 'merge-concurrent', one input file is considered to the common ancestor from which the other input files have been derived. As each successive file is loaded into the delta, the file is first aligned with the common ancestor and this alignment will take precedence over alignment between this file and other files previously loaded into the delta.

DeltaXML Merge can use key values, identified to the software using an attribute in the DeltaXML namespace, to identify corresponding elements in the inputs. Alignment of elements with the same namespace, local name and key will take precedence in the alignment process over other alignment criteria. Elements with different keys in the files will not be considered to correspond.

DeltaXML Merge treats elements as ordered, i.e. a change in order is identified as a change. Optionally any element can be identified to DeltaXML as orderless, using an attribute in the DeltaXML namespace which must be present in all files. In this case the child elements may appear in any order in the files and DeltaXML will match corresponding elements. Within an orderless element, a corresponding element is an element with the same name, namespace and key or an element that is exactly equal through its tree structure. Orderless elements must have element-only content.

DeltaXML ignores the order of attributes. Changes to attributes are represented using elements in the DeltaXML namespace.

PCDATA items can be treated as a whole or subdivided into words.

System Requirements

DeltaXML Merge requires either:

  • A Java Standard Edition JRE version 6.0 or later. We test on: Solaris (Intel 64 bit), Mac OSX (Intel 64 bit) and Windows Server (Intel 64 bit) platforms. For support any reported problem should be reproducible on at least one of these platforms.
  • The Microsoft .NET Framework version 3.5 or later. NOTE: the ability to run under .NET is not available in some releases of the product.

Patent granted 2001270901; EP1325432; 60134999.7; US8,196,135B2; CA 2416876; US 8,423,518 B2; EP2174238; 602008031420.0. Patents pending 1315520.5, 14275178.3, 14/474,377.