DeltaXML Core Delta Format (deltaV1)

1. DeltaXML Core Delta Format Description (deltaV1 for versions of DeltaXML Core prior to 5.0 (June 2008))

NOTE: If you are using DeltaXML Core version 5.0 or later please refer to the documentation on deltaV2 format.

The DeltaXML Delta format is a representation of the changes between two XML documents. A Delta can be re-combined with either of the original documents to generate the other, i.e. 'old'+delta to generate 'new' or 'new'-delta to generate 'old'. Delta files have the same overall structure as the files being compared, with a few additional attributes and elements. These special attributes and elements are introduced to represent the differences between the files.

In this description we denote the input documents as 'old' and 'new'.

2. Namespace and Prefix

The namespace for a Delta document is http://www.deltaxml.com/ns/well-formed-delta-v1 and the preferred prefix is deltaxml: .

Namespaces in the delta file are declared on the top (root) element. In general, the namespace prefixes used will be the same as those used in the input files. If the input files use different prefixes for the same namespace, the first one encountered will be adopted in the delta file. For all versions prior to 4.2, all elements that have a namespace are assigned a prefix in the delta file, and this is true even if some elements in the input files did not have prefixes. In cases where there is no prefix defined in the input files, a prefix will be generated, for example p0: .

3. Elements and Attributes

This is a list of the elements used by this format:

Element name Content Purpose
deltaxml:PCDATAmodify A sequence of one deltaxml:PCDATAold and one deltaxml:PCDATAnew element To indicate a change to the parsed character data (PCDATA) within an element
deltaxml:PCDATAold PCDATA, i.e. text To record a text item that appeared in the 'old' input document.
deltaxml:PCDATAnew PCDATA, i.e. text To record a text item that appeared in the 'new' input document.
deltaxml:exchange A sequence of one deltaxml:old and one deltaxml:new element To indicate an exchange, at an equivalent place in the two input documents, of two elements or of an element and a PCDATA string. A deltaxml:exchange will be used whenever two items in the files being compared are deemed to correspond with each other, because of their positions in the files, but they have a different type. For some processing of a delta it may be convenient to remove this wrapper element and a filter is provided to do this.
deltaxml:old A single element or PCDATA string. To record an item that appeared in the 'old' input document.
deltaxml:new A single element or PCDATA string. To record an item that appeared in the 'new' input document.

This is a list of the attributes used by Core Delta

Attribute name Content Purpose
deltaxml:delta

One of the values: add, delete, unchanged, WFmodify or WFmodifyUnordered.

The value "add" means the element apears only in the new document.

The value "delete" means the element appears only in the old document.

The value "unchanged" means that the element appears in both documents and there are no differences in attributes or child elements in the two documents.

The value "WFmodify" (Well Formed modify) means that the element appears in both old and new documents and is different.

The value "WFmodifyUnordered" is similar to WFmodify except that this is used for an element which is orderless, i.e. it had an attribute deltaxml:ordered="false".

To indicate how the containing element has been changed.
deltaxml:new-attributes The value will be a list of any attributes (name and value) that appear in the new input document. If an attribute appears also in deltaxml:old-attributes then this means it has been modified. An attribute that appears only in deltaxml:new-attributes has been added. To show changes to attributes.
deltaxml:old-attributes The value will be a list of any attributes (name and value) which appear in the old input document. If an attribute appears also in deltaxml:new-attributes then this means it has been modified. An attribute that appears only in deltaxml:old-attributes has been deleted. To show changes to attributes.
deltaxml:ordered The value may be 'true', the default, meaning all child elements are ordered. Or it may have the value  'false', meaning the child elements will be compared as if the order is not important (orderless). This is a control attribute and is not subject to change. This attribute may be used in the input documents to indicate whether or not the order of the child elements is significant.
deltaxml:key Any string which represents a key for the enclosing element within the context of the parent element. This is a control attribute and is not subject to change. This attribute may be used in the input documents to provide a key for an element, in order to identify correspondence between two elements at the same hierarchical level in each of the two input documents.

4. Description

There is no DTD or Schema for a Delta document, but the Delta will have the same look and feel as the original documents. There is a set of simple rules which apply to the  Delta format.

Elements, attributes and text that are identified by DeltaXML as common to both input documents are shared in the  Delta. A subtree that appears unchanged in one or more documents will appear in the Delta almost exactly as it appeared in the original document(s).

Added, deleted or changed attributes are encoded and their values are delimited using a single character. The character used to delimit attribute values will generally be a double quote, represented as the entity ", a single quote or a vertical bar. The character is picked according to the content of the attribute value, i.e. if it contains " then this cannot be used as a delimiter. If an attribute value that is changed includes all the delimiter characters, this will cause an error. From Version 2.4, the possible choice of delimiters has been increased to include: "'|~%^+`/\$?,;!

The handling of whitespace needs to be understood to avoid unexpected results. Whitespace is considered significant in XML except when a DTD or Schema is provided and the parser can identify some whitespace, e.g. between elements, as  ignorable. In this case, DeltaXML ignores it. Often it is best to remove all extra whitespace before comparison using one of the standard filters provided with DeltaXML.

Note that there is no representation of 'move' where an element is repositioned within its siblings. Such situations are represented using the delete, add or exchange options shown above. 

5. Rules

5.1. Summary of  Delta format

  1. The root element has a deltaxml:delta attribute with a value showing whether or not the two documents are the same.
  2. The deltaxml:delta attribute takes one of five allowed values.
  3. All elements will have a deltaxml:delta attribute unless there can be no changes to child elements or text, i.e. when the value of the attribute on an element is either add, delete or unchanged.
  4. The value of each deltaxml:delta attribute will be consistent with the deltaxml:delta attribute on its parent, i.e. the value of the deltaxml:delta attribute on the parent will either be deltaxml:delta="WFmodifyUnordered" or deltaxml:delta="WFmodify". Note that child elements of any element with deltaxml:delta="add", deltaxml:delta="delete" or deltaxml:delta="unchanged" will not have a deltaxml:delta attribute.
  5. An element with deltaxml:delta="delete" will appear with all its attributes and child elements exactly as it was in the old input document.
  6. An element with deltaxml:delta="add" will appear with all its attributes and child elements exactly as it was in the new input document.
  7. No child elements or attributes (except deltaxml:key and deltaxml:ordered attributes) from the input documents  will appear on an element with deltaxml:delta="unchanged" unless the delta is a full context delta.
  8. A child element of an element with deltaxml:delta="WFmodifyUnordered" can only have an attribute deltaxml:delta="WFmodify" if it also has a deltaxml:key attribute.
  9. Any text (PCDATA) that is different in the two input documents will appear as a grand-child within either a deltaxml:exchange element or a deltaxml:PCDATAmodify element.
  10.  Unchanged attributes of any element remain as attributes and appear only in the full context delta.
  11. Changed attributes are held in a deltaxml:old-attributes and deltaxml:new-attributes attributes.
  12. The deltaxml:key attribute and other control attributes remain as attributes and always have the same value as in the input files.

5.2. Full Context  Delta

A delta file normally represents just the changes between two files, and does not include data that has not changed. DeltaXML provides an option to generate a 'full delta' which includes unchanged data. The 'full delta' provides a structured representation of two files within a single file where the common data is shared. 

6. Examples

6.1. Examples of Delta for Elements

Document A Document B

<example>
  <person>
    <firstName/>

  </person>
</example>

<example>
  <person>
    <firstName/>
    <lastName/>
  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:delta="WFmodify">
  <person deltaxml:delta="WFmodify">
    <firstName deltaxml:delta="unchanged" />
    <lastName deltaxml:delta="add" />
  </person>
</example>

Element <lastName> is added in the 'new' document.

6.2. Examples of Delta for Text

Document A Document B

<example>
  <person>
    <firstName>J</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

<example>
  <person>
    <firstName>John</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

And the  Delta for this will be as follows:

Delta Comments

<example deltaxml:delta="WFmodify">
  <person deltaxml:delta="WFmodify">
    <firstName deltaxml:delta="WFmodify">
      <deltaxml:PCDATAmodify>
        <deltaxml:PCDATAold>J</deltaxml:PCDATAold>
        <deltaxml:PCDATAnew>John</deltaxml:PCDATAnew>
      </deltaxml:PCDATAmodify>
    </firstName>
    <lastName deltaxml:delta="unchanged">Smith</lastName>
  </person>
</example

The text in <firstName> is "J" in both the old document and "John" in the new.

The text in <lastName> is the same in both documents, and is shown here because this is a Full Context delta.

6.3. Examples of Delta for Attributes

Document A Document B

<example>
  <person gender="M" age="36">
    <firstName>J</firstName>
  </person>
</example>

<example>
  <person gender="M" age="37">
    <firstName>J</firstName>
  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:delta="WFmodify">
  <person deltaxml:delta="WFmodify"

      deltaxml:new-attributes="age="37""
      deltaxml:old-attributes="age="36""
      gender="M">
    <firstName deltaxml:delta="unchanged">J</firstName>
  </person>
</example>

The attribute 'gender' is unchanged and so appears as a regular attribute.

The attribute 'age' has a value of 36 in the old document and 37 in the new document.