Two and Three Document DeltaV2 Format

1. DeltaXML DeltaV2 Format Description

The DeltaXML DeltaV2 format is a representation of two or more XML documents in a single document. Any of the original documents can be extracted from the Delta.

The base version for the format described here is Version 2.0. In addition to this, specifically labelled sections describe the extensions added in Version 2.1. This latter version supports the recording of overlapping changes (from formatting elements) and is only output by the document comparator introduced in DeltaXML Core 7.0 (currently in Beta) .

When the format is applied to two documents, these input documents are denoted A and B, and with three documents the inputs are denoted A, B and C.

2. Namespaces and Prefixes

Three namespaces are used in the DeltaV2 format to represent change, they are summarized in the following table:

usual prefix namespace URI purpose
deltaxmlhttp://www.deltaxml.com/ns/well-formed-delta-v1 Elements and attributes used to represent change between the inputs
dxa http://www.deltaxml.com/ns/non-namespaced-attribute The namespace of an element, used to represent an attribute, which was not in a namespace in one or both input files.
dxx http://www.deltaxml.com/ns/xml-namespaced-attribute The namespace of an element, used to represent an attribute in the XML namespace (corresponding to the URI: http://www.w3.org/XML/1998/namespace and always bound to the prefix xml:).  Such attributes include: xml:space, xml:id, xml:base and xml:lang.

The v1 component of the namespace URI is not a mistake.  This URI was initially chosen when opinions on versioning of XML vocabularies and namespaces were in their infancy; we now agree with the mainstream opinion that new versions of a format/language should keep the same namespace.

3. Elements and Attributes

This is a list of the elements used by this format:

Element name Content Purpose
deltaxml:attributes One of more elements, each of which has a local-name and namespace corresponding to an attribute beloning to the parent element. Details any differences between the attributes associated with the parent element.
deltaxml:attributeValue CDATA representing the value of an attribute To record an attribute value that appeared in one or more of the input documents.
deltaxml:textGroup One or more deltaxml:text elements. This element contains the variants of this segment of text.
deltaxml:text PCDATA, i.e. text To record a text item that appeared in one or more of the input documents.

This is a list of the attributes used by Delta

Attribute name Content Purpose
deltaxml:deltaV2

For a delta of two documents, one of the following values: A, B, A=B, A!=B

For a delta of three documents, one of the following values: A, B, C, A=B, A!=B, B=C, B!=C, A=C, A!=C, A=B=C, A!=B!=C, A=B!=C, A=C!=B, A!=B=C.

Details the documents in which this data item appeared. If it appeared in more than one document, this attribute also indicates whether the data items were the same or different. For example, deltaxml:deltaV2="A=B"  means that the element appears in documents A and B and is the same in both.

On the root element the following attributes must appear:

Attribute name Content Purpose
deltaxml:version This must be "2.0" This indicates the version of the delta format.
deltaxml:content-type This must have one of the values: full-context, changes-only or merge-concurrent. This indicates if the delta document contains just the changes (changes-only) or if the data that is unchanged in all the documents is also present (full-context).

On each deltaxml:attributes element the following attribute/value must be present:

Attribute name Content Purpose
deltaxml:ordered This must be "false" Indicates that the child elements, used to represent attributes, can appear in any order.

4. New Attributes added in Version 2.1

In Version 2.1 new attributes have been added to support the representation of changes in element structure with respect to content. In certain document comparison scenarios (e.g. when processing formatting elements) this can improve granularity, a fuller description and samples can be found in the Overlapping Hierarchies in DeltaV2 document.

This following table summarises the function and contents of each of these new attributes:

Attribute name Content Purpose
deltaxml:deltaTag An alphabetically sorted list of one or more comma-separated document identifers, e.g. "A", "B", "A,B". Indicates that the element tag was present in the document(s) identified, and all of its content is as included in this element in the delta file.
deltaxml:deltaTagStart as above Indicates that the element tag was present in the document(s) identified, but in that document the element contained not only the content here but more content. The additional content follows in zero or more elements marked deltaxml:deltaTagMiddle and ending with an element marked deltaxml:deltaTagEnd.
deltaxml:deltaTagMiddle As above Indicates that the content of this element was included in an element marked deltaxml:deltaTagStart which precedes this element. There is additional content, which follows in an element marked deltaxml:deltaTagMiddle or deltaxml:deltaTagEnd.
deltaxml:deltaTagEnd As above Indicates that the content of this element was included in an element marked deltaxml:deltaTagStart which precedes this element and this element contains the last part of the content for this element.

5. Description

There is no DTD or Schema for a Delta, but the Delta will have the same look and feel as the original documents. There is a set of simple rules which apply to the Delta format. In general terms, the Delta generated from a set of documents will be a union of these documents in the sense that all the data that appears in any of the documents will also appear in the Delta.

Elements, attributes and text that are identified by DeltaXML as common to two or more of the documents are shared in the Delta. A subtree that appears unchanged in one or more documents will appear in the Delta almost exactly as it appeared in the original document(s).

6. Schematron Rules

The definitive version of these rules is contained in a Schematron rule document: deltaV2-schematron.xml. - Note: this document does not yet support the new Version 2.1 attributes.

The same rules file is normally used for both the 2-input and 3-input deltas.  Only one of the rules, the one which specifies the deltaV2 allowable characters, needs to be changed should a schema be required to differentiate between the 2-way and 3-way deltas.  Should a specific 2-way delta be required the rule would be:

<report test="matches(@deltaxml:deltaV2, '[^AB!=]')">Delta value contains invalid characters</report>

Should an application need to be coded to handle both 2-way and 3-way delta results and be required to differentiate between them it can so so by looking at the characters contained in the deltaxml:deltaV2 attribute on the root node of a delta file.  The root node will always have a delta attribute and it will always reflect the number of inputs. 

7. Full Context Delta

In an analogous manner to the original Delta format, a Delta can either show just changes or can include all the data from the original documents. When only changes are shown, the content of any unchanged element is not reproduced in the delta. This will be for any element with a deltaxml:deltaV2 attribute equal to A=B or, for three documents, A=B=C.

8. Compatibility with original Delta format

The original delta format handled two documents only. The values of the deltaxml:deltaV2 attribute correspond with the deltaxml:delta attribute as follows: add becomes B, delete becomes A, unchanged becomes A=B and WFmodify becomes A!=B. The value WFmodifyUnordered is also A!=B but the deltaxml:ordered="false" attribute remains on the element so that knowledge of the unordered content remains in the delta document.

Attribute changes are now handled within markup to make processing easier. There is no longer a representation of an exchange between two elements or an element and text item.

9. Examples for Two Documents

9.1. Examples of Elements in Delta

Document A Document B

<example>
  <person>
    <firstName/>

    <tel/>
  </person>
</example>

<example>
  <person>
    <firstName/>
    <lastName/>

  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B">
 <person deltaxml:deltaV2="A!=B">
  <firstName deltaxml:deltaV2="A=B"/>
  <lastName deltaxml:deltaV2="B" />
  <tel deltaxml:deltaV2="A" />
 </person>
</example>

Element <firstName> appears in both documents, A and B, and is the same in both.

Element <lastName> appears in document B only.

Element <T> appears in document A only.

9.2. Examples of Text in Delta 

Document A Document B

<example>
  <person>
    <firstName>J</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

<example>
  <person>
    <firstName>John</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B">
 <person deltaxml:deltaV2="A!=B">
  <firstName deltaxml:deltaV2="A!=B">
    <deltaxml:textGroup deltaxml:deltaV2="A!=B">
      <deltaxml:text deltaxml:deltaV2="A">J</deltaxml:text>
      <deltaxml:text deltaxml:deltaV2="B">John</deltaxml:text>
    </deltaxml:textGroup>
   </firstName>
   <lastName deltaxml:deltaV2="A=B">Smith</lastName>
 </person>
</example>

The text in <firstName> is "J" in A and  "John" in document B.

The text in <lastName> is the same in both documents.

9.3. Examples of Attributes in Delta 

Document A Document B

<example>
  <person gender="M" age="36">
    <firstName>J</firstName>
  </person>
</example>

<example>
  <person gender="M" age="37">
    <firstName>J</firstName>
  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B">
 <person deltaxml:deltaV2="A!=B" gender="M">
  <deltaxml:attributes deltaxml:deltaV2="A!=B">
   <dxa:age deltaxml:deltaV2="A!=B">
    <deltaxml:attributeValue deltaxml:deltaV2="A">36</deltaxml:attributeValue>
    <deltaxml:attributeValue deltaxml:deltaV2="B">37</deltaxml:attributeValue>
   </dxa:age>
  </deltaxml:attributes>
  <firstName deltaxml:deltaV2="A=B">J</firstName>
 </person>
</example>

The attribute 'gender' is unchanged and so appears as a regular attribute.

The attribute 'age' has a value of 36 in document A and 37 in B.

Element <firstName> appears now as the second child of <person>.

10. Examples for Three Documents

10.1. Examples of Elements in Delta

Document A Document B Document C

<example>
  <person>
    <firstName/>
    <lastName/>
    <tel/>
  </person>
</example>

<example>
  <person>
    <firstName/>
    <lastName/>

  </person>
</example>

<example>
  <person>
    <firstName/>

  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B!=C">
 <person deltaxml:deltaV2="A!=B!=C">
  <firstName deltaxml:deltaV2="A=B=C"/>
  <lastName deltaxml:deltaV2="A=B" />
  <tel deltaxml:delta="A" />
 </person>
</example>

Element <lastName> appears in two documents, A and B, and is the same in both.

Element <tel> appears in only one document, A.

10.2. Examples of Text in Delta

Document A Document B Document C

<example>
  <person>
    <firstName>J</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

<example>
  <person>
    <firstName>John</firstName>
    <lastName>Smith</lastName>
  </person>
</example>

<example>
  <person>
    <firstName>J</firstName>
    <lastName>Smith</lastName>
  </person>
</example> 

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B!=C">
 <person deltaxml:deltaV2="A!=B!=C">
  <firstName deltaxml:deltaV2="A=C!=B">
    <deltaxml:textGroup deltaxml:deltaV2="A=C!=B">
      <deltaxml:text deltaxml:deltaV2="A=C">J</deltaxml:text>
      <deltaxml:text deltaxml:deltaV2="B">John</deltaxml:text>
    </textGroup>
   </firstName>
   <lastName deltaxml:deltaV2="A=B=C">Smith</lastName>
 </person>
</example>

The text in <firstName> is "J" in both A and C.

The text in <firstName> is "John" in document B.

The text in <lastName> is the same in all documents.

10.3. Examples of Attributes in Delta

Document A Document B Document C

<example>
  <person gender="M" age="36">
    <firstName>J</firstName>
  </person>
</example>

<example>
  <person gender="M" age="37">
    <firstName>J</firstName>
  </person>
</example>

<example>
  <person gender="M">
    <firstName>J</firstName>
  </person>
</example>

And the Delta for this will be as follows:

Delta Comments

<example deltaxml:deltaV2="A!=B!=C">
 <person deltaxml:deltaV2="A!=B!=C" gender="M">
  <deltaxml:attributes deltaxml:deltaV2="A!=B!=C">
   <dxa:age deltaxml:deltaV2="A!=B">
    <deltaxml:attributeValue deltaxml:deltaV2="A">36</deltaxml:attributeValue>
    <deltaxml:attributeValue deltaxml:deltaV2="B">37</deltaxml:attributeValue>
   </dxa:age>
  </deltaxml:attributes>
  <firstName deltaxml:deltaV2="A=B=C">J</firstName>
 </person>
</example>

The attribute 'gender' is unchanged and so appears as a regular attribute.

The attribute 'age' has a value of 36 in document A and 37 in B.

Element <firstName> appears now as the second child of <person>.

11. DeltaV2 Version 2.1 Samples

A set of samples specific to Version 2.1 is provided in the document Overlapping Hierarchies in DeltaV2. The included samples illustrate how the new attributes introduced to DeltaV2 can be used in practice by supporting DeltaXML products.