The DeltaV2 Merge Format

1. Introduction

The XML output from DeltaXML Merge conforms to an extended version of the DeltaV2 format which is common to other DeltaXML products. This section describes parts of DeltaV2 particularly significant in the DeltaXML Merge context, please read the DeltaV2 Reference for a full description of this format.

2. Features of the DeltaV2 output

  • Contains all of the data from all of the input files; any of the input files can be extracted without loss.
  • Structure follows that of the input files (e.g. the root element will be the same).
  • Describes all of the changes to elements, text and attributes using an expanded deltaV2 format that supports n-way merge.
  • DeltaV2 version identifiers are specified through the API prior to processing and appear in the output.
  • Format is versatile and optimised for further processing to resolve differences.

3. DeltaV2 attributes

The deltaxml:deltaV2 attributes in the deltav2 format contain a sequence of one or more 'version identifiers' joined by the '=' character or '!=' character-pair.

The deltaV2 attributes conform to the following rules:

  • Where document versions have the same content at the level of the attribute they are referred to as being within an 'equality group' and the version identifiers are joined using the '=' character within this group
  • Document versions with different content are separated by the '!=' character-pair.
  • Within equality groups and between equality groups (i.e. groups of versions separated with '!=') the versions are ordered according to an ordering sequence specified in the deltaxml:version-order element defined on the document root element. The version-order attribute contains a comma separated list of version identifiers.

A sample deltaV2 attribute value from Merge:

ancestor=anna!=ben=chris!=david

4. Version Identifiers

Version identifiers are user-specified labels assigned to the common ancestor and each revision document. An identifier must be supplied each time a new document is added and each new identifier must have a unique value.

4.1. Choosing values for version identifiers

Version identifiers may be user-specified or machine generated (provided they meet the constraints outlined below). For example, the revision numbers or hash values used in a version control system could be used.

4.2. Constraints on version identifiers

Version identifiers should conform to the NMTOKEN production rule defined in the XML Specification. The same production rules are used in both the XML 1.0 and XML 1.1 specifications. This production rule allows many unicode characters, but prohibits the use of the ! (hex value 0x21) and '=' characters (hex value 0x3b) which are used as the version separators as discussed above and also space characters.

5. Other New functionality in the DeltaXML Merge variant

  • Supports n-way merge (instead of 2-way and 3-way for Core and Sync respectively)
  • DeltaV2 version identifiers specified using the API appear in the result deltaV2 attributes instead of the single characters in DeltaXML Core and Sync.
  • The version-order attribute is used to provide a persistent record of the order in which documents were added to the merger object.
  • Additional attributes, showing the results of a merge analysis, may also be present in the result.

5.1. New elements in DeltaXML Merge

A new element, the content group (deltaxml:contentGroup) is used to describe changes involving entity references, processing instructions and comments. This element is modelled on the deltaxml:textGroup element, but relaxes the restriction that the child elements (in this case deltaxml:content being equivalent to deltaxml:text) must only contain text() nodes.

The contentGroup provides alternative content that appears in similar positions in each of the files. The deltaxml:content child elements provide the alternative content and their deltaV2 attributes indicate which of the input files contained that content.

The following example indicates how a contentGroup is used to show that different entity references are used in corresponding locations in the merge inputs:

<p deltaxml:deltaV2="ancestor=edit1!=edit2">Example of entity preservation
  <deltaxml:contentGroup deltaxml:deltaV2="ancestor=edit1!=edit2">
    <deltaxml:content deltaxml:deltaV2="ancestor=edit1">&t1;</deltaxml:content>
    <deltaxml:content deltaxml:deltaV2="edit2">&t2;</deltaxml:content>
  </deltaxml:contentGroup></p>

6. Differences from the shared format

  • DeltaV2 member values: user-selected version identifier strings used instead of auto-generated single characters
  • Number of members: There may be more than 3 members in a DeltaV2 attribute (to support n-way merge).
  • The required top-level attribute deltaxml:content-type will have different values in Merge results. The value 'merge-concurrent' represents concurrent editing. Future versions of the Merge product will also support a 'travelling draft' model where there is not necessarily the concept of a common ancestor version. It is likely the value 'merge-consecutive' will be used for this algorithm. Other values may also be introduced for subsequent Merge developments.
  • DeltaXML Merge does not use any of the version 2.1 features that are supported by DeltaXML Core.