Mixed Ordered and Orderless Data

1. Issues with using deltaxml:ordered="false"

The deltaxml:ordered="false" attribute indicates that the element is orderless, i.e. its child elements can appear in any order and the order does not affect the meaning. When DeltaXML compares such elements, it treats the child elements as a set rather than an ordered list. This creates a problem when only some of the elements are orderless. For example, you might have a list of phone numbers in a contact record like this:

<records>

 <contact>
  <name>John Smith</name>
  <addressLine>25 Green Lane</addressline>
  <addressLine>London</addressline>

  <addressLine>UK</addressline>
  <phone type="office">+44 200 1234 567</phone>
  <phone type="fax">+44 200 1234 568</phone>
  <phone type="mobile">+44 200 1234 569</phone>

 </contact>
 ...
</records>

If we place a deltaxml:ordered="false" attribute on the contact record, then all the elements within it will be considered to occur in any order. In this case, we consider the addressLine elemens to be ordered and only the phone elements to be orderless, so adding this attribute will not work.

2. Managing ordered and orderless sub-elements

There are a number of ways to handle the combination of order and orderless sub elements correctly. One would be to create a new element to contain all the phone elements and then assign the deltaxml:ordered="false" to this element. The container can be stripped out later after we have finished the comparison. The problem with this solution is that it introduces false elements into the input files which then need to be stripped out later.

Another method is simply to sort these elements before the comparison process, based on some suitable value. In this case, the type attribute provides a possible value. This can be acheived with the following XSLT code.

<xsl:template match="phone[position()=1]">
<xsl:for-each select="../phone">
  <xsl:sort select="@type"/>
  <xsl:copy>

   <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
  </xsl:for-each>
</xsl:template>

<xsl:template match="phone[position()>1]"/>

What is happening here is that as soon as the first phone element is found, all phone elements are collected together, sorted and output in the sorted order. The second template ensures that phone elements are not duplicated in the output file.

We could also choose to make the type attribute into a key so that phone elements with the same type will always be compared - this would be a good idea if the type value is unique in the context of the contact element. This is achieved as follows:

<xsl:template match="phone[position()=1]">

<xsl:for-each select="../phone">
  <xsl:sort select="@type"/>
    <xsl:call-template name="copy-and-add-key">
      <xsl:with-param name="primary-key" select="@type"/>
    </xsl:call-template>
  </xsl:for-each>

</xsl:template>

And the template for copying and adding the key would be:

<xsl:template name="copy-and-add-key">
<xsl:param name="primary-key"/>
 <xsl:variable 
      name="normalized-primary-key" 
      select="normalize-space($primary-key)"/>
 <xsl:copy>
  <xsl:if test="$normalized-primary-key">
    <xsl:attribute name="deltaxml:key">
      <xsl:value-of select="$normalized-primary-key"/>
    </xsl:attribute>
  </xsl:if>
 <xsl:apply-templates select="@*|node()"/>
 </xsl:copy>
</xsl:template>

Many variations of this are possible. In this example, all phone elements are sorted but we could select only those in the contact element. Another variation would be to have a secondary key to be used if the primary one was not present.

The advantage of this approach is that minimum changes are made to the input file and therefore it is easier to understand the differences generated.