Comparing XML Schema

1. Introduction

DeltaXML is capable of comparing XML files that have some of their elements 'unordered', i.e. the elements in the two files being compared may appear in any order (referred to an an orderless comparison). Specific attributes need to be added to the files to achieve this and one convenient way to do this is to use XSL.

This "how to" guide uses this ability to compare the differences between XML Schema definition files. It steps through a number of worked examples based on XML Schemas so that you can see how to use XSL stylesheets to automatically add keys to your XML Schema to allow it to be processed using the DeltaXML Core differencing engine.

For further details of how the orderless comparison works in DeltaXML, please see How to Compare Orderless Elements for details.

1.1. White space in examples

Most of the examples in this paper are pretty-printed, so the white space is present for display purposes only and is not part of the XML files. It is good practice when comparing files either to remove all whitespace, e.g. using an XSL script

2. Worked example using XML Schema

Our intention is to provide an XSL filter that will add the required attributes for a specific type of XML file. As an example, we will use XML Schema and develop an XSL filter to add keys to any XML Schema file so that DeltaXML will be able to compare two XML Schema files where, for example, the complex type definitions may appear in any order. In addition, after we have added these attributes, DeltaXML will be able to determine that changes in the order of items in a <choice> element are not important and can be ignored. In this way, using a simple XSL filter, DeltaXML can be configured to provide a very intelligent comparison of XML Schema files.

The first step is to determine where keys need to be added. By default, XML data is taken to be ordered. We need to indicate to DeltaXML the cases where the elements are not ordered. The first step is therefore to examine the DTD or Schema to find where this is the case (do not get confused here: as we are using XML Schema as an example, this means we will look at the DTD for XML Schema or the Schema for XML Schema). We will use the DTD in this case as it is more compact, but the same principles would apply if a Schema had been used. Unfortunately, Schema does not have any way to indicate if particular elements should be ordered or not so it is necessary to use domain knowledge to determine if this is so.

We can examine the elements in alphabetical order, having expanded out all the entities so that we see how the real content of the element is defined. In the examples below, some analysis has already been done to determine which elements or content particles are unordered.

2.1. <all> element

Example 1. Definition of the 'all' element

<!ELEMENT all
(annotation?, element*)>

In the case of the <all> element, its contents consist of a single optional annotation and zero or more <element> elements. The <element> elements are not ordered - they could appear in any order and would have the same meaning.

This means that for every occurrence of an <all> element in our schema files being compared, we need to add a deltaxml:ordered="false" attribute. This can be achieved in XSL as follows.

Example 2. Adding a deltaxml:ordered="false" attribute to all <all> elements

<xsl:template match="xsd:all">
  <xsl:copy>
    <!-- add a deltaxml:ordered="false" attribute, then copy all other attributes -->
    <xsl:attribute name="deltaxml:ordered">false</xsl:attribute>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template>

The single <annotation> element should appear first, but we cannot enforce this with DeltaXML. We can however ensure that the annotation elements are matched up by giving them the same key. This can be achieved by adding a template in the XSL to match the <annotation> element (there will always be at most one of these) within an <all> element and add this key, as follows.

Example 3. Adding a deltaxml:key="single" attribute to <annotation> within <all>

<xsl:template match="xsd:all/xsd:annotation">
  <xsl:copy>
    <!-- add a deltaxml:key="single" attribute, 
         then copy all other attributes -->
    <xsl:attribute name="deltaxml:key">single</xsl:attribute>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template>

Notice here that we do not always want to do this because annotation is in some cases allowed to occur many times, so this template matches only <annotation> elements within <all> elements. The reason for using the value "single" for the key is simply that it enables us to re-use this template for other similar situations, as you will see later.

Next, we need to add deltaxml:key attributes for the <element> elements within <all>. The key to these will be either the @ref (the attribute named "ref") or the @name attribute. This means that the value of the delaxml:key attribute needs to be a copy of the value of either of these attributes. As it is illegal for both @name and @ref to appear together, we can represent this in XSL as follows.

Example 4. Adding a deltaxml:key="XX" attribute to <element> within <all>

<xsl:template match="xsd:all/xsd:element"> 
  <xsl:copy> 
    <xsl:choose>
      <xsl:when test="@name">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
      </xsl:when>
      <xsl:when test="@ref">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@ref"/></xsl:attribute>
      </xsl:when>
    </xsl:choose>
    <xsl:apply-templates select="@* | node()"/> 
  </xsl:copy> 
</xsl:template> 

We will see later that there are other situations where we need to do the same thing, and this can be achieved by changing the match attribute on this template.

2.2. <annotation> element

The next element that has repeated items in its content is <annotation>.

Example 5. Definition of <annotation>

<!ELEMENT annotation
  (appinfo | documentation)*>

In this case the content is ordered, so we do not need to add any deltaxml:ordered attribute to this element.

2.3. <appinfo> element

Example 6. Definition of <appinfo>

<!ELEMENT appinfo ANY>

The <appinfo> element has ANY content which allows both text and elements. This is considered to be ordered and so no deltaxml:ordered atttibute needs to be added.

2.4. <attributeGroup> element

The <attributeGroup> element is a little more complicated.

Example 7. Definition of <attributeGroup>

<!ELEMENT attributeGroup  
 (annotation?,  
  (attribute | attributeGroup)*, anyAttribute?)> 
 For CP:  
 (attribute | attributeGroup)* 
  This CP is not ordered. 
  Key for element 'attribute': @name @ref 
  Key for element 'attributeGroup': @name @ref 

Here we have an optional <annotation> element and an optional <anyAttribute> element in addition to the repeated <attribute> and <attributeGroup> elements. Because some of the repeated items are unordered we need to indicate that the <attributeGroup> element is unordered, by adding a deltaxml:ordered="false" attribute to it.

We can treat the optional <annotation> element in the same way as before, but we do not need to create a new template, we can simply change the match on the existing one. In fact we can change it at the same time to cater also for the <anyAttribute> element in this situation.

Example 8. Modifying the template match to add deltaxml:key="single" attribute

<xsl:template match="xsd:all/xsd:annotation | xsd:attributeGroup/xsd:annotation | 
 xsd:attributeGroup/xsd:anyAttribute"> 
  <xsl:copy> 
    <!-- add a deltaxml:key="single" attribute, then copy all other attributes --> 
    <xsl:attribute name="deltaxml:key">single</xsl:attribute> 
    <xsl:apply-templates select="@* | node()"/> 
  </xsl:copy> 
</xsl:template> 

The <attribute> and <attributeGroup> elements have keys of @name or @ref in the same way as the <element> element above. So, again, we can change the match value to cater for these situations also.

Example 9. Modifying the template match to add deltaxml:key="XX" attribute

<xsl:template match="xsd:all/xsd:element | xsd:attributeGroup/xsd:attribute |  
                     xsd:attributeGroup/xsd:attributeGroup"> 
  <xsl:copy> 
    <xsl:choose>
      <xsl:when test="@name">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
      </xsl:when>
      <xsl:when test="@ref">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@ref"/></xsl:attribute>
      </xsl:when>
    </xsl:choose>
    <xsl:apply-templates select="@* | node()"/> 
  </xsl:copy> 
</xsl:template> 

However, there is a problem here and the above will not work correctly. The problem is that an <attributeGroup> element may be matched by two templates now, depending on where it appears. If it appears within another <attributeGroup> element, it will match the second template, otherwise it will match the first. But we need to apply a deltaxml:ordered="false" attribute in both cases, and this is not done in the second case.

The solution is to split the second template into two, one for orderless elements which will add the deltaxml:ordered="false" attribute, the other for ordered elements which will not do this.

Example 10. Template to add deltaxml:key="XX" attribute for orderless elements

<xsl:template match="xsd:all/xsd:element |  
                     xsd:attributeGroup/xsd:attributeGroup"> 
  <xsl:copy> 
    <xsl:attribute name="deltaxml:ordered">false</xsl:attribute> 
    <xsl:choose>
      <xsl:when test="@name">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
      </xsl:when>
      <xsl:when test="@ref">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@ref"/></xsl:attribute>
      </xsl:when>
    </xsl:choose>
    <xsl:apply-templates select="@* | node()"/> 
 </xsl:copy> 
</xsl:template> 

Example 11. Template to add deltaxml:key="XX" attribute for ordered elements

<xsl:template match="xsd:attributeGroup/xsd:attribute"> 
  <xsl:copy> 
    <xsl:choose>
      <xsl:when test="@name">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
      </xsl:when>
      <xsl:when test="@ref">
        <xsl:attribute name="deltaxml:key"><xsl:value-of select="@ref"/></xsl:attribute>
      </xsl:when>
    </xsl:choose>  
    <xsl:apply-templates select="@* | node()"/>
 </xsl:copy> 
</xsl:template> 

Now the templates should behave as expected.

2.5. <choice> element

The <choice> element has <annotation> as a child and this can be dealt with as before.

Example 12. Definition of <choice>

<!ELEMENT choice  
 (annotation?,  
  (element | group | choice | sequence | any)*)> 
 For CP:  
 (element | group | choice | sequence | any)* 
  This CP is not ordered. 
  Key for element 'any': @id 
  Key for element 'choice': @id 
  Key for element 'element': @name @ref 
  Key for element 'group': @name @ref 
  Key for element 'sequence': @id 

The repeating <element> and <group> elements can also make use of previously-defined templates. The other repeating items can be identified by their @id attribute. As this is of type ID it will be unique across the whole file, but this means it will also be unique within this element and so can be used as a key. The following new templates will achieve this. We need two templates to cater for ordered and unordered elements as discussed above.

Example 13. Templates to add deltaxml:key attibute using @id value

<xsl:template match="xsd:choice/xsd:any | xsd:choice/xsd:sequence "> 
  <xsl:copy>
    <xsl:if test="@id">
      <xsl:attribute name="deltaxml:key"><xsl:value-of select="@id"/></xsl:attribute>
    </xsl:if>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template> 
 
<xsl:template match="xsd:choice/xsd:choice"> 
  <xsl:copy>
    <xsl:attribute name="deltaxml:ordered">false</xsl:attribute>
    <xsl:if test="@id">
      <xsl:attribute name="deltaxml:key"><xsl:value-of select="@id"/></xsl:attribute>
    </xsl:if>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template> 

2.6. <complexType> element

The <complexType> element has a more complex structure.

Example 14. Definition of <complexType>

<!ELEMENT complexType  
 (annotation?,  
  (simpleContent | complexContent |  
   ((all | choice | sequence | group)?,  
    (attribute | attributeGroup)*, anyAttribute?)))> 
 For CP:  
 (attribute | attributeGroup)* 
  This CP is not ordered. 
  Key for element 'attribute': @name @ref 
  Key for element 'attributeGroup': @name @ref 

The repeating group can be handled using existing templates. The other items occur as single elements and so again can make use of an existing template.

2.7. <documentation> element

The <documentation> element has ANY content and is ordered. Nothing needs to be added to the XSL filter.

Example 15. Definition of <documentation>

<!ELEMENT documentation ANY> 
 ANY content, always ordered. 

2.8. <element> element

The repeating items in <element> are unordered.

Example 16. Definition of <element>

<!ELEMENT element  
 (annotation?,  
  (complexType | simpleType)?,  
  (unique | key | keyref)*)> 
 For CP:  
 (unique | key | keyref)* 
  This CP is not ordered. 
  Key for element 'key': @name 
  Key for element 'keyref': @name 
  Key for element 'unique': @name 

We need a new template to cater for these three items and we need to amend the existing 'single' element template to add the <annotation>, <complexType> and <SimpleType> to it.

Example 17. Templates to add deltaxml:key attibute using @name value

<xsl:template match="xsd:element/xsd:unique"> 
  <xsl:copy>
    <xsl:attribute name="deltaxml:ordered">false</xsl:attribute>
    <xsl:if test="@name">
      <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
    </xsl:if>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template> 
 
<xsl:template match="xsd:element/xsd:key | xsd:element/xsd:keyref"> 
  <xsl:copy>
    <xsl:if test="@name">
      <xsl:attribute name="deltaxml:key"><xsl:value-of select="@name"/></xsl:attribute>
    </xsl:if>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template> 

2.9. <extension> element

For the <extension> element, we can modify existing templates in a similar way to changes for <complexType>.

Example 18. Definition of <extension>

<!ELEMENT extension  
 ((all | choice | sequence | group)?,  
  (attribute | attributeGroup)*, anyAttribute?)> 
 For CP:  
 (attribute | attributeGroup)* 
  This CP is not ordered. 
  Key for element 'attribute': @name @ref 
  Key for element 'attributeGroup': @name @ref 

2.10. <key> element

The <key> element is ordered, so no changes are required to the template.

Example 19. Definition of <key>

<!ELEMENT key  
 (annotation?, selector, field+)> 
 For CP: field+ 
  This CP is ordered. 

2.11. <keyref > element

The <keyref> element is ordered, so no changes needed to the XSL filter.

Example 20. Definition of <keyref >

<!ELEMENT keyref  
 (annotation?, selector, field+)> 
 For CP: field+ 
  This CP is ordered. 

2.12. <redefine> element

The <redefine> element is not ordered. The <annotation> element has no key and this means that any changes to an <annotation> element will result in a delete and an add rather than a modify. In general it is good practice to ensure that all unordered elements have some key to avoid this problem.

Example 21. Definition of <redefine>

 <!ELEMENT redefine  
 (annotation | simpleType | complexType | attributeGroup | group)*> 
 For CP:  
 (annotation | simpleType | complexType | attributeGroup | group)* 
  This CP is not ordered. 
  Key for element 'attributeGroup': @name 
  Key for element 'complexType': @name 
  Key for element 'group': @name 
  Key for element 'simpleType': @name 
  No key for these elements: annotation 

The key @name can make use of the existing template that adds this key.

2.13. <restriction> element

The repeated items in <restriction> are not ordered, but there are only keys for two of the elements. Any changes to other elements will be handled as a delete and an add.

Example 22. Definition of <restriction>

<!ELEMENT restriction  
 (annotation?,  
  (all | choice | sequence | group |  
   (simpleType?,  
    (minInclusive | minExclusive | maxInclusive | maxExclusive | totalDigits | 
          fractionDigits | pattern | enumeration | whiteSpace | length | 
          maxLength | minLength)*))?,  
  (attribute | attributeGroup)*, anyAttribute?)> 
 For CP:  
 (minInclusive | minExclusive | maxInclusive | maxExclusive | totalDigits | 
          fractionDigits | pattern | enumeration | whiteSpace | length | 
          maxLength | minLength)* 
  This CP is not ordered. 
  No key for these elements: enumeration fractionDigits length maxExclusive 
          maxInclusive maxLength minExclusive minInclusive minLength pattern
          totalDigits whiteSpace 
 For CP:  
 (attribute | attributeGroup)* 
  This CP is not ordered. 
  Key for element 'attribute': @name @ref 
  Key for element 'attributeGroup': @name @ref 

2.14. <schema> element

The <schema> element has a rather convoluted DTD specification due to the wish to have <annotation> at any position. However, apart from <annotation>, most of the repeated elements have keys. It is not possible with DeltaXML to preserve the order of the <annotation> elements, but this is generally not important because <annotation> is permitted within each of the other elements.

Example 23. Definition of <schema>

<!ELEMENT schema  
 ((include | import | redefine | annotation)*,  
  ((simpleType | complexType | element | attribute | attributeGroup | group | notation),
          annotation*)*)> 
 For CP:  
 (include | import | redefine | annotation)* 
  This CP is not ordered. 
  Key for element 'import': @namespace 
  Key for element 'include': @schemaLocation 
  Key for element 'redefine': @schemaLocation 
  No key for these elements: annotation 
 For CP:  
 ((simpleType | complexType | element | attribute | attributeGroup | group | notation),
          annotation*)* 
  This CP is not ordered. 
  Key for element 'attribute': @name 
  Key for element 'attributeGroup': @name 
  Key for element 'complexType': @name 
  Key for element 'element': @name 
  Key for element 'group': @name 
  Key for element 'notation': @name 
  Key for element 'simpleType': @name 
 For CP: annotation* 
  This CP is ordered. 

2.15. <sequence> element

The <sequence> element is ordered, so no changes are needed to the XSL filter.

Example 24. Definition of <sequence>

<!ELEMENT sequence  
 (annotation?,  
  (element | group | choice | sequence | any)*)> 
 For CP:  
 (element | group | choice | sequence | any)* 
  This CP is ordered. 

2.16. <union> element

The <union> element is not ordered and we can make simple modifications to existing templates to cater for this.

Example 25. Definition of <union>

<!ELEMENT union  
 (annotation?, simpleType*)> 
 For CP: simpleType* 
  This CP is not ordered. 
  Key for element 'simpleType': @name 

2.17. <unique> element

The <union> element is not ordered and a new template is needed to cater for the @xpath key.

Example 26. Definition of <attributeGroup>

<!ELEMENT unique  
 (annotation?, selector, field+)> 
 For CP: field+ 
  This CP is not ordered. 
  Key for element 'field': @xpath 

3. Using the XSL filter

We can now look at the effect of this filter by considering some examples. File t1a.xml is a simple (and not correct or complete) Schema file.

Example 27. File t1a.xml

<?xml version='1.0'?> 
<schema xmlns='http://www.w3.org/2001/XMLSchema'> 
 <element name='test1'> 
  <complexType> 
    <all> 
    <annotation> 
    <documentation>Some documentation</documentation> 
    </annotation> 
     <element ref='A' minOccurs='1' maxOccurs='1'/> 
     <element ref='B' minOccurs='1' maxOccurs='1'/> 
     <element ref='C' minOccurs='1' maxOccurs='1'/> 
    </all> 
  </complexType> 
 </element> 
 <element name='test2'> 
  <complexType> 
    <sequence> 
     <element ref='A'/> 
     <element ref='B'/> 
     <element ref='C'/> 
    </sequence> 
  </complexType> 
 </element> 
 <element name='test3'> 
  <complexType> 
    <choice> 
     <element ref='A' /> 
     <element ref='B' /> 
     <element ref='C' /> 
    </choice> 
  </complexType> 
 </element> 
</schema> 

It is worth looking at the same file after it has passed through the XSL input filter we have developed. This is shown below.

Example 28. File t1a.xml after it has been filtered with XSL (pretty-printed)

<?xml version="1.0" encoding="UTF-8"?> 
<schema xmlns="http://www.w3.org/2001/XMLSchema" 
        xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1" 
        deltaxml:ordered="false"> 
  <element deltaxml:ordered="false" deltaxml:key="test1" name="test1"> 
    <complexType deltaxml:ordered="false" deltaxml:key="single"> 
      <all deltaxml:ordered="false" deltaxml:key="single"> 
        <annotation deltaxml:key="single"> 
          <documentation> 
            Some documentation 
          </documentation> 
        </annotation> 
        <element deltaxml:key="A" ref="A" minOccurs="1" maxOccurs="1" /> 
        <element deltaxml:key="B" ref="B" minOccurs="1" maxOccurs="1" /> 
        <element deltaxml:key="C" ref="C" minOccurs="1" maxOccurs="1" /> 
      </all> 
    </complexType> 
  </element> 
  <element deltaxml:ordered="false" deltaxml:key="test2" name="test2"> 
    <complexType deltaxml:ordered="false" deltaxml:key="single"> 
      <sequence deltaxml:key="single"> 
        <element ref="A" /> 
        <element ref="B" /> 
        <element ref="C" /> 
      </sequence> 
    </complexType> 
  </element> 
  <element deltaxml:ordered="false" deltaxml:key="test3" name="test3"> 
    <complexType deltaxml:ordered="false" deltaxml:key="single"> 
      <choice deltaxml:ordered="false" deltaxml:key="single"> 
        <element deltaxml:key="A" ref="A" /> 
        <element deltaxml:key="B" ref="B" /> 
        <element deltaxml:key="C" ref="C" /> 
      </choice> 
    </complexType> 
  </element> 
</schema> 

This shows how the deltaxml:unordered and deltaxml:key attributes have been added to the file.

We can now make some changes to this file which are not 'real' changes and which we would like to be ignored by the comparator. First, we can change the order of the <element> definitions. Second, we can change the order of the elements in the <choice> in "test3". The new file is shown below.

Example 29. File t1b.xml

<?xml version='1.0'?> 
<schema xmlns='http://www.w3.org/2001/XMLSchema'> 
 <element name='test3'> 
  <complexType> 
    <choice> 
     <element ref='B' /> 
     <element ref='C' /> 
     <element ref='A' /> 
    </choice> 
  </complexType> 
 </element> 
 <element name='test2'> 
  <complexType> 
    <sequence> 
     <element ref='A'/> 
     <element ref='B'/> 
     <element ref='C'/> 
    </sequence> 
  </complexType> 
 </element> 
 <element name='test1'> 
  <complexType> 
    <all> 
    <annotation> 
    <documentation>Some documentation</documentation> 
    </annotation> 
     <element ref='A' minOccurs='1' maxOccurs='1'/> 
     <element ref='B' minOccurs='1' maxOccurs='1'/> 
     <element ref='C' minOccurs='1' maxOccurs='1'/> 
    </all> 
  </complexType> 
 </element> 
</schema> 

If we do not use the XSL filter we have developed, we get a large number of changes as shown below (the white space has been normalized before the comparison process).

Example 30. Comparison without using the XSL filter

<schema deltaxml:deltaV2="A!=B" deltaxml:version="2.0" deltaxml:content-type="changes-only">
  <element deltaxml:deltaV2="A!=B">
    <deltaxml:attributes deltaxml:deltaV2="A!=B">
      <dxa:name deltaxml:deltaV2="A!=B">
        <deltaxml:attributeValue deltaxml:deltaV2="A">
          test1
        </deltaxml:attributeValue>
        <deltaxml:attributeValue deltaxml:deltaV2="B">
          test3
        </deltaxml:attributeValue>
      </dxa:name>
    </deltaxml:attributes>
    <complexType deltaxml:deltaV2="A!=B">
      <all deltaxml:deltaV2="A">
        <annotation>
          <documentation>
            Some documentation
          </documentation>
        </annotation>
        <element ref="A" minOccurs="1" maxOccurs="1" />
        <element ref="B" minOccurs="1" maxOccurs="1" />
        <element ref="C" minOccurs="1" maxOccurs="1" />
      </all>
      <choice deltaxml:deltaV2="B">
        <element ref="B" />
        <element ref="C" />
        <element ref="A" />
      </choice>
    </complexType>
  </element>
  <element deltaxml:deltaV2="A=B" />
  <element deltaxml:deltaV2="A!=B">
    <deltaxml:attributes deltaxml:deltaV2="A!=B">
      <dxa:name deltaxml:deltaV2="A!=B">
        <deltaxml:attributeValue deltaxml:deltaV2="A">
          test3
        </deltaxml:attributeValue>
        <deltaxml:attributeValue deltaxml:deltaV2="B">
          test1
        </deltaxml:attributeValue>
      </dxa:name>
    </deltaxml:attributes>
    <complexType deltaxml:deltaV2="A!=B">
      <choice deltaxml:deltaV2="A">
        <element ref="A" />
        <element ref="B" />
        <element ref="C" />
      </choice>
      <all deltaxml:deltaV2="B">
        <annotation>
          <documentation>
            Some documentation
          </documentation>
        </annotation>
        <element ref="A" minOccurs="1" maxOccurs="1" />
        <element ref="B" minOccurs="1" maxOccurs="1" />
        <element ref="C" minOccurs="1" maxOccurs="1" />
      </all>
    </complexType>
  </element>
</schema> 

If we now use the new XSL filter, then the result is as shown below.

Example 31. Comparison using the XSL filter (no changes as expected)

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<schema deltaxml:deltaV2="A=B" deltaxml:version="2.0" deltaxml:content-type="changes-only" /> 

This shows no changes, as expected, because all we have done is to change the order of some of the elements where this is not significant.

If any 'real' changes are made these will be seen in the result.

4. Summary of steps for XSL template definition

The worked example above shows how an XSL template can be developed for a specific XML structure. These steps can be summarized as follows.

4.1. Expand DTD or Schema

It is first necessary to expand any entities or Schema structures that are used in the DTD or schema so that the real structure can be seen. If this step is not done it is likely that repeating content particles will not be correctly identified.

4.2. Identify repeating content particles

Each repeating content particle should be identified.

4.3. Determine which content particles are unordered

For each identified content particle, determine whether it is ordered or not. Note that MIXED and ANY content is always considered ordered because of the presence of PCDATA.

4.4. For each unordered content particle, add deltaxml:ordered attribute

For any element that has an unordered repeating content particle within it, a deltaxml:ordered="false" attribute needs to be added. Only a single template is needed for this but remember to add this attribute also in any other template that handles these elements.

4.5. For each unordered content particle, determine keys

For each unordered content particle, determine, for each element that is repeated, the keys that are suitable to identify the element. The key may be a constant value, a single attribute, a combination of attributes or the content of a child element. With XSL it is possible to cater for complex keys because the full power of the language is available to set up the deltaxml:key attribute.

4.6. Check the behaviour of the script

The worked example above discusses some of the issues in developing the XSL stylesheet to behave correctly. A set of test data examples will ensure that the effect that you require has been achieved. As XSL uses a pattern-matching mechanism, it is easy to get the wrong effects if a template is selected that you had not intended.

5. Conclusions

This paper shows how to develop an XSL filter to make DeltaXML work intelligently for a particular XML format. Simple keys can easily be represented using modifications of the worked example above. For more complex situations, the full power of XSL is available to configure the input files with additional attributes to drive DeltaXML in special ways according to the ordering and keys of elements.

The additional attributes can be stripped out using an output filter or left in for further processing. For any re-combination with the original files, the attributes will need to be present in both files.

The full XSLT filter is provided in the examples that are in the evaluation code download.