Comparing Orderless Elements

1. Introduction

1.1. Overview

Any element in an XML file can be declared as a container for orderless elements. DeltaXML will match up and compare the child elements of such a container regardless of the order in which they appear. What counts about orderless elements is membership in a set, not relative position. Arbitrary rearrangement leaves the meaning of the set unaffected.

To summarise, for comparing orderless elements, we are concerned with two types of element:

Orderless Container
This is an element that has child elements arranged in an order that is not considered significant.
Orderless Element
This is an element whose position relative to its siblings is not regarded as significant because its parent element is an orderless container.
In the example below the ’addressList’ element is the ‘orderless container’, whereas the ‘person’ elements within ‘addressList’ are ‘orderless elements’.

1.2. Using Keys

To detect changes in sets or orderless elements, DeltaXML must correlate matching elements in spite of rearrangements. Keys help DeltaXML perform this correlation. Keys remove any need for sorting. (Orderless comparisons have been supported since version 2.0 of DeltaXML.)

The special-purpose attributes called keys simplify DeltaXML comparisons in many situations. The structure of XML is not completely rigid; as with machine parts, a little flexibility lends strength. For instance XML permits both ordered and orderless data. Keys help DeltaXML track changes in both categories. Keys increase both the quality of the matches made and the runtime speed of DeltaXML comparisons. For orderless data keys should be provided whenever possible to get the best results.

Key attributes impose no requirements on data formats. Extensible Stylesheet Language (XSLT) scripts can generate keys on the fly, whenever needed, and discard them in post-processing. The original XML remains unaffected by their use.

Two practical solutions are presented here, with each using a different comparator and method for customising a comparison:

  • Pipelined Comparator (DXP) - Uses a filter pipeline defined by an XML file called a 'DXP' to customise the comparison.
  • Document Comparator - Uses Java API calls to customise a pre-existing pipeline with a number of extension points.

1.3. Removing 'ignorable' whitespace

One point that should be noted, before reading further, is that it is very important to remove whitespace only PCDATA nodes during orderless comparison (for example using the com.deltaxml.pipe.filters.NormalizeSpace Java XML Filter). Also, note that an orderless container must not contain PCDATA as an immediate child: it will cause an exception (com.deltaxml.api.UnorderedElementContainingPCDATAException). The reason is that while elements are separated unambiguously by tags, if two PCDATA items are put next to each other they are in effect merged into one PCDATA item, so orderless PCDATA makes no sense.

2. How to declare an orderless container

Ordering is often unimportant during data capture or storage. For example, entries in an electronic address book can be stored in any order. Assume that we have the following entries stored as our address data and wish to track any modifications using DeltaXML:

Example 1: an address book XML document (documentA.xml in the sample directory)

<addressList>
   <person customerid="15">
     <name>Joe Bloggs</name>
     <email>jblogs@msn.com</email>
     <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
   </person>
  <person customerid="14">
     <name>John Doe</name>
     <email>jdoe@hotmail.com</email>
     <address>
       <line>1234 East Beverley Drive</line>
       <line>Phoenix</line>
       <zip>AZ 12345</zip>
     </address>
   </person>
   <person customerid="62">
     <name>Joe Bloggs</name>
     <email>jblogs@hotmail.com</email>
     <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
  </person>
</addressList>

The first step is to tell DeltaXML that child elements of <addressList> are orderless elements. Or, to put it differently: to identify <addressList> elements as orderless containers.

Example 2: Adding the orderless attribute using an XSLT template (part of address-key.xsl in the sample directory)

<xsl:template match="addressList">
    <xsl:copy>
        <xsl:attribute name="deltaxml:ordered" select="'false'"/>
        <xsl:apply-templates select="@*, node()"/>
    </xsl:copy>
</xsl:template>

The attribute deltaxml:ordered="false" is the signal. It tells the DeltaXML comparison program that the set of <person> elements nested within the <addressList> element are orderless. (Actually, any direct child element of <addressList> will be considered orderless, not just <person>s.)

The second step is to provide DeltaXML with a means of matching elements across the two input files. In our example the id attribute can serve as a matching key. Two elements, one from each input file, are considered to match if, and only if, their respective keys match. When the keys match DeltaXML scans the element in question and records any changes found within its contents. Note how we record the key information using an attribute that is clearly flagged as being in the deltaxml namespace:

Example 3: Adding keys using an XSLT template (part of address-key.xsl in the sample directory)

<xsl:template match="person">
    <xsl:copy>
        <xsl:attribute name="deltaxml:key" select="@customerid"/>
        <xsl:apply-templates select="@*, node()"/>
    </xsl:copy>
</xsl:template>

While it might seem more sensible to reference the id attribute than to duplicate its value, concrete reasons exist for such a protocol. Keys can exhibit much more complexity than shown here.

The final result after applying these templates together is the following file.

Example 4: the keyed, orderless address book

<addressList xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             deltaxml:ordered="false">
   <person customerid="15" deltaxml:key="15">
     <name>Joe Bloggs</name>
     <email>jblogs@msn.com</email>
     <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
   </person>
  <person customerid="14" deltaxml:key="14">
     <name>John Doe</name>
     <email>jdoe@hotmail.com</email>
     <address>
       <line>1234 East Beverley Drive</line>
       <line>Phoenix</line>
       <zip>AZ 12345</zip>
     </address>
   </person>
   <person customerid="62" deltaxml:key="62">
     <name>Joe Bloggs</name>
     <email>jblogs@hotmail.com</email>
     <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
  </person>
</addressList>

3. Comparing orderless elements with keys

3.1. Ignoring data shuffling

DeltaXML is now equipped to perform an orderless comparison. The test will involve a simple rearrangement of the <person> elements:

Example 5: Rearranged address book (documentB.xml in the sample directory)

<addressList>
  <person customerid="15">
    <name>Joe Bloggs</name>
    <email>jblogs@msn.com</email>
    <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
  </person>
  <person customerid="62">
    <name>Joe Bloggs</name>
    <email>jblogs@hotmail.com</email>
    <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
  </person>
  <person customerid="14">
    <name>John Doe</name>
    <email>jdoe@hotmail.com</email>
    <address>
       <line>1234 East Beverley Drive</line>
       <line>Phoenix</line>
       <zip>AZ 12345</zip>
     </address>
  </person>
</addressList>

The comparison result shows no changes, as expected:

Example 6: Delta showing no differences between original and rearranged address books

<addressList xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"             
             deltaxml:deltaV2="A=B"
             deltaxml:ordered="false"
             deltaxml:version="2.0"
             deltaxml:content-type="full-context">
   <person deltaxml:key="15" customerid="15">
      <name>Joe Bloggs</name>
      <email>jblogs@msn.com</email>
      <address>
         <line>1234 Green Lane</line>
         <line>London</line>
         <zip>SW1 7WJ</zip>
      </address>
   </person>
   <person deltaxml:key="14" customerid="14">
      <name>John Doe</name>
      <email>jdoe@hotmail.com</email>
      <address>
         <line>1234 East Beverley Drive</line>
         <line>Phoenix</line>
         <zip>AZ 12345</zip>
      </address>
   </person>
   <person deltaxml:key="62" customerid="62">
      <name>Joe Bloggs</name>
      <email>jblogs@hotmail.com</email>
      <address>
         <line>1234 Green Lane</line>
         <line>London</line>
         <zip>SW1 7WJ</zip>
      </address>
   </person>
</addressList>

3.2. Detecting changes after shuffling

The preceding example showed that DeltaXML correctly ignores rearrangement of orderless sets. The next file introduces actual changes to address book:

Example 7: Address book with data change (documentC.xml in the sample directory)

<addressList>
  <person customerid="14">
    <name>John Doe</name>
    <email>jdoe@hotmail.com</email>
    <address>
       <line>1234 E. Beverley Drive</line>
       <line>Phoenix</line>
       <zip>AZ 12345 6789</zip>
     </address>
  </person>
  <person customerid="15">
    <name>Joe Bloggs</name>
    <email>jblogs@hotmail.com</email>
    <address>
       <line>1234 Green Lane</line>
       <line>London</line>
       <zip>SW1 7WJ</zip>
     </address>
  </person>
</addressList>

Now DeltaXML has something more to report, because embedded data has changed:

Example 8: Delta showing data change detected in the orderless set

<addressList xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"             
             deltaxml:deltaV2="A!=B"
             deltaxml:ordered="false"
             deltaxml:version="2.0"
             deltaxml:content-type="full-context">
   <person deltaxml:deltaV2="A!=B" deltaxml:key="15" customerid="15">
      <name deltaxml:deltaV2="A=B">Joe Bloggs</name>
      <email deltaxml:deltaV2="A!=B">
         <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">jblogs@msn.com</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">jblogs@hotmail.com</deltaxml:text>
         </deltaxml:textGroup>
      </email>
      <address deltaxml:deltaV2="A=B">
         <line>1234 Green Lane</line>
         <line>London</line>
         <zip>SW1 7WJ</zip>
      </address>
   </person>
   <person deltaxml:deltaV2="A!=B" customerid="14" deltaxml:key="14">
      <name deltaxml:deltaV2="A=B">John Doe</name>
      <email deltaxml:deltaV2="A=B">jdoe@hotmail.com</email>
      <address deltaxml:deltaV2="A!=B">
         <line deltaxml:deltaV2="A!=B">
            <deltaxml:textGroup deltaxml:deltaV2="A!=B">
               <deltaxml:text deltaxml:deltaV2="A">1234 East Beverley Drive</deltaxml:text>
               <deltaxml:text deltaxml:deltaV2="B">1234 E. Beverley Drive</deltaxml:text>
            </deltaxml:textGroup>
         </line>
         <line deltaxml:deltaV2="A=B">Phoenix</line>
         <zip deltaxml:deltaV2="A!=B">
            <deltaxml:textGroup deltaxml:deltaV2="A!=B">
               <deltaxml:text deltaxml:deltaV2="A">AZ 12345</deltaxml:text>
               <deltaxml:text deltaxml:deltaV2="B">AZ 12345 6789</deltaxml:text>
            </deltaxml:textGroup>
         </zip>
      </address>
   </person>
   <person deltaxml:deltaV2="A" deltaxml:key="62" customerid="62">
      <name>Joe Bloggs</name>
      <email>jblogs@hotmail.com</email>
      <address>
         <line>1234 Green Lane</line>
         <line>London</line>
         <zip>SW1 7WJ</zip>
      </address>
   </person>
</addressList>

In this case there were changes to <person> elements. DeltaXML accurately detected and recorded the change even though the order of elements was different.

3.3. Rules for keys in orderless comparisons

The following rules apply to the use of key attributes in the orderless case:

  • Orderless containers must be assigned a deltaxml:ordered="false" attribute.
  • XML elements are considered ordered by default (in the absence of this attribute on their parent).
  • Two elements that are identical in all respects except the deltaxml:ordered attribute are considered distinct and will not be compared with each other. This unusual event triggers a warning in DeltaXML verbose mode.
  • Orderless containers may not contain parsed character data.
  • Any element may be designated as an orderless container, regardless of the ordering status of its parent or child elements.

3.4. Rules for orderless elements lacking keys

Generally, elements that are members of an orderless set of elements should all have a key attribute. Orderless sets with a mixture of keyed and unkeyed elements may yield obscure results during comparisons.

Using no keys at all is preferable to a mixed case. Nevertheless, DeltaXML's rules for the mixed case are simple:

  • If the comparison file contains an identical element (perfect match, but neither element has a key attribute) DeltaXML assumes these are actually the same. No change is recorded.
  • If no identical element exists, the keyless element is considered a delta change. It is recorded as an addition or deletion.

3.5. Anchoring position of one element in an orderless set

Orderless sets sometimes incorporate a header element, such as <setheader> below:

Example 9: Orderless set with header

<MyDataItems xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             deltaxml:ordered="false"> 
   <setheader> 
      <date>20 January 2003</date> 
      <owner>MegaCorp</owner> 
   </setheader> 
   <item> Some data </item> 
   <item> More data </item> 
   <item> Extra data </item> 
</MyDataItems> 

Technically <setheader> is a member of the orderless set <MyDataItems>. Yet unlike the <item> members, <setheader> will not move around. It has a special status within the orderless set. So we would like DeltaXML to match it across revisions, not treating changes to this one element in the same way as changes to other members. Therefore we assign a deltaxml:key attribute to the <setheader> element:

Example 10: Orderless set with keyed header

<MyDataItems xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             deltaxml:ordered="false"> 
   <setheader deltaxml:key="headerkey"> 
      <date>20 January 2003</date> 
      <owner>MegaCorp</owner> 
   </setheader> 
   <item> Some data </item> 
   <item> More data </item> 
   <item> Extra data </item> 
</MyDataItems>

Of course, a better design would employ keys for all elements:

Example 11: Orderless set with keyed header and keyed elements

<MyDataItems xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             deltaxml:ordered="false"> 
   <setheader deltaxml:key="headerkey"> 
      <date>20 January 2003</date> 
      <owner>MegaCorp</owner> 
   </setheader> 
   <item deltaxml:key="q43"> Some data </item> 
   <item deltaxml:key="342"> More data </item> 
   <item deltaxml:key="09r"> Extra data </item> 
</MyDataItems>

3.6. Ideas for designing keys

The key to an element can take many forms, including:

  • a child element
  • part of a child element
  • a combination of attributes
  • a combination of attributes and child elements
  • a random number such as a GUID
  • arbitrary data from external sources (other files, databases, user input, etc).

The choices are almost unlimited. Moreover, an element class may apply different key formulations to different instances of that class. For example, some instances could use child data while others use attribute data.

Key generation has to be equally flexible. One possibility is to manufacture keys as part of the XML file. In this case they are stored as a permanent component of the XML data. However this technique leaves behind a footprint that may be inappropriate. It is often better to factor DeltaXML-specific data out. This factoring is possible by creating DeltaXML keys on a temporary, as-needed basis. A simple XSL script can generate keys just prior to DeltaXML processing, while another strips them upon completion. A separate paper on the use of XSL Filters details this technique.

We refer to key assignment based on element data as data-driven. For example, consider the bare address list without any identifiers or keys:

Example 12: Address book with no "id" attributes or keys

<addressList>
  <person>
    <firstName>John</firstName>
    <lastName>Smith</lastName> 
    <address>
      <line>26 High Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0JK</zip>
    </address>        
  </person>
  <person>
    <firstName>Adam</firstName>
    <lastName>Smith</lastName> 
    <address>
      <line>26 High Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0JK</zip>
    </address>
  </person>
  <person>
    <firstName>David</firstName>
    <lastName>Jones</lastName>
    <address>
      <line>12 New Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0PL</zip>
    </address>
  </person>
  <person>
    <firstName>Mark</firstName>
    <lastName>Lane</lastName>
    <address>
      <line>99 Bow Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0LL</zip>
    </address>
  </person> 
</addressList>

We expect the combination of firstName, lastName, and postCode to form a unique key. Such a key can be easily generated by XSL using XPath expressions. The results might look as follows:

Example 13: Data-driven keys added to address book elements

<addressList xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             deltaxml:ordered="false">
  <person deltaxml:key="John Smith WR13 0JK">
    <firstName>John</firstName>
    <lastName>Smith</lastName> 
    <address>
      <line>26 High Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0JK</zip>
    </address>        
  </person>
  <person deltaxml:key="Adam Smith WR13 0JK">
    <firstName>Adam</firstName>
    <lastName>Smith</lastName> 
    <address>
      <line>26 High Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0JK</zip>
    </address>
  </person>
  <person deltaxml:key="David Jones WR13 0PL">
    <firstName>David</firstName>
    <lastName>Jones</lastName>
    <address>
      <line>12 New Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0PL</zip>
    </address>
  </person>
  <person deltaxml:key="Mark Lane WR13 0LL">
    <firstName>Mark</firstName>
    <lastName>Lane</lastName>
    <address>
      <line>99 Bow Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0LL</zip>
    </address>
  </person> 
</addressList>

This new design is more natural than id attributes because the keys derive from actual data. Furthermore the filter supports key generation on the fly, leaving permanent data untouched by DeltaXML markup. The XSL definition for this is presented below:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
                xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
                version="2.0">
  <xsl:output method="xml" indent="no" />

  <xsl:template match="@*, node()">
    <xsl:copy>
      <xsl:apply-templates select="@*,node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="person">
    <xsl:copy>
      <xsl:attribute name="deltaxml:key"
                     select="concat(firstName/text(), ' ', lastName/text(), ' ', address/zip/text())"/>
      <xsl:apply-templates select="@*,node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="addressList">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ordered" select="'false'"/>
      <xsl:apply-templates select="@*,node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

In the above example, the deltaxml:ordered="false" attribute is added to the addressList element as specified in the template's match attribute. If you wanted it to apply to a range of different elements the XSL match attribute could be extended to include further XPaths, for example

<xsl:template match="addressList | elem1 | elem2/elem3">
  <xsl:copy>
    <xsl:attribute name="deltaxml:ordered" select="'false'"/>
    <xsl:apply-templates select="@*,node()"/>
  </xsl:copy>
</xsl:template> 

Here the attribute would be added to all <addressList> elements, all <elem1> elements, and <elem3> elements whose immediate parents where <elem2>.

It should be noted that we only suggest adding the deltaxml:ordered="false" element to data where order really does not matter. If you have an element whose sub-elements are a mixture of ordered and orderless elements and wish to manage these then please see our "how to" reference on "How to configure DeltaXML to handle mixed ordered and orderless data within a single element".

3.7. Nesting orderless sets

Orderless containers may nest, but child elements do not inherit the orderless quality automatically. It must be declared explicitly. For example, each person in the address list might have several phone numbers:

Example 14: Nested orderless sets

<addressList xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1" deltaxml:ordered="false">
  <person>
    <firstName>John</firstName>
    <lastName>Smith</lastName>
    <address>
      <line>26 High Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0JK</zip>
    </address>
    <phoneNumbers deltaxml:ordered="false">
      <phone>1 234</phone>
      <phone>2 345</phone>
      <phone>3 456</phone>
    </phoneNumbers>
  </person>
  <person>
    <firstName>David</firstName>
    <lastName>Jones</lastName>
    <address>
      <line>12 New Street</line>
      <line>Newtown</line>
      <line>Malvern</line>
      <zip>WR13 0PL</zip>
    </address>
    <phoneNumbers deltaxml:ordered="false">
      <phone>4 567</phone>
      <phone>5 678</phone>
      <phone>6 789</phone>
    </phoneNumbers>
  </person>
</addressList>

Comparison works the same no matter how deeply nested the elements. The rules for keys are the same across all levels.

3.8. Should everything be treated as orderless?

The default XML behaviour is that element ordering is significant. The previous example data, for example, the order of the <line> elements in the <address> is significant. Print them out in the wrong order and your postman, or rather the automated post-office sorting machines, may have a few problems! Thus by default you should not try to treat XML data as orderless.

Another example, imagine a simple XML vector graphics format:

  <line> 
    <point x="0" y="0"/> 
    <point x="1" y="1"/> 
  </line> 
 
  <closedPolygon> 
    <point x="0" y="0"/> 
    <point x="1" y="0"/> 
    <point x="1" y="1"/> 
    <point x="0.5" y="1.5"/> 
    <point x="0" y="1"/> 
  </closedPolygon>

Now in the case of a line, perhaps it would be fine to add deltaxml:ordered="false" to the <line> element. It doesn't matter which end you start at the resulting lines are the same and so the comparator should perhaps treat them so. But the <closedPolygon> is a different matter, reorder the points and you end up with different shapes. These should not be compared as equal.

So what approach should you take? We believe that understanding the semantics of the data is important. In some cases good 'clues' are present in DTDs, XMLSchema or RNG grammars for the data, for example the lack of a * or + repetition operator is a good indication not to add orderless. But there are cases where human expertise is needed to complement the grammars.

3.9. Orderless Presentation

When presenting the results of an orderless comparison, we have choices about whether to favour the order of elements in the A document or the B document, and also whether 'adds' should be shown before or after 'deletes'. The OrderlessPresentation comparator property is used to control this presentation order, this is illustrated in the following extract from the sample DXP:

<comparatorProperties>
  <property name="http://deltaxml.com/api/property/orderlessPresentation" 
            literalValue="a_matches_deletes_adds"/>

</comparatorProperties>

The DocumentComparator allows control of the OrderlessPresentation property through the OutputFormatConfiguration class, as shown in the following extract from the Java code sample:

DocumentComparator dc= new DocumentComparator();
...
dc.getOutputFormatConfiguration().setOrderlessPresentationMode(
                                  OrderlessPresentationMode.B_DELETES);

4. Orderless Comparison - with Document Comparator

The Pipelined Comparator works well with the xml data as shown in the above examples. The Document Comparator is an extension to the Pipelined Comparator designed specifically for comparing documents. For example, the html documents having the meta sections. Sample inputs documents are shown below.

Document 1: HTML document
  <html>
    <head>
      <title>Meta Tags Example</title>
      <meta name="keywords" content="Keyword1, Keyword2"/>
      <meta name="description" content="Description of the webpage"/>
      <meta name="revised" content="3/7/2014"/>
    </head>
    <body>
      <p>Hello</p>
    </body>
  </html>
Document 2: HTML document
  <html>
    <head>
      <title>Meta Tags Example</title>
      <meta name="keywords" content="Keyword1, Keyword2, Keyword3"/>
      <meta name="description" content="Information of the webpage"/>
      <meta name="revised" content="3/3/2015"/>
    </head>
    <body>
      <p>Hello</p>
    </body>
  </html>

This sample contains the file OrderlessComparisonSample.java or OrderlessComparisonSample.cs (depending on which Core variant is in use). This contains the source code to handle the orderless comparison using the Document Comparator. It uses the input filter (meta-section-key.xsl) which adds keys to meta elements and a wrapper around meta sections.

Data-driven keys added to meta elements with a wrapper around meta sections having deltaxml:ordered attribute
  <html>
    <head>
        <title>Meta Tags Example </title>
        <meta-wrapper xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
            deltaxml:ordered="false">
            <meta deltaxml:key="keywords" name="keywords" content="Keyword1, Keyword2 " />
            <meta deltaxml:key="description" name="description" content="Description of the webpage" />
            <meta deltaxml:key="revised" name="revised" content="3/7/2014" />
        </meta-wrapper>
    </head>
    <body>
        <p>Hello</p>
    </body>
  </html>  

5. Running the sample

The sample code illustrates two methods of performing orderless element comparison. The first method uses the Pipelined Comparator, and specifies input and output filters with a DXP file. The second method uses the API exposed by the Document Comparator.

5.1. Using Java

If you have Ant installed, use the build script provided to run the complete sample. Simply type the following command. This will generate output for the two methods in the PipelinedComparatorResult and DocumentComparatorResult directories.

ant run

If you don't have Ant installed, you can run the sample from a command line by issuing the commands listed below from the sample directory (ensuring that you use the correct directory and class path separators for your operating system).

Pipelined Comparator (DXP)

To run just the Pipelined Comparator sample and produce the output files resultAB.xml and resultAC.xml in the PipelinedComparatorResult directory, type the following command.

ant run-dxp

The commands to run only the Pipelined Comparator sample are as follows.

java -jar ../../command.jar compare orderless documentA.xml documentB.xml resultAB.xml
java -jar ../../command.jar compare orderless documentA.xml documentC.xml resultAC.xml

Document Comparator (API)

To run just the Document Comparator API sample and produce the output files resultAB.xml and resultAC.xml in the DocumentComparatorResult directory, type the following command.

ant run-dc

The commands to compile and run only the Document Comparator sample code are as follows.

mkdir bin
javac -cp bin:../../deltaxml.jar:../../saxon9pe.jar -d bin ./src/java/com/deltaxml/samples/OrderlessComparisonSample.java
java -cp bin:../../deltaxml.jar:../../saxon9pe.jar:../../icu4j.jar:../../resolver.jar:./bin/com/deltaxml/samples/ com.deltaxml.samples.OrderlessComparisonSample documentA.xml documentB.xml resultAB.xml
java -cp bin:../../deltaxml.jar:../../saxon9pe.jar:../../icu4j.jar:../../resolver.jar:./bin/com/deltaxml/samples/ com.deltaxml.samples.OrderlessComparisonSample documentA.xml documentC.xml resultAC.xml

To clean up the sample directory, run the following Ant command.

ant clean

5.2. Using C#

Pipelined Comparator (DXP)

Here, a batch file is used to execute the comparison, the command to run the Pipelined Comparator sample is simply:

run.bat

Document Comparator (API)

A Visual Studio solution and project are provided in the dotnet-api directory. You can run this in the usual way from Visual Studio.