First Steps with XML Data Compare

Optimised for XML containing structured data, as opposed to XML documents which contain more narrative free-flowing content, XML Data Compare is the new comparison product from DeltaXML. XSLT filters are not necessary to tweak results. Adjust how the comparison is made and how the output is presented just using a config file. Before continuing, go grab your evaluation trial.

Start Comparing XML Data

Once you have your evaluation copy you are ready to go!

Set up your REST server (see the REST user guide) and compare two of your data files using an empty config file.

The config file may require some thought about namespaces and what you need to adjust and there are several samples to help you on your way.  You can always start with an ’empty’ config file and add to that if necessary. There’s more information on this page: Basics – Configuration File and a full definition in the Configuration File Schema Guide.

The samples are on Bitbucket; the whole project is here.

Postman provides access to the REST API or you can use the command line JAVA sample to run the comparison. Whatever way you choose it is simple to modify the config file and review the result file to see the changes that are produced. And you can develop your solution in whatever language you prefer.

To specify the nodes that are to be treated differently, the config file uses XPath expressions. I use an XML editor such as oXygen to check that I am using the right XPath expression. In oXygen, there is an XPath/XQuery builder window that can be used for this.

Change the Config File When:

  • Numeric data is present and shows up as different even when that difference is insignificant.  See numeric tolerances.
  • Sequences of elements do have a key that could be used to match pairs of records.   If, when not using this key, the matching is not quite accurate then see ordered and unordered comparison to improve the results.
  • You wish the comparison to treat each block of text as a single node and show the whole block as being different even when just one word differs. See word by word.
  • Different types of whitespace – single or multiple spaces, tabs and newlines – are significant within the dataset. However, by default, the whitespace is normalised. See normalize whitespace.
  • Differences are shown in certain elements but you would prefer them to be ignored. See ignore changes.
  • You want to change the result file to see changes only or to view a side-by-side folding report in a browser. See output format.

Keep Reading