Back to Blog

When to use XML Data Compare

Robin La Fontaine

Posted on 14th May 2021

In the beginning…

XML Compare has always been our flagship product – flexible and powerful to address all the varied needs of XML comparison. It has grown over the years based on customer feedback and new requirements. It is certainly used to compare both XML data and XML documents but it is true to say that it has had more development with a focus on documents. This has meant that it has in some areas become more difficult to use for data-focused XML.

Why XML Data Compare?

So our aims in developing XML Data Compare were twofold. First, we wanted to make it easier to use without the need for software development or programming skills – hence a simple XML configuration file rather than the need to write or modify XSLT code. This ‘no-code’ approach reduces the learning curve.  

Second, we wanted to make it more powerful in the way that it handled data. For example, attributes are more important in XML data and in fact sometimes they contain all the information load. Therefore we need to take account of attributes when we align the elements in two files. By ‘align’ we mean identifying which element in one file corresponds best with a particular element in the other file. Also, for data, the order of elements is often not important so a set of child elements can be aligned to a corresponding set in any order.

The benefits of XML Data Compare

The benefit of XML Data Compare is that access to improved handling for both attributes and orderless data is much simpler. In fact we have developed a completely new comparison ‘engine’ to take a heuristic approach to comparing orderless data and the results are very good. A significant benefit of this new approach is that whereas before it was necessary to specify key values in order to align two similar but not identical elements, these keys are no longer needed. The new comparison engine will discover these keys and use them, even coping with duplicate or multiple keys by taking a heuristic approach to give the best possible result. Alas, there will always be situations where the heuristic approach does not work, but we will continue to strive for perfection!

REST with XML Data Compare

We have also taken the opportunity with XML Data Compare to offer a REST interface so that it is easy to access: just send the files and a configuration file and the result is returned. We intend that by offering this as a SaaS service it will be easier to get up and running and without the need to worry about software updates, they will all be handled as part of the service.

XML Compare vs XML Data Compare

So how should you choose which one to use? My advice would be to try XML Data Compare first unless your XML really is mainly a text document. If it is a genuine text document at heart then remember that we have special products to handle DocBook and DITA so go first to them if your document is one of those. If you cannot find what you want, or if you have a particular challenge, then please get in touch.

Characteristic Typical in XML Documents Typical in XML Data 
Main content is text narrativeVery typicalNot typical 
Text content includes elements to show formatting (‘mixed’ content)Very typical – if so you should use XML CompareDo not use XML Data Compare in this case 
XML attributes mainly contain meta-data about the text contentTypical Not typical 
XML attributes contain important data content Unlikely Often the case but not always – XML Data Compare takes attributes into account when aligning elements and can show word/token changes within attributes 
The order or sequence of elements must be preservedDefinitely true May be true but likely to be some exceptions 
The order or sequence of elements is not critical to the meaning of the dataNot true This can often be true for data and XML Data Compare is much better suited 
Currently using XSLT to process the XML for publicationTypical – XSLT knowledge is required when using XML Compare Possible but XSLT expertise not needed for XML Data Compare (knowledge of Xpath is useful)

comments