XML Compare has always been our flagship product – flexible and powerful to address all the varied needs of XML comparison. It has grown over the years based on customer feedback and new requirements. It is certainly used to compare both XML data and XML documents but it is true to say that it has had more development with a focus on documents. This has meant that it has in some areas become more difficult to use for data-focused XML.
So our aims in developing XML Data Compare were twofold. First, we wanted to make it easier to use without the need for software development or programming skills – hence a simple XML configuration file rather than the need to write or modify XSLT code. This ‘no-code’ approach reduces the learning curve.
Second, we wanted to make it more powerful in the way that it handled data. For example, attributes are more important in XML data and in fact sometimes they contain all the information load. Therefore we need to take account of attributes when we align the elements in two files. By ‘align’ we mean identifying which element in one file corresponds best with a particular element in the other file. Also, for data, the order of elements is often not important so a set of child elements can be aligned to a corresponding set in any order. Sometimes small changes in values are not significant and you want to focus on the values that are. We’ve included numeric tolerances with XML Data Compare to allow you to indicate this.
The benefit of XML Data Compare is that access to improved handling for both attributes and orderless data is much simpler. In fact we have developed a completely new comparison ‘engine’ to take a heuristic approach to comparing orderless data and the results are very good. A significant benefit of this new approach is that whereas before it was necessary to specify key values in order to align two similar but not identical elements, these keys are no longer needed. The new comparison engine will discover these keys and use them, even coping with duplicate or multiple keys by taking a heuristic approach to give the best possible result. Alas, there will always be situations where the heuristic approach does not work, but we will continue to strive for perfection!
We have also taken the opportunity with XML Data Compare to offer a REST interface so that it is easy to access: just send the files and a configuration file and the result is returned. We intend that by offering this as a SaaS service it will be easier to get up and running and without the need to worry about software updates, they will all be handled as part of the service.
So how should you choose which one to use? My advice would be to try XML Data Compare first unless your XML really is mainly a text document. If it is a genuine text document at heart then remember that we have special products to handle DocBook and DITA so go first to them if your document is one of those. If you cannot find what you want, or if you have a particular challenge, then please get in touch.
|Characteristic||Typical in XML Documents||Typical in XML Data|
|Main content is text narrative||Very typical||Not typical|
|Text content includes elements to show formatting (‘mixed’ content)||Very typical – if so you should use XML Compare||Do not use XML Data Compare in this case|
|XML attributes mainly contain meta-data about the text content||Typical||Not typical|
|XML attributes contain important data content||Unlikely||Often the case but not always – XML Data Compare takes attributes into account when aligning elements and can show word/token changes within attributes|
|The order or sequence of elements must be preserved||Definitely true||May be true but likely to be some exceptions|
|The order or sequence of elements is not critical to the meaning of the data||Not true||This can often be true for data and XML Data Compare is much better suited|
|Currently using XSLT to process the XML for publication||Typical – XSLT knowledge is required when using XML Compare||Possible but XSLT expertise not needed for XML Data Compare (knowledge of Xpath is useful)|