Over the many years we’ve been providing software for XML comparison, the same question has popped up from time to time: can I compare format ‘X’ against format ‘Y’? Until recently, our answer has always been that DeltaXML software is designed to compare two versions of the same format, e.g., DocBook against DocBook, DITA against DITA, or your internal format against your internal format. We’ve spent some time exploring the reason behind that recurring question – aiming to understand the ‘why’ of format x compared with format y. It often means that content has been converted (from x to y), and there is still a need to understand content changes regardless of the differences in structure. Content conversion (aka migration) isn’t the only scenario where comparing different XML formats is required. Others include:
- Restructuring a document – the requirement is to compare the content while ignoring the structural changes.
- Avoiding noise – there are changes to attributes and elements that you want to ignore and focus only on content changes.
- Managing third-party content – incoming content is in a different format, and there is a requirement to understand how the content differs from what’s held internally.
All these scenarios have one thing in common – the ‘content’ is what’s important, rather than the structure or format that holds it.
This is where ConversionQA comes in. Primarily designed to handle the ‘check that content conversion has been achieved without unforeseen content change’ scenario, ConversionQA allows comparison of content held in entirely different XML formats. Initially, we’re focusing on Microsoft Word, DITA maps, and any single-file XML format, e.g., DocBook or XHTML. The comparison ignores the element structure of the inputs and compares just the content to determine the differences, if any. If there are differences, they’re highlighted in a simple HTML report to point you in the right direction for updating your conversion scripts if necessary.