Loading login details...

DeltaXML Support FAQ

Table of Contents

This file records answers to the most frequently asked questions received by our e-mail support service. Further technical questions are answered in our Technical FAQ.

1. General questions

1.1. Do I need a Schema or DTD?

No. But if you do have a DTD, an XML Schema or a RELAX NG grammar, your XML parser can work out when blocks of whitespace can be ignored and this will usually result in a better comparison. The XML specification states that whitespace is significant and so DeltaXML will compare it. Another way to remove spurious white space is to use the normalize-space.xsl input filter.

1.2. Why does the delta file have newlines in the tags?

These newlines are used to make changes easier to inspect in a normal text editor. They will always be stripped by an XML processor because they are inside element tags. The delta file has no added whitespace. If newlines were placed outside the markup tags they would be considered significant by an XML parser. If you want to view the file in a pretty-printed format, apply indentation output properties with the PipelinedComparator, use an identity XSLT transform, or view it in an XML-aware browser.

1.3. How can I configure DeltaXML, for example to ignore some attributes?

The question: What I was looking for was some sort of configuration mechanism by which you can configure what is compared on, much as you can with traditional diff. Things that I might want to configure include:

For example, I might know that my content management system adds attributes to elements that it uses to track where things are in the database - these values are arbitrary and not meaningful to the information as a whole, so it would be nice to be able to ignore such attributes.

The answer: DeltaXML uses XSL for this type of configuration because it is:

  1. standard

  2. completely flexible

  3. well understood.

So, for example, to remove whitespace you can filter the input files with the normalize-space.xsl input filter.

Similarly, to remove an attribute just add a template to the XSL to take it out! This allows complete flexibility and typically only requires a few lines of code. Our paper on " Configuring DeltaXML to Ignore Elements" explains techniques for "Using XSLT to ignore differences in specified elements or attributes".

More critically, it is possible to add attributes to control DeltaXML by using keys to identify corresponding elements. And you can specify that for a particular element its children should be considered orderless.

1.4. What is an 'orderless' comparison, and how would I use it?

XML documents are generally ordered. In other words, the order of elements within the file is significant and if two elements are exchanged you expect this to be reported as a difference. However, in many files the order of certain elements is not significant and it is useful to compare files without worrying about element order. For example, the elements in an XML Schema <choice> element are orderless as each must refer to a different type of element. DeltaXML can be instructed to ignore order within specific elements, at different levels in the hierarchy, by adding attributes to identify those that contain orderless element sets. More details of these techniques can be found in our paper on " How to compare orderless elements".

1.5. How do I guarantee as-expected results?

Keys are a feature added in DeltaXML Version 2.2 that enable users to control comparisons precisely. When two files are compared the DeltaXML comparison function will, at each level in the XML tree structure, first look to see if there are any elements with DeltaXML keys. These are always matched first before any 'best match' algorithm is applied to match up element set. You can see the advantage of this by considering the effect of adding keys to paragraphs in an XHTML document, as demonstrated in the example of XHTML comparison provided in our DeltaNet Test Drive.

Keys can be applied both to ordered and orderless data. They are particularly important when your data is orderless.??

1.6. Do I need to change my XML data to use keys and orderless comparison?

No. The best way to apply keys is to write an XSLT stylesheet that adds keys and identifies orderless element sets. In this way you can continue to use your data as it is and do not need to modify it. Section 3.6 of "How to compare orderless elements" shows an example XSLT script for automatically adding the necessary deltaxml:ordered and deltaxml:key attributes to XML elements.

1.7. Can I compare comments and processing instructions?

Yes. Comments and processing instructions are normally ignored by DeltaXML. However, if you want them to be included all you need to do is to convert them into markup before comparison, using XSLT. We can provide example XSLT stylesheets to do this. After the comparison, simply convert the added elements back into comments and processing instructions.

1.8. I want to compare my text documents to show changes to individual words. Can DeltaXML do that?

Yes. The compare-detailed sub-command of the command.jar command-line driver provides easy access to file comparison on a word-by-word basis. The WordByWord sample included in the DeltaXML Core 3.0 and subsequent releases demonstrates how this functionality could be incorporated into your application.

A more specialized filter pipeline is also available for XHTML. Rather than produced a differences report this pipeline produces an XHTML result with the differences represented in-line. This is available in the compare-xhtml sub-command of command.jar and also documented in: Guide to Using Filters with DeltaXML, with appropriate source code included in the samples/xsl-filters subdirectory of the Core 3.0 release.

1.9. Will DeltaXML handle large XML files, e.g. 10Mb?

Yes. We have prepared a page with details of techniques: "How to compare large files" which provides guidance on how to handle large files.

1.10 What is the latest release?

Please see the release notes, for core: http://www.deltaxml.com/core/current/docs/release-notes.html, and for sync: http://www.deltaxml.com/sync/current/docs/release-notes.html

2. Using the Command-line evaluation software

2.1. How can I generate an HTML file to view the changes in my browser?

There are two ways to do this depending on whether you want to see the changes in the context of all the unchanged data, or whether you just want to see the changes, for example when you are comparing two large files. To view changes in context, you can generate a full delta using a command line prompt such as:

java -jar /DeltaXMLCore-3_0/command.jar compare doc1.xml doc2.xml differences.html

For changes only, the compare command is used in conjunction with the --changes-only option, as follows:

java -jar /DeltaXMLCore-3_0/command.jar compare --changes-only doc1.xml doc2.xml
    changes.html

2.2. How can I generate an XML file suitable for further processing?

There are two ways to do this depending on whether you want to pass on all the contents of the updated files to the next application, or whether you just want to pass it details of the changes that have been made, for example when you are comparing changes to two large files. To generate a full delta in XML add the --raw-xml-output option to the command line prompt:

java -jar /DeltaXMLCore-3_0/command.jar compare --raw-xml-output doc1.xml doc2.xml
    differences.xml

For changes only, the compare command is used in conjunction with the --changes-only option, as follows:

java -jar /DeltaXMLCore-3_0/command.jar compare --raw-xml-output --changes-only
    doc1.xml doc2.xml changes.xml

Alternatively, if you wish to ensure the delta file can be recombined with original files by preserving the whitespace in the original files, you can used the compare-raw command:

java -jar /DeltaXMLAPI-3_0/command.jar compare-raw doc1.xml doc2.xml changes.xml

2.3. Can I preserve whitespace?

If non-ignorable whitespace is significant for your applications, you can prevent whitespace from being normalized by adding the --preserve-whitespace option to the comparecommand, or by using the compare-raw command if you only want to record changes. To generate a full delta use the comparecommand with both the --raw-xml-output and --preserve-whitespace options:

java -jar /DeltaXMLCore-3_0/command.jar compare --raw-xml-output --preserve-whitespace
    doc1.xml doc2.xml changes.xml

To generate a changes-only file use the compare-raw command:

java -jar /DeltaXMLCore-3_0/command.jar compare-raw doc1.xml doc2.xml changes.xml

2.4. Can I compare HTML files?

If your HTML files are created as well-formed XML (i.e. if they conform to W3C's XHTML 1.0 specification) you can use the DeltaXML's compare-html command to compare them:

java -jar /DeltaXMLCore-3_0/command.jar compare-html doc1.xhtml doc2.xhtml
    differences.xhtml

2.5. Can I compare XML Schemas or DTDs?

You cannot compare XML DTDs directly using DeltaXML because they are not well-formed XML files. You can, however, compare two W3C XML Schemas, or any other form of schema that is encoded as well-formed XML (e.g. RELAX NG schemas, or schemas that have been generated from DTDs using a conversion tool).

DeltaXML has a special compare-schema command for optimizing the comparison of W3C XML Schemas. When this command is used the file is pre-filtered to ensure that element names and other unique characteristics are turned into deltaxml:keys and that attributes are added to ensure that the schema is treated as an orderless set of elements. The command has the form:

java -jar /DeltaXMLCore-3_0/command.jar compare-schema schema-v1.xml schema-v2.xml
    differences.html

The --raw-xml-output and --changes-only options can be used in conjunction with the compare-schema command, but the --preserve-whitespace option cannot be used when comparing schemas.

2.6. Can I use deltas to regenerate source files?

Changes that have been generated using the compare-raw command can be used for 'round-trip' processing, using either forward recombination with the original source file to regenerate the revised file, or backward recombination with the revised file to regenerate the original file. The commands required take the form:

java -jar /DeltaXMLCore-3_0/command.jar recombine-forward doc1.xml changes.xml
    regenerated-2.xml
java -jar /DeltaXMLCore-3_0/command.jar recombine-reverse doc2.xml changes.xml
    regenerated-1.xml

2.7. How can I use XSL in conjunction with DeltaXML?

Details of how to use XSL stylesheets as pre- or post-filters to DeltaXML to form processing pipelines are provided in our paper on Configuring DeltaXML with XSL Filters.  A number of sample stylesheets are included in the samples/xsl-filters subdirectory of the release.

2.8. What do pre-filters do?

A pre-filter is an XSLT stylesheet which is applied to both input files before they are compared. This stylesheet can, for instance, remove whitespace or add orderless or key attributes to control the comparison process. You can also use it to remove elements which you do not want to be compared, or convert comments or processing instructions to elements so that these will be compared.

2.9. What do post-filters do?

Post-filters are typically used to remove markup added by pre-filters, or to remove markup generated by DeltaXML that is not required during subsequent stages of processing.

3. Using the API

3.1. How can I integrate DeltaXML with my applications?

Details of how DeltaXML's XMLComparator can be used within a Java application are provided in the Introduction to DeltaXML guide. A full set of Javadoc documentation is provided in the download. A copy of the latest documentation for the DeltaXML Core API is also available online.

3.2. Is the code threadable?

Multiple comparators can be used by different threads at the same time. See the package and class descriptions in the Javadoc for more information.

3.3. Can the API do everything the command-line version does?

And a lot more. The stylesheets activated by the various options provided in the evaluation software are included in the API package, so you can quickly build equivalent processes and add features that make your application even more impressive. You will also be able to process additional stylesheets at the start and end of your processing sequence to ensure that all relevant parts of your input files are fully processed and the results are presented in the format most suitable for your users and applications.

3.4. Do you offer any support for regression testing?

Yes. We prodive an isEqual method which quickly returms a boolean result and which is very useful in regression tests.