Move detection when comparing XML files

May 2024 – DeltaXML brings advanced move detection to XML comparison for enhanced document precision.

XML Compare has always excelled at structural comparisons, ensuring that changes in element order are accurately identified. However, many users face the challenge of tracking content blocks, such as paragraphs, that move within a document. We are excited to introduce the latest feature in our XML Compare product: “Move Detection.” This new capability addresses a significant challenge—accurately tracking content that has been relocated within a document, even when unique identifiers are not present.

The ability to detect finer-grained changes, such as the movement of paragraphs or sections, is crucial across various industries. Whether you are managing extensive standards documents, facilitating seamless communication between authors and editors, or ensuring the integrity of legal contracts, the precision offered by move detection is invaluable.

Our new “Move Detection” feature offers enhanced configurability and flexibility, allowing you to identify and track moved content with ease.

Benefits of enhanced move detection during comparison

  • Allows move detection without the need for unique identifiers, making it easier to track changes in documents without additional markup.
  • Cleaner comparison results by identifying moved content, reducing the clutter of added and deleted elements.
  • Users can enable or disable move detection, configure which elements to track, and set the scope of detection based on their specific needs.
  • Facilitates clearer communication between authors and editors by indicating moved content instead of deletions and additions.
  • Gives users confidence in the accuracy of their document comparisons by providing detailed move detection and change identification.
  • Particularly beneficial for documents with extensive content reorganisation, making it easier to track and review changes over long release cycles.

How will moves be shown within my results?

For those familiar with DeltaXML, you know that the strength of our comparison solutions lies in their robust configuration and integration capabilities, largely enabled by our intuitive deltaV2 result file. At its core, the deltaV2 format simplifies the comparison of ‘A’ and ‘B’ documents by combining them into a single document. Once move handling is enabled the Move Source is identified using a @deltaxml:move-id and the Move Target is identified using a @deltaxml:move-idref which has the same value as the @deltaxml:move-id. We call this a Move Pair. In this format, deltaxml:deltaV2 attributes (within the DeltaXML namespace) are added to elements where differences exist. These attributes can take one of the following values: A, B, A=B, or A!=B. Here, ‘A’ or ‘B’ signifies the document source, while the ‘=’ or ‘!=’ separator indicates whether the matching source elements are identical or different.

With the introduction of move detection, a new attribute, deltaxml:movedText, is added to elements where moves have occurred. This attribute highlights the moved content, ensuring that both the original and new locations are clearly identified within the deltaV2 file.

By maintaining results in this XML format, you retain the flexibility to use XSLT to transform these results into any format or process that suits your needs. This approach ensures that the powerful insights provided by move detection are easily integrated into your existing workflows and systems, enhancing your ability to manage and review document changes efficiently.

Why is move detection so beneficial?

Detecting content moves is essential for maintaining the integrity and readability of documents. Without move detection, relocated content might be marked as deleted from its original location and added elsewhere, creating confusion and clutter in the comparison results. By accurately identifying moves, this feature ensures a cleaner, more intuitive output, highlighting genuine changes in the content.

Enable or Disable Move Detection

The “Move Detection” feature can be optionally enabled or disabled based on user preferences. This allows you to decide when move detection is necessary for your documents, giving you greater control over the comparison process.

Identify Move Candidates

Users have full control over which elements should be considered for move detection through XPath configuration. This allows for both broad and precise selection criteria. For example, a technical writer updating a large manual can configure the system to detect moves in paragraphs but exclude those containing images by using an XPath such as //p[not(descendant::image)]. This specificity ensures that only relevant content moves are tracked, streamlining the review process.

Configure a ‘Move Class’

The move class feature enables users to limit the scope of move detection to specific sections of a document. By setting a move class with an XPath such as ancestor::section/@id or ancestor::section/title/text(), users can ensure that moves are only recognised within defined boundaries. This is particularly useful for standards organisations that need to track changes within individual sections of long documents, ensuring that content moves are accurately captured without generating unnecessary noise.

Optionally Remove the Move Source

For users who prefer a cleaner comparison output, this feature allows the removal of the original location of moved content, displaying only the new location (the ‘move target’). This option is beneficial in scenarios such as collaborative editing, where an author might want to see only the final position of edited content without the clutter of its previous location. For example, in academic publishing, this can help streamline the review of moved content in research papers, making it easier to focus on the current structure.

Advanced Configuration Options

For those needing more granular control, advanced configuration options are available. These settings allow you to define the circumstances under which a move candidate is included. For instance, you can choose between an ‘unrestricted’ mode, where even content moved within deleted sections is detected, and a ‘restricted’ mode, where only directly marked deletions are considered for moves. This advanced configurability ensures the move detection process can be tailored to fit specific document management needs.

Let’s get comparing

If you’re already a DeltaXML customer, you can access this new “Move Detection“ feature by simply updating to the latest version. Our comprehensive documentation will guide you through everything you need to know to make the most of this powerful enhancement.

For those new to DeltaXML, we invite you to trial our products today. Take advantage of our free samples and discover how easily you can manage your changing content.

If you have any questions or would like a demo, please don’t hesitate to get in touch. We’re here to help you make the most of your XML comparison processes.

We’d love to hear your feedback on this feature or any ideas you may have for future improvements, so please share your thoughts in the comments section below. Your input is super important in helping us make our solutions even better for you. Thank you for your continued support and collaboration, and to make sure you never miss a new feature sign up to our newsletter.

Keep Reading

How Move Detection Improves Document Management

/
Learn how move detection technology improves document management by accurately tracking relocated content.

Configuring XML Compare for Efficient XML Comparison

/
Define pipelines and fine-tune the comparison process with various configuration options for output format, parser features, and more.

A Beginner’s Guide to Comparing XML Files

/
With XML Compare, you receive more than just a basic comparison tool. Get started with the most intelligent XML Comparison software.

Introducing Character By Character Comparison

Find even the smallest differences in your documents with speed and precision with character by character comparison.

Tackling Tracked Changes & Overcoming Hurdles in Managing Large Document Revisions

Managing large document revisions is challenging with tracked changes.

Effortlessly Manage Known Differences During Conversion Checks

Focus on unknown differences with DeltaXML's new ignore change feature for ConversionQA.

Mastering Complex Table Comparisons Within Your Technical Documentation

Our software excels at presenting changes in complex tables and technical content.

Simplifying Your JSON Management Experience with DeltaJSON

DeltaJSON simplifies JSON data management with the introduction of an NPM package.

Cyber Resilience for SMEs: A Chat with DeltaXML’s Systems Administrator

Peter Anderson, IT System Administrator, relays the importance of cyber resilience for SMEs.
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

Never miss an update

Sign up to our newsletter and never miss an update on upcoming features or new products