Change Tracking Standards

Or

Why is there no XML standard for change tracking?

This page describes DeltaXML involvement with various standards for change tracking of XML documents.
We have contributed to standards in the area of change tracking, and have developed representations for XML change tracking, i.e. to enable a sequence of edits to a document to be represented in a generic way. We have contributed to OpenDocument and to W3C.

Is change-tracking the same as comparison?

At DeltaXML, we focus exclusively on comparison and change representation for structured data and documents: this is not the same as change tracking, but the two are closely related. Both seek to represent, in some way, how a document has changed.

Change tracking is inherently ordered, i.e. there is an order to the changes that have been made. Change representation for comparison is not ordered, the changes are shown but there may be no temporal order given, so no knowledge of whether one change pre-dates another.

OpenDocument (ODF and ODT) Standards

We were asked to recommend how to improve the change tracking in the OpenDocument Format (ODF) or more specifically, OpenDocument Text Format (ODT). Our proposal was evaluated in 2012 alongside two other proposals, which resulted in the choice of a representation based on operations, known as MCT, see this report for more details: https://www.oasis-open.org/committees/download.php/46485/Change-Tracking-Select-Committee-Report-13July2012.pdf

It is interesting to review why this operational approach was selected. The promises were great though the approach was, in our view, not appropriate. There is universal agreement that change tracking is a difficult problem, and there were in the committee two different views of how to approach it, and these are outlined briefly below.

Two approaches to representing change tracking

The operational view (MCT) says ‘tell me what operations have been performed, and I can undo them’. The generic change-tracking approach (GCT) says, ‘tell me what has changed to get to this version, and I can give you any previous version of the document’. But these two do not site comfortably together. Given a GCT representation of change, MCT has to work out what operation was performed, and that is difficult. Given an MCT representation, GCT has to work out what the effect of that operation was, and that too is difficult.

The current version of ODT, version 1.2 dated September 2011 https://www.oasis-open.org/standards#opendocumentv1.2, uses an approach that is closer to GCT than MCT. Microsoft Word also works in a similar way. A third proposal to the committee, ECT, was targeted at extending the current change tracking to cover new areas.

The principle of the MCT approach was to shift this complexity from the format into the application, so that an operation such as ‘delete column 4 in this table’ could be specified simply and easily in the format, and the application would know what to do. Following this through, the result is a standard for editing operations rather than a standard for representing change in a document. Also, for every operation the inverse needs to be defined, so that to undo ‘delete column 4 in this table’, the inverse operation ‘add this column between columns 3 and 4 in this table’ has to be applied, and therefore the deleted data has to be preserved. This is not easy to specify, and not easy to test, which is perhaps why it is taking so long to evolve into a standard.

It is interesting that the MCT approach (according to the examples in the report referenced above) represents a document as the start document and then all the edits that have been performed on it. Therefore, to extract the end document out of it, all these edits need to be processed; it is not possible simply to ignore the change tracking information. A significant benefit of the current change tracking, and of GCT, was that it could be ignored completely and the result was the final document.

It is perhaps, in retrospect, not surprising that the two approaches have each gone in separate directions. The complexity of defining MCT means that the OpenDocument community still has (as at mid-2017) no standard to use.

W3C Standards

A W3C Community Group was formed in 2012 https://www.w3.org/community/change/. There was a fair bit of activity initially but this has waned and there has been very little since mid-2015.

There is a keen interest amongst those who support standardisation to have a standard in this area. The main, and very understandable, reason is that those who are developing XML schemas in different areas do not want to burden their standard with complex markup to represent changes. It would be much better, they quite correctly argue, if there was a standard way to handle change to any XML format.

On the other hand, the many XML editing tools that exist have each developed their own way to handle change tracking, often using processing instructions (PIs). They do not have a motivation to inter-operate with competitive editing tools, and it does not seem important enough to their customers for them to insist on this. There may also be a competitive advantage in having better change tracking. Although change tracking in normal document editing is well established, it is still in development for structured documents. This may suggest that it is premature to have a standard in this area.

For DeltaXML, a standard would be useful but lack of a standard is not a big problem because we can generate tracked changes in whatever format is needed, from our generic change representation.

Papers and Articles presented by DeltaXML at various conferences

Approaches to Change tracking in XML

XML Prague – March 2010

This paper is an introduction to tracking change in XML documents and data. Presented at XML Prague 2010, held March 13th and 14th, 2010, Prague, Czech Republic.

View

Representing Changes in Open Document Format

Submission to OASIS ODF Technical Committee – July 2010

This paper is an introduction to tracking change in XML documents and data. Presented at XML Prague 2010, held March 13th and 14th, 2010, Prague, Czech Republic.

View

Representing Change Tracking in XML Markup

XML Prague – February 2013

This paper introduces a proposal of a standard mechanism for representing tracked changes in XML. It was presented at XML Prague 2013, Feb 9-10 2013.

View

Standard Change Tracking for XML

Balisage Confluence – August 2014

This paper explores the advantages of using a generic approach to representing tracked changes in XML and the benefits to having a standard XML solution. Refers to work done for OpenDocument (ODF) standards and W3C ‘Change’ Community Group.

View