Concurrent Merge Analysis

1. Introduction

In the deltaV2 format (produced by the merge and three way merge objects) the deltaV2 attribute accurately describes the contributions of the input files to the merge result. However, subsequent processing is simplified if these deltas are classified to describe the types of change which they represent. The current classification scheme describes changes in the result file as an add, a delete or a modify.

This classification has a number of uses. For example, it could be used to style changes in a publication, using strike through, red/green background or change bars to indicate change visually. Another use-case would be to drive a GUI associated with change display.

Because we can consider the ancestor version to be older than the other edits in the merge we have a temporal frame of reference and so the concepts of add and delete are defined relative to the ancestor version. This choice then leads to the following definitions for change categorisation:

  1. add: something that does not exist in the ancestor, but does in one or more edits.
  2. delete: something that exists in the ancestor, but is missing in at least one of the edits.
  3. modify: other changes not already classified as an add or a delete.

Users familiar with the deltaV2 format will be aware that change propagates up an XML tree. So that, for example, a change in a word in a paragraph affects the paragraph itself, the parent element such as section or body all the way to the root of the tree. This is represented in deltaV2 through the presence of a != separator between the version identifiers. Although similar logic could have been applied to the modify state in our change classification, we have avoided propagating modify up the XML tree as we did not consider it useful in the use-cases discussed above. It is therefore only applied to leaf elements, specifically text changes, using textGroup, and attribute changes. The add and delete classifications are applied both to the leaf elements and also more generally to elements within the tree. The following table summarizes where the change classifications will appear.

classification valueApplied to change types

The analysed deltaV2 output from a 'concurrent' merge augments the default deltaV2 output by adding an deltaxml:edit-type attribute. The deltaxml prefix/namespace is the same as used for deltaV2 attributes. Other than the additional attributes the analyzed result is identical to the deltaV2 representation.

2. Examples

2.1. A simple word deletion

Consider a paragraph containing the 'quick brown fox...' pangram being concurrently edited by Anna, Ben and Chris (we will use these names as the version identifiers and use 'Original' to represent the common ancestor). Let us suppose that Ben deletes the word quick in his edit. The deltaV2 representation of this would be as follows:

<p deltaxml:deltaV2="Original=Anna=Chris!=Ben">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna=Chris">
    <deltaxml:text deltaxml:deltaV2="Original=Anna=Chris">quick</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

The analyzed result is similar, but the textGroup used to represent the deleted word will be marked with a delete edit-type, for example:

<p deltaxml:deltaV2="Original=Anna=Chris!=Ben">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna=Chris" deltaxml:edit-type="delete">
    <deltaxml:text deltaxml:deltaV2="Original=Anna=Chris">quick</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

2.2. A complex deletion

In this example we will still have Ben deleting the word quick, but let us also assume that Chris changes that word to fast. Here is the deltaV2:

<p deltaxml:deltaV2="Original=Anna!=Ben!=Chris">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna!=Chris">
    <deltaxml:text deltaxml:deltaV2="Original=Anna">quick</deltaxml:text>
    <deltaxml:text deltaxml:deltaV2="Chris">fast</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

The analyzed result still uses a delete edit-type for this slightly more complex case:

<p deltaxml:deltaV2="Original=Anna!=Ben!=Chris">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna!=Chris" deltaxml:edit-type="delete">
    <deltaxml:text deltaxml:deltaV2="Original=Anna">quick</deltaxml:text>
    <deltaxml:text deltaxml:deltaV2="Chris">fast</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

2.3. A simple addition

In this example the word 'very' will be added by one of the editors, 'Chris'. The analysed result with the edit-type attribute is shown:

<p deltaxml:deltaV2="Original=Anna=Ben!=Chris">The 
  <deltaxml:textGroup deltaxml:deltaV2="Chris" deltaxml:edit-type="add">
    <deltaxml:text deltaxml:deltaV2="Chris">very </deltaxml:text>
  </deltaxml:textGroup>quick brown fox jumps over the lazy dog.</p>

2.4. A simple modification

In this example the word 'quick' will be changed by Chris to 'fast', but unlike the complex deletion example above this is the only change here. The analysed result with the edit-type attribute is shown:

<p deltaxml:deltaV2="Original=Anna=Ben!=Chris">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna=Ben!=Chris"" deltaxml:edit-type="modify">
    <deltaxml:text deltaxml:deltaV2="Original=Anna=Ben">quick</deltaxml:text>
    <deltaxml:text deltaxml:deltaV2="Chris">fast</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

This is a simple modification as there are only two deltaxml:text 'branches' or children in the above deltaxml:textGroup. More complex scenarios such as different words being changed in the different inputs would still be a modify, but would not be simple.

3. Notes

  1. The first example could be described as a simple deletion. When used in the context of the rule-based processing result we define a simple change as one marked with an add or delete edit type and where the deltaV2 attribute does not contain !=. In the case of a textGroup or attribute change this would correspond to there being exactly one deltaxml:text or deltaxml:attributeValue possibility. It is therefore possible to give a simple accept or reject type response in a change GUI with a single predictable result.
  2. The second example is of a complex change. This is defined in rule processing as having the add/delete edit type and also having a deltaV2 attribute containing !=. This form of conflict is often resolved interactively.
  3. The second, complex example is still classified as a deletion, even when a word is also modified. When we categorize change wadd and delete override modify.
  4. The simple examples above use textGroups and distinguish between simple and complex deletions. This approach is applied to addition (the presence of != implies complex addition) and it can also be applied to attribute change represented using deltaxml:attributes. The simple/complex categorization can also apply to elements. Consider for example where one editor will delete a paragraph and another adds a word to that paragraph, this would be a complex delete. Another way of thinking about categorizing element adds or deletes is to consider that a simple add or delete will have no descendant change in that subtree. A complex element add/delete will contain nested change.

4. Enabling the merge analysis

The merge analysis is performed as part of the output processing stage, whenever the result type is set to analysed (or rule resolved) deltaV2; so this result type needs to be set prior to the output processing stage being run. When using the API the result type must be set prior to invoking the extractAll method on a ConcurrentMerge object. The setResultType method on can be used with ConcurrentMerge.MergeResultType.ANALYZED_DELTAV2 as its parameter value. Similar rules apply to the ThreeWayMerge object and its corresponding result type.

The command line tool provides the result-type optional parameter; setting this to analyzed-deltav2 will provide a result with the classification attributes.