Rule Based Processing

1. Introduction

Two types of merge output, the DELTAV2 and ANALYZED_DELTAV2 result types, are complete representations of all of the content in the merge inputs. The SIMPLIFIED_DELTA is a simplified form for DELTAV2 which also represents all of the content in the merge inputs with loss of some of the change information. When content is deleted, for example, information about the deletion and the deleted content is included in the result.

However, the automatic acceptance of these simple changes such as additions and deletions into a result is a common use-case for line-based merge algorithms and similar processing is useful with the XML based algorithms used in our merge products.

We call this rule-based processing because a set of rules are used to determine which types of change are automatically applied. This processing is used when the user selects the RULE_PROCESSED_DELTAV2 result type for a merge. Without any user-specified processing rules the processing engine will, by default, process simple add changes so that the content is added. Similarly, simple delete changes are removed from the result and simple (leaf) modifications are applied. The processing rules allow control over which changes are displayed to the user (for example, for subsequent interactive checking or resolution). Another way of thinking about the display rules is that they control which changes are not automatically applied or converted. The SIMPLIFIED_RULE_PROCESSED_DELTA is a simplified form for RULE_PROCESSED_DELTAV2.

1.1. Motivation

The motivation for developing a rule-based processing system follows from the design of line-based merge or 'diff3' algorithms used in software version control systems. These systems typically accept non-conflicting changes, so that lines that are changed, but not in conflict are merged into the result.

However, our solution is more flexible in that rules can be used to control where the non-conflicting changes are applied whereas the line based algorithms typically do not provide any configuration possibilities.

1.2. Example

The following is an example of a deltaxml:textGroup used to show fine grained changes to inline text. One of the inputs ('Ben' in the example) changes the word 'quick' to 'fast'. Without rule based processing (when the result type is ANALYZED_DELTAV2) the output would be:

<p deltaxml:deltaV2="Original=Anna=Chris!=Ben">The 
  <deltaxml:textGroup deltaxml:deltaV2="Original=Anna=Chris!=Ben" deltaxml:edit-type="modify">
    <deltaxml:text deltaxml:deltaV2="Original=Anna=Chris">quick</deltaxml:text>
    <deltaxml:text deltaxml:deltaV2="Ben">fast</deltaxml:text>
  </deltaxml:textGroup> brown fox jumps over the lazy dog.</p>

The above is an example of a modification which can be rule processed. The default action of the rule processing system would be to remove the deltaxml:textGroup describing the change. The corresponding output would then be:

<p deltaxml:deltaV2="Original=Anna=Chris=Ben">The fast brown fox jumps over the lazy dog.</p>

In the above example the deltaxml:textGroup describing the change is removed and replaced with the literal text corresponding to the modification made by Ben (the word 'fast'). Also note that the deltaV2 attribute on the parent p element, has been updated to reflect the fact that it no longer contains changes. The updating or correction of delta attributes is applied throughout the tree and if appropriate to the root element. In certain cases, such as when all changes are simple and are processed, the result may have a deltaV2 on the root element indicating that no changes are present in the final result. The root element delta can then be used to determine that there is no need to use interactive change display or conflict resolving tools in these cases.

Similar to deltaxml:textGroup used in DELTAV2 merge ouput, deltaxml:versionGroup are used to represent the changes in SIMPLIFIED_DELTAV2. Rule processing can also be applied to the simplified output to produce SIMPLIFIED_RULE_PROCESSED_DELTAV2 output. Following is an example of deltaxml:versionGroups.

<p deltaxml:deltaV2="ancestor=edit1!=edit2">The
<deltaxml:versionGroup>
   <deltaxml:versionContent deltaxml:deltaV2="base=edit1">quick</deltaxml:versionContent>
   <deltaxml:versionContent deltaxml:deltaV2="edit2">fast</deltaxml:versionContent>
</deltaxml:versionGroup> brown fox jumps over the lazy dog.</p>

The default action of the rule processing system will remove the deltaxml:versionGroup to describe the change and produce the same rule processed result as above.

2. Rule configuration

There are currently five parameter settings available. Four of them are used to control changes which are to be displayed and therefore not resolved in any way. A parameter version priority list is used for conflict resolution based on the version identifier priorities. If none of these settings are used their default values have been chosen so that some resolution does take place when rule based processing is applied.

Rule configuration is achieved using the RuleConfiguration object in the API. This has several methods that describe the configuration as discussed below. The DitaConcurrentMerge object has a setRuleConfiguration method that changes the current configuration. This is applied during rule processing in the extractAll methods when the DitaConcurrentMerge.MergeResultType has been set to RULE_PROCESSED_DELTAV2.

2.1. DisplayChangesTo

This rule configuration parameter setting specifies whether changes to specific elements or groups of elements are either displayed or automatically resolved.

Any XPaths used are applied in the context of the entire 'output' document, after generation of the analysed deltaV2. It is possible to provide control over elements anywhere in the file, such as this XPath for selecting title elements:

//title

would apply to any title element.

An XPath can also be used to address a specific individual element using its 'id' attribute:

/topic/bodydiv[2]/p[@id='summary']

Note: the output XML tree may is, in general, likely to have a different structure to that of at least one of its inputs. For example, an input could add (or delete) a paragraph. Therefore using the 'sibling' index number to identify a position in the tree should only be done when the user is confident that these indexes will not change in the output.

Using the XPath sequence concatenation operator (comma), it is possible to supply multiple XPaths using the setDisplayChangesTo method in the API, for example:

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplayChangesTo("//title, /topic/bodydiv/p[1], //p[start-with(@id, 'sum')]");
dcm.setRuleConfiguration(rc);

Selecting changes to text and attributes is a little more involved, as it requires some knowledge of the deltaV2 format. Specifically changed text and attributes is contained in deltaxml:textGroup and deltaxml:attributes elements. The following example illustrates how to display changes to text containing the word 'important' and all 'id' attributes.

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplayChangesTo(
  "//deltaxml:textGroup[contains(string(.), 'important')]," +
  "//deltaxml:attributes/dxa:id");
dcm.setRuleConfiguration(rc);

For further information on deltaxml:textGroup and deltaxml:attributes please see the deltaV2 format documentation.

2.2. DisplayChangesInvolving

It is possible to specify that no changes for a given list of versions are resolved (i.e. they are displayed). This is achieved by setting the DisplayChangesInvolving parameter to a comma separated list of version identifiers that are to be displayed. The following code can be used to see all the changes involving the 'Anna' and 'Chris' versions.

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplayChangesInvolving(new HashSet<String>() {{ add("Anna"); add("Chris"); }});
dcm.setRuleConfiguration(rc);

2.3. DisplaySimpleAdds

The DisplaySimpleAdds parameter controls whether simple adds are displayed or automatically resolved. Its default value is false, which means that it will be automatically resolved. The following code can be used to display all simple adds.

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplaySimpleAdds(true);
dcm.setRuleConfiguration(rc);

2.4. DisplaySimpleDeletes

The DisplaySimpleDeletes parameter controls whether simple deletes are displayed or automatically resolved. Its default value is false, which means that it will be automatically resolved. The following code can be used to display all simple deletes.

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplaySimpleDeletes(true);
dcm.setRuleConfiguration(rc);

2.5. DisplaySimpleModify

The DisplaySimpleModify parameter controls whether simple modifications are displayed or automatically resolved. Its default value is false, which means that it will be automatically resolved. The following code can be used to see all simple modify.

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
rc.setDisplaySimpleModify(true);
dcm.setRuleConfiguration(rc);

2.6. Version Priority List

This rule configuration setting specifies the priority list of versions identifiers. Modifications and conflicts will be automatically resolved based on the priority list and versions involved in a change. This is achieved by comma separated list of version identifiers (first version identifier in list having highest priority).

DitaConcurrentMerge dcm= new DitaConcurrentMerge();
dcm.setResultType(MergeResultType.RULE_PROCESSED_DELTAV2);
RuleConfiguration rc= new RuleConfiguration();
List<String> priorityList= Arrays.asList("chris","anna","ben");
rc.setVersionPriorityList(priorityList);
dcm.setRuleConfiguration(rc);

2.7. Parameter precedence

The precedence order of the rule based resolver parameters is summarized as follows:

  1. The DisplayChangesTo and DisplayChangesInvolving settings will override other parameters
  2. The DisplaySimpleAdds and DisplaySimpleDeletes settings control whether resolution is applied to these categories of changes for any elements that are not configured with parameters discussed above.
  3. All other parameters (DisplayChangesTo, DisplayChangesInvolving, DisplaySimpleAdds, DisplaySimpleDeletes, DisplaySimpleModify ) will override the version priority list setting.