This post examines and discusses the various ways in which the git merge process can be extended and explains why we’re suggesting that its more appropriate to integrate our tree-based merge tools as merge-drivers rather than the more common route of providing a mergetool.

The merge (and also graft) process in Git involves a number of components:

The ‘merge strategy’ is responsible for looking at all of the files and directories with an understanding of moves and renames, matching up the corresponding files, determining the appropriate ancestor and calling the merge driver on triples of files. In some cases the scenario will determine that a full merge is unnecessary and may, for example, perform a fast-forward merge. It is also possible to specify a scenario such as ‘ours’ that produces a result that takes all of the files on a certain branch. In these cases it is not really a full merge and the merge driver may not be invoked.

The ‘merge driver’ receives three files corresponding to the ancestor and the two branches, loads the content into memory and is responsible for aligning their content and identifying any conflicts. Using a return code, it signals to the invoking code whether there are any conflicts. This usually reports a message to the user, often using a line starting with a capital “C” character. The diff3 merge driver in Git represents conflicts using a textual line based format consisting of marker lines using angle-bracket, equals or plus characters. The git config command can be used to configure different merge drivers and the .gitattributes file can then select between them using the filenames or extensions of the files being merged. For example:

*.xml,*.xsl    merge=xmlmerge
*.json         merge=jsonmerge

The user typically needs to resolve any conflicts in each file before the merge operation can be completed and committed. It is possible to take the file with the markers produced by the driver and resolve the conflicts by editing that output in a text editor and then reporting that the file has been resolved. A ‘mergetool’ provides a graphical user interface to automate the conflict resolution process, often allowing the user to select content from one of the branches or possibly the ancestor for each of the conflicting regions in the file.

There are two common usage or interaction patterns we have found relating to the use of merge drivers and merge tools:

  • The merge driver produces the line-based conflict markers and then the merge tool reads the result file from the driver, interprets the markers and provides the user with selection capabilities based on this interpretation. We know of  two merge tools which take this approach: MS Visual Studio Code and TkDiff.  Please do add comments if you know any others.
  • The merge tool, when it is invoked, is supplied with the filename of the driver result, but also the names of the original inputs to the merge driver. It can then re-merge the inputs and perhaps base its user interface on internal data structures from its own merge algorithm. Examples of tools which re-merge and do not seem to use the driver results include: Araxis Merge, P4Merge, and OxygenXML.

It’s possible to integrate our XML and JSON aware  merge products into the Git merge process as either a merge driver or merge tool. We believe that the best approach is to integrate as a merge driver and use the following arguments.

Avoiding conflict confusion

The merge driver and merge tool should identify the same conflicts, i.e. behave in a consistent way. When processing XML with a line-based algorithm (such as diff3), changes such as those to attribute order might cause a conflict in the merge driver. In many workflows a conflicting file would cause the merge tool to be invoked in order to resolve the conflict. But if the merge tool then uses a tree-based XML or JSON aware algorithm this would not identify these apparent conflicts and the file may not even have any conflicts present. The unnecessary invocation of the merge tool may cause confusion for the user.

Improved non-conflicting results

A tree-based merge algorithm which is XML or JSON aware would normally produce well-formed XML or JSON results. However this is not true of a line based merge such as diff3, where the result may have mismatched element tags for example. These bad results will not necessarily be associated with a conflict – the mismatched tag may be non-conflicting. If the tree-aware algorithm is only used in the merge tool, it  may never be invoked unless there is a conflict and it is therefore possible for bad result to go unnoticed.

An algorithm with a better understanding of the data and its semantics can make better alignment decisions. Again in non-conflicting situations it makes sense to have this better alignment performed at the merge driver stage.

Simpler software design

The separation of the merge algorithm and a conflict resolving GUI can lead to simpler software design. It may be that merge tools find the textual markers insufficient for their needs and can provide a better experience by re-running a merge algorithm, but the merge architecture would be simpler if this was not necessary. This would avoid duplicated code and reduce the processing and IO required.

Let’s finish the post with screenshot.  Here’s one of our test files for an attribute conflict.  We’ve used an XML aware merge process in the merge driver and it identifies the attribute conflict.  In this case we’re using the same textual markers to annotate the conflict as diff3 uses, but we have reformatted the conflict to minimise and precisely include just the conflicting part of the XML tree.  If Visual Studio Code is used as a merge tool it then provides the conflict handling capabilities shown just above each of the conflicting areas.  The screenshot is part of ongoing experimentation with change representations that can be used to communicate between the merge driver and mergetool – we’re also looking at XML and JSON based markup and we are planning to discuss this more in future blog posts or conference papers.