Merging tables

1. Introduction

DeltaXML Merge currently supports CALS table, HTML table, and DITA simple table processing.

The CALS table processing ensures that when syntactically and semantically valid (as per OASIS CALS table model documentation) input tables are provided the result will be a valid CALS table.

Similarly, the HTML tables processing ensures that when valid input tables are provided - according to the HTML-4 or HTML-5 documentation - the result will be a valid HTML-4/5 table. Note that both inputs need to follow the same standard (i.e. be HTML-4 or HTML-5).

Simple changes to the table, such as changing the contents of an entry, adding a row or column are generally represented as fine grain changes.

Some type of changes such as table entries overlapping or spanning multiple rows and columns are difficult to represent at fine granularity, whilst ensuring validity. In these cases, the changes are represented at row (i.e. , groups of added/deleted rows) or even whole-table granularity.

In case of DITA simple tables, the syntactic constraints ensure that cells cannot overlap or span either rows or columns, therefore changes are represented at a fine grained level of detail.

2. Change representation

The following table shows how different type of table changes are represented in a merge result.

Type of ChangeApplies to CALS tablesApplies to HTML tablesFine grain changesGroups of added/deleted rowsTable Duplication
Cell content change
Row addition/deletion
Column addition/deletion (Without columns alignment)
Column addition/deletion (With columns alignment)
Row span addition/deletion/modification
Column span addition/deletion/modification
2-dimensional span addition/deletion/modification
Invalid tables

3. Table processing configuration

In DeltaXML Merge, CALS tables and HTML table processing are configured separately. The following section talks about how to turn table processing on and off and set different CALS table processing modes.

3.1. CALS table

CALS table processing is enabled/disabled using setCalsTableProcessing.

Invalid cals table behaviour

In order to ensure that only valid CALS tables are passed to our specialized CALS table processing, each input table is marked either valid or invalid. This parameter declares what type of processing should be used for those tables that are marked as invalid. The 'warning report mode' parameter configures how recoverable errors are reported.

Three options are provided:

  • fail: The fail option stops the comparison by throwing an appropriate exception (that includes the errors identified by the validity checker).
  • propagate up: The propagate up option ensures that changes to an invalid table (or more specifically 'tgroup') are represented at the table level.
  • compare as XML : The compare as XML option essentially compares the tables as if they were well-formed XML.

This can be configured using setInvalidCalsTableBehaviour.

Warning report mode

This mode specifies the way in which invalid table warnings should be reported.

Different options such as comments, messages or processing instructions are available to report warnings.

This can be configured using setWarningReportMode.

CALS table validation level

The CALS invalid table behaviour depends on the CALS table validation level.

The CALS table validation level can either be STRICT or RELAXED.

This can be configured using setCalsValidationLevel.

3.2. HTML table

HTML table processing is enabled/disabled using setHtmlTableProcessing.