How to represent changes in JSON

A JSON value can be one of these:

  1. Literal value false, true or null
  2. number
  3. string
  4. object
  5. array

We will look at how changes to each of these can be represented.

Showing change in JSON Literals and numbers

Let’s say we change {“archived”: false} to {“archived”: true} . The standard JSON patch would represent this change as:

JSON Patch:
[{
  "op": "replace",
  "path": "/archived",
  "value": true
}]

This works fine for this simple example, but although it tells us what to replace and the new value it does not tell us the original value, so it works only in one direction. JSON Patch also suffers from the fact that each operation changes the state so it can get very complicated to understand. What we are looking for is a delta representation that will be symmetrical, so we can go in either direction, and this is shown below for this example.

{
   "archived": {"dx_delta": {
      "A": false,
      "B": true }}
}

This shows the original value, “A” which is false, as well as the new “B” value. It is a functional representation, indicating what has changed rather than how to change it. You can see also that it would extend to three versions (or more) quite trivially, we just add a “C” and then for four a “D” value.

This pattern works for changing any value to any other value, even if one or both is an object or an array. The value is just replaced by a dx_delta object with the relevant different values. This works for strings as well, though later in this blog we look at strings in more detail.

Showing change in JSON objects

The example above is for an object member, so you can see how this extends to showing changes in a object. For example if we change

{
   "info": "your data",
   "channel": "bbc 1",
   "name": "John Joe Smith"
}

to

{
   "info": "my data",
   "channel": "bbc 2",
   "name": "John Joe Smith"
}

we get

{
   "info": {"dx_delta": {
      "A": "your data",
      "B": "my data"}},
   "channel": {"dx_delta": {
      "A": "bbc 1",
      "B": "bbc 2"}},
   "name": "John Joe Smith"
}

Notice that “name” has not been changed but it still appears – in many contexts this is useful, so that we have the original JSON data with all the changes indicated. We call this a full-context delta. It is simple enough to remove items that are not changed, to give a changes-only delta – both adopt the same format for changes so can be processed in a similar way.

Although in this case we have both an “A” member and a “B” member for each dx_delta object, this might not be the case. If there was, for example, no “B” member then this means that the item was not present in the “B” document – note that it does not mean that it was null, it means it was not present at all. The array example below shows how this works.

The changes can be embedded anywhere in a JSON hierarchy so we can show the changes at the lowest level of granularity, which is usually what we need.

Showing change in JSON arrays

Objects and values are fairly straightforward for showing changes, but arrays present more of an issue. Consider the changes between [1,2,3,4] and [1,2,4,5]. As an aside, this is where JSON Patch is very difficult to understand, the relevant patch is:

JSON Patch:
[
 {
   "op": "add",
   "path": "/4",
   "value": 5
 },
 {
   "op": "remove",
   "path": "/2"
 }
]

This says add a value at the end (position 4, counting starts at zero), and then remove the item at position 2, i.e. the third item which is ‘3’. If the two “op” objects were reversed we would get something completely different. Patch is easier to understand if multiple changes to an array start at the end and work back towards the start.

So, a better delta would be:

[
   1,
   2,
   {"dx_delta": {"A": 3}},
   4,
   {"dx_delta": {"B": 5}}
]

This shows us that between the 2 and the 4 there was a 3 in the “A” document, and after the 4 there was a 5 in the “B” document. This is a lot easier to understand. You will probably have noticed that this is not the only possible delta for this situation. This also works:

[
   1,
   2,
   {"dx_delta": {"A": 3,"B": 4}},
   {"dx_delta": {"A": 4,"B": 5}}
]

Which of these two representations is needed depends on the nature of the data and whether item position is important or not. If position is important the second delta works best, otherwise the first one shows fewer changes.

Showing change in JSON strings

As mentioned above, JSON strings can be treated in the same way as literals and numbers. So we can show changes between:

{"description": "This is a good example of Word by Word processing"}

and

{"description": "This is a great example of Word by Word processing"}

as

{
   "description": {"dx_delta": {
      "A": "This is a good example of Word by Word processing",
      "B": "This is a great example of Word by Word processing"}}
}

That is quite correct and a simple representation. But it is sometimes useful to be able to represent the changes at a word level, so in this example we can see that only one word has changed. Here is a way to do that:

{
   "description": {"dx_delta_string": [
      "This is a ",
      {"dx_delta": {"A": "good","B": "great"}},
      " example of Word by Word processing"]}
}

Here we have replaced the string value with an object, a “dx_delta_string”. This is an array of strings where each member is either an unchanged chunk of text or a delta object showing the changes. So here we can now clearly see that only one word has changed. It is a more complex representation but gives the changes at a finer level of granularity. It is quite easy to re-construct either the “A” string or the “B” string by concatenating the array members and, where there is a dx_delta, choosing the relevant value.

Conclusions

The delta representation shown here has several important characteristics, all of which are advantages over the regular JSON Patch format. The first is that it is symmetrical and works as a ‘patch’ in both directions, i.e. to convert “A” to “B” or to convert “B” to “A”.

Secondly, the same format can represent the whole JSON structure with changes shown, or it can represent just the changes – showing changes-only is similar to a patch in terms of its functionality. The full-context delta can in fact be used as an ‘intelligent patch’ in that because it maintains context it is able to patch JSON documents that are not exactly equal to the “A” or the “B” document. More on that later.

The third characteristic is that this delta extends to representing changes in three or more JSON structures.

Finally, changes to strings can be described at the word level.


Try a 30 day free professional trial of DeltaJSON today →

Keep Reading