Any object that has a unique key member should ideally be represented as an object where the key is pulled out as the member string – this leads to unambiguous comparison. See the example below.
Arrays present more of a problem for comparison. This is because arrays are used for different purposes. For example, if an array is used to represent an x,y coordinate, then the expectation is that [ 34, 56 ] is not the same as [ 56, 34 ]. However, if the array is being used as an unordered set of numbers, then the arrays should be considered equal. So comparing by position or as unordered items are alternative approaches to be applied depending on the interpretation of the array data.
Furthermore, comparing by position is not always what is needed when we use an array as a list, where the item order is significant. In this case, comparing [1,3,2,4,5] with [1,3,4,5] by position would give three differences: 2 != 4, 4 != 5 and 5 is a deleted item.
[ 1, 3, 2, 4, 5 ]
| | | | x
[ 1, 3, 4, 5 ]
A more intelligent ordered comparison might just say that 2 has been inserted.
[ 1, 3, 2, 4, 5 ]
| | + | |
[ 1, 3, 4, 5 ]
So it is arrays that cause most problems in comparing JSON data.
When JSON is generated, arrays are often used where the data could be represented as objects. Converting such an array into an object may therefore be a sensible pre-comparison step in order to get only ‘real’ changes identified.
For example:
{"contacts": [
{
"id": "324",
"first_name": "AN",
"last_name": "Other"
},
{
"id": "127",
"first_name": "John",
"last_name": "Doe"
}
]}
would be much better represented for comparison purposes as:
{"contacts": {
"324": {
"first_name": "AN",
"last_name": "Other"
},
"127": {
"first_name": "John",
"last_name": "Doe"
}
}}
It may not look quite so natural, but the corresponding contacts will be aligned properly.