… compare two JSON structures and pin-point … the differences – – Nicholas Ring – Google+
Posted by jpluimers on 2019/08/20
I’ve added a few WayBack/Archive.is links to the interesting comments by Zoë Peterson from Scooter Software (of Beyond Compare fame) at [WayBack] … compare two JSON structures and pin-point … the differences – – Nicholas Ring – Google+:
Beyond Compare 4 has an optional “
JSON
sorted” file format that usesjq
to pretty print and sortJSON
data before comparing it. It’s not included out of the box yet, but you can get a copy here:If you’re interested in an actual algorithm and not just an app, I don’t have a suggestion handy, but could dig one up. Tree alignment is more complicated than sequence alignment and we did do research into it, but it was quite a few years ago and didn’t get incorporated into BC.
XML
alignment algorithms were being actively researched back in the aughts and they should trivially transfer toJSON
.
…It looks like our research mostly ended around 2002, and I wasn’t personally involved in it, so I don’t know how helpful this will be, but here’s what I have:
XyDiff
(C++)
“Detecting Changes in XML Documents” by Gregory Cobena, Serge Abiteboul, Amelie Marian
[WayBack] https://github.com/fdintino/xydiffDiffXML
(Java)
“CS4 Dissertation: XML Diff and Patch Utilities” by Adrian Mouat
[WayBack] http://diffxml.sourceforge.net/X-Diff
(C++/Java)
“X-Diff: An Effective Change Detection Algorithm for XML Documents” by Yuan Wang, David J DeWitt, Jin-Yi Cai
[WayBack] http://pages.cs.wisc.edu/~yuanwang/xdiff.htmlDiffMk
(Java/GPL)
[Archive.is] https://sourceforge.net/projects/diffmk/XML-SemanticDiff
(Perl)
[WayBack] http://search.cpan.org/dist/XML-SemanticDiff/xmldiff
(Python)
[WayBack] https://www.logilab.org/project/xmldiff
The general idea in the thread is that JSON
– though not as formalised as XML
– does have structure, so if you can normalise it, then XML
ways of differencing should work.
Normalisation also means that you need to normalise any floating point, date time, escaping, quoting, etc. Maybe not for the faint of heart.
–jeroen
Leave a Reply