-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the org.json dependency by openjson library #84
base: master
Are you sure you want to change the base?
Replace the org.json dependency by openjson library #84
Conversation
(https://github.com/openjson/openjson) - fixes license incompatibility, see https://wiki.debian.org/qa.debian.org/jsonevil https://lwn.net/Articles/707510/ - improves performance when writing JSON
I might be missing it but I don't see a claim in either article that the JSON license is actually incompatible with the Apache license, merely that it doesn't meet the FSF and OSI definitions of free or open source and therefore is not allowed to be used by official Apache Software Foundation projects and is treated as non-free by Debian. Although there's obviously an expectation that webarchive-commons is open source and therefore ideally all its dependencies are as well. This patch will have some level of impact in dependent applications such as Heritrix and OpenWayback. At very least both rely on the json.org library being pulled in as a transitive dependency. I'm not sure yet whether it will cause an API incompatibility, most of the usages in webarchive-commons seem internal. The danger is that this forces Heritrix to change to openjson and that in turn breaks everyone's custom modules. It's a pity they used a different package name as otherwise it might actually be drop in bytecode compatible. It appears openjson was forked from com.tdunning/json which actually does use the org.json package name: https://mvnrepository.com/artifact/com.tdunning/json/1.8 We could try using that instead if this change does lead to an API breakage headache. @ldko, @nlevitt, @anjackson do you have any thoughts? |
Hi @ato, the option to rely on Ted Dunning's package has been discussed in commoncrawl#17 and tdunning/open-json#13. It was the initial trial but turned out to be significantly slower. |
https://en.wikipedia.org/wiki/Douglas_Crockford#.22Good.2C_not_Evil.22
Maybe we can also get a special allowance to do evil? :) I added json-simple as a dependency to heritrix-contrib at one point: internetarchive/heritrix3#242 My feeling is pretty much that the license thing is a non-issue. Nobody's suing anybody over this. I haven't heard of anybody being prevented from using heritrix it. The only way I can imagine it becoming an issue is if people (i.e. us) start making noise about it. If this is really about performance, do we have any numbers from benchmarking? |
One point is that Apache projects (hosted by the ASF) must not depend on webarchive-commons resp. need to fix the license issue by managing transitive dependencies and replacing org.json by a bytecode compatible package. That's possible and was one of the trials in commoncrawl#16. @ato, I understand that the potential issues in downstream projects (namely heritrix3) can be blocker.
See the profiles in commoncrawl#16 - the JSON serialization is almost twice as fast, the entire WARC -> WAT conversion becomes 6-7% faster. |
Since the original discussion, Doug Crockford has dedicated the JSON library to the Public Domain, which should make it acceptable to most open source organizations. |
Replace the org.json dependency by openjson to