-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publishing grobid-core to Maven central #59
Comments
Hello! Thanks for the issue ! These are the problems I can remember so far about that:
I don't know if there is a mechanism in Maven to download large resource files that might not be artifacts on a maven repository. If it is the case, we could think about hosting the resources files on Amazon S3 for example. We can diminish a lot the size of grobid-home by not including CRF++ models and by relying only on Wapiti. CRF++ is still included and can be optionally used, but does not present any advantages in comparison to Wapiti. On the other hand, we might use another CRF library in the future, so the size of the resource files could also increase.
For these dependencies, Grobid uses currently a local file-based repo. |
Hi @kermitt2 thank you for quick reply and the context. Some comments from me below
I don't think this is an issue for pushing Maven artifacts. I say this because
How is this licensed? I see it is copyrighted however is it licensed? If it is licensed permissively enough then we can maybe also make this available somewhere.
Do you have a link to the above project please?
Same with the above. Do you have a link to this component(s) so I can check it out and see what the status of the error is? Thank you very much in advance. |
Hi Sujen! Normally all these packages have a licence compatible with the licence of Grobid, Apache 2 so that they could be included directly with Grobid.
Now the more difficult part: the wipo-analysers has no project page. I've packaged everything as maven project, so I could create a GitHub repo for this. I will double check with the WIPO people, but as it is Apache 2, that should be no problem. Thanks a lot for you effort ! |
thanks @kermitt2 and @sujen1412 right now we should be able to publish Wapiti and ImageIO and the analyzers to the Central repo under the Grobid group, indicating these are used in conjunction with Grobid. If at such time those particular developer communities would like to take management and ownership of publishing their artfiacts to Central they can, and we can update Grobid to use them at that time. For now though it's safe to publish the rest of the jars. GREAT library @kermitt2 ! We used it in DARPA Memex with great success. |
Fix for Grobid #59 - Publishing to Maven Central
OK I got it working! https://issues.apache.org/jira/browse/TIKA-1699 |
@sujen1412 this issue can be closed. |
@sujen1412 I am seeing build errors on Jenkins for Tika: https://builds.apache.org/job/tika-trunk-jdk1.7/822/ It seems that the other jars aren't in Central like we talked about. Can you please take care of that? |
Many thanks @sujen1412 and @chrismattmann for your efforts to integrate GROBID in Apache Tika! And thank you Chris for your nice words. It's really a pleasure to see the library used and considered useful! |
Hi @sujen1412 can you re-open this? We need either those jars published in to Central or another mechanism here to integrate into Tika. One thing I was thinking of was just connecting to the GROBID server. See discussion on http://issues.apache.org/jira/browse/TIKA-1699 |
OK I filed an issue to upload the Wapiti jar fork: |
OK here is the issue for EUGFC ImageIO plugin: https://issues.sonatype.org/browse/OSSRH-17126 |
Here's the one for Language Detection: https://issues.sonatype.org/browse/OSSRH-17127 |
For Chasen CRFPP: https://issues.sonatype.org/browse/OSSRH-17128 |
Here's the WIPO analysers: https://issues.sonatype.org/browse/OSSRH-17129 That should be all of them. |
@kermitt2 I tried to use the maven central grobid but maven does not build:
Do you know what might be going on? Can I add these locally? |
Hello, these libraries come locally with Grobid (under grobid/lib or grobid-core/lib), so they are not loaded from maven central when Grobid builds. You can have a look at grobid-core/pom.xml how the local repository is defined for resolving these dependencies without troubles. |
@kermitt2 Thanks, I fixed these issues by including the jars and fixing to |
I think at the moment with the deployment in bintray we can close this issue, isn't it? |
Fix for Grobid kermitt2#59 - Publishing to Maven Central Former-commit-id: 999c43a
I am trying to integrate grobid into Apache Tika for metadata extracion. It would be nice to have grobid-core published to maven central to make adding the dependency in pom.xml easier.
The text was updated successfully, but these errors were encountered: