Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch OAI-PMH library in metafacture-biblio #360

Closed
fsteeg opened this issue Feb 23, 2021 · 1 comment · Fixed by #467
Closed

Switch OAI-PMH library in metafacture-biblio #360

fsteeg opened this issue Feb 23, 2021 · 1 comment · Fixed by #467

Comments

@fsteeg
Copy link
Member

fsteeg commented Feb 23, 2021

In metafacture-biblio, we depend on org.dspace:oclc-harvester2:0.1.12 (see details).

It's the only version of the OCLC harvester published to Central (see https://mvnrepository.com/artifact/org.dspace/oclc-harvester2). There is a GitHub repo at https://github.com/OCLC-Research/oaiharvester2 which contains a slightly newer version, but is not published to Central.

We came across an issue in the library while using it from OERSI, caused by a call in HarvesterVerb, resulting in duplicte logging output (see workaround). With our current setup, we have no way to properly fix issues like this. We should either depend on the OCLC harvester in a way that allows us to make changes to the code, or switch to a new library.

The OCLC harvester is used in a lot of projects on GitHub, many of which incorporate the code into their repos. The newest, maintained version of the original OCLC code seems to be in the oai-harvest-manager repo: https://github.com/clarin-eric/oai-harvest-manager/tree/master/src/main/java/ORG/oclc/oai/harvester2/verb. That repo however is not published to Central.

One option would be to set up a fork of the original OCLC repo with publishing to Central via GitHub actions. This would already give us the possibility to make changes to the code. We could also ask the oai-harvest-manager folks to contribute their version to that repo.

Another option would be to switch to a different library, like XOAI, which is published to Central.

Discussed with @dr0i: as a first step, we should have a look at XOAI to see if that works for us.

tibdevelopment pushed a commit to TIBHannover/oersi-etl that referenced this issue Feb 23, 2021
Avoid duplicate log outputs after OAI-PMH harvester ran for the
first time (old library programmatically adds a console appender)

See https://gitlab.com/oersi/oersi-etl/-/issues/40

Remove previous workaround from b2ed6cb

See https://gitlab.com/oersi/oersi-etl/-/issues/53#note_514936670

Long-term solution will be implemented in Metafacture

See metafacture/metafacture-core#360
fsteeg added a commit that referenced this issue Mar 11, 2021
Causes downstream problems due to jitpack.io requirement

See #360
fsteeg added a commit that referenced this issue Mar 11, 2021
tibdevelopment pushed a commit to TIBHannover/oersi-etl that referenced this issue Mar 11, 2021
Build via script failed sporadically due to server issues,
now uses library via metafacture-biblio instead, see
metafacture/metafacture-core#360
fsteeg added a commit that referenced this issue Apr 20, 2021
Causes downstream problems due to jitpack.io requirement

See #360
fsteeg added a commit that referenced this issue Apr 20, 2021
@fsteeg fsteeg mentioned this issue Apr 20, 2021
tibdevelopment pushed a commit to TIBHannover/oersi-etl that referenced this issue Dec 22, 2021
- Set up logging to file for testing (log/oersi-etl.log)
- Downgrade metafacture-core dependencies to 5.3.0-rc2
(which works around logging-related issues with a third-party lib,
see metafacture/metafacture-core#360)
- Update slf4j and log4j dependencies to current releases
@fsteeg
Copy link
Member Author

fsteeg commented Sep 16, 2022

It's the only version of the OCLC harvester published to Central (see https://mvnrepository.com/artifact/org.dspace/oclc-harvester2).

When revisting this, I saw there now is a 1.0.0 published on Aug 5, 2022 from https://github.com/DSpace/oclc-harvester2. Yay, thanks @tdonohue! I'll update the dependency and assign @dr0i in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant