-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harvesting geonode (pycsw) by OAI-PMH does not work #6242
Comments
Hi @kamil386, I'm not sure this is a Dataverse issue. When I click "Metadata search via OAI-PMH:" on the right side of the page here: http://master.demo.geonode.org/developer/ It takes me to http://master.demo.geonode.org/catalogue/csw?mode=oaipmh&verb=Identify which shows what appears to be an error. Can you reach out to the Geonode team to make sure this is working as expected before we start troubleshooting here? Thanks! |
@djbrooke Geonode team has just fixed that bug, but still the same error appears. Maybe the problem is with bad interpreting the oai in the tag or additional necessary parameter "mode=oaipmh" in geonode/pycsw is not included in the Dataverse queries? |
@kamil386 is right, there are no more errors at http://master.demo.geonode.org/catalogue/csw?mode=oaipmh&verb=Identify (weird Django errors last week) so I'm re-opening this issue. |
Error Fix
|
@JingMa87 thanks for all the investigation. Do you think the bug is on the Dataverse side or the Geonode side (or both)? |
@pdurbin I did some more investigation and updated my last comment. The issue is definitely not with Dataverse. |
@JingMa87 thanks, from looking at our pom.xml we seem to be running a fork:
It looks like the code moved to https://github.com/DSpace/xoai and there's a version called 4.2.0. |
@pdurbin I can test the newest version of the library with Dataverse but I'm wondering what changes L.A. patched? It might be that the newest version of XOAI doesn't have the features that L.A. patched in so I would have to test for that. |
@JingMa87 I mean, you certainly could but is this issue a high priority for you? If so, I can ask L.A. about that patch. If not, are there other issues we could re-direct your energy into? We really appreciate all the pull requests! |
@pdurbin If any issues have more priority just let me know so I can discuss it with my coordinator. Do note that I'm a recently hired engineer for Data Archiving and Networked Services (DANS) in the Netherlands so I'm quite new to Dataverse. My current goal is to get to know the app more and in particular the harvesting client feature. |
@JingMa87 welcome to the Dataverse community! Here are a few harvesting-related issues you might want to read through:
@jggautier thinks a lot about harvesting and might have some other issues in mind. I'd also like to bring it to @landreev 's attention that we may have a future harvesting hacker in our midst. 😄 Thanks! |
Thanks @pdurbin and hi @JingMa87. The issue IQSS/dataverse.harvard.edu#72 - about special characters in dataset metadata preventing other repositories from harvesting from Harvard Dataverse - is the harvested-related issue most pressing to me right now. It's in Harvard Dataverse's GitHub repo, but as far as I know it's possible that other repositories would be or are being affected by it. That is, other repositories have characters in their metadata exports that are preventing others from harvesting from them. I did as much digging as I could, but wouldn't know how to proceed. |
@jggautier Sounds like something I can look into! Also do you or @pdurbin think I can get permissions to add issues and PRs to a project? Otherwise I would have to ask a colleague to do it every time I do something. |
@JingMa87 I just invited you to join https://github.com/orgs/IQSS/teams/dataverse-readonly . I hope that helps. |
@pdurbin I validated GeoNode's repo URL on https://www.openarchives.org/Register/ValidateSite and they're giving me the same error caused by double question marks. I asked our functional manager to report this issue and a fix to GeoNode. Since the original poster of this issue is not commenting, I'd like to close it. Agreed? |
Harvesting geonode (pycsw) by OAI-PMH does not work, the details:
XML response from geonode (pycsw):
http://master.demo.geonode.org/catalogue/csw?mode=oaipmh
<!-- pycsw 2.4.0 --><oai:OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><oai:responseDate>2019-10-01T18:51:35Z</oai:responseDate><oai:request>http://master.demo.geonode.org/catalogue/csw?mode=oaipmh</oai:request><oai:error code="badArgument">Missing 'verb' parameter</oai:error></oai:OAI-PMH>
On the second hand, valid XML response from dataverse that works for harvesting without any problem:
https://demo.dataverse.org/oai
<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2019-10-01T23:52:01Z</responseDate><request>https://demo.dataverse.org/oai</request><error code="badVerb">Illegal verb</error></OAI-PMH>
Logs:
[2019-10-02T01:58:18.164+0200] [glassfish 4.1] [INFO] [] [edu.harvard.iq.dataverse.HarvestingClientsPage] [tid: _ThreadID=51 _ThreadName=jk-connector(4)] [timeMillis: 1569974298164] [levelValue: 800] [[ metadataformats: failed;received empty list from ListMetadataFormats]]
Dataverse version: 4.16
The screenshot from the dashboard on create harvesting client window:
Maybe it is related to the oai: in the tag (it is the only difference in both xml responses) and dataverse can't process that xml.
Providing full url of oai in geonode also does not work:
http://master.demo.geonode.org/catalogue/csw?mode=oaipmh&verb=ListRecords&set=citable&metadataPrefix=oai_dc
The text was updated successfully, but these errors were encountered: