-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exclude others namespace from harvesting "oai_dc" metadata prefix #10837
Exclude others namespace from harvesting "oai_dc" metadata prefix #10837
Conversation
… tags are ignored (cherry picked from commit 8514c7f)
FYI: You might want to look at/review #10836 which I think is doing something similar but more extensive. |
@qqmyers I'm not sure there is a link. #10837 comes before in the If I missed something, could you shed some light on it for me? |
Sorry - I agree it's not related. I just saw the note about skipping entries that would fail and wanted to make sure you saw the other PR, but looking at your code I see you're addressing problems in even reading the XML input. |
Hi @pdurbin ! There is a chance for this small PR to be embedded into 6.5 version ? 🙏 |
@jeromeroucou we moved it to "ready for review". Thanks for the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeromeroucou Thank you for the PR! I am moving it into "ready for QA".
When you have a chance, please sync the branch with develop.
Our plan is to include this change in the 6.5 release next month.
testing passed, merging PR[ |
What this PR does / why we need it:
This PR allows the harvesting of certain repository who expose metadata with specific namespace.
Some repository extend the "oai_dc" with specific namespace. For example, SEANOE expose specific metadata with
dct
namespace. Below, the result of https://www.seanoe.org/oai/OAIHandler?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:seanoe.org:41307Actually, this record can't be harvested because the following exception occurs :
We propose to ignore everything that is not the
dc
namespace which means skip theWstxParsingException
.Which issue(s) this PR closes:
No related issue funded
Special notes for your reviewer:
Not really but I've a suggestion to improve the scope of this pull request with another one (or issue) : the
ForeignMetadataFormatMapping
can be more flexible and can be used for more namespaces thandcterms
. With this, we can add a mapping fordct
namespaceSuggestions on how to test this:
Add a new harvesting client with
https://www.seanoe.org/oai/OAIHandler
server andGROUP:EMSO
set.Before the PR, all datasets are in error, with this PR, all datasets are imported.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
No
Is there a release notes update needed for this change?:
A release note snippet has beed added