Skip to content

Commit

Permalink
Merge pull request #10837 from Recherche-Data-Gouv/harvest_exclude_in…
Browse files Browse the repository at this point in the history
…valid_tag

Exclude others namespace from harvesting "oai_dc" metadata prefix
  • Loading branch information
ofahimIQSS authored Nov 19, 2024
2 parents bbf29cc + cc2a056 commit 42d00d1
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Some repository extend the "oai_dc" metadata prefix with specific namespaces. In this case, harvesting of these datasets is not possible, as an XML parsing error is raised.

The PR [#10837](https://github.com/IQSS/dataverse/pull/10837) allows the harvesting of these datasets by excluding tags with namespaces that are not "dc:", and harvest only metadata with the "dc" namespace.
Original file line number Diff line number Diff line change
Expand Up @@ -205,8 +205,17 @@ public DatasetDTO processOAIDCxml(String DcXmlToParse) throws XMLStreamException

private void processXMLElement(XMLStreamReader xmlr, String currentPath, String openingTag, ForeignMetadataFormatMapping foreignFormatMapping, DatasetDTO datasetDTO) throws XMLStreamException {
logger.fine("entering processXMLElement; ("+currentPath+")");

for (int event = xmlr.next(); event != XMLStreamConstants.END_DOCUMENT; event = xmlr.next()) {

while (xmlr.hasNext()) {

int event;
try {
event = xmlr.next();
} catch (XMLStreamException ex) {
logger.warning("Error occurred in the XML parsing : " + ex.getMessage());
continue; // Skip Undeclared namespace prefix and Unexpected close tag related to com.ctc.wstx.exc.WstxParsingException
}

if (event == XMLStreamConstants.START_ELEMENT) {
String currentElement = xmlr.getLocalName();

Expand Down

0 comments on commit 42d00d1

Please sign in to comment.