You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
JabRef produces fatal exceptions for files containing non-DOI-related strings such as 10:51 (a timestamp) and 10/B(C)/15 (an arbitrary designation/ID)
The ShortDOI parsing subsystem was improved in #6920 to fix failure cases, but it appears there are additional cases and strings (probably an arbitrarily high number) that produces fatal exceptions.
Given the issues associated with arbitrary strings in arbitrary documents, I suspect it is unlikely sustainable in the long-term to fixed pattern match, particularly if the behaviour of the system for failing cases, remains a fatal exception from which the user must manually recover (ie: identify the document containing the string, and exclude it from import).
I propose the behaviour be changed to fall-through (not fail). If it is desirable to not lose the failing semantics, files/entries may potentially be with a note or status that the parsing resulted in a null result, though I'm not sure that is particularly valuable.
Steps to reproduce the behavior
Prepare local PDF files with contents that contain strings that produce exceptions (see below)
Create New library
Run Tools -> Search for unlinked local files
Browse to folder containing local files -> Scan -> Import
Log Files
Log File
java.lang.IllegalArgumentException: 10/B(C)/15 is not a valid DOI/Short DOI.
at org.jabref@5.2.298/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
at org.jabref@5.2.298/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
at org.jabref@5.2.298/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)
Log File
java.lang.IllegalArgumentException: 10:51 is not a valid DOI/Short DOI.
at org.jabref@5.2.298/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
at org.jabref@5.2.298/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
at org.jabref@5.2.298/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)
The text was updated successfully, but these errors were encountered:
Hi,
I wrote the short-doi improvement you mentioned.
The strings 10/B(C)/15 and 10:51 by themselves should not be interpreted as short dois anymore. I will look into it and try to find out what's wrong.
Summary
JabRef produces fatal exceptions for files containing non-DOI-related strings such as
10:51
(a timestamp) and10/B(C)/15
(an arbitrary designation/ID)The ShortDOI parsing subsystem was improved in #6920 to fix failure cases, but it appears there are additional cases and strings (probably an arbitrarily high number) that produces fatal exceptions.
Given the issues associated with arbitrary strings in arbitrary documents, I suspect it is unlikely sustainable in the long-term to fixed pattern match, particularly if the behaviour of the system for failing cases, remains a fatal exception from which the user must manually recover (ie: identify the document containing the string, and exclude it from import).
I propose the behaviour be changed to fall-through (not fail). If it is desirable to not lose the failing semantics, files/entries may potentially be with a note or status that the parsing resulted in a null result, though I'm not sure that is particularly valuable.
Steps to reproduce the behavior
New library
Tools
->Search for unlinked local files
Browse
to folder containing local files ->Scan
->Import
Log Files
Log File
Log File
The text was updated successfully, but these errors were encountered: