-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crossref as a DOI provider (experimental) #10806
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is close to working but I think right now it mixes the way legacy providers and new multiple pid providers work w.r.t. settings. I've tried to leave comments about how to disentangle that, but am happy to talk through it as well. I also commented on where I think some test changes/additions are needed.
I also noted that we'll need to decide if the updates to DOI metadata coming in other PRs should also apply to CrossRef - if so we probably need a new issue once everything is merged.
|
||
|
||
// PROVIDER CROSSREF | ||
CROSSREF_URL(SCOPE_PID_PROVIDER, "url"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a SCOPE_PID_CROSSREF(SCOPE_PID_PROVIDER, "crossref"),
entry and then all of the rest should use that scope rather than SCOPE_PID_PROVIDER. The effect of that is to let you define jvm options of the form dataverse.pid.<id>.crossref.url
for each of the potentially many crossref providers (rather than having only one global setting for all crossref providers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I had that but removed it to have 'dataverse.pid.< id >.url' without the .crossref.
I'll make that change
|
||
.. _dataverse.pid.crossref.password: | ||
|
||
dataverse.pid.crossref.password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you edit this to move it to the multiple provide section, please add a note like the one on DataCite suggesting that more secure options should be used (versus having the password as a plain text jvm-option)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR was made before multiPid was supported/before docs were improved with that, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will modify the doc
@@ -617,6 +617,15 @@ this provider. | |||
- :ref:`dataverse.pid.ezid.username` | |||
- :ref:`dataverse.pid.ezid.password` | |||
|
|||
**JVM Options for CrossRef:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation, and some of the code, is for 'legacy' single pid configuration but I think we could/should only support handling the new multipid way of doing config. Although I think you've added it, I think it's reasonable to not support the legacy style at all for cross ref (since the idea of legacy was to support people who had already deployed with the old single provider jvm options and no one could have used CrossRef before now (except the developer's group).
To change the docs - see the DataCite section in the guide and add the info there (probably right after DataCite/before Handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will modify the doc
@@ -188,6 +188,9 @@ private void loadProviders() { | |||
legacy = new FakeDOIProvider("legacy", "legacy", authority, shoulder, identifierGenerationStyle, | |||
dataFilePidFormat, "", ""); | |||
break; | |||
case "CrossRef": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should drop this. If you want to support legacy still, this section should be doing like the DataCite legacy section above to read the legacy-style options and feed them into the provider directly (creating a new CrossRefDOIProvider rather than calling a factory method). Again - probably better to drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove
import java.util.logging.Level; | ||
import java.util.logging.Logger; | ||
|
||
public class CrossRefDOIPidProvider extends AbstractDOIProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other DOI providers drop the Pid part of the name (or use DOI instead of Pid since it is the type of Pid they support), so this would be more consistent as CrossRefDOIProvider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renaming
} | ||
} | ||
|
||
class CrossRefMetadataTemplate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope perhaps for now - we should coordinate and see if the enhancements in #10615/#10632 should be applied to CrossRef as well (as part of that PR or a new one once they're merged). It's possible that CrossRef doesn't accept all the new metadata that gets added in those, but if it does, I assume we'd want to send it to CrossRef as well. @konradperlowski - any thoughts?
@JvmSetting(key = JvmSettings.PID_PROVIDER_LABEL, value = "CrossRef 1", varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_TYPE, value = CrossRefDOIPidProvider.TYPE, varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_AUTHORITY, value = "10.5072", varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_SHOULDER, value = "FK2/", varArgs = "crossref1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are very close to the values for the provider ez1 above - perhaps changing to 10.5075 and/or to FK5/ to make a few more characters different would make it clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will use sample CrossRef values
@JvmSetting(key = JvmSettings.PID_PROVIDER_TYPE, value = CrossRefDOIPidProvider.TYPE, varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_AUTHORITY, value = "10.5072", varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_SHOULDER, value = "FK2/", varArgs = "crossref1") | ||
@JvmSetting(key = JvmSettings.PID_PROVIDER_MANAGED_LIST, value = "cr:20.20.20/FK2ABCDEF", varArgs ="crossref1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values in the managed list for this provider need to be DOIs - of the form doi:10.5076/FK7ABCDEF, etc. (The hdl example above uses the standard Handle form.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this an empty list
@@ -133,6 +145,7 @@ public static void setUpClass() throws Exception { | |||
pidProviderFactoryMap.put(HandlePidProvider.TYPE, new HandleProviderFactory()); | |||
pidProviderFactoryMap.put(FakeDOIProvider.TYPE, new FakeProviderFactory()); | |||
pidProviderFactoryMap.put(EZIdDOIProvider.TYPE, new EZIdProviderFactory()); | |||
pidProviderFactoryMap.put(CrossRefDOIPidProvider.TYPE, new CrossRefDOIProviderFactory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that the test changes don't yet do anything because the code above doesn't look for jvm options with crossref1 in them. Once those changes are made, I think adding to the tests here to assure that a crossref doi - one with authority 10.5072 and shoulder FK2/ as currently defined, is recognized by a provider with the id crossref1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add that to JvmSettingsTest.java
This comment has been minimized.
This comment has been minimized.
CrossRef has two APIs that are used in Dataverse: | ||
|
||
The base URL of the `CrossRef <https://api.crossref.org>`_, | ||
used to mint and manage DOIs. Current valid values for ``dataverse.pid.*.datacite.mds-api-url`` are "https://mds.datacite.org" (production) and "https://mds.test.datacite.org" (testing, the default). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about DataCite values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified the doc
@@ -2841,6 +2871,50 @@ should delete the old JVM option and the wrapped password alias, then recreate | |||
as shown for :ref:`dataverse.pid.datacite.password` but with the EZID alias | |||
name. | |||
|
|||
.. _dataverse.pid.crossref.url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are about legacy support which is no longer implemented - they can all be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
CrossRefRESTfullClient client = getClient(); | ||
doiExists = client.testDOIExists(identifier.substring(identifier.indexOf(":") + 1)); | ||
} catch (Exception e) { | ||
logger.log(Level.INFO, identifier, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's called a lot just to see if the DOI exists. It's just a check before it creates one.
logger.log(Level.FINE, jsonMetadata); | ||
metadata.put("_status", mappedJson.get("status").toString()); | ||
} catch (RuntimeException e) { | ||
logger.log(Level.INFO, identifier, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same - a warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied the code from DOIDataCiteRegisterService which logs info level
} | ||
|
||
public String reserveIdentifier(String identifier, DvObject dvObject) throws IOException { | ||
logger.info("Crossref reserveIdentifier"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and probably fine for all of these that just tell you that you used the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
|
||
class CrossRefFileUtil { | ||
|
||
public static void close(InputStream in) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is just used with readAndClose where the stream comin in is in a try-with-resources statement. Do we need any of this (the finally/close parts)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed try catch finally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So close() is no longer used anywhere?
String data = EntityUtils.toString(response.getEntity(), encoding); | ||
if (response.getStatusLine().getStatusCode() != 200) { | ||
String errMsg = "Response from getMetadata: " + response.getStatusLine().getStatusCode() + ", " + data; | ||
logger.info(errMsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more that should be .warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
@@ -133,6 +145,7 @@ public static void setUpClass() throws Exception { | |||
pidProviderFactoryMap.put(HandlePidProvider.TYPE, new HandleProviderFactory()); | |||
pidProviderFactoryMap.put(FakeDOIProvider.TYPE, new FakeProviderFactory()); | |||
pidProviderFactoryMap.put(EZIdDOIProvider.TYPE, new EZIdProviderFactory()); | |||
pidProviderFactoryMap.put(CrossRefDOIProvider.TYPE, new CrossRefDOIProviderFactory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to the end of the testDOIParsing() test:
String pid7String = "doi:10.11111/DVN/ABCDEF";
GlobalId pid7 = PidUtil.parseAsGlobalID(pid7String);
assertEquals(pid7String, pid7.asString());
assertEquals("crossref1", pid7.getProviderId());
That at least tests that the a CrossRef provider gets created by the factory and the authority/shoulder params are being set correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -21,11 +21,17 @@ void lookupSetting() { | |||
void lookupPidProviderSetting() { | |||
assertEquals("test", JvmSettings.DATACITE_USERNAME.lookup("datacite")); | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this adds anything/suggest removing. The DataCite example is nominally just a test of the lookup(param) call and this just repeats that with another setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I suggested a little cleanup and a new test. I did look through the 'original' code that interacts with the CrossRef service and creates the required metadata, but did not dig into details of whether the API calls work/etc. (I assume it was working code when it came to us.)
Beyond that, I think it looks OK to move forward. It appears that we are not going to be able to acquire CrossRef credentials to test the code in the short term, which is basically like the EzID provider where we don't have creds to test (afaik), so we'll have to rely on users to report bugs.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
Merged. As discussed at standup (and mentioned above) this support for Crossref as a DOI provider is experimental, especially because we were unable to obtain and credentials to test with. Community, please report bugs and create pull requests to fix them as our hands somewhat tied. I did run the code locally and was able to publish a dataset using the FAKE provider. I also confirmed that API tests are passing. |
What this PR does / why we need it:
This pr adds new doi provider - CrossRef. Implemented logic was mostly inspired by other providers such as DataCite and PermaLink.
Which issue(s) this PR closes: 8581
Special notes for your reviewer:
Suggestions on how to test this:
In order to test this you need to have test crossref account and set all required properties, described in doc/sphinx-guides/source/installation/config.rst file
Update: CrossRef does not create test accounts. This may need to be tested by @JacekChudzik
Does this PR introduce a user interface change? If mockups are available, please link/include them here: None
Is there a release notes update needed for this change?: included
Additional documentation: Installation configuration
https://dataverse-guide--10806.org.readthedocs.build/en/10806/installation/config.html#crossref-specific-settings