-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4499 - dataset version pids #9462
base: develop
Are you sure you want to change the base?
Conversation
This commit adds a new scope and setting to the JvmSettings, enabling the configuration of different modes for Dataset Version PIDs. These modes are depicted in VersionPidMode. A test ensures the parsability. In addition, VersionPidMode also contains a fine grained option to change the conduct of Dataverse collections and their datasets for these PIDs.
…R official Also removing unused import of @transient
…JPA model Enable the database model to carry the configured conduct of a collection. Also enable JSON parser and printer to marshal the setting.
…d conduct IQSS#4499 This commit adds a public method DataverseServiceBean.wantsDatasetVersionPids() that will determine how to deal with a dataset version (which belongs to a dataset that lives within a collection) in terms of "should a PID be registered/updated?". The background is: when a dataset is published, there will be the context of the owning Dataverse collection. It's important to take into account the configured conduct for the collection in the decision how to go ahead with a version's PID.
These are placeholders for now, to be filled with actual code.
These are placeholders for now, to be extended with real code.
Hi @mreekie, this is not yet ready to be drawn into any other column other than my own. I did add a size, but obviously this is not the size of the estimated work for me, but my expectation of time spent on review, testing and Q&A at IQSS. Hope that's alright! |
Making it NON NULL in database and Bean Validation.
…reusable Removing the unnecessary transformation via the ValidationError class and at the same time making it available to use from API endpoints to create nicely formatted JSON (error) responses.
For a first set of attributes (name, alias, description and, most important for the PR about IQSS#4499, the dataset version PID conduct), make an endpoint available that allows changes via simple PUT HTTP commands.
By adding a NotImplementedException, a subclass of UnsupportedOperationException, we can describe methods not yet implemented. This status might change in the future (or not). The important part about this introduction: this runtime exception is flagged as an "application exception", which makes the EJB exception inspection not handle it as a system exception (which is the default), resulting in rolling back transactions. The idea is to enable handling perfectly fine situations within the command engine when some other component of Dataverse - e.g. the EJB of a PID provider - throws this exception. This way we can catch the exception and deal e.g. with dataset unlocks.
…ServiceBean With the addition of NotImplementedException we can better express what's going on without rollback exceptions in EJB. The methods that will reach out to some provider now throw checked IOExceptions, as the communication might go sideways and the command engine needs to act accordingly. (This is the same behaviour now as for the normal object methods) Also extending JavaDoc descriptions and expectations a bit.
Now returning the same identifier for a minor version that has been assigned to the adjacent major version. As before, this can be changed by a provider. The tests have been changed accordingly.
… versions Identifiers for versions are done in two steps: 1. create it (if not existing) and 2. make it findable. The second step is done by GlobalIdServiceBean.publicizeIdentifier, and this commit now adds the interface method for the first step.
…ication With this commit, a first draft to create and release a version identifier from the command engine is added. FinalizeDatasetPublicationCommand is used after someone hit publish in the UI and currently takes care of the creation and publishing for dataset identifiers and files, now extended to look after version PIDs as well.
…base As minor versions might carry the identifier of their major version, we cannot set a unique constraint on the column. Removing this from the model as well as the Flyway migration.
Key takeaways from first feedback during tech hour:
Thanks @qqmyers @pdurbin and everyone else for commenting and sharing your thoughts! |
After an elaborate chat with @landreev yesterday (thanks again!), let me write down our key insights we got out of it:
|
A couple minor points:
|
Thanks for continuing the feedback @qqmyers! Ad 1) Yes, on purpose. The selection in the collection will be disabled if the feature is turned off globally (by an admin). No one can turn it on by accident. "Skip" on the other hand is an opt-out. If your parent collection has it on, you might want to turn it off for a certain subcollection. The decision on the collection is by curators! This is a difference to selecting storage per collection. Ad 3.3) Yes, one could try to make that happen in rollback. Let me add a note here that we don't do this for dataset PIDs or File PIDs, so maybe it's fine not to do it for version as well. Also, in case things go sideways because the provider is out shopping 500s, it might be hard to ask for a deletion. Of course, we could add a service that retries deleting failed registrations. Might be neat for dataset PIDs and file PIDs as well. But sounds like beyond scope for this PR to me. |
re: 1) I'm still a bit confused - I understand off/no version PIDs allowed as a setting (though just not having setting defined could be 'off'), but does the setting then enforce whether major/minor pids are allowed at all? If not, the setting can just be binary. If it does, the the curator option either shouldn't allow minor if minor isn't turned on, etc. |
…being included in next release
Instead of providing an initialized value and always saving a value, make the model itself return "inherit" as default when the DB value is null. This is better for backward compat and searches.
…s with info about minor or major version
…tVersionPids - Incorporate feedback from first review - Switch to model where admin sets limits but the collections can override but not exceed the limit - Use admin settings as defaults if collection chain does not provide a choice now - Also add unit testing for business logic See also IQSS#9462 (comment) See also IQSS#9462 (comment)
@poikilotherm this PR is on your wishlist for 6.4 but there are a lot of merge conflicts. Also, do you still feel that a size of 30 is right? Thanks. |
We are also interested in this feature. I saw above this feature was (potentially) planned for 6.4. Is it now planned for a later version? |
What this PR does / why we need it:
This pull request adds the long awaited option to generate PIDs for dataset version.
TODOs:
DataverseServiceBean.wantsDatasetVersionPids()
GlobalIdServiceBean.publicizeIdentifier(DatasetVersion datasetVersion)
and implementationsGlobalIdServiceBean
and impl. classes)Think about pre-registring the version PID like we do for datasets/filesNo - this doesn't make sense, as we never know a-priori if a version will be minor or major, and only major versions shall have a PID.UpdateDvObjectPIDMetadataCommand
RegisterDvObjectCommand
FinalizeDatasetPublicationCommand
UpdateDatasetTargetURLCommand
)DatasetVersion.getJsonLd()
to indicate this is a version, use version identifier and add relation to dataset via concept PIDWhich issue(s) this PR closes:
Closes #4499
Special notes for your reviewer:
This is WIP.
Suggestions on how to test this:
This is WIP.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Some minor additions. Will be included once avail.
Is there a release notes update needed for this change?:
Yes, will be included.
Additional documentation:
None yet.