-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iqss/7140 google cloud archiver #7292
Iqss/7140 google cloud archiver #7292
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good but I left a couple comments.
``curl http://localhost:8080/api/admin/settings/:ArchiverClassName -X PUT -d "edu.harvard.iq.dataverse.engine.command.impl.GoogleCloudSubmitToArchiveCommand"`` | ||
|
||
``curl http://localhost:8080/api/admin/settings/:ArchiverSettings -X PUT -d ":":GoogleCloudBucket, :GoogleCloudProject"`` | ||
|
||
The Google Cloud Archiver defines two custom settings, both are required: | ||
|
||
\:GoogleCloudBucket - the name of the bucket to use. For example: | ||
|
||
``curl http://localhost:8080/api/admin/settings/:GoogleCloudBucket -X PUT -d "qdr-archive"`` | ||
|
||
\:GoogleCloudProject - the name of the project managing the bucket. For example: | ||
|
||
``curl http://localhost:8080/api/admin/settings/:GoogleCloudProject -X PUT -d "qdr-project"`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When all the BagIt export stuff was initially merged, it didn't dawn on me that we lost our comprehensive list of settings in one place.
I think it would be nice to document :ArchiverSettings in the big list and probably all the various sub-settings like :GoogleCloudBucket, :GoogleCloudProject, and the older ones (:BagItLocalPath, etc.).
From a code perspective, I think I'd also like to see strings like :GoogleCloudBucket, :GoogleCloudProject, :BagItLocalPath, etc. added to the "Key" enum in SettingsServiceBean.java. That way, developers can also have a comprehensive list of all settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add them to the guides list. The :ArchiverSettings is a Key already. For the others, I think earlier discussions questioned whether the archive-specific classes would end up being deployed separately, in which case having their settings in the core would be odd. Since that's not on the roadmap at this point, I could move them temporarily if you think its worth the effort.
return new Failure("Error in transferring DataCite.xml file to GoogleCloud", | ||
"GoogleCloud Submission Failure: incomplete metadata transfer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do failures like this show up in the UI? If so, do they need to be translatable into languages other than English?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't appear anywhere except the log AFAIK, even if you run them as part of a post-publish workflow instead of the API. (Since they run asynchronously, the API just returns an OK, it's launched response.
The /api/datasets/{id}/actions/:publish also runs archiving, synchronously, if configured, and it does have internationalizable success/failure messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple typos
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
@pdurbin got 'em, thanks |
…alitativeDataRepository/dataverse.git into IQSS/7140-GoogleCloudArchiver
Smoke tested, tested local archive, and with Jim's help set up google cloud archive. Waiting some doc updates before merging. |
What this PR does / why we need it: This PR includes an archiver class to send Bags with datacite.xml and OAI_ORE metadata files to Google cloud storage that is analogous to the existing DuraCloud and Local File archivers. It also includes an update to the DuraCloud archiver to add some timing checks developed when testing the Google version that should improve robustness. There is also a one character fix in the documentation of the Local File archiver to replace a non-functional smart quote in a curl command needed to configure it (the smart quote breaks cut/paste from the doc).
The Google Cloud Archiver was developed by/for QDR but should be useful for other community members.
Which issue(s) this PR closes:
Closes #7140
Special notes for your reviewer:
Suggestions on how to test this: Testing requires a Google Cloud account - we can probably coordinate to add the config/key for QDR's dev bucket to an ansible-launched test machine and screen share to verify the Bag arrives at Google.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: no
Is there a release notes update needed for this change?: could note that archiving to Google is possible
Additional documentation: