Skip to content

Commit

Permalink
tweaks for HDC release notes IQSS#8611
Browse files Browse the repository at this point in the history
  • Loading branch information
pdurbin committed Sep 19, 2022
1 parent a6875f3 commit cd6edb4
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 18 deletions.
29 changes: 12 additions & 17 deletions doc/release-notes/8611-DataCommons-related-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,41 +6,38 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S

### Harvard Data Commons Additions

As reported at the 2022 Dataverse Community Meeting, the Harvard Data Commons project has supported a wide range of additions to the Dataverse software that improve support for Big Data, Workflows, Archiving, and Interaction with other repositories. In many cases, these additions build upon features developed within the Dataverse community by Borealis, DANS, QDR, and TDL and others. Highlights from this work include:
As reported at the 2022 Dataverse Community Meeting, the [Harvard Data Commons](https://sites.harvard.edu/harvard-data-commons/) project has supported a wide range of additions to the Dataverse software that improve support for Big Data, Workflows, Archiving, and interaction with other repositories. In many cases, these additions build upon features developed within the Dataverse community by Borealis, DANS, QDR, TDL, and others. Highlights from this work include:

- Initial support for Globus file transfer to upload to and download from a Dataverse managed S3 store. The current implementation disables file restriction and embargo on Globus-enabled stores.
- ```
- Initial support for Remote File Storage. This capability, enabled via a new RemoteOverlay store type, allows a file stored in a remote system to be added to a dataset (currently only via API) with download requests redirected to the remote system. Use cases include referencing public files hosted on external web servers as well as support for controlled access managed by Dataverse (e.g. via restricted and embargoed status) and/or by the remote store.
- Workflow (add Aday's notes here or reword to separate the Objective 2 work)
- Support for Archiving to any S3 store using Dataverse's RDA-conformant BagIT file format (a BagPack).
- Initial support for computational workflows, including a new metadata block and detected filetypes.
- Support for archiving to any S3 store using Dataverse's RDA-conformant BagIT file format (a BagPack).
- Improved error handling and performance in archival bag creation and new options such as only supporting archiving of one dataset version.
- Additions/corrections to the OAI-ORE metadata format (which is included in archival bags) such as referencing the name/mimeType/size/checksum/download URL of the original file for ingested files, the inclusion of metadata about the parent collection(s) of an archived dataset version, and use of the URL form of PIDs.
- Additions/corrections to the OAI-ORE metadata format (which is included in archival bags) such as referencing the name/mimetype/size/checksum/download URL of the original file for ingested files, the inclusion of metadata about the parent collection(s) of an archived dataset version, and use of the URL form of PIDs.
- Display of archival status within the dataset page versions table, richer status options including success, pending, and failure states, with a complete API for managing archival status.
- Support for batch archiving via API as an alternative to the current options of configuring archiving upon publication or archiving each dataset version manually.
- Initial support for sending and receiving Linked Data Notification messages indicating relationships between a dataset and external resources (e.g. papers or other dataset) that can be used to trigger additional actions, such as the creation of a back-link to provide, for example, bi-directional linking between a published paper and a Dataverse dataset.
- A new capability to provide custom per field instructions in dataset templates

## Major Use Cases and Infrastructure Enhancements

Changes and fixes in this release include:

- Administrators can configure an S3 store used in Dataverse to support users uploading/downloading files via Globus File Transfer (PR #8891)
- Administrators can configure an S3 store used in Dataverse to support users uploading/downloading files via Globus File Transfer. (PR #8891)
- Administrators can configure a RemoteOverlay store to allow files that remain hosted by a remote system to be added to a dataset. (PR #7325)
- Administrators can configure the Dataverse software to send archival Bag copies of published dataset versions to any S3-compatible service. (PR #8751)
- Users can see information about a dataset's parent collection(s) in the OAI-ORE metadata export. (PR #8770)
- Users and Administrators can now use the OAI-ORE metadata export to retrieve and assess the fixity of the the original file (for ingested tabular files) via the included checksum. (PR #8901)
- Archiving via RDA-conformant Bags is more robust and is more configurable (PR #8773, #8747, #8699, #8609, #8606, #8610)
- Users and administrators can see the archival status of the versions of the datasets they manage in the dataset page version table (PR #8748, #8696)
- Administrators can configure messaging between their Dataverse installation and other repositories that may hold related resources or services interested in activity within that installation (PR #8775)
- Users and administrators can now use the OAI-ORE metadata export to retrieve and assess the fixity of the original file (for ingested tabular files) via the included checksum. (PR #8901)
- Archiving via RDA-conformant Bags is more robust and is more configurable. (PR #8773, #8747, #8699, #8609, #8606, #8610)
- Users and administrators can see the archival status of the versions of the datasets they manage in the dataset page version table. (PR #8748, #8696)
- Administrators can configure messaging between their Dataverse installation and other repositories that may hold related resources or services interested in activity within that installation. (PR #8775)
- Collection managers can create templates that include custom instructions on how to fill out specific metadata fields.

## Notes for Dataverse Installation Administrators

### Enabling experimental capabilities
### Enabling Experimental Capabilities

Several of the capabilities introduced in v5.12 are "experimental" in the sense that further changes and enhancements to these capabilities should be expected and that these changes may involve additional work, for those who use the initial implementations, when upgrading to newer versions of the Dataverse software. Administrators wishing to use them are encouraged to stay in touch, e.g. via the Dataverse Community Slack space, to understand the limits of current capabilties and to plan for future upgrades.
Several of the capabilities introduced in v5.12 are "experimental" in the sense that further changes and enhancements to these capabilities should be expected and that these changes may involve additional work, for those who use the initial implementations, when upgrading to newer versions of the Dataverse software. Administrators wishing to use them are encouraged to stay in touch, e.g. via the Dataverse Community Slack space, to understand the limits of current capabilities and to plan for future upgrades.

## New JVM Options and DB Settings

Expand Down Expand Up @@ -73,8 +70,6 @@ Earlier versions of the archival bags included the ingested (tab-separated-value

## Complete List of Changes

## Installation

If this is a new installation, please see our [Installation Guide](https://guides.dataverse.org/en/5.12/installation/). Please also contact us to get added to the [Dataverse Project Map](https://guides.dataverse.org/en/5.12/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org) if you have not done so already.
Expand All @@ -83,4 +78,4 @@ If this is a new installation, please see our [Installation Guide](https://guide

8\. Re-export metadata files (OAI_ORE is affected by the PRs in these release notes). Optionally, for those using the Dataverse software's BagIt-based archiving, re-archive dataset versions archived using prior versions of the Dataverse software. This will be recommended/required in a future release.

9. Standard instructions for reinstalling the citation metadatablock. There are no new fields so solr changes/reindex aren't needed. This PR just adds an option to the list of publicationIdTypes
9\. Standard instructions for reinstalling the citation metadatablock. There are no new fields so Solr changes/reindex aren't needed. This PR just adds an option to the list of publicationIdTypes
4 changes: 3 additions & 1 deletion doc/release-notes/8639-computational-workflow.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
NOTE: These "workflow" changes should be folded into "Harvard Data Commons Additions" in 8611-DataCommons-related-notes.md

## Adding Computational Workflow Metadata
The new Computational Workflow metadata block will allow depositors to effectively tag datasets as computational workflows.

To add the new metadata block, follow the instructions in the user guide: <https://guides.dataverse.org/en/latest/admin/metadatacustomization.html>

The location of the new metadata block tsv file is: `dataverse/scripts/api/data/metadatablocks/computational_workflow.tsv`
The location of the new metadata block tsv file is: `dataverse/scripts/api/data/metadatablocks/computational_workflow.tsv`
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
NOTE: These "workflow" changes should be folded into "Harvard Data Commons Additions" in 8611-DataCommons-related-notes.md

The following file extensions are now detected:

wdl=text/x-workflow-description-language
Expand Down

0 comments on commit cd6edb4

Please sign in to comment.