4.17
Dataverse 4.17
This release brings new features, enhancements, and bug fixes to Dataverse. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.
Release Highlights
Dataset Level Explore Tools
Tools that integrate with Dataverse can now be launched from the dataset page! This makes it possible to develop and add tools that work across the entire dataset instead of single files. Tools to verify reproducibility and allow researchers to compute on an entire dataset will take advantage of this new infrastructure.
Performance Enhancements
Dataverse now allows installation administrators to configure the session timeout for logged in users using the new :LoginSessionTimeout
setting. (Session length for anonymous users has been reduced from 24 hours to 10 minutes.) Setting this lower will release system resources as configured and will result in better performance (less memory use) throughout a Dataverse installation.
Dataverse and Dataset pages have also been optimized to discard more of the objects they allocate immediately after the page load. Thus keeping less memory permanently tied up for the duration of the user's login session. These savings are especially significant in the Dataverse page.
Major Use Cases
Newly-supported use cases in this release include:
- As a user, I can launch and utilize external tools that allow me to work across the code, data, and other files in a dataset.
- As a user, I can add a footer to my dataverse to show the logo for a funder or other entity.
- As a developer, I can build external tools to verify reproducibility or allow computation.
- As a developer, I can check to see the impact of my proposed changes on memory utilization.
- As an installation administrator, I can make a quick configuration change to provide a better experience for my installation's users.
Notes for Dataverse Installation Administrators
Configurable User Session Timeout
Idle session timeout for logged-in users has been made configurable in this release.
The default is now set to 8 hours (this is a change from the previous default value of 24 hours).
If you want to change it, set the setting :LoginSessionTimeout to the new value in minutes.
For example, to reduce the timeout to 4 hours:
curl -X PUT -d 240 http://localhost:8080/api/admin/settings/:LoginSessionTimeout
Once again, this is the session timeout for logged-in users only. For the anonymous sessions the sessions are set to time out after the default session-timeout
value (also in minutes) in the web.xml of the Dataverse application, which is set to 10 minutes. You will most likely not ever need to change this, but if you do, configure it by editing the web.xml file.
Flexible Solr Schema, optionally reconfigure Solr
With this release, we moved all fields in Solr search index that relate to the default metadata schemas from schema.xml
to separate files. Custom metadata block configuration of the search index can be more easily automated that way. For details, see admin/metadatacustomization.html#updating-the-solr-schema.
This is optional, but all future changes will go to these files. It might be a good idea to reconfigure Solr now or be aware to look for changes to these files in the future, too. Here's how:
- You will need to replace or modify your
schema.xml
with the recent one (containing XML includes) - Copy
schema_dv_mdb_fields.xml
andschema_dv_mdb_copies.xml
to the same location as theschema.xml
- A re-index is not necessary as long no other changes happened, as this is only a reorganization of Solr fields from a single schema.xml file into multiple files.
In case you use custom metadata blocks, you might find the new updateSchemaMDB.sh
script beneficial. Again,
see http://guides.dataverse.org/en/4.17/admin/metadatacustomization.html#updating-the-solr-schema
Memory Benchmark Test
Developers and installation administrators can take advantage of new scripts to produce graphs of memory usage and garbage collection events. This is helpful for developers to investigate the implications of changes on memory usage and it is helpful for installation administrators to compare graphs across releases or time periods. For details see the scripts/tests/ec2-memory-benchmark
directory.
New Database Settings
:LoginSessionTimeout controls the session timeout (in minutes) for logged-in users.
Notes for Tool Developers and Integrators
New Features and Breaking Changes for External Tool Developers
The good news is that external tools can now be defined at the dataset level and there is new and improved documentation for external tool developers, linked below.
Additionally, the reserved words {datasetPid}
, {{filePid}
, and {localeCode}
were added. Please consider making it possible to translate your tool into various languages! The reserved word {datasetVersion}
has been made more flexible.
The bad news is that there are two breaking changes. First, tools must now define a "scope" of either "file" or "dataset" for the manifest to be successfully loaded into Dataverse. Existing tools in a Dataverse installations will be assigned a scope of "file" automatically by a SQL migration script but new installations of Dataverse will need to load an updated manifest file with this new "scope" variable.
Second, file level tools that did not previously define a "contentType" are now required to do so. In previously releases, file level tools that did not define a contentType were automatically given a contentType of "text/tab-separated-values" but now Dataverse will refuse to load the manifest file if contentType is not specified.
The Dataverse team has been reaching out to tool makers about these breaking changes and getting various tools working in the https://github.com/IQSS/dataverse-ansible repo. Thank you for your patience as the dust settles around the external tool framework.
For more information, check out new Building External Tools section of the API Guide.
Complete List of Changes
For the complete list of code changes in this release, see the 4.17 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email support@dataverse.org.
Installation
If this is a new installation, please see our Installation Guide.
Upgrade
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.17.war
-
Restart glassfish
-
Update Citation Metadata Block
wget https://github.com/IQSS/dataverse/releases/download/v4.17/citation.tsv
curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"
If you have any trouble adding an external tool at the dataset level and see warnings about "contenttype" in server.log, it is recommended that you run the following SQL update from pull request #6460:
ALTER TABLE externaltool ALTER contenttype DROP NOT NULL;