Skip to content

Commit

Permalink
Merge pull request #6488 from TexasDigitalLibrary/IQSS/6485
Browse files Browse the repository at this point in the history
Iqss/6485 - Multiple File Stores
  • Loading branch information
kcondon authored Feb 19, 2020
2 parents ad49b44 + 034787b commit bf79e64
Show file tree
Hide file tree
Showing 34 changed files with 977 additions and 490 deletions.
36 changes: 36 additions & 0 deletions doc/release-notes/6485-multiple-stores.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Multiple Store Support
Dataverse can now be configured to store files in more than one place at the same time (multiple file, s3, and/or swift stores).

General information about this capability can be found in the <a href="http://guides.dataverse.org/en/latest/installation/config.html">Configuration Guide</a> - File Storage section.

**Upgrade Information:**

**Existing installations will need to make configuration changes to adopt this version, regardless of whether additional stores are to be added or not.**

Multistore support requires that each store be assigned a label, id, and type - see the documentation for a more complete explanation. For an existing store, the recommended upgrade path is to assign the store id based on it's type, i.e. a 'file' store would get id 'file', an 's3' store would have the id 's3'.

With this choice, no manual changes to datafile 'storageidentifier' entries are needed in the database. (If you do not name your existing store using this convention, you will need to edit the database to maintain access to existing files!).

The following set of commands to change the Glassfish JVM options will adapt an existing file or s3 store for this upgrade:
For a file store:

./asadmin create-jvm-options "\-Ddataverse.files.file.type=file"
./asadmin create-jvm-options "\-Ddataverse.files.file.label=file"
./asadmin create-jvm-options "\-Ddataverse.files.file.directory=<your directory>"

For an s3 store:

./asadmin create-jvm-options "\-Ddataverse.files.s3.type=s3"
./asadmin create-jvm-options "\-Ddataverse.files.s3.label=s3"
./asadmin delete-jvm-options "-Ddataverse.files.s3-bucket-name=<your_bucket_name>"
./asadmin create-jvm-options "-Ddataverse.files.s3.bucket-name=<your_bucket_name>"

Any additional S3 options you have set will need to be replaced as well, following the pattern in the last two lines above - delete the option including a '-' after 's3' and creating the same option with the '-' replaced by a '.', using the same value you currently have configured.

Once these options are set, restarting the glassfish service is all that is needed to complete the change.

<<<<<<< HEAD
Note that the "\-Ddataverse.files.directory", if defined, continues to control where temporary files are stored (in the /temp subdir of that directory), independent of the location of any 'file' store defined above.
=======
Note that the "\-Ddataverse.files.directory", if defined, continues to control where temporary files are stored (in the /temp subdir of that directory), independent of the location of any 'file' store defined above.
>>>>>>> branch 'IQSS/6485' of https://github.com/TexasDigitalLibrary/dataverse.git
22 changes: 21 additions & 1 deletion doc/sphinx-guides/source/admin/dataverses-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,27 @@ Add Dataverse RoleAssignments to Child Dataverses

Recursively assigns the users and groups having a role(s),that are in the set configured to be inheritable via the :InheritParentRoleAssignments setting, on a specified dataverse to have the same role assignments on all of the dataverses that have been created within it. The response indicates success or failure and lists the individuals/groups and dataverses involved in the update. Only accessible to superusers. ::
curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias//addRoleAssignmentsToChildren
curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/addRoleAssignmentsToChildren
Configure a Dataverse to store all new files in a specific file store
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To direct new files (uploaded when datasets are created or edited) for all datasets in a given dataverse, the store can be specified via the API as shown below, or by editing the 'General Information' for a Dataverse on the Dataverse page. Only accessible to superusers. ::
curl -H "X-Dataverse-key: $API_TOKEN" -X PUT -d $storageDriverLabel http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver
The current driver can be seen using:

curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver

and can be reset to the default store with:

curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver

The available drivers can be listed with:

curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/storageDrivers


Datasets
--------
Expand Down
8 changes: 6 additions & 2 deletions doc/sphinx-guides/source/developers/big-data-support.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Install a DCM

Installation instructions can be found at https://github.com/sbgrid/data-capture-module/blob/master/doc/installation.md. Note that shared storage (posix or AWS S3) between Dataverse and your DCM is required. You cannot use a DCM with Swift at this point in time.

.. FIXME: Explain what ``dataverse.files.dcm-s3-bucket-name`` is for and what it has to do with ``dataverse.files.s3-bucket-name``.
.. FIXME: Explain what ``dataverse.files.dcm-s3-bucket-name`` is for and what it has to do with ``dataverse.files.s3.bucket-name``.
Once you have installed a DCM, you will need to configure two database settings on the Dataverse side. These settings are documented in the :doc:`/installation/config` section of the Installation Guide:

Expand Down Expand Up @@ -100,6 +100,7 @@ Optional steps for setting up the S3 Docker DCM Variant
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- Before: the default bucket for DCM to hold files in S3 is named test-dcm. It is coded into `post_upload_s3.bash` (line 30). Change to a different bucket if needed.
- Also Note: With the new support for multiple file store in Dataverse, DCM requires a store with id="s3" and DCM will only work with this store.

- Add AWS bucket info to dcmsrv
- Add AWS credentials to ``~/.aws/credentials``
Expand All @@ -115,6 +116,9 @@ Optional steps for setting up the S3 Docker DCM Variant
- ``cd /opt/glassfish4/bin/``
- ``./asadmin delete-jvm-options "\-Ddataverse.files.storage-driver-id=file"``
- ``./asadmin create-jvm-options "\-Ddataverse.files.storage-driver-id=s3"``
- ``./asadmin create-jvm-options "\-Ddataverse.files.s3.type=s3"``
- ``./asadmin create-jvm-options "\-Ddataverse.files.s3.label=s3"``


- Add AWS bucket info to Dataverse
- Add AWS credentials to ``~/.aws/credentials``
Expand All @@ -132,7 +136,7 @@ Optional steps for setting up the S3 Docker DCM Variant

- S3 bucket for Dataverse

- ``/usr/local/glassfish4/glassfish/bin/asadmin create-jvm-options "-Ddataverse.files.s3-bucket-name=iqsstestdcmbucket"``
- ``/usr/local/glassfish4/glassfish/bin/asadmin create-jvm-options "-Ddataverse.files.s3.bucket-name=iqsstestdcmbucket"``

- S3 bucket for DCM (as Dataverse needs to do the copy over)

Expand Down
Loading

0 comments on commit bf79e64

Please sign in to comment.