Skip to content

Commit

Permalink
documentation update
Browse files Browse the repository at this point in the history
  • Loading branch information
ErykKul committed Jan 25, 2023
1 parent 222b56a commit 635f32e
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 6 deletions.
2 changes: 1 addition & 1 deletion doc/release-notes/9130-cleanup-storage.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
### Support for cleaning up files in datasets' storage

Experimental feature: all the files stored in the Dataset storage location that are not in the file list of that Dataset can be removed with the new native API call (/api/datasets/$id/cleanStorage).
Experimental feature: the leftover files stored in the Dataset storage location that are not in the file list of that Dataset, but are named following the Dataverse technical convetion for dataset files, can be removed with the new native API call [Cleanup storage of a Dataset](https://guides.dataverse.org/en/latest/api/native-api.html#cleanup-storage-api).
14 changes: 9 additions & 5 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1518,28 +1518,32 @@ The fully expanded example above (without environment variables) looks like this
Cleanup storage of a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is an experimental feature and should be tested on your system before using it in production.
Also, make sure that your backups are up-to-date before using this on production servers.
It is advised to first call this method with the ``dryrun`` parameter set to ``true`` before actually deleting the files.
This will allow you to manually inspect the files that would be deleted if that parameter is set to ``false`` or is omitted (a list of the files that would be deleted is provided in the response).

If your Dataverse installation has been configured to support direct uploads, or in some other situations,
you could end up with some files in the storage of a dataset that are not linked to that dataset directly. Most commonly, this could
happen when an upload fails in the middle of a transfer, i.e. if a user does a UI direct upload and leaves the page without hitting cancel or save,
Dataverse doesn't know and doesn't clean up the files. Similarly in the direct upload API, if the final /addFiles call isn't done, the files are abandoned.

You might also want to remove cached export files or some temporary files, thumbnails, etc.

All the files stored in the Dataset storage location that are not in the file list of that Dataset can be removed, as shown in the example below.
All the files stored in the Dataset storage location that are not in the file list of that Dataset (and follow the naming pattern of the dataset files) can be removed, as shown in the example below.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/J8SJZB
export DRYRUN=true
curl -H "X-Dataverse-key: $API_TOKEN" -X GET "$SERVER_URL/api/datasets/:persistentId/cleanStorage?persistentId=$PERSISTENT_ID"
curl -H "X-Dataverse-key: $API_TOKEN" -X GET "$SERVER_URL/api/datasets/:persistentId/cleanStorage?persistentId=$PERSISTENT_ID&dryrun=$DRYRUN"
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl -H X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X GET https://demo.dataverse.org/api/datasets/:persistentId/cleanStorage?persistentId=doi:10.5072/FK2/J8SJZB
curl -H X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X GET https://demo.dataverse.org/api/datasets/:persistentId/cleanStorage?persistentId=doi:10.5072/FK2/J8SJZB&dryrun=true
Adding Files To a Dataset via Other Tools
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down

0 comments on commit 635f32e

Please sign in to comment.