diff --git a/doc/release-notes/9130-cleanup-storage.md b/doc/release-notes/9130-cleanup-storage.md index 71387a92db2..052123d5dd5 100644 --- a/doc/release-notes/9130-cleanup-storage.md +++ b/doc/release-notes/9130-cleanup-storage.md @@ -1,3 +1,3 @@ ### Support for cleaning up files in datasets' storage -Experimental feature: all the files stored in the Dataset storage location that are not in the file list of that Dataset can be removed with the new native API call (/api/datasets/$id/cleanStorage). \ No newline at end of file +Experimental feature: the leftover files stored in the Dataset storage location that are not in the file list of that Dataset, but are named following the Dataverse technical convetion for dataset files, can be removed with the new native API call [Cleanup storage of a Dataset](https://guides.dataverse.org/en/latest/api/native-api.html#cleanup-storage-api). \ No newline at end of file diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index 82023f7a060..f67c5a45174 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -1518,28 +1518,32 @@ The fully expanded example above (without environment variables) looks like this Cleanup storage of a Dataset ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This is an experimental feature and should be tested on your system before using it in production. +Also, make sure that your backups are up-to-date before using this on production servers. +It is advised to first call this method with the ``dryrun`` parameter set to ``true`` before actually deleting the files. +This will allow you to manually inspect the files that would be deleted if that parameter is set to ``false`` or is omitted (a list of the files that would be deleted is provided in the response). + If your Dataverse installation has been configured to support direct uploads, or in some other situations, you could end up with some files in the storage of a dataset that are not linked to that dataset directly. Most commonly, this could happen when an upload fails in the middle of a transfer, i.e. if a user does a UI direct upload and leaves the page without hitting cancel or save, Dataverse doesn't know and doesn't clean up the files. Similarly in the direct upload API, if the final /addFiles call isn't done, the files are abandoned. -You might also want to remove cached export files or some temporary files, thumbnails, etc. - -All the files stored in the Dataset storage location that are not in the file list of that Dataset can be removed, as shown in the example below. +All the files stored in the Dataset storage location that are not in the file list of that Dataset (and follow the naming pattern of the dataset files) can be removed, as shown in the example below. .. code-block:: bash export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx export SERVER_URL=https://demo.dataverse.org export PERSISTENT_ID=doi:10.5072/FK2/J8SJZB + export DRYRUN=true - curl -H "X-Dataverse-key: $API_TOKEN" -X GET "$SERVER_URL/api/datasets/:persistentId/cleanStorage?persistentId=$PERSISTENT_ID" + curl -H "X-Dataverse-key: $API_TOKEN" -X GET "$SERVER_URL/api/datasets/:persistentId/cleanStorage?persistentId=$PERSISTENT_ID&dryrun=$DRYRUN" The fully expanded example above (without environment variables) looks like this: .. code-block:: bash - curl -H X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X GET https://demo.dataverse.org/api/datasets/:persistentId/cleanStorage?persistentId=doi:10.5072/FK2/J8SJZB + curl -H X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X GET https://demo.dataverse.org/api/datasets/:persistentId/cleanStorage?persistentId=doi:10.5072/FK2/J8SJZB&dryrun=true Adding Files To a Dataset via Other Tools ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~