Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6524 dataset size api #6609

Merged
merged 17 commits into from
Feb 6, 2020
Merged

6524 dataset size api #6609

merged 17 commits into from
Feb 6, 2020

Conversation

sekmiller
Copy link
Contributor

What this PR does / why we need it:
This PR add API endpoints for getting the storage size of a dataset and getting the sum of size of the files available for download in a particular version of a dataset.

Which issue(s) this PR closes:

Closes #6524

Special notes for your reviewer:
None

Suggestions on how to test this:
Note that the storage size api requires that the token passed has view unpublished. Likewise for the download size, if the version is :draft there must be a token passed with view unpublished permission. No token is required for a published version.

Does this PR introduce a user interface change?:
No

Is there a release notes update needed for this change?:
No. (could be included in "additional features/fixes."

Additional documentation:
added doc to native-apt.rst

@coveralls
Copy link

coveralls commented Feb 4, 2020

Coverage Status

Coverage decreased (-0.01%) to 19.46% when pulling 9478caa on 6524-dataset-size-api into 7d79fb3 on develop.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks fine so I'm moving it to QA but heads up to @sekmiller that I left some comments and suggestions.


export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=xxxxxx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export ID=xxxxxx
export ID=42


.. code-block:: bash

curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/xxxxxx/storagesize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/xxxxxx/storagesize
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/42/storagesize


export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=xxxxxx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export ID=xxxxxx
export ID=42

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=xxxxxx
export VERSIONID=x.x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export VERSIONID=x.x
export VERSIONID=1.0


.. code-block:: bash

curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/xxxxxx/versions/x.x/downloadsize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/xxxxxx/versions/x.x/downloadsize
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx https://demo.dataverse.org/api/datasets/42/versions/1.0/downloadsize

long total = 0L;

if (dataset.isHarvested()) {
return 0L;
}

List<DataFile> filesToTalley = new ArrayList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to nitpick but it's "tally" without an "e".

/**
* Returns the total byte size of the files in this dataset
*
* @param dataset
* @param countCachedExtras boolean indicating if the cached disposable extras should also be counted
* @param mode String indicating whether we are getting the result for storage (entire dataset) or download version based
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to document the accepted values for "mode".


private final Dataset dataset;
private final Boolean countCachedFiles;
private final String mode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a String for "mode" and enum would be nice. More type safe that way. And auto documenting.

@djbrooke
Copy link
Contributor

djbrooke commented Feb 5, 2020

@pdurbin if there are suggested changes let's not start QA yet. @sekmiller feel free to respond to the feedback above and let me know if you want me to edit any docs.

@pdurbin
Copy link
Member

pdurbin commented Feb 5, 2020

@djbrooke that's fine but I want to re-emphasize that I don't see any changes as required. Again, they're just suggestions. 😄

@djbrooke
Copy link
Contributor

djbrooke commented Feb 5, 2020

@pdurbin thanks, yeah, I 100% agree and understand, but I don't want @kcondon to start QA and have things shift under him if @sekmiller plans to make the changes.

@pdurbin pdurbin removed their assignment Feb 5, 2020
@sekmiller
Copy link
Contributor Author

I addressed the coding suggestions that @pdurbin made, but I'm going to leave the doc as is for now since it is in line with what is already there for other api endpoints.

@sekmiller sekmiller removed their assignment Feb 6, 2020
@kcondon kcondon self-assigned this Feb 6, 2020
@kcondon kcondon merged commit 0348023 into develop Feb 6, 2020
@kcondon kcondon deleted the 6524-dataset-size-api branch February 6, 2020 20:21
@djbrooke djbrooke added this to the 4.20 milestone Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataset Size API
5 participants