Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Download: Allow user to download all files from a dataverse at once. #639

Closed
eaquigley opened this issue Jul 9, 2014 · 13 comments
Closed
Labels
Feature: API Guide Feature: File Upload & Handling Type: Feature a feature request User Role: Guest Anyone using the system, even without an account

Comments

@eaquigley
Copy link
Contributor


Author Name: Kevin Condon (@kcondon)
Original Redmine Issue: 4086, https://redmine.hmdc.harvard.edu/issues/4086
Original Date: 2014-06-06
Original Assignee: Gustavo Durand


This was requested by a dv admin, Janina:

Allow a user to download all files from a dataverse at once, see RT#179817

do you know if there is a way to download all of the files in our Dataverse at once?

@raprasad raprasad modified the milestone: Dataverse 4.0: In Review Jul 9, 2014
@scolapasta scolapasta modified the milestones: Beta 7 - Dataverse 4.0, In Review - Dataverse 4.0 Jul 15, 2014
@scolapasta
Copy link
Contributor

Moved to 4.1 to decide if we actually want something like this (what if a dataverse has GBs and GBs of files?)

@kcondon
Copy link
Contributor

kcondon commented Apr 20, 2015

Under the Download button, the UI currently says, All files from this dataset, but it is grayed out.
A user was asking whether this was broken, see RT #196683, also Liz added a ticket to allow selecting all files at once for download: #1988

@scolapasta scolapasta modified the milestones: In Review - Long Term, In Review - Short Term May 8, 2015
@raprasad
Copy link
Contributor

Based on # of files + sizes, can there be a quick estimate of whether:

  • This can be done fairly fast (in seconds) for that user OR
  • Request into a queue
    • User sees message saying to expect an email + dv notification for when the download is ready.

@scolapasta scolapasta removed their assignment Jan 27, 2016
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@sbarbosadataverse
Copy link

@scolapasta
This was requested by Harvard GSD just today and had to be passed to Kevin.

@kcondon
Copy link
Contributor

kcondon commented May 9, 2016

Will bring up at GIRT to see whether it is realistic to pursue this.

@donsizemore
Copy link
Contributor

Feature request (which might better belong in a separate issue?):

Thu-Mai often needs to download all files from a dataset in their original format. I'm looking into scripting this through the Native and Data Access APIs, but a GUI option for archivists would be extremely helpful.

@pdurbin
Copy link
Member

pdurbin commented Apr 6, 2017

@donsizemore yeah, a separate issue might be nice. What you and Thu-Main want is actually a smaller "user story" since it's limited to a single dataset. This issue is about a whole dataverse, which might contain sub-dataverses (which might contain sub-dataverses).

@pdurbin
Copy link
Member

pdurbin commented May 15, 2017

@donsizemore this issue came up today. Weren't you saying in IRC that you cooked up a script? Want us to take a look? 😄

@pdurbin
Copy link
Member

pdurbin commented May 15, 2017

"all mine does is download all files in a given dataset in original format writing out the original filename, but it could be smartly rewritten and extended" -- @donsizemore at http://irclog.iq.harvard.edu/dataverse/2017-04-07#i_51359

@pdurbin pdurbin added User Role: Guest Anyone using the system, even without an account and removed zTriaged labels Jun 30, 2017
@pdurbin
Copy link
Member

pdurbin commented Oct 4, 2018

When we tried to estimate #4529 about downloading all file based on a persistent ID (DOI or Handle) we decided against implementing that feature due to concerns over performance problems: #4529 (comment)

This issue represents even more load on the server so by the same logic we wouldn't implement this either.

@djbrooke
Copy link
Contributor

I'm going to close this for now. For the performance concerns, we could possibly revisit after implementing Lambda functions (#6093) that would take zipping datasets off the application server, or if we decide to pre-zip and store content in support of #6085. We'd need to take non-S3 installations into consideration.

@mankoff
Copy link
Contributor

mankoff commented Jun 4, 2020

Hello. I'm interested in this feature (and commented recently on a related issue). I have a question after reading this thread:

Why is zipping required?

Based on my (limited, ancient) webserver admin experience, if the dataset is exposed as a folder, the individual files could be downloaded with wget. Compression can happen on-the-fly (and only by certain filetype?) by the webserver (e.g. Apache), or no compression at all, and the Dataverse does not need to bulk-zip everything before the download begins.

@pdurbin
Copy link
Member

pdurbin commented Jun 4, 2020

@mankoff I appreciate your out of the box thinking! Thanks for commenting on #4529 and #6505 as well! Let's move the conversation to one of those issues since they're still open. Alternatively, you're welcome to open a dedicated issue about this idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: API Guide Feature: File Upload & Handling Type: Feature a feature request User Role: Guest Anyone using the system, even without an account
Projects
None yet
Development

No branches or pull requests

10 participants