Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unembargo process results in error due to incomplete uploads #2021

Open
kabilar opened this issue Sep 4, 2024 · 11 comments
Open

Unembargo process results in error due to incomplete uploads #2021

kabilar opened this issue Sep 4, 2024 · 11 comments
Labels
embargo Issues around embargo functionality UX Affects usability of the system

Comments

@kabilar
Copy link
Member

kabilar commented Sep 4, 2024

As reported by a user on August 27 via email to help@dandiarchive.org, when umembargoing a Dandiset an error occurs which turns out to be a result of incomplete uploads. The error message displayed to the user upon pressing UNEMBARGO is:

Sorry, something went wrong on our side (the developers have been notified). Please try that operation again later.

@jjnesbitt found that the API was returning a 400 error when attempting to unembargo. To resolve, Jake manually removed the uploads after confirmation from the user that the uploads were not necessary. We should catch this case of incomplete/interrupted uploads, provide this feedback directly to the user, and provide instructions on how to resolve.

Thank you.

@kabilar kabilar added UX Affects usability of the system embargo Issues around embargo functionality labels Sep 4, 2024
@alessandratrapani
Copy link

Hey there! I have 2 embargoed dandisets (001172 and 001170) where the GUI Unembargo button is inactive, but I don't have any active uploads. I don't know exactly where to check for incomplete/interrupted uploads. Can you please point me in the right direction?

@jjnesbitt
Copy link
Member

Hi Alessandra, we don't currently have a way for end users to easily check/clear incomplete uploads on a dandiset (although we are trying to improve this). I've checked and can confirm I see the following in progress / incomplete uploads:

  1. Dandiset 001172: 8 incomplete uploads, between ~35gb and ~50gb in size.
  2. Dandiset 001170: 20 incomplete uploads, between ~20gb and ~60gb in size.

If you'd like, I could clear these uploads, in order for you to proceed. If there is some missing data, you could either hold off on un-embargo until you've finished uploading the data, or un-embargo right away, and finish uploading afterwards. There would be no functional difference.

Please let me know what you'd like to do.

@alessandratrapani
Copy link

Yes, please proceed with clearing those uploads. I will discuss the next steps with the authors.
Do you have any suggestions on the uploading procedure to avoid incomplete uploads?

@jjnesbitt
Copy link
Member

Yes, please proceed with clearing those uploads. I will discuss the next steps with the authors.

Okay, this has been done. I can confirm that both of those dandisets are now available for un-embargo.

Do you have any suggestions on the uploading procedure to avoid incomplete uploads?

At the moment, not particularly. My guess is that its due to the process running the CLI being killed during upload, which leaves those uploads orphaned. So my only advice would be to gracefully shutdown the CLI if an upload is in progress (if that's the case here). @yarikoptic any advice on this front?

@bendichter
Copy link
Member

What do you mean by gracefully shutting down the CLI?

@jjnesbitt
Copy link
Member

What do you mean by gracefully shutting down the CLI?

This is where my lack of knowledge about the CLI comes into play, but I'm assuming that when a user cancels a running upload, the CLI attempts to handle that in a way that doesn't leave orphaned uploads. So if that process was hard killed, it wouldn't have a chance to do so. However, afaik we don't have endpoints to delete uploads in any way, so the only way the CLI might do that is finish any uploads in progress.

@bendichter
Copy link
Member

yes, though users will tend to run into this problem if they have slow or unreliable upload speeds and may not always be able to let an upload complete.

@jwodder, @yarikoptic do you think a reasonable solution to this might be to add a CLI function that clears any existing unfinished uploads?

@jjnesbitt
Copy link
Member

That function would still need API support, which I think is something we should provide anyway for the web client.

@bendichter
Copy link
Member

Sounds good to me

@yarikoptic
Copy link
Member

... the only way the CLI might do that is finish any uploads in progress.

neat idea, filed

But overall it would not be a solution: we do need to improve API for the life cycle of uploads - both for regular assets or zarrs. Related existing issues for zarrs

if here an issue is about regular assets (I assume, didn't tripple check) -- IMHO we should

  • add dandisets_uploads endpoint (/dandiset/{dandiset_pk}/uploads/) to list existing uploads for a dandiset
    • worth keeping in mind that there could still be incomplete Zarrs. So may be at once to allow for two (ATM) types of possible incomplete uploads -- blob or zarr.
  • add uploads_cancel endpoint (/uploads/{upload_id}/cancel/) to cancel an existing (blob) upload.

With that any UI/client can implement querying/canceling of uploads, and then "Unembargo" button could check/query for confirmation/clean things up to proceed to /unembargo/ functionality. A complimentary alternative is to make /unembargo/ to add a bool flag cleanup_uploads which would, if True, first automatically clean up all such incomplete uploads, while initial request error would provide information about such uploads. But I like that approach less.

@jjnesbitt
Copy link
Member

This problem will largely be addressed by the upcoming garbage collection, but I agree we should add that functionality to the API. Namely, listing and cancelling of existing uploads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embargo Issues around embargo functionality UX Affects usability of the system
Projects
None yet
Development

No branches or pull requests

5 participants