Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

objstore.DownloadDir() doesn't correctly cleanup on failure #3941

Closed
pracucci opened this issue Mar 19, 2021 · 8 comments
Closed

objstore.DownloadDir() doesn't correctly cleanup on failure #3941

pracucci opened this issue Mar 19, 2021 · 8 comments

Comments

@pracucci
Copy link
Contributor

What happened:
We've seen an error like this:

msg="failed to remove file on partial dir download error" file=/data/compact/0@5679675083797525161-1614499200000-1614506400000/01EZM4R375DDZC1F52B9HV7S8G err="remove /data/compact/0@5679675083797525161-1614499200000-1614506400000/01EZM4R375DDZC1F52B9HV7S8G: directory not empty"

From a quick look at the DownloadDir() implementation (code) I think it doesn't cleanup cleanup on failure when the downloaded directory contains subdirectories too (eg. block chunks/).

I see a couple of issues:

  1. downloadedFiles is not populated when DownloadDir() is called recursively
  2. downloadedFiles is populated with the dst and not actually the downloaded file, but then we delete it we call os.Remove() (non recursive)

What you expected to happen:

Fixing the logic should be easy, but I'm not sure what's the desired behaviour. Do we expect DownloadDir() to delete everything including subdirectories?

@bwplotka
Copy link
Member

bwplotka commented Mar 19, 2021

Ideally it deletes all it downloaded if requested. WDYT?

It's sounds like a bug.

@GiedriusS
Copy link
Member

Could you please print the whole directory tree of when this happens if possible?

@pracucci
Copy link
Contributor Author

Ideally it deletes all it downloaded if requested. WDYT?

I do agree. I'm actually thinking whether it's easier to change DownloadDir() logic and not make the use of any recursion. Now we have the ability to Iter() objects recursively, so we could just leverage on it. This would simplify the fix.

@pracucci
Copy link
Contributor Author

pracucci commented Mar 19, 2021

Could you please print the whole directory tree of when this happens if possible?

Yeah. Unfortunately happened on a customer cluster and I just got this report (I wasn't actively debugging it).

@GiedriusS
Copy link
Member

Could you please print the whole directory tree of when this happens if possible?

Yeah. Unfortunately happened on a customer cluster and I just got this report (I wasn't actively debugging it).

I presume your customer doesn't run code from v0.19.0-rc.1? So, this issue is probably from v0.18.0 i.e. before #3031 ?

@saswatamcode
Copy link
Member

I think this bug would arise when the downloadedFile array is populated with a dst that hasn't been downloaded. This possibly should cause DownloadFile to return an error though and the array wouldn't be appended. Was there any other error just before this? It would be easier to figure out the flow then.

@stale
Copy link

stale bot commented Jun 2, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Jun 2, 2021
@stale
Copy link

stale bot commented Jun 16, 2021

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants