Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure sync memory leak #802

Closed
mikejdunphy opened this issue Dec 24, 2019 · 7 comments
Closed

azure sync memory leak #802

mikejdunphy opened this issue Dec 24, 2019 · 7 comments

Comments

@mikejdunphy
Copy link

azcopy 10.3.3 on linux crashes vm w/ memory leak.
running azcopy sync source destination --recursive
runs for a few hours, scans about 3.4 million and crashes system
eats all swap and memory
repeated 2x
source has 53 million tiny files

image

@JohnRusk
Copy link
Member

Is it likely that many of the files already exist at the destination? If so, I believe its a known bug that's on our backlog. Thanks for the clear charts and details.

@mikejdunphy
Copy link
Author

mikejdunphy commented Dec 28, 2019 via email

@JohnRusk JohnRusk self-assigned this Jan 5, 2020
@JohnRusk
Copy link
Member

JohnRusk commented Jan 5, 2020

Hi @mikejdunphy

Sorry for the slow reply. I was away over Christmas/New Year.

My guess is that this is not actually a memory leak... but just relatively high memory usage. We've heard of a few other cases where it plateaus in the low GBs. I.e. a bit higher than the 2.x GB you are seeing.

Also, I should mention that there are actually two known issues with high file counts:

  • the one I mentioned above, which only relates to "copy" when using --overwrite=false and when lots of files already exist at the destination, and
  • another in which the progress reporting has a performance issue. I believe this one only has a significant effect once you get up to a few 10s of millions (and even then, its tolerable up to 50 million or so) -- but maybe somehow in your case its having a greater effect at a lower file count that I expect. This issue can be mitigated by breaking the work up into separate jobs. E.g. if you have 20 directories under your top level root directory, run 20 separate AzCopy jobs, one after the other, one for each directory. (It can help to script it).

From your description, I'm not sure whether you're actually suffering from the perf issue where breaking it up into separate jobs would help. It might be just a memory issue. I'd suggest that the following things might help:

  1. If you happen to be running a VM, increase the memory allocated to the VM.
  2. Set the environment variable AZCOPY_BUFFER_GB to 0.25. This may be enough to keep the memory usage within the bounds of what you already have in the machine.
  3. Try breaking up the work, if possible, as described above.

Finally, I'm puzzled by this:

I tried the “copy” but that’s not going to work either, its been running for a week and is still
“scanning” ..

When you use copy, it starts copying file as soon as the first 10,000 have been scanned. Did it report any throughput? Did it report any files completed?

@JohnRusk
Copy link
Member

JohnRusk commented Jan 5, 2020

(BTW, I edited the above reply. The first draft mistakenly said that the first known perf issue applies to sync. But its actually to copy with overwrite=false)

@mikejdunphy
Copy link
Author

mikejdunphy commented Jan 6, 2020 via email

@JohnRusk
Copy link
Member

JohnRusk commented Jan 6, 2020

A few other general perf tips for tiny files include:

  • Drop the logging level to be more concise, since very small files = lots of logging at the default level. Use --log-level WARNING
  • If you're running AzCopy in an environment with 4 CPUs or fewer, experiment with setting a higher concurrency level, by setting the environment variable AZCOPY_CONCURRENCY_VALUE to 128 or 256.
  • Consider turning off the length check. By default, AzCopy 10.3.x checks the length of the destinations vs the length of the source after each file is copied. But that costs you one extra IO operation per file. That's just a drop in the bucket when you have a small number of big files, but it can be significant when you have a large number of small files. You can turn it off with--check-length=false.
  • Consider the benefits (and costs) of using premium block blob storage instead of "normal" block blob storage. Premium is optimized for very small files. https://azure.microsoft.com/en-us/blog/premium-block-blob-storage-a-new-level-of-performance/

In some small-files cases, the first 3 tips (combined) can give you something like a doubling of throughput. The last one, premium block blobs, can double or better on top of that - but check the pricing differences because it is priced differently. Performance-wise it a is a very good choice for small blobs.

@gapra-msft
Copy link
Member

Closing due to inactivity. Please open a new issue if you are still experiencing issues with AzCopy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants