azure sync memory leak #802

mikejdunphy · 2019-12-24T17:11:33Z

azcopy 10.3.3 on linux crashes vm w/ memory leak.
running azcopy sync source destination --recursive
runs for a few hours, scans about 3.4 million and crashes system
eats all swap and memory
repeated 2x
source has 53 million tiny files

JohnRusk · 2019-12-28T07:45:58Z

Is it likely that many of the files already exist at the destination? If so, I believe its a known bug that's on our backlog. Thanks for the clear charts and details.

mikejdunphy · 2019-12-28T19:08:08Z

Hello, actually there were no files on the destination, it never gets past scanning and doesn’t upload anything. I tried the “copy” but that’s not going to work either, its been running for a week and is still “scanning” .. its up to 9 million files out of 53 million. Would take 6 weeks just to scan. And who knows how long the copy would take. Going to have to figure something else out to get this data in the cloud … I could export/import to a Microsoft device I guess but we would not be able to sync it. Regards -mjd [email chain old messages deleted for clarity]

JohnRusk · 2020-01-05T21:57:04Z

Hi @mikejdunphy

Sorry for the slow reply. I was away over Christmas/New Year.

My guess is that this is not actually a memory leak... but just relatively high memory usage. We've heard of a few other cases where it plateaus in the low GBs. I.e. a bit higher than the 2.x GB you are seeing.

Also, I should mention that there are actually two known issues with high file counts:

the one I mentioned above, which only relates to "copy" when using --overwrite=false and when lots of files already exist at the destination, and
another in which the progress reporting has a performance issue. I believe this one only has a significant effect once you get up to a few 10s of millions (and even then, its tolerable up to 50 million or so) -- but maybe somehow in your case its having a greater effect at a lower file count that I expect. This issue can be mitigated by breaking the work up into separate jobs. E.g. if you have 20 directories under your top level root directory, run 20 separate AzCopy jobs, one after the other, one for each directory. (It can help to script it).

From your description, I'm not sure whether you're actually suffering from the perf issue where breaking it up into separate jobs would help. It might be just a memory issue. I'd suggest that the following things might help:

If you happen to be running a VM, increase the memory allocated to the VM.
Set the environment variable AZCOPY_BUFFER_GB to 0.25. This may be enough to keep the memory usage within the bounds of what you already have in the machine.
Try breaking up the work, if possible, as described above.

Finally, I'm puzzled by this:

I tried the “copy” but that’s not going to work either, its been running for a week and is still
“scanning” ..

When you use copy, it starts copying file as soon as the first 10,000 have been scanned. Did it report any throughput? Did it report any files completed?

JohnRusk · 2020-01-05T22:00:16Z

(BTW, I edited the above reply. The first draft mistakenly said that the first known perf issue applies to sync. But its actually to copy with overwrite=false)

mikejdunphy · 2020-01-06T13:10:13Z

Thanks for all the hints. I can try those I suppose. After further review, I had a firewall/initernet issue on my end. Yes after the first 10k it started to try and copy but received a error. The error wasn’t straightforward so I didn’t recognize it as a internet firewall issue as I mistakenly thought it would not even be able to start/scan and act like it was working. I verified it does work on a vm that has full internet access. In any event I am going to start w/ shares less then 1 million files till I get familiar/comfortable w/ it all. Its still going to take weeks to get the 50+ million files up there and even when it is up there I really have no way in a reasonable time to “sync” it. I’ll have to come up w/ something later. I have plenty of other shares to put up there that have less files so I will start with those. Regards -mjd [prior email messages deleted for clarity]

JohnRusk · 2020-01-06T19:45:57Z

A few other general perf tips for tiny files include:

Drop the logging level to be more concise, since very small files = lots of logging at the default level. Use --log-level WARNING
If you're running AzCopy in an environment with 4 CPUs or fewer, experiment with setting a higher concurrency level, by setting the environment variable AZCOPY_CONCURRENCY_VALUE to 128 or 256.
Consider turning off the length check. By default, AzCopy 10.3.x checks the length of the destinations vs the length of the source after each file is copied. But that costs you one extra IO operation per file. That's just a drop in the bucket when you have a small number of big files, but it can be significant when you have a large number of small files. You can turn it off with--check-length=false.
Consider the benefits (and costs) of using premium block blob storage instead of "normal" block blob storage. Premium is optimized for very small files. https://azure.microsoft.com/en-us/blog/premium-block-blob-storage-a-new-level-of-performance/

In some small-files cases, the first 3 tips (combined) can give you something like a doubling of throughput. The last one, premium block blobs, can double or better on top of that - but check the pricing differences because it is priced differently. Performance-wise it a is a very good choice for small blobs.

gapra-msft · 2023-10-31T21:23:12Z

Closing due to inactivity. Please open a new issue if you are still experiencing issues with AzCopy.

JohnRusk self-assigned this Jan 5, 2020

JohnRusk removed their assignment May 26, 2020

adreed-msft added the waiting for customer label Jul 16, 2020

aruvic mentioned this issue Nov 4, 2020

azcopy sync ignoring AZCOPY_BUFFER_GB #1215

Open

DrChat mentioned this issue Jul 14, 2023

Limit azcopy to just half a gig of RAM microsoft/onefuzz#3293

Merged

5 tasks

gapra-msft closed this as completed Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

azure sync memory leak #802

azure sync memory leak #802

mikejdunphy commented Dec 24, 2019

JohnRusk commented Dec 28, 2019

mikejdunphy commented Dec 28, 2019 via email •

edited by JohnRusk

Loading

JohnRusk commented Jan 5, 2020 •

edited

Loading

JohnRusk commented Jan 5, 2020

mikejdunphy commented Jan 6, 2020 via email •

edited by JohnRusk

Loading

JohnRusk commented Jan 6, 2020

gapra-msft commented Oct 31, 2023

azure sync memory leak #802

azure sync memory leak #802

Comments

mikejdunphy commented Dec 24, 2019

JohnRusk commented Dec 28, 2019

mikejdunphy commented Dec 28, 2019 via email • edited by JohnRusk Loading

JohnRusk commented Jan 5, 2020 • edited Loading

JohnRusk commented Jan 5, 2020

mikejdunphy commented Jan 6, 2020 via email • edited by JohnRusk Loading

JohnRusk commented Jan 6, 2020

gapra-msft commented Oct 31, 2023

mikejdunphy commented Dec 28, 2019 via email •

edited by JohnRusk

Loading

JohnRusk commented Jan 5, 2020 •

edited

Loading

mikejdunphy commented Jan 6, 2020 via email •

edited by JohnRusk

Loading