Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leak on Linux? #955

Closed
asilvas opened this issue Sep 20, 2017 · 64 comments
Closed

Leak on Linux? #955

asilvas opened this issue Sep 20, 2017 · 64 comments
Labels

Comments

@asilvas
Copy link

asilvas commented Sep 20, 2017

Been troubleshooting a leak in https://github.com/asilvas/node-image-steam (processes millions of images every day) and originally thought it was in my project but after a number of heap dump checks I determined it wasn't a leak in V8.

In order to break it down the simplest parts I recorded the traffic in a serial form so it can be replayed in a pure sharp script.

https://gist.github.com/asilvas/474112440535051f2608223c8dc2fcdf

npm i sharp request

curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.js > sharp.js
curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.log > sharp.log

node sharp.js http://img1.wsimg.com/isteam sharp.log

It's downloading these files on the fly, which avoids any FS caching which will bloat memory usage, and forwards the instructions (in sharp.log) directly to sharp, one at a time.

Memory usage gets into 500MB+ within a few mins (at least on Docker+CentOS), and seems to eventually peak. On some systems I've seen over 2GB usage. Only processing a single image at a time should be pretty flat in memory usage. Have you seen this before? Any ideas? I wasn't aware of anything sharp/vips was doing that should be triggering Linux's file caching.

Edit: While memory usage on Mac is still higher than I expect for a single image processed at a time (~160MB) after a couple hundred images, it's nowhere near as high as on Linux.. And it seems to peak quickly. So it appears to be a linux only issue. Docker is also involved, so not ruling that out either.

@lovell
Copy link
Owner

lovell commented Sep 21, 2017

Hello, how is memory usage being measured? If RSS, please remember this includes free memory that has not (yet) been returned to the OS, which explains why different OSs report different RSS for the same task.

If you've not seen then, there have been quite a few related questions previously:
https://github.com/lovell/sharp/search?utf8=%E2%9C%93&q=rss+%22returned+to+the+OS%22&type=Issues

@asilvas
Copy link
Author

asilvas commented Sep 21, 2017

Yes, RSS being the main indicator. But I also rely on buff/cache and avail memory to better understand "free-able" memory, and this indicates that this memory is never being released back. This memory doesn't seem to reside in the V8 memory space, as indicated from dozens of heap dump tests.

Thanks for links to the other issues that seem connected. I'm not entirely sure the issue is fully understood though. I'm not convinced it's an issue with sharp either, but I'm hoping we can work around the problem as the impact is quite significant. We're using ~5x the memory we should be, which becomes a big deal when you're serving (many) millions of requests/day.

I'll continue investigating the related cases. So far I've found no workaround to the problem that doesn't involve not using toBuffer.

@lovell
Copy link
Owner

lovell commented Sep 21, 2017

It's worth subscribing to nodejs/node#1671 for V8 updates that will improve GC of Buffer objects.

If you're not already doing so, you might want to experiment with a different memory allocator such as jemalloc. You'll probably see less fragmentation, but that's still dealing with the effect rather than the cause.

@asilvas
Copy link
Author

asilvas commented Sep 21, 2017

Will do, thanks.

Probably no surprise, but I was at least able to correlate usage with the concurrency setting.

In my isolated (sharp-only) test these were the findings:

0 concurrency: 276MB (should be detecting 4 in my local setup)
1 concurrency: 190MB
4 concurrency: 276MB
8 concurrency: 398MB (our prod env)
16 concurrency: 500MB

@lovell
Copy link
Owner

lovell commented Sep 22, 2017

Does the prod environment use 8 real CPU cores or is this "vCPU" hyper-threading? If the latter, perhaps also experiment halving concurrency to improve throughput (and reduce the memory effects).

@asilvas
Copy link
Author

asilvas commented Sep 22, 2017

I am artificially limiting cores to keep memory in check, but at the cost of up to 30% slower response times. Temporary test.

@asilvas
Copy link
Author

asilvas commented Sep 25, 2017

Feel free to close this as a duplicate of others. But from everything I've learned of the problem thus far there doesn't seem to be any conclusive evidence that this is a Node and/or V8 issue. From the symptoms of the isolated tests I've run (as well as others) it does seem to be any issue with sharp or vips as this isn't a common problem in the node community to see run-away memory increases of this nature. I was able to verify this is not a case of GC'd memory not being released back to the OS, confirming that this memory was in-use and any reduction in available memory resulting in memory allocation failure. But as I said, nothing conclusive either way.

I tried to investigate ways to resolve/workaround the problem within Sharp but was unsuccessful -- hopefully someone with more expertise with V8 native modules will have better luck.

@lovell
Copy link
Owner

lovell commented Sep 25, 2017

"I was able to verify this is not a case of GC'd memory not being released back to the OS"

Could memory fragmentation explain this?

@asilvas
Copy link
Author

asilvas commented Sep 25, 2017

I haven't proven/disproven that theory. But with the modest number of objects being processed to achieve such high memory usage, it'd seem to require some pretty severe fragmentation to justify this.

Is your thoughts that it's fragmentation in V8 or native space?

@lovell
Copy link
Owner

lovell commented Sep 26, 2017

Are you using Node 8? If not, do you see the same RSS levels with it?

Were you able to try jemalloc? It provides useful debugging via malloc_stats_print.

Given you're using CentOS, have you tried disabling transparent huge pages?

@asilvas
Copy link
Author

asilvas commented Sep 26, 2017

Not using 8 in prod, but yes was able to reproduce similar RSS levels (with 8.5.0) in the isolated test.

Might be a bit before I can look into the other options but will keep in mind, thanks.

@lovell
Copy link
Owner

lovell commented Oct 15, 2017

If sharpen or blur operations are being used then the small leak fixed in https://github.com/jcupitt/libvips/issues/771 may be related here.

@asilvas
Copy link
Author

asilvas commented Oct 15, 2017

The test sample for this topic doesn't use those two operations so probably unrelated. But we do use them on occasion, thanks!

@trev-dev
Copy link

I've found that running my sharp modules in a child_process spawn that exits once it's completed works really well for me. It keeps the memory load down.

@asilvas
Copy link
Author

asilvas commented Nov 24, 2017

Was hoping to avoid spawning a child process, but it is something I had considered as well. It's manageable at the moment so holding out for now.

@lovell
Copy link
Owner

lovell commented Dec 19, 2017

@asilvas The sharp tests just revealed a memory leak on one possible libvips error path when using toBuffer (or pipe output) with JPEG output - see https://github.com/jcupitt/libvips/pull/835

@asilvas
Copy link
Author

asilvas commented Dec 19, 2017

Excellent find (and fix), @lovell ! Thanks, these sort of fixes make a big difference when processing millions of images. Any idea when this fix will be available?

@lovell
Copy link
Owner

lovell commented Dec 19, 2017

@asilvas The next libvips v8.6.1 patch release should contain this fix, which then allows the release of sharp v0.19.0.

@kishorgandham
Copy link

@lovell
Copy link
Owner

lovell commented Feb 3, 2018

@asilvas Are you seeing an improvement with the latest libvips/sharp?

@asilvas
Copy link
Author

asilvas commented Feb 3, 2018

In testing, will let you know next week.

@vinerz
Copy link

vinerz commented Feb 7, 2018

I am currently having the same issue.

Stack information:
Heroku on Ubuntu Server 16.04.3
Node 9.5.0

Libraries versions:
LIBVIPS=8.6.2
LIBWEBP=0.6.1
LIBTIFF=4.0.9
LIBSVG=2.42.2
LIBGIF=5.1.4

Sharp:
Version: 0.19.0

I also use a lot of JPEG toBuffer.

screen shot 2018-02-07 at 04 19 27

screen shot 2018-02-07 at 04 19 47

Even under stress the V8 heap doesn't change, but the non-heap memory grows consistently on every request.

Before the update my memory usage was steadily at 300MB using libvips 7.42.3 and sharp 0.17.1

@asilvas
Copy link
Author

asilvas commented Feb 7, 2018

Seeing similar results after ~24 hours in production:

image
image

Overall memory usage patterns seems to be a bit improved, but still far higher than I'd expect (eventually reaching 2GB, perhaps related to prior suggestions). I have noticed some perf improvements overall, though it could be due to the relatively short life of the new containers. I might revisit doing some more memory profiling at some point, but it'll have to wait for now.

I'll let you know if any new data surfaces.

@vinerz
Copy link

vinerz commented Feb 7, 2018

@asilvas would you mind sharing the throughput of your servers and if they are many different images?

@asilvas
Copy link
Author

asilvas commented Feb 7, 2018

@vinerz We generate over 20 million images per day, from millions of source images. Overall throughput is much higher, but that part is unrelated to this topic. Powered by https://github.com/asilvas/node-image-steam, and of course sharp+libvips.

@vinerz
Copy link

vinerz commented Feb 7, 2018

Thanks for the answer!

Based on this information, I can see that my leakage is getting larger much, much faster than yours, even tough manipulating only 200 thousand images per day from around 50 thousand different sources.

It might be related to the fact that I use toBuffer a lot in my code due to filters / resizing chains.

I'll try disabling libvips cache to see what happens.

@asilvas
Copy link
Author

asilvas commented Feb 7, 2018

We use toBuffer for the final result of every image as well: https://github.com/asilvas/node-image-steam/blob/b96c3d39bc7b125f552b1cef0d1dfa05be3b488e/lib/processor/processor.js#L103

Our sharp options include:

cache: false
concurrency: 4
simd: true

I've toyed with options quite a bit in the past, but might be worth revisiting with the recent changes/fixes.

@vinerz
Copy link

vinerz commented Feb 8, 2018

I modified the core app flow to use a single sharp object and changed all toBuffer chain to a single Stream piped directly to the Express Response, but I am getting the same memory results. It might related to something else.

Currently using cache: false and concurrency: 2

@lovell
Copy link
Owner

lovell commented Feb 9, 2018

@asilvas Thank you for the detailed updates!

@vinerz Your comments mention "Node 9.5.0" and "Before the update... using libvips 7.42.3 and sharp 0.17.1". I suspect you were using a different version of Node "before the update" too. If so, does returning to the previous version make any difference?

@egekhter
Copy link

egekhter commented Jul 23, 2020

@egekhter Whilst Node.js Worker Threads will probably help the single-threaded world of jimp, they won't offer much to help the multi-threaded world of sharp/libvips and can cause greater heap fragmentation faster.

30x worker threads each spawning 4x libuv threads each spawning 16x (c5.4xlarge vCPU) libvips threads is a concurrency of 1920 threads all allocating/freeing memory from the same pool.

If you'd like to "manage" concurrency via Worker Threads, then try setting sharp.concurrency to 1 so libvips doesn't also try to do so.

Solved my problem with your help.

sharp.concurrency(1);

Up to 70x child processes and still have plenty of memory left.

Thanks for all your work!

Screen Shot 2020-07-23 at 4 07 50 PM

@lovell
Copy link
Owner

lovell commented Mar 10, 2021

Please see #2607 for a change in the default concurrency for glibc-based Linux users that will be in v0.28.0.

@FoxxMD
Copy link

FoxxMD commented May 5, 2022

not related...but @vinerz what application were you using to monitor memory in this comment? Is that heroku's dashboard?

@vinerz
Copy link

vinerz commented May 5, 2022

not related...but @vinerz what application were you using to monitor memory in this comment? Is that heroku's dashboard?

Hey @FoxxMD, that's New Relic's application monitor for node 😄

@alcidesbsilvaneto
Copy link

I switched from Debian to Alpine (and admittedly Node 11 to 12): image

Thanks to everyone who suggested this as a fix!

Still the solution for me in 2023.

daniellockyer added a commit to TryGhost/SDK that referenced this issue May 8, 2023
refs lovell/sharp#955 (comment)

- we've seen Ghost hogging memory whenever images are uploaded
- it seems to be due to an issue with memory fragmentation, because a
  heap snapshot of a container after the memory grows shows nothing in
  JS code
- we've been using jemalloc in production but it still seems to occur
- this change has been suggested on the referenced thread in order to
  improve fragmentation on top of using jemalloc
- can easily revert if it causes issues
@nandi95
Copy link

nandi95 commented Sep 3, 2023

Went from:
Screenshot_2023-09-03_at_21 33 26
to:
Screenshot_2023-09-03_at_23 48 59

It's a definite improvement.

@rambo-panda
Copy link

rambo-panda commented Dec 28, 2023

const sharp = require("sharp");
(async () => {
    await Promise.all([...Array(100)].map(() => sharp("./test.webp").rotate(90).toBuffer()));
})();     
  • use libjemalloc.so cache:true concurrency : 1
    image

  • use libjemalloc.so cache:false concurrency : 1
    image

  • use libjemalloc.so cache:true concurrency : 16
    image

  • use libjemalloc.so cache:false concurrency : 16
    image


  • use glibc cache: false concurrency:1
    image

  • use glibc cache:true concurrency:1
    image

  • use glibc cache:false concurrency : 16
    image

  • use glibc cache: true concurrency: 16
    image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests