Memory leak #21

pierreinglebert · 2014-04-23T10:13:36Z

I ran a batch using sharp on ~1000 files and it ended using >300MB ram.

I created a little gist to "demonstrate" it, it download a large image and creates 800 thumbs at different sizes.
On my mbp, when finished, it takes 120MB ram (more if i increase the number of generated thumbs).

lovell · 2014-04-23T12:45:35Z

Are you able to share the gist, or a gist of the gist?

libvips manages its own internal file and operation cache, defaulting to a soft limit of 100MB. This soft limit can be changed via sharp.cache(limit). To see if this is the source of heap (mis-)usage, try setting it to 0 to see if these numbers reduce.

Are Buffers involved? If so, there may be heap allocated for Buffer objects but now remains in unused virtual memory that v8 has decided it might need in the future. You could try Node's --expose-gc start-up setting that exposes the global.gc() method, which will force it to free/compact unreferenced objects on the heap.

Do you see the same problems with both Node 0.10 and 0.11? If things are better with 0.11 then you could have run into one of the problems targeted by nodejs/node-v0.x-archive#5720.

pierreinglebert · 2014-04-23T12:53:02Z

I created the gist just for this issue, i forgot to copy paste it :/ (updated my post)

It seems to be buffer related, if i return an empty buffer, it doesn't seem to leak.
I didn't investigate deeply on this, i created the issue if you had some leads.

I faced it on 0.10, i will try on node master.

pierreinglebert · 2014-04-23T20:25:08Z

Well, it's worse on node master (150MB vs 120MB) after calling global.gc()

I think we need to go deeper.

lovell · 2014-04-24T12:58:58Z

Thanks for all the extra info Pierre, I've been able to reproduce this locally.

I see the process-level rss climbs approximately in line with the size of the generated Buffers whilst the V8-level heapUsed and heapTotal remain stable.

I'll experiment with the version of NanNewBufferHandle that takes a FreeCallback to discover if the node::Buffer created at sharp.cc:309 is being correctly freed at GC time.

Failing that, valgrind is our friend.

lovell · 2014-04-24T22:38:51Z

Some Initial findings...

Even when libvips cache limit is set to zero, it always holds on to at least one operation. I'll check with John to see if this is a feature or a bug.

This cache might account for up to half of the Resident Set Size (RSS) value.

I've run memleak,js through valgrind and there are a few small leaks I need to take a closer look at:

==27133== LEAK SUMMARY:
==27133== definitely lost: 5,299 bytes in 43 blocks
==27133== indirectly lost: 60,920 bytes in 87 blocks
==27133== possibly lost: 70,543 bytes in 135 blocks
==27133== still reachable: 7,290,817 bytes in 20,722 blocks

The RSS value can include previously allocated/used memory yet to be returned to the OS, which might not always mean there's a leak. Perhaps try running other memory-intensive processes at the same time to see if the RSS value drops.

lovell · 2014-04-25T14:38:43Z

I'll attempt to fix the inability to disable the operation cache via https://github.com/jcupitt/libvips/issues/119.

Another thought: the RSS value also includes shared libraries that could be hogging/leaking memory. For example libvips uses Optimized Inner Loop Runtime Compiler, which was updated last week (v0.4.19) to "fix many memory leaks".

lovell · 2014-05-06T22:19:59Z

The liborc library v0.4.18 is definitely leaking memory.

Valgrind test run with libvips compiled to use ORC v0.4.18:

==22480== LEAK SUMMARY:
==22480== definitely lost: 7,443 bytes in 124 blocks
==22480== indirectly lost: 160,734 bytes in 158 blocks

Valgrind test run with libvips --without-orc:

==19249== LEAK SUMMARY:
==19249== definitely lost: 4,563 bytes in 104 blocks
==19249== indirectly lost: 43,578 bytes in 118 blocks

Disabling, or better upgrading to v0.4.19 is highly recommended.

jcupitt · 2014-06-02T08:37:48Z

It looks like orc-0.4.19 is freeing too much memory :-( see jcupitt/libvips@d990f5c

apply that patch and live with the small leak, or disable orc

lovell · 2014-06-02T10:22:52Z

Thanks John, are you planning to file a bug report upstream?

I'm going to recommend disabling liborc for now.

jcupitt · 2014-06-02T10:41:58Z

Yes, I need to test against their git master and maybe make a tiny test case.

jcupitt · 2014-06-04T17:23:10Z

I've managed to make a test-case for the orc crash and filed a bug upstream:

https://bugzilla.gnome.org/show_bug.cgi?id=731227

Follow that to track progress on this issue (hopefully).

Perhaps this should be split out to a separate issue in sharp, this is no longer a memleak, really.

jcupitt · 2014-06-05T07:55:22Z

They've fixed the bug in orc, I now see clean runs with their patch. orc-0.4.20 should include this fix.

vips master now frees orc programs again, so the situation is:

vips master plus orc pre-0.4.19 will leak memory
vips master plus orc-0.4.19 will crash
vips master plus orc-0.4.20 and later will not crash and will have few leaks

lovell · 2014-06-08T17:54:58Z

Thank you @jcupitt. I'll wait until liborc v0.4.20 is available via the usual package manager suspects before recommending its use again.

lovell · 2014-08-26T14:28:17Z

Found three more leaks we can fix, one covered by 0d89131 and two by https://github.com/jcupitt/libvips/pull/164.

Everything else seems to be caused by Node-land code not freeing unreferenced Buffer objects during garbage collection.

lovell · 2014-08-26T22:39:02Z

The change in commit bac367b is to notify the underlying V8 engine of libvips' cache via its AdjustAmountOfExternalAllocatedMemory method.

This provides it with a more accurate picture of memory usage, ensuring garbage collection occurs at more relevant frequencies to prevent paging.

lovell · 2014-08-28T09:56:38Z

@pierreinglebert are you able to re-run your tests using the latest master branch of both libvips and sharp to ensure we've mopped up the spillages?

In addition, liborc v0.4.22 is now available and I can see a few recent commits to fix memory leaks.

Runs unit tests via valgrind

lovell · 2014-09-02T13:06:20Z

The unit tests can now be run via valgrind to check for memory leaks:

cd tests
./leak.sh

It uses a combination of libvips' suppressions file and its own based on my research into this issue. Node.js, at least v0.10, doesn't GC the heap before exiting hence all the lingering Buffer allocations.

Thanks for reporting this @pierreinglebert. I'm going to close this task now but feel free to re-open if you're still having problems.

eschwim · 2014-09-19T21:21:54Z

I'm having much the same @pierreinglebert was, however I'm not using a libvips version compiled with orc.

Running this small test script with 1M iterations (instead of the listed 1K) against a 2855x2855 jpeg will typically result in an OOM error after about 10 minutes (this is on a x86_64 Centos 6.4 VM with 4 cores and 2GB of memory).

I've tried this with both libvips-7.40.8 and the latest libvips commit (compiled with the default configuration), as well as with sharp-0.6.2 and the latest sharp commit. I've run valgrind against the script, using your leak.sh as a template, and come up with the these results.

lovell · 2014-09-19T21:33:48Z

Thanks @eschwim, I'll take a closer look. The consistency of "24,000 bytes in 1,000 blocks" involving vips_cache_operation_buildp seems like a good place to start.

jcupitt · 2014-09-19T21:53:38Z

libvips caches the most recent 1,000 operations by default, I wonder if that's part of it. You can use vips_cache_set_max() to change this:

http://www.vips.ecs.soton.ac.uk/supported/7.40/doc/html/libvips/VipsOperation.html#vips-cache-set-max

I can't remember which of the cache limiting things sharp uses at the moment. It shouldn't use 24kb per operation though, that seems much too high.

lovell · 2014-09-19T22:03:02Z

Hi @jcupitt, thanks for helping here. The cache max defaults are 100MB and 500 ops in sharp - see https://github.com/lovell/sharp/blob/master/src/sharp.cc#L851-L852

@eschwim's script calls sharp.cache(50, 200) to reduce this to 50MB and 200 ops.

jcupitt · 2014-09-19T22:16:35Z

Ah OK. I wonder why it's exactly 1,000? It seems odd it's such a neat number.

eschwim · 2014-09-19T22:30:58Z

Just in case you want to repro, here's the exact valgrind call I made:

curl -O https://raw.githubusercontent.com/jcupitt/libvips/master/libvips.supp
curl -O https://raw.githubusercontent.com/lovell/sharp/master/tests/sharp.supp

G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind \
  --suppressions=libvips.supp --suppressions=sharp.supp \
  --leak-check=full --show-leak-kinds=definite,indirect,possible \
  node test_sharp_leak.js

I was using:
valgrind-3.9.0
node-0.10.32
libvips-6930318
sharp-ac85d88

With this input image

lovell · 2014-09-20T08:37:32Z

I've added a comment about a possible source of this leak in jcupitt/libvips@862d7f0

jcupitt · 2014-09-20T08:59:21Z

There was an embarrassing double allocate, but that's only been there for 24 hours, and would only leak (I think) 20 bytes per operation. There's probably something nastier happening as well, unfortunately.

lovell · 2014-09-20T10:13:54Z

Thanks John, valgrind is no longer reporting these problems for me after your fix.

g_new( VipsOperationCacheEntry, 1 ) allocates 24 bytes and Eric's script has 1000 iterations; it all adds up quite nicely.

@eschwim please can you try again against libvips 7.40.

jcupitt · 2014-09-20T10:20:15Z

I tried a couple of things. I ran vipsthumbail from 7.40 head under valgrind and got a clean run with a 10k x 10k jpeg.

I then modified vipsthumbnail: I added vips_cache_set_max(0) and changed the main loop to run for 100000 iterations on the first argument. I set it going on a 10k x 10k jpeg and watched RES in top:

        loop    RES
        10      46000
        100     96000
        200     111000
        300     117000
        400     116000
        600     140000
        700     142000
        900     148000
        1200    150000
        1700    153000
        2400    155000
        2700    155000
        3000    156000
        4300    161000
        4600    161000
        4900    162000
        5900    163000
        6100    162000
        6300    162000

I tried with a 1k x 1k jpeg as well, which was a bit quicker:

        loop    RES
        1200    18832
        3300    18872
        5000    19012
        7000    19128
        9000    19200
        12000   19200
        16000   19236
        25000   19236
        30000   19236
        40000   19236

Which looks like the behaviour you'd expect from memory fragmentation.

lovell · 2014-09-20T10:43:49Z

Thanks John. Have you considered replacing occurrences of g_new(type, 1) with g_slice_new(type) (matched with g_slice_free instead of g_free) for "small" objects within libvips? It would be interesting to see the effect, if any, on performance.

jcupitt · 2014-09-20T11:07:02Z

... I tried again with the cache enabled and it stabilised at 136012 after 2000 iterations. This is with Orc disabled.

jcupitt · 2014-09-20T11:16:31Z

The standard linux malloc() already does slab allocation and has a magazine layer, so I'm not sure it would make a large difference there. And vips tries hard to not call malloc() and free() during processing, so it could only reduce setup cost.

The standard Windows malloc is extremely slow and primitive (because of compatibility problems) so it could well be useful. I'll add a note to the TODO.

eschwim · 2014-09-20T16:27:33Z

Using libvips-26e9248:

==1569== LEAK SUMMARY:
==1569== definitely lost: 0 bytes in 0 blocks
==1569== indirectly lost: 0 bytes in 0 blocks
==1569== possibly lost: 0 bytes in 0 blocks
==1569== still reachable: 15,686,883 bytes in 216,645 blocks
==1569== suppressed: 5,627,727 bytes in 34,588 blocks

Yay! :) Thanks, guys!

jcupitt · 2014-09-20T16:57:17Z

@lovell found it, but yes, good result!

lovell · 2014-09-20T20:50:29Z

Thanks all, closing this problem.

John: If you're not aware of it, jemalloc might be worth looking at if you ever go down the g_new replacement route, especially if Windows performance is of concern.

Coincidentally, Node.js 0.10 has its own slab allocator but this has been removed in 0.11 unstable (and the future 0.12 stable).

eschwim · 2014-09-20T23:23:38Z

Although, sadly, the original problem that I was having still persists. Replacing the for loop in the original test script with a while(1) loop will still result in the process eventually being reaped by the OOMkiller. You can see the RSS balloon, below:

[eric@test-box ~]$ node ./test_sharp_leak.js &
[eric@test-box ~]$ while [ 1 ]; do ps -eF | grep '[l]eak' | awk '{print $6}'; sleep 60; done
87188
405588
650688
891764
1129072
1368184
1623400
1690640 (1.69Gb)
... etc. until the process is killed:

[eric@test-box ~]$ dmesg | tail -2
Out of memory: Kill process 4524 (node) score 909 or sacrifice child
Killed process 4524, UID 0, (node) total-vm:7601880kB, anon-rss:1587876kB, file-rss:852kB

lovell · 2014-09-21T09:27:40Z

Hi @eschwim, using while(1) in your script will build a queue within libuv containing (tens of) millions of instances of resize_baton to process. This will eventually exhaust memory.

You can monitor queue depth with sharp.counters().

Remembering that with the asynchronous power of Node.js comes great responsibility, perhaps try using async.parallelLimit(...) in your test.

eschwim · 2014-09-21T14:41:29Z

Ah yes, I definitely need to heed uncle Ben's words more often. I will give it a go; thanks!

eschwim · 2014-09-22T22:41:11Z

Well, Uncle Ben is going to be very disappointed with me.

In an effort to debug my problems, I've attempted a couple of different ways of getting more synchronous behavior. But I have yet to come up with a solution that doesn't result in monotonic memory growth and eventual OOM death.

Perhaps more enlightening is this httpd script (adapted from your example). Querying this with a simple synchronous shell loop (yes | parallel -j1 curl -s localhost:8000 -o /dev/null) results in the same memory consumption behavior as the asynchronous loops. Also, the counter logs printed by the httpd script list at most one queued and one active process. This makes me wonder if there isn't something else going on behind the scenes that is eating up memory.

FWIW, I've taken a few heap dumps (using node-heapdump) and run them through the the Chrome Dev Tools comparison tool, but nothing immediately jumps out at me.

Just for some background, I'm not doing this strictly as an academic exercise 😀 I have a web proxy that I've written (using sharp) that works fantastically, with the one exception that it falls over after a few million requests. 😢

lovell · 2014-09-23T13:47:15Z

Both the JS tests still create as many queued tasks as memory will allow before exhausting it. Perhaps try something like https://gist.github.com/lovell/f79476202c6cb9533f8e

With this example the RSS does slowly creep up, which could be caused by heap fragmentation.

cat /proc/pagetypeinfo provides some hints about low-level memory allocation.

You could try altering the MALLOC_MMAP_THRESHOLD_ environment variable, used for the heap vs mmap decision, from it's 128*1024 default.

node-heapdump includes the V8-managed heap only, which probably won't help here.

puzrin · 2015-10-26T18:21:06Z

FYI, we had to process 650000 of 800x600 images and libvips segfaulted after 5 min until disabled liborc in configure.

It certaintly worth to automate libvips build on install or write instructions to README.

lovell · 2015-10-26T19:21:24Z

Thanks @puzrin - are you able to provide more details of the liborc-caused segfault, ideally with a stacktrace, over at #172.

lovell added the bug label Apr 24, 2014

lovell added a commit that referenced this issue May 10, 2014

Add warning about liborc memory leaks. #21

8322b44

lovell added triage and removed bug labels May 20, 2014

lovell added a commit that referenced this issue Jun 2, 2014

Warn about liborc 0.4.19 buffer overflows #21

5ccc2bc

lovell added a commit that referenced this issue Aug 26, 2014

Prevent small memory leak with non-empty argv0 #21

0d89131

lovell added a commit that referenced this issue Aug 26, 2014

Notify V8 GC of libvips cache size #21

bac367b

lovell added a commit that referenced this issue Sep 2, 2014

Add memory leak check #21

87f6e83

Runs unit tests via valgrind

lovell closed this as completed Sep 2, 2014

lovell reopened this Sep 19, 2014

lovell closed this as completed Sep 20, 2014

jcupitt mentioned this issue Oct 16, 2014

Shutdown thread libvips/ruby-vips#55

Closed

lovell mentioned this issue Apr 26, 2015

Cache warning: linecache error reading tile libvips/libvips#261

Closed

Memory leak #21

Memory leak #21

Comments

pierreinglebert commented Apr 23, 2014

lovell commented Apr 23, 2014

pierreinglebert commented Apr 23, 2014

pierreinglebert commented Apr 23, 2014

lovell commented Apr 24, 2014

lovell commented Apr 24, 2014

lovell commented Apr 25, 2014

lovell commented May 6, 2014

jcupitt commented Jun 2, 2014

lovell commented Jun 2, 2014

jcupitt commented Jun 2, 2014

jcupitt commented Jun 4, 2014

jcupitt commented Jun 5, 2014

lovell commented Jun 8, 2014

lovell commented Aug 26, 2014

lovell commented Aug 26, 2014

lovell commented Aug 28, 2014

lovell commented Sep 2, 2014

eschwim commented Sep 19, 2014

lovell commented Sep 19, 2014

jcupitt commented Sep 19, 2014

lovell commented Sep 19, 2014

jcupitt commented Sep 19, 2014

eschwim commented Sep 19, 2014

lovell commented Sep 20, 2014

jcupitt commented Sep 20, 2014

lovell commented Sep 20, 2014

jcupitt commented Sep 20, 2014

lovell commented Sep 20, 2014

jcupitt commented Sep 20, 2014

jcupitt commented Sep 20, 2014

eschwim commented Sep 20, 2014

jcupitt commented Sep 20, 2014

lovell commented Sep 20, 2014

eschwim commented Sep 20, 2014

lovell commented Sep 21, 2014

eschwim commented Sep 21, 2014

eschwim commented Sep 22, 2014

lovell commented Sep 23, 2014

puzrin commented Oct 26, 2015

lovell commented Oct 26, 2015