Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak when img onerror/onload callbacks reference img #1003

Merged
merged 1 commit into from
Sep 19, 2017

Conversation

zbjornson
Copy link
Collaborator

@zbjornson zbjornson commented Sep 16, 2017

Storing the onerror/onload callback functions in Persistent handles (here, as Nan::Callbacks) means they will never be GC'ed automatically; only when the Image destructor runs, which explicitly resets them. But, if those functions have references to the image, then the image will never be GC'ed. --> leak

I'm annoyed that I can't figure out how to do this from C++ after thinking about it all day. I have an open message on the v8-users list to see if there's a way, but this method also removes 87 lines of code. The only benefit to doing it in C++ is that we can type check the value and only set it if it's a function. Up to you if you want to wait a bit to see if someone responds.

Fixes #785
Fixes #150 (for real)


Code that repros the leak:

const fs = require("fs");
const Canvas = require(".");

const imgsrc = fs.readFileSync("./test/fixtures/clock.png");

const canvas = new Canvas.Canvas(15000, 15000);
const ctx = canvas.getContext("2d");

var invocations = 0;

// Causes leak:
for (var k = 0; k < 5; k++) {
  gc();
  console.log(process.memoryUsage().rss);
  for (var i = 0; i < 1000; i++) {
    const img = new Canvas.Image();
    img.onload = _ => {
      if (img.width > 0) invocations++;
      // img.src= null; // frees a bunch of memory but doesn't actually invoke ~Image
    }
    img.src = imgsrc;
  }
}

// Doesn't cause leak:
// for (var k = 0; k < 5; k++) {
//   gc();
//   console.log(process.memoryUsage().rss);
//   for (var i = 0; i < 10000; i++) {
//     const img = new Canvas.Image();
//     img.onload = _ => invocations++;
//     img.src = imgsrc;
//   }
// }

In master:
24535040
466857984
908058624
1347592192

This PR:
24571904
26157056
26300416
25919488
24920064

Storing the onerror/onload callbacks in Persistent handles means they will never be GC'ed. If those functions have references to the image, then the image will never be GC'ed either.

Fixes Automattic#785
Fixes Automattic#150 (for real)
@zbjornson
Copy link
Collaborator Author

I think I understand this better now (or I'm even more confused and this is all wrong, but at the moment I've convinced myself). The first paragraph of my OP is correct and explains the leak. In addition...

node-canvas's img.src= leads to synchronous invocation of onerror or onload. In browsers it's async, even if you assign a data: URL. Two consequences of this are that (a) onload/onerror must be assigned before img.src=, and (b) using onload is actually unnecessary because when img.src= returns, the image is loaded.

A third consequence is that img will still be in scope and not eligible for GC after the src is loaded. (That's why this PR works.) If the callbacks were changed to be invoked asynchronously, it would go out of scope. The solution there that avoids the leak would be to store a persistent handle containing the img.onload value when img.src = ... is invoked, and then reset the handle when onload is invoked.

Summary:

  • This PR is good to go. Even if we decide to make the callbacks async sometime, the old code was wrong. The only slightly gross thing about this PR is that the property is set directly on the object instead of through a setter that validates that the argument is a function, and instead I'm checking that it's a function right before invoking it. A JS setter would end up exposing some other property like img._onload; only way around this is a C++ setter AFAIK.

  • We should decide before 2.0 lands if img.src= should become async to align with browsers.

@LinusU
Copy link
Collaborator

LinusU commented Sep 19, 2017

I really think that it should be asynchronous, since in most cases it is performing blocking io as it is now. It will also be unacceptable to be synchronous when/if we support .src = 'http://...'

I think that the minimum we should to before releasing 2.0 is to just add process.nextTick or similar to delay the callbacks, that way it will at least behave asynchronous for the user; and we would be able to land proper async without breaking something...

I think this would break a lot of existing code based on how many github issues show code that assumes it's sync

This is a bit unfortunate but I think 2.x is a good time to break things. In the end, I hope it will lead to better practices...

There's also the performance aspect: the async turn adds about a millisecond for an idle node.js process, longer for active processes.

I feel that this comes down to io vs cpu. My opinion is that io work should be async and cpu work should be sync. That callbacks adds a bit of overhead is just how Node.js works, the same can be said for e.g. fs.readFile, fs.readFileSync is faster, but it's not concurrent which can yield more throughput when looking at the bigger picture...

I'd be a strong advocate for adding a non-standard sync API like [...]

Sounds good 👍

@LinusU LinusU merged commit 77db631 into Automattic:master Sep 19, 2017
@zbjornson zbjornson deleted the memleak2 branch September 19, 2017 16:13
@zbjornson
Copy link
Collaborator Author

Yeah, that's most logical, I agree. Will open another ticket for it.

@fuzhenn
Copy link

fuzhenn commented Nov 6, 2017

Thx for the excellent works!
We met this memleak problem in our production env
@LinusU not in a hurry, any plan to publish a new version to fix this recently?

@LinusU
Copy link
Collaborator

LinusU commented Nov 6, 2017

published as 2.0.0-alpha.6

@zbjornson zbjornson mentioned this pull request Jan 30, 2018
@zbjornson zbjornson mentioned this pull request Apr 9, 2018
@zbjornson zbjornson mentioned this pull request Feb 2, 2019
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak in Image Image Memory Leak
3 participants