Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zlib inflate() holds and reuses buffer until program exit, is never freed limiting memory for other uses. #45303

Open
nitram-work opened this issue Nov 3, 2022 · 13 comments
Labels
zlib Issues and PRs related to the zlib subsystem.

Comments

@nitram-work
Copy link

nitram-work commented Nov 3, 2022

Version

v16.17.1

Platform

Linux 5.15.0-52-generic #58~20.04.1-Ubuntu SMP Thu Oct 13 13:09:46 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

zlib

What steps will reproduce the bug?

Code snippet (run with --expose-gc)

const { exit } = require('process');
const { deflateSync, inflate } = require('zlib');

const MEGABYTES = 1024 * 1024;

if (!global.gc) {
  console.error('Please run with `node --expose-gc`');
  exit(1);
}

const main = async () => {
  process.on('exit', () => gcAndLogMem('at exit handler'));

  createAndFreeBuffer(2000 * MEGABYTES);

  const compressed = compressBuffer(1000 * MEGABYTES);

  // Decompressing or compressing-freeing several times reuses zlib reserved buf
  for (let i = 0; i < 3; i++) {
    gcAndLogMem('before decompress');
    await decompressBuffer(compressed);
    gcAndLogMem('after decompress');
  }

  createAndFreeBuffer(2000 * MEGABYTES);
  gcAndLogMem('before exit');
};

// Create a very compressible pattern buffer to make sure it is allocated.
const createFilledBuffer = size => {
  const buf = Buffer.alloc(size);
  for (let i = 0; i < size; i++) buf[i] = Math.floor(i / 1000) & 0xff;
  return buf;
};

const createAndFreeBuffer = size => {
  console.log('create buffer {');
  let buf = createFilledBuffer(size);
  gcAndLogMem('  created buffer');
  buf = null;
  gcAndLogMem('  freed buffer');
  console.log('} create buffer');
};

const compressBuffer = size => {
  console.log('compress {');
  let input = createFilledBuffer(size);
  gcAndLogMem('  fill input');
  const compressed = deflateSync(input);
  gcAndLogMem('  deflated');
  input = null;
  gcAndLogMem('  freed input');
  console.log('} compress');
  return compressed;
};

const decompressBuffer = async compressed => {
  console.log('decompress {');
  let output = null;
  await new Promise(resolve => {
    inflate(compressed, (err, decompressed) => {
      output = decompressed;
      gcAndLogMem('after inflate');
      resolve();
    });
  });
  gcAndLogMem('  after promise');
  output = null;
  gcAndLogMem('  free output');
  console.log('} decompress');
};

const gcAndLogMem = m => {
  global.gc();
  global.gc();
  console.log(
    `${m.padEnd(16, '.')} GC CALL/EXT MEM= ${(
      process.memoryUsage().external / MEGABYTES
    ).toFixed(1)}MB`,
  );
};

main();

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior?

Memory should lower back to near zero after finishing each decompression routine/scope.

What do you see instead?

Memory never lowers back to "zero" until the exit handler, not allowing it's reuse for something else.
It's not a cumulative memory leak, but keeps it reserved forever.
In a system with 2.5 GB memory limit it would crash due to OOM, even if we could fit 2GB easily if that memory held by zlib was released. Try it: (ulimit -Sd 2500000; node --expose-gc ./script.js )

Script output unbounded
$ node --expose-gc script.js

create buffer {
  created buffer GC CALL/EXT MEM= 2000.3MB
  freed buffer.. GC CALL/EXT MEM= 0.3MB
} create buffer
compress {
  fill input.... GC CALL/EXT MEM= 1000.3MB
  deflated...... GC CALL/EXT MEM= 1003.3MB
  freed input... GC CALL/EXT MEM= 3.3MB
} compress
before decompress GC CALL/EXT MEM= 3.3MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB <--------- this internal buffer is never released
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
before decompress GC CALL/EXT MEM= 1003.4MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB <--------- but it looks like it's reused by zlib
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
before decompress GC CALL/EXT MEM= 1003.4MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
create buffer {
  created buffer GC CALL/EXT MEM= 3003.4MB  <------ here crashes with limited memory 
  freed buffer.. GC CALL/EXT MEM= 1003.4MB
} create buffer
before exit..... GC CALL/EXT MEM= 1003.4MB
at exit handler. GC CALL/EXT MEM= 0.3MB

Script output boudned to 2500 MB (in Ubuntu, YMMV)
( ulimit -Sd 2500000 ; node --expose-gc script.js )

create buffer {
  created buffer GC CALL/EXT MEM= 2000.3MB
  freed buffer.. GC CALL/EXT MEM= 0.3MB
} create buffer
compress {
  fill input.... GC CALL/EXT MEM= 1000.3MB
  deflated...... GC CALL/EXT MEM= 1003.3MB
  freed input... GC CALL/EXT MEM= 3.3MB
} compress
before decompress GC CALL/EXT MEM= 3.3MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
before decompress GC CALL/EXT MEM= 1003.4MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
before decompress GC CALL/EXT MEM= 1003.4MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB
create buffer {
at exit handler. GC CALL/EXT MEM= 1003.4MB
node:internal/buffer:959
    super(bufferOrLength, byteOffset, length);
    ^

RangeError: Array buffer allocation failed
    at new ArrayBuffer (<anonymous>)
    at new Uint8Array (<anonymous>)
    at new FastBuffer (node:internal/buffer:959:5)
    at Function.alloc (node:buffer:371:10)
    at createFilledBuffer (/home/xxx/src/script.js:32:22)
    at createAndFreeBuffer (/home/xxx/src/script.js:39:13)
    at main (/home/xxx/src/script.js:26:3)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)

Additional information

This was a related issue I previously raised and closed myself because it wasn't a leak and I was misguided by understanding how zlib worked #44750
But even now understanding the peak memory use would be 2x the inflated buffer (the chunks must be unified in another buffer) it still doesn't explain why it is not released.

I thought to open a new issue because the original title would be misleading. It's not a memory leak per se as it doesn't accumulate, but it nevers releases those buffers that might be needed by other parts of the program.

@VoltrexKeyva VoltrexKeyva added the zlib Issues and PRs related to the zlib subsystem. label Nov 3, 2022
@bnoordhuis
Copy link
Member

Is your complaint that the memory from this line isn't released until exit?

const compressed = compressBuffer(1000 * MEGABYTES);

If so, I don't understand why you expect it to be released earlier.

@Neustradamus
Copy link

To follow

@RaisinTen
Copy link
Contributor

Hi @Neustradamus, did you consider clicking on the subscribe button instead? That would allow you to follow the conversation without having to write a comment that unfortunately sends unnecessary notifications to everyone who is subscribed.

@bnoordhuis
Copy link
Member

I'll close this out seeing there's been no response from OP.

@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Dec 5, 2022
@nitram-work
Copy link
Author

I was out on vacation and then EOY celebrations. Please re-open.

Is your complaint that the memory from this line isn't released until exit?

const compressed = compressBuffer(1000 * MEGABYTES);

If so, I don't understand why you expect it to be released earlier.

The decompression buffer is never released until process exit, a 1000MB buffer.
The compressed buffer is just 3 megabytes and never released, that's correct but not the issue.

@bnoordhuis
Copy link
Member

bnoordhuis commented Jan 3, 2023

Not sure I follow. That buffer is 1,000 MB, not 3 MB big - edit: oh wait, you're saying it's 3 MB after compression, right?

In the "expected behavior" section, you fill out what you expect but not why. Can you elaborate on that?

@nitram-work
Copy link
Author

nitram-work commented Jan 3, 2023

I agree the syntax is confusing for compressBuffer. (I'll elaborate on why below after some context.)

The compressBuffer function:

  • creates a 1000MB random fill buffer
  • then compresses it to around 3MB
  • then releases the 1000MB correctly (as seen in the mem usage log)
compress {
  fill input.... GC CALL/EXT MEM= 1000.3MB
  deflated...... GC CALL/EXT MEM= 1003.3MB
  freed input... GC CALL/EXT MEM= 3.3MB
} compress

You can see the only memory left in use is 3MB (from the compressed buffer, again, correct)
But then after decompressing from 3MB back to 1000MB, I can never free that buffer back, ever.

If the original buffer were to be bigger, then the decompressed buffer that can't be freed is bigger.
Luckily that buffer seems to be re-used by zlib on each call, so it's not a cumulative memory leak.

To answer the why: If I want to use that memory for something else after some processing (in my case big PNG inflate and decode) I no longer have that memory available, it's bound to zlib.

before decompress GC CALL/EXT MEM= 3.3MB
decompress {
after inflate... GC CALL/EXT MEM= 2003.4MB
  after promise. GC CALL/EXT MEM= 2003.4MB
  free output... GC CALL/EXT MEM= 1003.4MB <--------- this internal buffer is never released
} decompress
after decompress GC CALL/EXT MEM= 1003.4MB

I understand decompression takes 2000MB because it needs 1000MB for the chunks list, and then another contiguous 1000MB to unify the chunks from the list after it's done decompressing. But then one of those two is greedily kept by zlib (to be reused later, apparently) but can never be released for other uses.

@nitram-work
Copy link
Author

(Updated previous comment with log references for clarity)

@nitram-work
Copy link
Author

Not sure I follow. That buffer is 1,000 MB, not 3 MB big - edit: oh wait, you're saying it's 3 MB after compression, right?

Yes

In the "expected behavior" section, you fill out what you expect but not why. Can you elaborate on that?

To further elaborate with a concrete example:

I'm constrained by memory limited containers 8GB.
Sometimes I need to serially decompress BIG PNGs to, let's say, a 7.5GB uncompressed buffer each, to recompress them into a different format.
But even after disposing their buffers, I will not be able to do something different afterwards, unrelated to de/compression, on the remaining 0.5GB free limited memory and the container will crash due to OOM, even though those 7.5GB should be available, but are held hostage by zlib for unknown reasons.

Please don't forget to re-open! Happy new year!

@bnoordhuis
Copy link
Member

@addaleax I think this may be a misreporting bug introduced in (speculatively) ed715ef rather than a real memory leak.

AllocScope looks like it can miss updates when its destructor races with a zlib job thread and indeed reported external memory stays constant when I change inflate() to inflateSync().

@bnoordhuis bnoordhuis reopened this Jan 3, 2023
@nitram-work
Copy link
Author

nitram-work commented Jan 3, 2023

Thank you.

I don't think it's only a reporting problem. As stated by the second log of the original post, If I soft limit the memory (using ulimit) it runs OOM and is killed while manually creating a buffer, as expected when OOM (RangeError: Array buffer allocation failed). If it was just an accounting error it shouldn't.

@addaleax
Copy link
Member

addaleax commented Jan 5, 2023

@bnoordhuis I don’t think this is an accounting error either. By printf()ing out the values passed to Isolate::AdjustAmountOfExternalAllocatedMemory() from zlib it seems that they are balanced and sum up to zero as they should.

@bnoordhuis
Copy link
Member

Inserting a printf() probably perturbs timings enough that any race conditions go away. I won't claim there's definitely a race going on but the way AllocScope is used now doesn't look Obviously Correct to me.

(And since inflateSync makes the problem go away it's almost certainly some thread-related issue.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
zlib Issues and PRs related to the zlib subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants