Skip to content

gzip.GzipFile creates reference cycle that requires a deep garbage collection cycle to cleanup. #129640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fowczarek opened this issue Feb 4, 2025 · 3 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@fowczarek
Copy link

fowczarek commented Feb 4, 2025

Bug report

Bug description:

During debugging memory buildup, I've noticed gzip.GzipFile holds a reference to itself (Cycle: GzipFile._buffer->BufferedWriter._raw->_WriteBufferStream.gzip_file-> GzipFile). This cycle prevents memory from being freed until the garbage collector runs a deep cleanup cycle (generation=2).

Steps to reproduce

  1. Disable garbage collection temporarily to make sure we are the ones who catch it
  2. Set the garbage collector's debug level to DEBUG_LEAK
  3. Open GzipFile.
  4. Force garbage collection and look at its output
import gc
import gzip
import io

gc.collect()
gc.disable()
gc.set_debug(gc.DEBUG_LEAK)

with io.BytesIO() as buffer:
    with gzip.GzipFile(mode="wb", fileobj=buffer):
        pass

gc.collect()
gc.set_debug(0)

Potential solution

class _WriteBufferStream(io.RawIOBase):
    ...

    def __del__(self):
        del self.gzip_file

CPython versions tested on:

3.12

Operating systems tested on:

macOS

Linked PRs

@fowczarek fowczarek added the type-bug An unexpected behavior, bug, or error label Feb 4, 2025
@picnixz picnixz added the stdlib Python modules in the Lib dir label Feb 4, 2025
Mr-Sunglasses added a commit to Mr-Sunglasses/cpython that referenced this issue Mar 6, 2025
@cmaloney
Copy link
Contributor

cmaloney commented Mar 7, 2025

I think this is the equivalent to gh-129726? (Hadn't seen this issue until just now). If so, has a fix which has been backported to 3.12. Similar concept, but broke refloop by adding a weakref (PR: #130055)

@Mr-Sunglasses
Copy link
Contributor

This Issue is fixed in (PR: #1300550) by @cmaloney . The (PR: #130916) add's some tests for it.

@hauntsaninja
Copy link
Contributor

Additional test PRs still welcome, but going ahead and closing this since the issue is resolved by #130055 (and backported)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants