Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-120754: Fix memory leak in FileIO.__init__() #124225

Merged
merged 1 commit into from
Sep 18, 2024

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 18, 2024

Free 'self->stat_atopen' before assigning it, since io.FileIO.init() can be called multiple times manually (especially by test_io).

Free 'self->stat_atopen' before assigning it, since
io.FileIO.__init__() can be called multiple times manually
(especially by test_io).
@vstinner
Copy link
Member Author

cc @cmaloney

@cmaloney
Copy link
Contributor

👍 Definitely where the leak is happening / my own tracking finds same location. https://vstinner.github.io/debug-python-refleak.html really helpful in tracking down, thanks for writing up.

There's a number of cases where self->fd is changed but the cached information about the file (now stat_atopen, previously estimated_size and _blksize) aren't changed. I started tracking those down (https://github.com/python/cpython/compare/main...cmaloney:cpython:cmaloney/fix_closefd_norealloc?expand=1). Happy with a much more minimal fix to get bots green. Trying to find a refactor so all the closefd logic which feels fairly replicated across the code at the moment can hopefully get deduped a lot while fixing potential corner case issues.

@vstinner vstinner merged commit 43cd7aa into python:main Sep 18, 2024
37 checks passed
@vstinner vstinner deleted the fileio_init branch September 18, 2024 22:11
@vstinner
Copy link
Member Author

(https://github.com/python/cpython/compare/main...cmaloney:cpython:cmaloney/fix_closefd_norealloc?expand=1)

Only allocating (PyMem_New) stat_atopen if it's NULL sounds like a better fix than my fix. I merged my fix anyway since you approved it and I would like to repair buildbot as soon as possible.

You might include your better fix in a following PR if you want, since you planned more changes for io if I understood correctly.

Thanks for reviewing my fix!

@cmaloney
Copy link
Contributor

cmaloney commented Sep 18, 2024

Yep, will make more PRs stat_atopen got several patches moving again.

  1. isatty at open to replace GH-120754: Remove isatty call during regular open #121593 and fix Avoid calling isatty() for most open() calls #90102
  2. "clear out cached stat result anytime self->fd is changed" (followup from this PR)
  3. Potentially clear out stat_atopen on seek Unbounded reads by zipfile may cause a MemoryError. #113977. That is a potentially more comprehensive solution to code that does file.seek() then file.read(), but also in theory the "check position in file if file is large" code should make unnecessary, https://github.com/python/cpython/blob/main/Modules/_io/fileio.c#L769-L783. Still trying to wrap my head around why the "get position" didn't seem to work for that issue)

savannahostrowski pushed a commit to savannahostrowski/cpython that referenced this pull request Sep 22, 2024
Free 'self->stat_atopen' before assigning it, since
io.FileIO.__init__() can be called multiple times manually
(especially by test_io).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants