-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page.to_image()
leaks file descriptors
#1089
Comments
In theory there is a parameter |
Let me shine some light on this:
Maybe we should just remove the parameter with the next major version to avoid confusion? WDYT? Footnotes
|
Thanks for clarifying this - I feel silly for not remembering the bit about garbage collection, that's the reason why context managers exist after all, sorry about that! So yes, |
No problem.
I guess you're right we should provide context manager functionality. |
Yes, context manager functionality would be great! |
Wait, I just remembered there are 2-in-1 with-blocks, of course: >>> import pypdfium2 as pdfium
>>>
>>> class PdfWithCtx (pdfium.PdfDocument):
... def __enter__(self):
... return self
... def __exit__(self, *_):
... self.close()
...
>>> with open(".../doc.pdf", "rb") as fh, PdfWithCtx(fh) as pdf:
... print(pdf[0].get_textpage().get_text_bounded()) So I guess autoclose is more or less obsolete. I think I might remove the parameter with the next major version (if that ever happens 😅). |
Describe the bug
When extracting images from a PDF file with a large number of pages, we eventually run out of file descriptors on Mac. This probably leaks file descriptors everywhere but Linux has less strict ulimits. It may also be related to #1072 as Windows has silly restrictions on opening the same file multiple times.
This is because
pypdfium2
is holding onto a file descriptor somewhere. The exception thrown by the code below:Code to reproduce the problem
PDF file
https://ville.sainte-adele.qc.ca/upload/documents/20231213-Codification-administrative-Rgl-1314-2021-Z.pdf (485 pages, enough to hit the ulimit on a Mac!)
Environment
develop
branch (commit 07d9997)The text was updated successfully, but these errors were encountered: