-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leaks occur when saving each page of a PDF as an image #1430
Comments
I have tested this on Windows and Linux: |
Is there no way to release the cache? Or to achieve the preservation of images of other programs, the current situation in the processing of multiple large PDF documents is easy to take up all the memory |
Yes, there is: execute You can also try the MuPDF CLI tool like this: Caching suppression is not yet available in PyMuPDF, but I will make sure to include it in the next version. All this will of course have an adverse effect on performance. Other considerations to alleviate this problem include using Python multiprocessing as explained in the documentation. |
If you run |
In any case, after being done with a pixmap (i.e. after saving it), set it to |
I found the following logix best to keep intermediate memory under control while also delivering acceptable speed. # process the file in segments / intervals
doc = fitz.open("adobe.pdf")
interval = 50
pc = doc.page_count
pno = 0
while pno < pc:
limit = min(pc, pno + interval)
for page in doc.pages(pno, limit, 1):
pix = page.get_pixmap()
pix = None # <== important!
if limit >= pc:
break
pno += interval
doc.close() # release file and its resources
fitz.TOOLS.store_shrink(100) # empty MuPDF cache
doc = fitz.open(doc.name) # recycle document
doc.close() |
The solution you described has improved the memory problem, thank you very much. |
Describe the bug (mandatory)
Memory leaks occur when saving each page of a PDF as an image, I wrote this operation as a function that increases memory each time the loop runs, It doesn't seem to recycle memory
To Reproduce (mandatory)
If you comment the code that saves the image, there is no memory problem
Your configuration (mandatory)
3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)]
win32
PyMuPDF 1.19.1: Python bindings for the MuPDF 1.19.0 library.
Version date: 2021-10-23 00:00:10.
Built for Python 3.9 on win32 (64-bit).
The text was updated successfully, but these errors were encountered: