-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Retention with fitz.page.get_pixmap() #3625
Comments
Adding fitz.TOOLS.store_shrink(100) after pix = None actually helped a lot. Here is a link to an older issue which I missed at first |
Can you please provide printouts with numbers updated after the mentioned adjustments? In general, if a permanently low memory footprint is desired (for whatever reasons), shrinking the store usage should be used generously.
|
Below you can see memory profiling after adjustments. The interesting thing is that while processing the file f0 fitz.TOOLS.store_shrink(100) in line 47 seems to made no difference, but memory usage increased only by 7MiB. And didn't shrink back to initial number. While processing file f1, fitz.TOOLS.store_shrink(100) in line 47 reduced memory usage a lot. But still not all of it. Additional 20.12 MB added up. Then it seems to plateau. P.S. I have upgraded PyMuPDF to 1.24.7 memory profiling after adjustmentsprocessing file f0Memory usage before function: 53.28 MB Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 60.41 MB processing file f1Memory usage before function: 60.41 MB Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 80.53 MB processing file f2Memory usage before function: 80.53 MB Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 80.53 MB processing file f3Memory usage before function: 80.53 MB Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 80.53 MB |
I encountered the same issue! Memory leak! try:
with fitz.Document(stream=data, filetype="pdf") as doc:
...
except Exception as e:
logging...
finally:
fitz.TOOLS.store_shrink(100)
gc.collect() other code: zoom_x = request.imgsz / page_width
zoom_y = request.imgsz / page_height
zoom = min(zoom_x, zoom_y)
mat = fitz.Matrix(zoom, zoom)
pix = page.get_pixmap(matrix=mat, colorspace="rgb", alpha=False) |
Another issue: why does calling the page.get_image_rects function return a large number of images (over 40,000), when there are no visible images on that PDF page? I'm looking for this PDF. I'll share it once I find it. |
Please do not mix different things in the same report! |
It seems that your "issue" goes back to that |
Thank you very much, I will give it a try. |
Description of the bug
When processing larger PDF files the page.get_pixmap() method significantly increases memory usage and does not release it properly after completion. It results in a high memory footprint that persists until an even larger file is processed. This behavior can be observed from the memory profiling data provided below.
I implemented the operation as a function that is called in cycle for each file. I set pix = None for each page and call doc.close() and fitz.TOOLS.store_shrink(100) for each document as was suggested in a similar issue here #1430
One can see that sugnificant increase in memory usage occurred while processing file f1 and a high memory footprint persisted while processing later files.
If there is a method I could call to release the memory please let me know.
Relevant closed issue #1430.
processing file f0
Memory usage before function: 34.70 MB
Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 39.10 MB
Memory usage difference total: 4.41 MB
processing file f1
Memory usage before function: 39.10 MB
Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 301.36 MB
Memory usage difference total: 262.26 MB
processing file f2
Memory usage before function: 301.36 MB
Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 301.36 MB
Memory usage difference total: 0.00 MB
processing file f3
Memory usage before function: 301.36 MB
Line # Mem usage Increment Occurrences Line Contents
Memory usage after function: 301.36 MB
Memory usage difference total: 0.00 MB
How to reproduce the bug
PyMuPDF version
1.23.x or earlier
Operating system
Linux
Python version
3.11
The text was updated successfully, but these errors were encountered: