Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too Many Open Files when working with Tiffs (issue with files being closed) #6985

Closed
geoawd opened this issue Mar 3, 2023 · 5 comments · Fixed by #6986
Closed

Too Many Open Files when working with Tiffs (issue with files being closed) #6985

geoawd opened this issue Mar 3, 2023 · 5 comments · Fixed by #6986
Labels
Anaconda Issues with Anaconda's Pillow TIFF Windows

Comments

@geoawd
Copy link

geoawd commented Mar 3, 2023

What did you do?

I am using os.walk to pattern match files in folders, reducing the size of images using pillow and saving them to a temporary folder, eventually merging these to a PDF. I have narrowed the issue down to the Pillow compression as Without the pillow compression, the process completes for many hundreds of thousands of images but when using the pillow compression it stops consistently with [Errno 24] Too many open files on file 8187.

This happens in VS Code and an Anaconda Powershell on different computers in different python environments and in pillow 9.2.0 and 9.4.0.

What did you expect to happen?

Images would be opened, saved and closed using the context manager where conditions are met. This works until the script stops with [Errno 24] Too many open files.

What actually happened?

The process stops at the same point every time (after successfully processing 8186 files) with [Errno 24] Too many open files.

pillow

I think this is similar to the issue shown in #5936 where the tiff files are not being closed.

Unfortunately, I have many hundreds of thousands of files to process. I've tried all the suggestions on #5936 Could something have re-introduced this issue?

What are your OS, Python and Pillow versions?

  • OS: Windows, tested on two separate computers (i7, 16GB ram and Xeon, 128GB ram) with different environments:
  • Python: 3.8.12 and 3.9.12
  • Pillow: 9.2.0 and 9.4.0

I have provided a working example using only the compression function.

Working example

Code with an example 20kb image is included: Pil.zip

This throws the exception [Errno 24] Too many open files on file 8187 on both an i7, 16GB ram and a Xeon, 128GB ram.

import os
import gc
from PIL import Image, TiffImagePlugin

TiffImagePlugin.DEBUG = True
Image.MAX_IMAGE_PIXELS=None

dir = os.path.dirname(__file__)

input_dir= os.path.join(dir +"/input/")
temp_dir= os.path.join(dir +"/temp/")
exceptionfile = './temp/exceptionlog.txt'


def compression():
    try:
        with open (os.path.join(dir +"/input/" + file), "rb") as f:
            img = Image.open(f)
            if img.mode == '1':
                img.save(temp_dir+file, compression='group4')
            elif img.mode == 'P' or img.mode == 'L':
                bitonal = img.convert('1')
                bitonal.save(temp_dir+file, compression='group4')
            elif img.mode == 'RGB':
                img.save(temp_dir+file+".jpg", compression='jpeg') 
            else: img.save(temp_dir+file+".jpg", compression='jpeg')
    except Exception as e:
        print(e)
        with open(exceptionfile, 'a') as f:
            print(f"{e}", file = f)
    finally:
         gc.collect()

test_image = ["0500_SI_0001.tif"]
         
if __name__ ==  "__main__":

    count = 0
    while True:
        for file in test_image:
            compression()
            count += 1
            print(count)
@geoawd
Copy link
Author

geoawd commented Mar 3, 2023

This appears to be the same as #6986 and the pull request in #6986 seems to fix the issue (at least with the test files I have provided). I can test with a wider range of formats next week.

@radarhere
Copy link
Member

Just to link, this is a new report of #6671

@radarhere
Copy link
Member

It would be interesting if anyone knows why 8187 is the magic number here.

@Yay295
Copy link
Contributor

Yay295 commented Mar 4, 2023

https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmaxstdio

The _setmaxstdio function changes the maximum value for the number of files that may be open simultaneously at the stream I/O level. C run-time I/O now supports up to 8,192 files open simultaneously at the low I/O level.

They probably have five other files open somewhere. 8187 + 5 = 8192.

@geoawd
Copy link
Author

geoawd commented Mar 4, 2023

I’ve tested the pull request in #6986 with a couple of sets of files and this has resolved the Too many open files error.

So long as this hasn’t introduced any unexpected issues then I think this issue can be closed.

@aclark4life aclark4life added the Anaconda Issues with Anaconda's Pillow label May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Anaconda Issues with Anaconda's Pillow TIFF Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants