Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyshp 2.3 closing bytesio objects with Writer #244

Closed
akrherz opened this issue May 11, 2022 · 5 comments
Closed

pyshp 2.3 closing bytesio objects with Writer #244

akrherz opened this issue May 11, 2022 · 5 comments
Labels

Comments

@akrherz
Copy link

akrherz commented May 11, 2022

PyShp Version

2.3.0

Python Version

3.10

Your code

shpio = BytesIO()
shxio = BytesIO()
dbfio = BytesIO()
with shapefile.Writer(shx=shxio, dbf=dbfio, shp=shpio) as shp:
     ...
zio = BytesIO()
with zipfile.ZipFile(
        zio, mode="w", compression=zipfile.ZIP_DEFLATED
    ) as zf:
        zf.writestr("fn.shp", shpio.getvalue())
        zf.writestr("fn.shx", shxio.getvalue())
        zf.writestr("fn.dbf", dbfio.getvalue())

Full stacktrace

zf.writestr("fn.shp", shpio.getvalue())
ValueError: I/O operation on closed file.

Other notes

af1dee9 appears to now close any file objects. Perhaps my usage is not a best practice for BytesIO objects, but even in a context manager, the new pyshp code will close the bytesio objects.

@akrherz akrherz added the bug label May 11, 2022
@mcflugen
Copy link

@akrherz I came here to file this exact issue. I'm also not sure if it's best practice but we use BytesIO objects in this same way you are.

It seems as though if shapefile.Writer is able to write to binary streams, then those streams should be accessible after writing to them. Otherwise, why write to them?

@mcflugen
Copy link

mcflugen commented May 12, 2022

I'm not sure if this is the direction you would want to go but something like the following fixes this issue for me.

My thought is that if the Writer opens a file, then it should be responsible for closing it, but if the Writer's given an already open stream then it should be left to caller to close that stream.

One way to accomplish this is to have __getFileObj keep track of all the files the Writer opens,

    def __getFileObj(self, f):
        """Safety handler to verify file-like objects"""
        if not f:
            raise ShapefileException("No file-like object available.")
        elif hasattr(f, "write"):
            return f
        else:
            pth = os.path.split(f)[0]
            if pth and not os.path.exists(pth):
                os.makedirs(pth)
            fp = open(f, "wb+")
            self._files_to_close.append(fp)
            return fp

and then, instead of the for attribute in (self.shp, self.shx, self.dbf): loop, the close method would end in,

        while self._files_to_close:
            self._files_to_close.pop().close()

If there is ever an issue with streams not being flushed, perhaps the close method could call the file's flush method, if available?

        for attribute in (self.shp, self.shx, self.dbf):
            try:
                attribute.flush()
            except AttributeError:
                pass

Anyway, that's certainly not the most elegant solution but maybe there's something there?

@karimbahgat
Copy link
Collaborator

Thanks to both. So i did think about this when i implemented the change, with the idea that calling close() should indeed close fileobjects, and that all code relating to a context manager should happen within the context close. Nevertheless, i agree about your point that closing is most important for files that pyshp opens itself. I've also encountered the same problem and agree that writing to file objects is pointless if you can't read them out after, the problem being that closing was required to write and flush the final header info.

I'm a bit short on time so i just took your suggested code and implemented keeping track of opened files and only closing those, flushing all data regardless regardless of source, and added several more tests for this.

Can any of you confirm that the latest code works for your use case?

@akrherz
Copy link
Author

akrherz commented Jun 1, 2022

Thanks @karimbahgat , I reviewed your commit and the change made sense to me. I tested it locally and it seems to be working as expected.

@karimbahgat
Copy link
Collaborator

Released now in v2.3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants