-
-
Notifications
You must be signed in to change notification settings - Fork 935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pyogrio
doesn't like io.BytesIO?
#3260
Comments
It actually looks like the error might be on the save, not on the read. The following fails. import io
import os
from pathlib import Path
import pyogrio
import pyogrio.raw
from shapely import Polygon
import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal
os.environ["PYOGRIO_USE_ARROW"] = "1"
gpd.options.io_engine = "pyogrio"
gpd.show_versions()
data = gpd.GeoDataFrame(
[
{"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
{"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
{"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
],
geometry="geometry",
crs="EPSG:4326",
)
outpath = Path("tmp.gpkg")
if outpath.exists():
outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()
with io.BytesIO() as stream:
data.to_file(stream, layer="geometry", driver="GPKG")
bytestr = stream.getvalue()
assert bytestr == bytestr_from_file, f"{len(bytestr)=} != {len(bytestr_from_file)=}"
|
Thanks @bretttully for the report, this is currently the case - bytesIO can't be written to, see geopandas/pyogrio#249 (and discussion in #2875). We should note this as a difference between fiona and pyogrio that could break people in 1.0 |
Oh, thanks @m-richards -- that would be a fairly large regression for us... We could work around by writing to a temp file and then reading to bytes back in, but that wouldn't be great. |
Thanks @bretttully, this is a good feedback to have! I suppose you're not the only one using BytesIO as intermediate files. @jorisvandenbossche @brendan-ward @theroggy what is the feasibility of getting this to pyogrio 0.8 before geopandas 1.0 lands? |
I've been looking into this based on how it is implemented in Fiona / rasterio and working toward a potential PR. Not sure about the timing because there are some complexities here to work out (GPKG append / add layers to memory stream). Will continue the discussion on the pyogrio side. |
@bretttully can you post the output of |
Code: import io
from pathlib import Path
import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal
from shapely import Polygon
gpd.show_versions()
data = gpd.GeoDataFrame(
[
{"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
{"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
{"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
],
geometry="geometry",
crs="EPSG:4326",
)
outpath = Path("tmp.gpkg")
if outpath.exists():
outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()
with io.BytesIO() as stream:
data.to_file(stream, layer="geometry", driver="GPKG")
bytestr = stream.getvalue()
assert len(bytestr) == len(bytestr_from_file), f"{len(bytestr)=} != {len(bytestr_from_file)=}"
with io.BytesIO(bytestr) as stream:
data2 = gpd.read_file(stream, driver="GPKG")
assert_geodataframe_equal(data, data2) |
Note the change of |
Thanks @bretttully I can indeed reproduce that, it works with fiona (both with released geopandas as with geopandas main), and as we know it does not yet work with pyogrio (geopandas/pyogrio#249) From a quick test, current fiona does not allow to append ( Fiona allows you to write for a multi-file driver like Shapefile, but then reading the resulting bytes doesn't work (at least not easily by just passing a stream):
|
This is now implemented in pyogrio 0.8.0; wheels are on PyPI / conda forge. |
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
Fails with the following error
Expected Output
Output of
geopandas.show_versions()
The text was updated successfully, but these errors were encountered: