-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix write of kml lon/lat transpose #421
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @nicholas-ys-tan !
Given that Fiona is setting traditional order after instantiating the CRS object, it seems reasonable for us to do the same, though I'm not aware of what the negative side effects of doing so would be.
A few comments below.
Would you mind adding a test case for GeoJSON written with RFC7946 as described here
Something like this should work:
def test_write_geojson_rfc7946_coordinates(tmp_path, use_arrow=use_arrow):
points = [Point(10, 20), Point(30, 40), Point(50, 60)]
gdf = gp.GeoDataFrame(geometry=points, crs="EPSG:4326")
output_path = tmp_path / "test.geojson"
write_dataframe(
gdf,
output_path,
layer="tmp_layer",
driver="GeoJSON",
RFC7946=True,
use_arrow=use_arrow,
)
gdf_in = read_dataframe(output_path, use_arrow=use_arrow)
assert np.array_equal(gdf_in.geometry.values, points)
We should probably also test this for appending to an existing file (not entirely sure how that will work in that case, because then we don't create the layer object with constructed CRS object ?) |
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Thanks @brendan-ward , Have added your changes and test, will just need to do this one tomorrow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small detail: can you add an entry to CHANGES.md for you PR.
Co-authored-by: Pieter Roggemans <pieter.roggemans@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot to mention that the tests needed @pytest.mark.requires_arrow_write_api
since they are writing using the Arrow API
This should fix the test failures on CI.
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
Co-authored-by: Brendan Ward <bcward@astutespruce.com>
I am currently investigating this:
There seems to be an issue unrelated to this ticket - I was testing the appending to a .kml file, and it introduces a Z-dimension. I am not sure if it is known behaviour but I couldn't find an existing ticket about it. Interestingly, it has nothing to do with points = [Point(10, 20), Point(30, 40), Point(50, 60)]
gdf = gpd.GeoDataFrame(geometry=points, crs="EPSG:4326")
output_path = r'/home/nicholas/dev/data/ogr_test/temporary_kml_file.kml'
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=True
)
gdf_read = read_dataframe(output_path, use_arrow=True)
print(gdf_read.geometry)
points_append = [Point(70, 80), Point(90, 100), Point(110, 120)]
gdf_append = gpd.GeoDataFrame(geometry=points_append, crs="EPSG:4326")
write_dataframe(
gdf_append, output_path, layer='tmp_layer', driver="KML", use_arrow=True, append=True
)
gdf_read_appended = read_dataframe(output_path, use_arrow=True)
print(gdf_read_appended.geometry)
This doesn't seem to be related to the points = [Point(10, 20), Point(30, 40), Point(50, 60)]
gdf = gpd.GeoDataFrame(geometry=points, crs="EPSG:4326")
output_path = r'/home/nicholas/dev/data/ogr_test/temporary_kml_file2.kml'
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=True
)
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=True
)
print(gdf_read.geometry)
On looking into the KML, can confirm it gets restructured Before: <?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Folder><name>tmp_layer</name>
<Placemark>
<Point><coordinates>10,20</coordinates></Point>
</Placemark>
<Placemark>
<Point><coordinates>30,40</coordinates></Point>
</Placemark>
<Placemark>
<Point><coordinates>50,60</coordinates></Point>
</Placemark>
</Folder>
</Document></kml> After over-writing with same data: <?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Document id="tmp_layer">
<name>tmp_layer</name>
<Placemark id="tmp_layer.1">
<Point>
<coordinates>
10,20,0
</coordinates>
</Point>
</Placemark>
<Placemark id="tmp_layer.2">
<Point>
<coordinates>
30,40,0
</coordinates>
</Point>
</Placemark>
<Placemark id="tmp_layer.3">
<Point>
<coordinates>
50,60,0
</coordinates>
</Point>
</Placemark>
</Document>
</Document>
</kml> This happens in the main branch too, but with the reverse co-ordinates. I assume this is not something to be addressed within this ticket. I haven't yet confirmed if this is an upstream bug but I am currently assuming it is. As a work-around, I've modified the test for the appended KML file to only compare the xy-coordinates by reading with If this isn't already a known issue, I'll do some digging as to why this is happening. |
the ubuntu-small CIs don't seem to like to open the produced KML files for appending - will need to look more into it this weekend. |
Not sure why this leads to appending not working... but FYI: apparently ubuntu-small doesn't contain the LIBKML driver, so according to de doc kml handling will fallback to the KML driver in that case. |
Oh that is interesting, one thing I did note but neglected to mention in my above text about the Z-dimension is that when I run This seems to suggest when there is an existing KML file, LIBKML is used to over-write the KML file, hence why the Z-dimension is introduced, and why it is failing in ubuntu-small. points = [Point(10, 20), Point(30, 40), Point(50, 60)]
gdf = gpd.GeoDataFrame(geometry=points, crs="EPSG:4326")
output_path = r'/home/nicholas/dev/data/ogr_test/temporary_kml_file7.kml'
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="LIBKML", use_arrow=True
)
gdf_read = read_dataframe(output_path, use_arrow=True)
print(gdf_read.geometry)
and outputs the same KML format as the over-written file from my earlier comment: <?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Document id="tmp_layer">
<name>tmp_layer</name>
<Placemark id="tmp_layer.1">
<Point>
<coordinates>
10,20,0
</coordinates>
</Point>
</Placemark>
<Placemark id="tmp_layer.2">
<Point>
<coordinates>
30,40,0
</coordinates>
</Point>
</Placemark>
<Placemark id="tmp_layer.3">
<Point>
<coordinates>
50,60,0
</coordinates>
</Point>
</Placemark>
</Document>
</Document>
</kml> I'll investigate further to confirm tomorrow morning. |
It looks like always adding the Z value when using Can you log a separate issue that appending a KML file also adds Z value when using the |
The KML driver might just not support appending? In that case you will have to skip the test (or the second part of the test) when LIBKML driver is not available (you can use something like BTW, the error message we give is a bit confusing. We are adding the suggestion "It might help to specify the correct driver explicitly by prefixing the file path with ':', e.g. 'CSV:path'", which is strange because when writing, the user already explicitly mentions the driver (which in practice is not used when appending I think, but that's not really clear for the user) |
Thanks @jorisvandenbossche , I think it's not just appending but also over-writing. It seems to use LIBKML if over-writing an existing file regardless of Thanks for the pointer, I was looking for how pyogrio is able to check what drivers are available. Will push the fixes shortly. points = [Point(10, 20), Point(30, 40), Point(50, 60)]
gdf = gpd.GeoDataFrame(geometry=points, crs="EPSG:4326")
output_path = r'/home/nicholas/dev/data/ogr_test/temporary_kml_file2.kml'
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=True
)
write_dataframe(
gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=True
)
print(gdf_read.geometry)
|
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Yes, but this test it's the append case that matters (so that's the reason I mentioned it explicitly). |
Thanks Joris, that all makes sense - I'll include this information in a separate ticket to document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nicholas-ys-tan !
With this fix in, shall we do a 0.8.1 release? (this seems like an important fix to get out for when geopandas switches to use pyogrio by default) |
Closes #420
My cython is atrocious.
Followed what was done in fiona, qgis and gdal documentation to find this is probably what needs to be done. But I am probably doing it too broadly and may have unintended consequences in other filetypes.
Still need to write tests.Need to investigate how this may impact other file types.