-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shared_coords=True vs shared_coords=False #187
Comments
Regarding the section about coverages in geos you mentioned:
As you probably know, gaps and slivers are a big issue in GIS data in general. As long as not all borders between polygons/features are explicitly matched to each other including by having matching vertices everywhere the features touch having a perfect topology is impossible due to rounding errors,... Just doing a conversion from one file format to another can create gaps because of a conversion from double to discrete numbers,... so data that seemed properly matched/snapped will change to gappy/slivery data. So, even though the current implementations using intersections doesn't need all vertices to be there, once you get into more complex data the places where a point was snapped to the middle of a line without adding the snap-vertex to the neighbour, some of those cases won't be properly "topologized". The only structural way to get perfect results all the time and with all operations (this is not limited to creating a topology) is that data is perfectly matched... The data I'm working with at the moment, and that I've been using to test, is "happy day scenario" data. It is the result of a polygonize of raster data, so all intersections between data are perfectly matched: every segment is either perfectly horizontal or perfectly vertical, so no gaps and slivers in the data. Most data out there isn't like that though :-(... |
So, as long as the shared_coords=True, is faster, an alternative approach could be to change the default from shared_coords=True to shared_coords=False, as this will give the best results for most datasets, but if the user is sure the data is already 100% cleaned/prepared (~coverage-valid), he can use shared_coords=False to get the bit of extra performance. Mind: when I was adding tests, I first started by running them both using shared_coords=False and shared_coords=True. But, in the first 2 cases where I did this this resulted in what seemed to be a bug at first sight in the shared_coords=True path. So, I might be wrong, I was focused on shared_coords=False, but at first sight there are still some bugs there... that are best fixed if the option is kept alive. |
I'm fine with changing the default from Did you play with the |
Also observed a, what seems like, bug with import geopandas
from topojson import Topology
nybb_path = geopandas.datasets.get_path("nybb")
data = geopandas.read_file(nybb_path)
topo = Topology(
data=data, prequantize=200, shared_coords=True
)
topo.to_alt() |
No I haven't. I turned it off because the data I've used till now didn't need any cleaning. So I don't have any opionion on what a good value would be... |
As already briefly touched here: #179 (comment)
I'm not sure what the best way forward is for this?
The text was updated successfully, but these errors were encountered: