-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple closest street network segments with get_network_id #124
Comments
Hi @kvnkrmr, How faster is If you'd like to make a PR to tackle the warning and/or faster implementation, I'd be more than happy. |
Hi @martinfleis,
as the roads and
as the buildings. For this whole dataset, |
Thanks! That is a huge dataset. I haven't tried as it shows 700 hours on my machine at the moment 🤣. I assume the difference of 1s is basically equal to nothing with geodataframes as large as these. What was the total time? Testing all three versions using built-in I am curious about the benchmarks on the larger data. One idea regarding the cause of this issue - as it is highly unlikely that two different street segments will be at the exactly the same distance from building centroid, I assume that those affected segments are, in fact, duplicated. Which happens sometimes in OSM. Anyway, we should print a warning with ids explaining what happened. Looking forward for timings and PR! Thanks! |
@kvnkrmr just checking the state of this. I want to release 0.1.1 with a few bugfixes, so I just want to understand if I should wait for this or leave for the next one. Thanks! |
@martinfleis Hi, I couldn't find the time for this in the last week, so leave it for the next one. Sorry! |
Just an update on this. GeoPandas will include |
I was looking at the
get_network_id
function and found an issue.During the nearest distance calculation
information is lost if there are two geometries which have exactly the same distance from each other. I think this should at least be mentioned if that happens.
I stumbled over this problem by trying to improve the speed of the for loop by using
pandas.DataFrame.apply
and instead ofpandas.DataFrame.iterrows
. For the ´apply` I implemented two different functions, and that is where I found the issue.My solution, which seems to be a bit faster for bigger dataframes looks like this:
I was worried using the function with dictionaries was not working because the results were different than the old way, but when looking at the ´distances´ dict, I saw that it sometimes occurs that building centroids have exactly the same distance to the streets. Which makes sense, but that information should not be dropped.
The text was updated successfully, but these errors were encountered: