-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future Improvements- Returning a DataFrame of all source-destination nodes within a certain distance (and including the distance) #56
Comments
FYI - this has existed in Pandana in C++ for a while, but did not make it into the version that was converted to use Pandas DataFrames. |
Thanks Stefan, |
Hi, Stefan I think you could go the other way by returning your set of coordinates ( I assume those are transit stops) within predefined distance to parcel centroids. Then do a reversed query to get your isochrones. We have done similar analysis for transit access and we used 200+ transit stations. And pandana returned a table with 400+ columns successfully! |
Ahh yes, that's a pretty great idea @semcogli very clever |
@semcogli-that is a great idea! Funny because we are doing that already in our accessibility script to get the distance from each parcel to nearest transit stop. Thanks also to @fscottfoti and @sablanchard for replying! |
I would like to see this feature added as well. Are there any updates on this? Is it likely to be added anytime soon? |
hey folks, is there still potential to expose this from the C++ side? I'm working on a new spatial weights constructor for libpysal that uses pandana to find the "neighbors" withn some network distance of each observation. It accomplishes the spirit of this issue, but does so really inefficiently by iterating over all o-d pairs of an input geodataframe and calling |
It looks like the get_nodes_in_range function is gone, but it's not really needed. Basically this returns a DistanceVec which is a node_id to distance map (so has what you want). Also once you call precomputeRangeQueries, the info you want is just sitting there in memory here. Must be tantalizing ;) It just needs to be added to the cython file (cyaccess.pyx) to get it back into python-land. Maybe you should just add it? |
sweet, thanks @fscottfoti! i was hoping there would be a relatively simple path like this (and i was pretty sure Tantalizing indeed! I'll read up on how these C/++ extensions work. Once complete, I'm guessing this would mean reducing to a workflow something like net = pdna.Network(...)
net.precompute(3000)
neighbor_dist = net.get_distance_vec() which would grab the data sitting in memory and return a scipy sparse matrix or a dataframe or something? |
Yeah sounds right - might just return a dict of dicts of weights? The first key could be the origin node_id, the second key would be the destination node_id and the second value would be the weight? |
perfect |
I agree that this would be super useful! Thanks @fscottfoti for pointing out the relevant code. I can take a stab at revealing this on the Python side -- @knaaptime let's coordinate. |
🎉🎉 |
(fwiw, when this is implemented it would also mean you could use all the same tooling and workflows, but estimate spatial models in place of OLS for the hedonics back upstream in UDST) |
Ok, a first pass of this is working in the range-queries branch! The initial Python function is Precomputing is optional -- if it's been done, we can return cached results, and otherwise they'll be computed when the function is called. In C++, the data is stored as a vector of NodeID–distance pairs. The direct translation into Python is a list of tuples, for example the following if you ask for nodes within 4 impedance units of node 1:
What output formats seem most useful, though? Probably a dataframe or sparse matrix? I'll try some preprocessing in C++ to optimize performance for generating those. cc @janowicz |
Nice! So cython knows to turn a vector of pairs into a list of tuples? That's pretty nice. A list of namedtuples could be nice for the single node. Maybe a sparse matrix for the whole graph. |
awesome!! thanks a ton for the quick work on this! I agree with @fscottfoti on the formats |
i know you guys are busy with other parts of the package at the moment, but just wanted to drop by and see if there's anything i can do to test/otherwise help move this along :) |
Hi @knaaptime, thanks for following up about this! Looking back at my work in progress, I think it's close to being mergeable. I'll find some time soon to finish it up, and some testing/input from your end would definitely be helpful too! You can check things out in the range-queries branch, which includes a runnable example of the new functionality in One to-do was that it returns results as a list of lists of node id–distance tuples, which is not a very convenient format if you have a lot of origin nodes. We could either leave this as-is for the initial release, or take a pass at conversion to something else -- probably multi-indexed DataFrame would be the easiest but I'm not sure. |
I'm excited for this feature, thanks @smmaurer! |
awesome @smmaurer, thank you! I'll play with the range-queries branch, but i agree this is pretty much complete for what i'm after. A dataframe would be manageable, but i think a scipy.sparse matrix might be the ideal output for my use case (generating a spatial weights matrix) |
I'd love to see this get merged! Like @knaaptime, I'd love to have simple range-queries on networks with Pandana. I've been blown away by its efficiency! I've tackled the return value thing plus docstrings in my fork, and I've sent a draft PR (#167). I opted there to work in the same manner as The main issue is that the branch currently returns distances greater than the limiting radius. I've implemented a fix in c++ (that's buggy) and a fix using pandas (that's wasteful). I hope this is useful, and I'm game to do the legwork to push this through. |
hey folks, just wanted to check in and see if anyone is game to merge this and cut a new release. Would love to use this new functionality, and would be more than happy to help tie any loose ends I can :) |
sorry to bother folks, but just wanted to check back in and see if there's any hope to get this merged. If there's any work left to do, I'm happy to do the lifting |
With an eye towards the release of the urbanaccess libary, I'd like to test using pandana to build accessibility isochrones. My idea is to associate parcels with nodes (as we are already doing in our Soundcast accessibility scripts: https://github.com/psrc/soundcast/blob/master/scripts/accessibility/accessibility.py) and then have pandana return all the nodes/parcels that are accessible within a certain distance/impedance to a passed in set of coordinates. I would then be able to build polygon isochrones using either the parcel point or polygon geometries from the returned ids. I don't think this functionality exists in pandana yet- there would be potentially too many nodes/parcels for the nearest pois method. It sounds like the Future Improvement listed in the title would do the trick and so I was wondering if there are plans to implement this feature? The end goal is to produce transit accessibility isochrones so if this functionality will exist in urbanaccess then I am happy to wait until its release. Thanks!
Stefan
The text was updated successfully, but these errors were encountered: