-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PageRank #788
Add PageRank #788
Conversation
Pull Request Test Coverage Report for Build 5086986067
💛 - Coveralls |
Everything should be working now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for writing this! I really like using sprs for this, it also avoids us having to figure out how to link against and leverage blas/lapack to compute the eigenvectors of the google matrix from an ndarray. I need to refresh my memory on the algorithm before I do a detailed review on the algorithm code. I just has some quick high level comments from a quick scan of the code.
The algorithm is approximating the eigencector of the transition matrix. You might want to check NetworkX’s Python code directly because they have some quirks on how they handle dangling nodes etc. |
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in review, this looks excellent to me. The code LGTM and nothing real stands out to me as being incorrect. Just a couple small inline suggestions and questions but other than I think this is ready to merge.
/// | ||
/// :returns: a read-only dict-like object whose keys are the node indices and values are the | ||
/// PageRank score for that node. | ||
/// :rtype: CentralityMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to document that it will raise FailedToConverge
if max_iter
is reached? It's something people might want to catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will add the notes for this, HITS, eigenvector centrality and all the centralities that cannot converge in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the quick update
gonna put a pin in it and double check everything and maybe make a new issue about it, but somehow i'm getting totally uniform results from this where networkx is (a lot) slower but produces some results that make more sense |
Please open a new issue. If you could give the graph that triggers the difference that would be great as well. We use the Power Method and NetworkX uses SVD decomposition to approximate the eigenvector so there might be some discrepancies. |
Related to #315
Adds an implementation of the PageRank algorithm using sparse matrices. It uses the
sprs
crate combined withndarray
to implement a Power Method approach of finding the PageRank.Also, we test this implementation against NetworkX's implementation of the PageRank. We accept all the arguments that NetworkX accepts: tolerance, max_iter, personalization, dangling, etc.
n.b: it's ready for review