Supplementary code for the paper Efficient algorithm to compute Markov transitional probabilities for a desired PageRank
This repo includes the source code for performing Reverse PageRank calculation as introduced in the EPJ Data Science publication.
In this setting the network topology and the importance of the nodes (i.e. the stationary distribution of a random walk) are assumed to be known in advance and algorithm looks for a weighting of the edges which ensures that the random walk visits each node proportional to the importance of the nodes provided in advance.
If you are not proficient in programming Java, you can still follow the below steps as well to use the algorithm:
- (optional) Install Java Runtime Environment if it is not available on your computer.
- Download the reversePageRank.jar file from this repository.
- Create an XML file in the GEXF format (in your favorite programming language) for the network the transition probabilities you would like to learn. Make sure that every node has an attribute called
weight
, which will be treated as the expected stationary distribution for the given node. Thelabel
attributes can be used for identifying the individual nodes. (You can also find a sample GEXF file in thedata/
folder of this repo that you can immediately try out.) - Assuming your network in the GEXF format is located in the file
my_network.gexf
, you can run the algorithm by invoking the commandjava -jar reversePageRank.jar my_network.gexf
. The learned transition probabilities to meet the desired stationary distribution will be written into a file namedoutput.log
, and the loss values of the initial solution and the one found by the algorithm as described in the article get printed to the standard output.
Assuming access to Maven, the quickest way to give the algorithm a try and also to see a visualization is to enter
mvn package
in the command line.
Note that running the above command creates a file named inversePR.txt
in the current working directory with details of the learning procedure printed to it.
Relying on the functionalities provided by Graphstream, it is possible to work with networks in multiple popular network formats (including the DOT, TLP and GEXF formats).
The hu.u_szeged.graph.reader.GraphReader
is an easily extendable class which (on default) operates on the ./data/sample.gexf
sample input file in the GEXF format.
Our source code relies on the following dependencies (also included in the pom.xml for Maven.)