A fast C++ implementation of graph kernels including:
- simple kernels between vertex and/or edge label histograms,
- random walk kernels (popular baselines), and
- the Weisfeiler-Lehman graph kernel (state-of-the-art).
Please see the following paper for more details:
- Sugiyama, M., Borgwardt, K. M.: Halting in Random Walk Kernels, Advances in Neural Information Processing Systems (NIPS 2015), 2015 [PDF]
Other implementations of graph kernels and graph benchmark datasets are available here.
You can compute the full kernel matrix by calling the function graphKernelMatrix
.
To use it, you just need to include the header file "graphKernels.h" in your program.
The Eigen and igraph libraries are needed.
The main function graphKernelMatrix
is defined as:
void graphKernelMatrix(vector<igraph_t>& g, vector<double>& par,
string& kernel_name, MatrixXd& K);
g
: a vector of input graphspar
: a vector of parameterskernel_name
: a string to specify a kernel (see the list below)K
: the full kernel matrix will be returned here
To try the code, we also provide a graph benchmark dataset "mutag" and a test code "graphKernels_test.cc", which includes input and output interface for graph files.
For example:
$ cd src/cc
$ make
$ ./gkernel -k kR -p 1,2,1 -i ../../data/mutag.list -g ../../data/mutag/ -o mutag_kR.kernel
> Reading files ... end
> Information:
Kernel: k-step random walk kernel
Parameter: k = 3, lambda = (1, 2, 1)
Number of graphs: 188
The average number of vertices: 17.9309
The maximum number of vertices: 28
The average number of edges: 19.7926
The maximum number of edges: 33
> Computing the kernel matrix ... end
Runtime for the kernel matrix computation: 2.9501 (sec.)
> Writing the kernel matrix to "mutag_kR.kernel" ... end
$ ./gkernel -k WL -p 5 -i ../../data/mutag.list -g ../../data/mutag/ -o mutag_WL.kernel
> Reading files ... end
> Information:
Kernel: Weisfeiler-Lehman kernel
Parameter: h = 5
Number of graphs: 188
The average number of vertices: 17.9309
The maximum number of vertices: 28
The average number of edges: 19.7926
The maximum number of edges: 33
> Computing the kernel matrix ... end
Runtime for the kernel matrix computation: 0.00567007 (sec.)
> Writing the kernel matrix to "mutag_WL.kernel" ... end
To compile the program, please edit paths in the "Makefile" according to the location of Eigen and igraph libraries in your environment.
-k <kernel_name>
: An abbreviated kernel name (see the list below)
-p <parameter>
: Parameter(s) in a kernel (if there are more than two, they should be comma-separated)
-i <input_file_list>
: A file describing the list of graph file names
-g <input_file_path>
: A path to the directory of graph files (the GraphML format is supported)
-o <output_file>
: An output file of the full kernel matrix
The following graph kernels are supported:
The second column (Abbrev.) is used for the third argument of graphKernelMatrix
.
Kernels | Abbrev. | Parameter |
---|---|---|
Linear kernel between vertex histograms | V | None |
Linear kernel between edge histograms | E | None |
Linear kernel between vertex-edge histograms | VE | None |
Linear kernel combination (V + λVE) | H | λ |
Gaussian RBF kernel between vertex histograms | VG | σ |
Gaussian RBF kernel between edge histograms | EG | σ |
Gaussian RBF kernel between vertex-edge histograms | VEG | σ |
Geometric random walk kernel | GR | λ |
Exponential random walk kernel | ER | β |
k-step random walk kernel | kR | λ0, λ1, ..., λk |
Weisfeiler-Lehman subtree kernel | WL | h |
The development of the graphkernels open access software package was financially supported by the Horizon 2020/CDS-QUAMRI/634541 project. This support is gratefully acknowledged.
Author: Mahito Sugiyama
Affiliation: National Institute of Informatics, Japan
E-mail: mahito@nii.ac.jp