Skip to content

Latest commit

 

History

History
148 lines (92 loc) · 5.97 KB

INSTALL.md

File metadata and controls

148 lines (92 loc) · 5.97 KB

Installation Instructions

This file explains how to install the code, which has been used for the paper

Verifying message-passing neural networks via topology-based bounds tightening

by Christopher Hojny (*), Shiqiang Zhang (*), Juan S. Campos, and Ruth Misener, (* co-first authors) which has been published at: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, PMLR 235, 2024.

Here is what you have to do to get the code running:

  1. Download SCIP from https://scipopt.org/. We recommend to use at least version 8.0.1, because the code has not been tested with older versions.

  2. Install SCIP and compile it as described in the INSTALL file of SCIP's main directory with your individual settings. Make sure to create the necessary softlinks in SCIP's lib directory.

    To replicate the results from the above paper, we recommend to use the compilation command "make LPS=spx OPT=opt", i.e., to use the following settings:

    (a) LPS=spx: Use SoPlex as LP solver. For this you have to install SoPlex, which you can also find at http://scipopt.org. If you have installed SCIP via the SCIP-OptSuite, then you also have installed SoPlex.

    (b) OPT=opt: The code is compiled in optimized mode and runs significantly faster.

    On some machines, you should use gmake instead of make.

  3. Clone this repository (it contains the project).

  4. There are two options to determine the path to the SCIP directory:

    • Set the environment variable SCIP_PATH to contain the path to SCIP's root directory.

    • Edit the Makefile of the directory scipgnn, edit the variable SCIPDIR if necessary. It should point to the directory that contains SCIP, i.e., $SCIPDIR/lib contains the SCIP library files.

  5. Compile the project: In the main directory scipgnn, enter exactly the same compilation command as used in Step 2.

  6. To run the program, enter bin/scipgnn.$(OSTYPE).$(ARCH).$(COMP).$(OPT).$(LPS) (e.g., "bin/scipgnn.linux.x86_64.gnu.opt.spx2"). The first two arguments are mandatory and specify (i) the information about the trained GNN, (ii) the information about the verification problem that needs to be solved.

    Additional parameters can be:

    -s <setting file>

    -t <time limit>

    -m <mem limit>

    -n <node limit>

    -d <display frequency>

    To obtain files (i) and (ii), we provide scripts in the directory scripts_experiments.

    (i) The script scripts_experiments/generate_instance_from_json.py generates a file encoding a trained GNN from a JSON file generated by PyTorch. The script is called via

    `scripts_experiments/generate_instance_from_json.py <gnn.json> <path-to-store> <filename>`
    

    where

    - `gnn.json` is the JSON file encoding a trained GNN
    - `path-to-store` is the path to an existing directory in which the file to be generated
      is stored
    - `filename` is the name of the file to be generated
    

    (ii) To create files modeling the graph or node classification problems, we provide two scripts. The script

    scripts_experiments/create_temporary_data_graphclassification.py

    creates the instance files for graph classification problems and the script

    scripts_experiments/create_temporary_data_nodeclassification.py

    creates the files for node classification problems.

    Both scripts receive the following input:

    • inputfile: a file containing information about the underlying graph
    • outputfile: the path to a file in which the about shall be stored
    • globalbudget: the global attack budget (percentage of edges for graph classification and total number of edges for node classification)
    • strength: local attack strength
    • labelshift: shift that determined the attack label (if original label is l and the shift is s, the attack label is (l + s) mod #labels).

    The input files containing information about the underlying graphs are generated by the scripts

    scripts_experiments/generate_basic_instance_files_graphclassification.py

    or

    scripts_experiments/generate_basic_instance_files_nodeclassification.py

    Both receive the following arguments as input

    • the path to the directory that contains the PyTorch files for the corresponding GNNs (in pt-format)
    • the path to the directory in which the generated file shall be stored
    • for graph classification either "MUTAG" or "ENZYMES", for node classification either "CiteSeer" or "Cora"

    For the sake of convenience, we provide the files containing information about the underlying graphs as well as the files containing the trained GNNs. The files

    data_experiments/graph_classification_instances/graph_{ENZYMES,MUTAG}_<num>.gcinfo

    contain information about the graph classification problems for ENZYMES and MUTAG, where <num> provides the numerical identifier of these graph in the corresponding collections.

    The files

    data_experiments/node_classification_instances/graph_{Cora,CiteSeer}_<num>.gcinfo

    contain information about the node classification problems for Cora and CiteSeer, where <num> provides the index of the node of the corresponding graph that is attacked.

    Finally,

    data_experiments/gnn_instances/model_{CiteSeer,Cora,ENZYMES,MUTAG}.gnn

    encode the trained GNNs used in our experiments. That is, to run the code, only the script

    scripts_experiments/create_temporary_data_graphclassification.py

    or

    scripts_experiments/create_temporary_data_nodeclassification.py

    needs to be called to create an instance. An exemplary call is

    scripts_experiments/create_temporary_data_graphclassification.py data_experiments/graph_classification_instances/graph_MUTAG_1.gcinfo tmpfile 0.1 2 1

    to create an instance for the graph with ID 1 from the MUTAG test set, the result is stored as tmpfile with a global attack budget of 0.1 (i.e., 10% of all edges), a local attack strength of 2 as well as a label shift of 1.