Skip to content

Latest commit

 

History

History
141 lines (92 loc) · 5.03 KB

README.md

File metadata and controls

141 lines (92 loc) · 5.03 KB

Summary

This repo contains the implementation of the model proposed in Knowledge Hypergraphs: Extending Knowledge Graphs Beyond Binary Relations for knowledge hypergraph embedding, as well as the code for most of the baselines in the paper. The code can be also used to train a HypE models for any input graph.

Note however that the code is designed to handle graphs with arity at most 6 (which is the case for JF17K).

The software can be also used as a framework to implement new knowledge hypergraph embedding models.

Dependencies

  • Python version 3.7
  • Numpy version 1.17
  • PyTorch version 1.4.0

Docker

We recommend running the code inside a Docker container. To do so, you will first need to have Docker installed. You can then compile the image with:

docker build -t hype-image:latest  .

and run using (replace the path to your local repo):

docker run --rm -it -v {HypE-code-path}:/eai/project --user `id -u`:`id -g` hype-image /bin/bash

Usage

To train HypE or any of the baselines you should define the parameters relevant to the given model. The default values for most of these parameters are the ones that were used to obtain the results in the paper.

  • model: The name of the model. Valid options are HypE, HSimplE, MTransH, DistMult, MCP.

  • dataset: The dataset you want to run this model on (JF17K is included in this repo).

  • batch_size: The training batch size.

  • num_iterations: The total number of training iterations.

  • lr: The learning rate.

  • nr: number of negative examples per positive example for each arity.

  • out_channels: number of out channels for convolution filters in HypE.

  • filt_w: width of convolutional weight filters in HypE.

  • stride: stride of convolutional weight filters in HypE.

  • emb_dim: embedding dimension.

  • input_drop: drop out rate for input layer of all models.

  • hidden_drop: drop out rate for hidden layer of all models.

  • no_test_by_arity: when set, test results will not be saved by arity, but as a whole. This generally makes testing faster.

  • test: when set, this will test a trained model on the test dataset. If this option is present, then you must specify the path to a trained model using -pretrained argument.

  • pretrained: the path to a pretrained model file. If this path exists, then the code will load a pretrained model before starting the train or test process. The filename is expected to have the form model_*.chkpnt. The directory containing this file is expected to also contain the optimizer as opt_*.chkpnt, if training is to resume.

  • output_dir: the path to the output directory where the results and models will be saved. If left empty, then a directory will be automatically created.

  • restartable: when set, the training job will be restartable: it will load the model from the last saved checkpoint in output_dir, as well as the best_model, and resume training from that point on. If this option is set, you must also specify output_dir.

Training HypE and HSimplE

You can train by running the following from within Docker (the values provided below are the ones used to obtain the results in the paper):

python main.py -model HypE -dataset JF17K -num_iterations 1000 -batch_size 128 -lr 0.1  -filt_w 1 -out_channels 6 -stride 2 -emb_dim 200 -nr 10
python main.py -model HSimplE -dataset JF17K -num_iterations 1000 -batch_size 128 -lr 0.01 -emb_dim 200 -nr 10

Testing a pretrained model

You can test a pretrained model by running the following:

python main.py -model HSimplE -dataset JF17K -pretrained output/my_pretrained_model.chkpnt -test

Baselines

The baselines implemented in this package are m-DistMult, m-CP, and m-TransH. You can train them by running the following:

python main.py -model MTransH -dataset JF17K -num_iterations 1000 -batch_size 128 -lr 0.06 -emb_dim 200 -nr 10
python main.py -model MCP -dataset JF17K -num_iterations 1000 -batch_size 128 -lr 0.02 -emb_dim 34 -nr 10
python main.py -model MDitMult -dataset JF17K -num_iterations 1000 -batch_size 128 -lr 0.02 -emb_dim 200 -nr 10

@inproceedings{fatemi2019knowledge, title={Knowledge Hypergraphs: Prediction Beyond Binary Relations}, author={Fatemi, Bahare and Taslakian, Perouz and Vazquez, David and Poole, David}, booktitle={IJCAI}, year={2020} }

Cite HypE

If you use this package for published work, please cite the following paper:

@inproceedings{fatemi2019knowledge,
  title={Knowledge Hypergraphs: Prediction Beyond Binary Relations},
  author={Fatemi, Bahare and Taslakian, Perouz and Vazquez, David and Poole, David},
  booktitle={IJCAI},
  year={2020}
}

Contact

Bahare Fatemi

Computer Science Department

The University of British Columbia

201-2366 Main Mall, Vancouver, BC, Canada (V6T 1Z4)

bfatemi@cs.ubc.ca

License

Licensed under the GNU General Public License Version 3.0. https://www.gnu.org/licenses/gpl-3.0.en.html