Entity Alignment is the task of linking entities with the same real-world identity from different knowledge graphs. EVA is a set of algorithms that leverage images in knowledge graphs for facilitating Entity Alignment.
This repo holds code for reproducing models presented in our paper: Visual Pivoting for (Unsupervised) Entity Alignment [arxiv][aaai] at AAAI 2021.
Download the used data (DBP15k, DWY15 along with precomputed features) from dropbox or BaiduDisk (code: dhya) (1.3GB after unzipping) and place under data/
.
Original sources of DBP15k and DWY15k:
[optional] The raw images of entities appeared in DBP15k and DWY15k can be downloaded from dropbox (108GB after unzipping). All images are saved as title-image pairs in dictionaries and can be accessed with the following code:
import pickle
zh_images = pickle.load(open("eva_image_resources/dbp15k/zh_dbp15k_link_img_dict_full.pkl",'rb'))
print(en_images["http://zh.dbpedia.org/resource/香港有線電視"].size)
We use the DWY15k dataset as an example (files not used in experiments are omitted).
data/DWY_data/
├── dwy15k_dense_sf_vec.npy: surface form vectors encoded by fastText (dense split)
├── dwy15k_norm_sf_vec.npy: surface form vectors encoded by fastText (normal split)
├── dbp_wd_15k_V1/: normal split
│ ├── mapping/
│ │ ├── 0_3/: the third split (used across all experiments)
│ │ │ ├── ent_ids_1: mapping between entity names and ids for graph 1
│ │ │ ├── ent_ids_2: mapping between entity names and ids for graph 2
│ │ │ ├── rel_ids_1: mapping between relation names and ids for graph 1
│ │ │ ├── rel_ids_2: mapping between relation names and ids for graph 2
│ │ │ ├── ill_ent_ids: inter-lingual links (specified by ids)
│ │ │ ├── triples_1: a list of tuples in the form of (head, relation, tail) for graph 1 (specified by ids)
│ │ │ ├── triples_2: a list of tuples in the form of (head, relation, tail) for graph 2 (specified by ids)
│ │ │ ├── ...
│ │ ├── ...
│ ├── ...
├── dbp_wd_15k_V2/: dense split
│ ├── ...
data/pkls/
├── dbpedia_wikidata_15k_norm_GA_id_img_feature_dict.pkl: mapping between entity names to image features for DWY15k (normal)
│ ├── ...
The code is tested with python 3.7 and torch 1.7.0.
Run the full model on DBP15k:
./run_dbp15k.sh 0 2020 fr_en
where 0
specifies the GPU device, 2020
is a random seed and fr_en
sets the language pair.
Similarly, you can run the full model on DWY15k:
./run_dwy15k.sh 0 2020 1
where the first two args are the same as before, the third specifies where using the normal (1
) or dense (2
) split.
To run without iterative learning:
./run_dbp15k_no_il.sh 0 2020 fr_en
./run_dwy15k_no_il.sh 0 2020 1
To run the unsupervised setting on DBP15k:
./run_dbp15k_unsup.sh 0 2020 fr_en
Our codes are modified from KECG. We appreciate the authors for making KECG open-sourced.
@inproceedings{liu2021visual,
title={Visual Pivoting for (Unsupervised) Entity Alignment},
author={Liu, Fangyu and Chen, Muhao and Roth, Dan and Collier, Nigel},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={35},
number={5},
pages={4257--4266},
year={2021}
}
EVA is MIT licensed. See the LICENSE file for details.