This open-source project contains the Python implementation of our approach TemporalFC (published at ISWC2023). This project is designed to ease real-world applications of fact-checking over knowledge graphs and produce better results. With this aim, we rely on:
- PytorchLightning to perform training via multi-CPUs, GPUs, TPUs or computing cluster,
- Pre-trained-TKG-embeddings to get pre-trained TKG embeddings for knowledge graphs for knowledge graph-based component,
- Elastic-search to load text corpus (wikipedia) on elastic search for text-based component, and
- Path-based-approach to calculate output score for the path-based component.
This project performs 2 independent tasks:
- Fact-checking
- Time-point prediction
First clone the repository:
git clone https://github.com/dice-group/TemporalFC.git
cd TemporalFC
There are two options to reproduce results.
- Using pre-processed input dataset, and
- Regenerate input dataset from scratch.
Select any 1 of these 2 options.
download and unzip data and embeddings files in the root folder of the project.
pip install gdown
wget https://files.dice-research.org/datasets/ISWC2023_TemporalFC/data_TP.zip
unzip data_TP.zip
Note: if it gives permission denied error you can try running the commands with "sudo"
To regenerate data from scratch, you need to re-train the embedding algorithm again and put the generated embeddings in data_TP/dataset_name/embeddings folder, and dataset in data_TP/dataset_name/train and data_TP/dataset_name/test foder.
Detailed instructions are in overall_process folder.
Install dependencies via conda:
#setting up environment
#creating and activating conda environment
conda env create -f environment.yml
conda activate tfc
#If conda command not found: download miniconda from (https://docs.conda.io/en/latest/miniconda.html#linux-installers) and set the path:
#export PATH=/path-to-conda/miniconda3/bin:$PATH
start generating results:
# Start training process, with required number of hyperparemeters. Details about other hyperparameters is in main.py file.
python main.py --eval_dataset Dbpedia124k --model temporal-full-hybrid --max_num_epochs 500 --min_num_epochs 50 --batch_size 12000 --val_batch_size 1000 --negative_triple_generation corrupted-triple-based --task fact-checking --emb_type dihedron --embedding_dim 100 --num_workers 1
# computing evaluation files from saved model in "dataset/Hybrid_Stroage" directory
python evaluate_checkpoint_model_FC.py --checkpoint_dir_folder all --checkpoint_dataset_folder dataset/ --eval_dataset Dbpedia124k --model temporal-full-hybrid --max_num_epochs 500 --min_num_epochs 50 --batch_size 12000 --val_batch_size 1000 --negative_triple_generation corrupted-triple-based --task fact-checking --emb_type dihedron --embedding_dim 100 --num_workers 1
# Start training process, with required number of hyperparemeters. Details about other hyperparameters is in main.py file.
python main.py --eval_dataset Dbpedia124k --model temporal-prediction-model --max_num_epochs 500 --min_num_epochs 50 --batch_size 12000 --val_batch_size 1000 --negative_triple_generation False --task time-prediction --emb_type dihedron --embedding_dim 100 --num_workers 1
# computing evaluation files from saved model in "dataset/Hybrid_Stroage" directory
python evaluate_checkpoint_model_TP.py --checkpoint_dir_folder all --checkpoint_dataset_folder dataset/ --eval_dataset Dbpedia124k --model temporal-prediction-model --max_num_epochs 500 --min_num_epochs 50 --batch_size 12000 --val_batch_size 1000 --negative_triple_generation False --task time-prediction --emb_type dihedron --embedding_dim 100 --num_workers 1
-
To reproduce exact results you have to use exact parameters as listed above.
-
For other datasets you need to change the parameter in front of --eval_dataset
-
Use parallel processing for fast processing. Default parameter is set to 4 workers that we used to generate results.
Available embeddings types: dihedron
Available models: temporal-prediction-model, temporal-full-hybrid
Note: model names are case-sensitive. So please use exact names.
Fact checking part should contain negative triple generation parameter.
Available options are: (1) corrupted-triple-based and (2) corrupted-time-based,
As future work, we will exploit the modularity of TemporalFC by integrating time-period based fact checking.
The work has been supported by the EU H2020 Marie Skłodowska-Curie project KnowGraphs (no. 860801)).
If you find our work useful in your research, please consider citing the respective paper:
#TemporalFC
@inproceedings{qudus2023TemporalFC,
author = {Qudus, Umair and R\"{o}der, Michael and Kirrane, Sabrina and Ngomo, Axel-Cyrille Ngonga},
title = {TemporalFC: A Temporal Fact Checking Approach over Knowledge Graphs},
year = {2023},
isbn = {978-3-031-47239-8},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
url = {https://doi.org/10.1007/978-3-031-47240-4_25},
doi = {10.1007/978-3-031-47240-4_25},
booktitle = {The Semantic Web – ISWC 2023: 22nd International Semantic Web Conference, Athens, Greece, November 6–10, 2023, Proceedings, Part I},
pages = {465–483},
numpages = {19},
keywords = {temporal fact checking, ensemble learning, transfer learning, time-point prediction, temporal knowledge graphs},
location = {Athens, Greece}
}
#HybridFC
@InProceedings{qudus2022HybridFC,
Author = {Qudus, Umair and Röder, Michael and Saleem,Muhammad and Ngomo, Axel-Cyrille Ngonga},
Editor ={Sattler, Ulrike and Hogan, Aidan and Keet, Maria and Presutti, Valentina and Almeida, Jo{\~a}o Paulo A. and Takeda, Hideaki and Monnin, Pierre and Pirr{\`o}, Giuseppe and d'Amato, Claudia},
Title = {HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs},
booktitle = {The Semantic Web -- ISWC 2022},
Year = {2022},
Doi = {10.1007/978-3-031-19433-7\_27},
isbn ={978-3-031-19433-7},
pages = {462--480},
address ={Cham},
publisher = {Springer International Publishing},
biburl = {https://www.bibsonomy.org/bibtex/2ec2f0b9ee7ca0c1c6ef1d8fbcd7262e4/dice-research},
keywords = {knowgraphs frockg raki 3dfed dice ngonga saleem roeder qudus},
url = {https://papers.dice-research.org/2022/ISWC_HybridFC/public.pdf},
}
In case you have any question, please contact: umair.qudus@uni-paderborn.de
or umair.qudus@hotmail.com
- Umair Qudus (DICE, Paderborn University)
- Michael Röder (DICE, Paderborn University)
- Sabrina Kirrane (WU, WU Vienna)
- Axel-Cyrille Ngonga Ngomo (DICE, Paderborn University)