This is the code and data of paper Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models, LREC-COLING 2024.
The entire project consists of three steps: (1) Generating data, (2) Processing data and (3) Runing KGC models.
You can directly download the data we generated from Goolge Drive. Put the data in current directory, e.g., MPIKGC/LP_fb_wn_chatglm2/
.
If you want to generate data with LLMs by yourself, you can run the following command to query LLMs for data:
First, cd generating_MPIKG/
, then
For FB15k237:
python querying_llm_fb15k237.py --max_length 256 --temperature 0.2 --cuda 0 --batchsize 1 --LLMfold './../LP_fb_wn_llama2/' --LLMname ChatGLM2
For WN18RR:
python querying_llm_wn18rr.py --max_length 256 --temperature 0.2 --cuda 0 --batchsize 1 --LLMfold './../LP_fb_wn_llama2/' --LLMname ChatGLM2
For FB13 and WN11:
python querying_llm_TC_FB13_WN11.py --max_length 256 --temperature 0.2 --cuda 0 --batchsize 1 --LLMfold './../TC_fb_wn_llama2/' --LLMname ChatGLM2 --fb_or_wn FB13
After getting extracted keywords, you can genearate the top K matching entities for Structure Extraction, and change the default parameters in codes to obtain the desired data:
python textmatch4structure.py
Finally, generating data in corresponding path, e.g., LP_fb_wn_chatglm2/FB15k237/
:
cotdes.txt
for MPIKGC-E
rel2des.txt
for MPIKGC-R Global
rel2sentence.txt
for MPIKGC-R Local
rel2reverse.txt
for MPIKGC-R Reverse
struc/*
for MPIKGC-S
Merge the enhanced data with original KG to generate new KG with regular form:
cd processing_MPIKG
python merge.py
Download KGC models from their repository: CSprom-KG,
LMKE,
SimKGC,
KG-BERT. Put these project in current directory, e.g., MPIKGC/CSprom-KG/
.
For diffrent KGC models, you need to slightly modify the form of data by runing:
python data4kgc_models.py
This code will transfer the form of data to adapt to different KGC model requirements, and copy the data to corresponding model project folder.
Note that KG-BERT need same form as LMKE, You can directly copy it over.
Configure according to the environment and process requirements of each model. Hyperparameters for KGC models can be found in Appendices of our paper.