This repository contains the source code of the paper: Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning
The original data we used is from the public news dataset : MIND. We build an item2item dataset based on the method in the paper.
Files in data folder:
./data/
kg/wikidata-graph/
wikidata-graph.tsv
knowledge graph triples from Wikidataentity2id.txt
entity label to indexrelation2id.txt
relation label to indexentity2vecd100.vec
entity embedding from TransErelation2vecd100.vec
relation embedding from TransE
mind/
behaviors.tsv
the impression logs and users' news click hostoriesnews.tsv
the detailed information of news articles involved in the behaviors.tsv file
item2item/
all_news.tsv
all news used for training, validating, testingdoc_feature_embedding.tsv
document embedding from sentence-bertdoc_feature_entity.tsv
entities mentioned in documentspos_train.tsv
positive item pairs in train datapos_valid.tsv
positive item pairs in valid datapos_test.tsv
positive item pairs in test datarandom_neg_sample_train.tsv
item2item train datarandom_neg_sample_valid.tsv
item2item valid data
kprn/
train_data.json
train data for KPRNvalid_data.json
valid data for KPRNpredict_train.json
warm up train data for anchorKGpredict_valid.json
warm up valid data for anchorKG
python == 3.9.13
torch == 1.12.0
sklearn == 1.1.2
numpy == 1.23.4
hnswlib == 0.4.0
networkx == 2.8.7
nni == 2.8
sentence_transformers == 2.2.2
tqdm == 4.64.1
-
Dataset download and process
$ python data_process.py
The config file is ./config/data_config.json
If the download speed is too slow, you can refer to followng links for dataset download and put it under the corresponding folder before running the code.
- MIND_large_train: ./data/mind/train/
- MIND_large_valid: ./data/mind/valid/
- MIND_small_train: ./data/mind/train/
- MIND_small_valid: ./data/mind/valid/
- Knowledge Graph: ./data/kg/
-
Kprn training
$ python KPRN_train.py
The config file is ./config/KPRN_config.json
-
Warmup training + AnchorKG training
$ python main.py
The config file is ./config/anchorkg_config.json
We integrates with NNI module for tuning the hyper-parameters automatically. You can tune the KPRN training stage, warmpup training stage, anchorKG training stage respectively. For easy usage, you can run the following code:
$ nnictl create --config ./nni_config.yaml --port 9074
You can configure the nni_config.yaml for your own usage. For more details about NNI, please refer to NNI Documentation