Datasets, scripts, and fine-tuned model for the paper LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking.
-
📖 Paper: LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
-
Fine-tuned ReFinED model on 🤗HuggingFace: LLMAEL-REFINED-FT
We introduce LLM-Augmented Entity Linking (LLMAEL), a plug-and-play approach to enhance entity linking through LLM data augmentation. We leverage LLMs as knowledgeable context augmenters, generating mention-centered descriptions as additional input, while preserving traditional EL models for EL execution. Experiments on 6 standard datasets show that the vanilla LLMAEL outperforms baseline EL models in most cases, while the fine-tuned LLMAEL set the new state-of-the-art results across all 6 benchmarks.
This repository contains LLMAEL's testing and training datasets, LLM prompts, data fusion scripts, and the fine-tuned ReFinED model checkpoint.
For the datasets
directory:
- The
original_el_benchmarks
sub-directory stores the original EL benchmarks of 3 EL models BLINK, GENRE, and ReFinED, downloaded from their respective github repositories. - The
llm_raw_augmentations
sub-directory contains raw context generated by backbone LLMs, specifically short entity descriptions based on the original mentions and contexts. These entries precisely match the content and order of their corresponding benchmarks inoriginal_el_benchmarks
. - The
llm_augmented_el_benchmarks
sub-directory stores the final LLM-augmented EL benchmarks, infusing data fromllm_raw_augmentations
intooriginal_el_benchmarks
(Data from this repository needs to be generated using augment_all_datasets.sh. See section "To Generate Testing and Training Data").
git clone https://github.com/THU-KEG/LLMAEL.git
cd LLMAEL
To generate the testing and training datasets used in our paper's main experiment and ablation studies, run the following commands:
cd scripts
bash augment_all_datasets.sh
To generate datasets with other options, select and run the following python scripts:
python augment_blink_datasets_with_llm.py
python augment_genre_datasets_with_llm.py
python augment_refined_datasets_with_llm.py
with your custom parameters --llm_name
, --join_strategy
, --original_benchmarks_path
, --llm_contexts_path
, --output_path
.
Please download from our 🤗HuggingFace hub: LLMAEL-REFINED-FT
- Clone the official github repositories of our 3 selected EL models: BLINK, GENRE, and ReFinED
- Find the official test script of each model, respectively
- Change the test datasets to our augmented datasets from
llm_augmented_el_benchmarks
, and run the official test script. Our main experiments are conducted using the 6 augmented test sets synthesized under context-joining strategy 4. For the vanilla LLMAEL, we used BLINK's full cross-encoder model, GENRE's AIDA model without the candidate set, and ReFinED's AIDA model. For fine-tuned LLMAEL, we customly fine-tuned ReFinED's wikipedia model using Llama3-70-b augmented AIDA train dataset.
Our results (accuracy) are as follows:
Method | AIDA | MSNBC | AQUA | ACE04 | CWEB | WIKI | AVG |
---|---|---|---|---|---|---|---|
LLMAEL x BLINK | 81.94 | 86.56 | 85.16 | 86.01 | 69.17 | 81.14 | 81.61 |
LLMAEL x GENRE | 88.27 | 85.67 | 85.14 | 85.21 | 70.67 | 82.95 | 82.99 |
LLMAEL x ReFinED | 92.38 | 86.94 | 88.09 | 88.14 | 73.16 | 85.90 | 85.76 |
LLMAEL x ReFinED (FT) | 92.34 | 88.79 | 89.06 | 88.14 | 75.07 | 86.62 | 86.67 |
We report
@misc{xin2024llmael,
title={LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking},
author={Xin, Amy and Qi, Yunjia and Yao, Zijun and Zhu, Fangwei and Zeng, Kaisheng and Bin, Xu and Hou, Lei and Li, Juanzi},
journal={arXiv preprint arXiv:2407.04020},
year={2024}
}