Skip to content
/ RADCoT Public

Code for "RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion", LREC-COLING 2024

Notifications You must be signed in to change notification settings

ZIZUN/RADCoT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Sung-Min Lee
May 25, 2024
e1f0d14 · May 25, 2024

History

6 Commits
Mar 16, 2024
Mar 16, 2024
May 25, 2024
Mar 16, 2024
Mar 16, 2024
Mar 16, 2024
Mar 16, 2024
Mar 16, 2024
Mar 16, 2024

Repository files navigation

RADCoT

(official) Code for "RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion", LREC-COLING 2024 (accepted)

Requirements

  • PyTorch >= 1.13.1
  • numpy
  • faiss-cpu
  • transformers==4.33.2
  • tensorboard
  • pyserini

Process

  1. Environment Setting
pip install -r ./pretrain/requirements.txt
  1. Chain-of-Thought generation (using LLM)
cd llm_infer
python infer.py
  1. Retrieval-augmented SLM training and inference
bash train.sh 4
bash infer.sh
  1. evaluation
python retrieve_bm25.py --eval_dataset trec-dl-19 --query_augment_file_path llm_infer/results/19_bm25.txt

References

Q&A

If you encounter any problem, leave an issue in the github repo.

Citation

@inproceedings{lee-etal-2024-radcot-retrieval,
    title = "{RADC}o{T}: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion",
    author = "Lee, Sung-Min  and
      Park, Eunhwan  and
      Jeon, DongHyeon  and
      Kang, Inho  and
      Na, Seung-Hoon",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.1182",
    pages = "13514--13523",
    abstract = "Large language models (LLMs) have demonstrated superior performance to that of small language models (SLM) in information retrieval for various subtasks including dense retrieval, reranking, query expansion, and pseudo-document generation. However, the parameter sizes of LLMs are extremely large, making it expensive to operate LLMs stably for providing LLM-based retrieval services. Recently, retrieval-augmented language models have been widely employed to significantly reduce the parameter size by retrieving relevant knowledge from large-scale corpora and exploiting the resulting {``}in-context{''} knowledge as additional model input, thereby substantially reducing the burden of internalizing and retaining world knowledge in model parameters. Armed by the retrieval-augmented language models, we present a retrieval-augmented model specialization that distills the capability of LLMs to generate the chain-of-thoughts (CoT) for query expansion {--} that is, injects the LLM{'}s capability to generate CoT into a retrieval-augmented SLM {--} referred to as \textbf{RADCoT}. Experimental results on the MS-MARCO, TREC DL 19, 20 datasets show that RADCoT yields consistent improvements over distillation without retrieval, achieving comparable performance to that of the query expansion method using LLM-based CoTs. Our code is publicly available at \url{https://github.com/ZIZUN/RADCoT}.",
}

About

Code for "RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion", LREC-COLING 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published