Skip to content

Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"

Notifications You must be signed in to change notification settings

GSYfate/knnlm-limits

Repository files navigation

Great Memory, Shallow Reasoning: Limits of kNN-LMs

| Paper |

Table of Contents

Quickstart

Step 1: Setup Environment

Clone this repository:

git clone https://github.com/GSYfate/knnlm-limits.git
cd knnlm-limits

Run:

conda create --name faiss python=3.9
pip install -r requirements.txt
  • The project also relies on the faiss library. To install the GPU version of faiss, use the following command.
conda install -c pytorch -c nvidia faiss-gpu=1.7.4 mkl=2021 blas=1.0=mkl

Step 2: Saving a Datastore

Models Used in the Experiment

Datasets Used to build the datastore

To save a datastore, run:

  bash script/save_dstore.sh {model} {datastore} {path of datastore}

e.g.

  bash script/save_dstore.sh meta-llama/Llama-2-7b-hf wiki ds/wiki/Llama-2-7b-hf

Step 3: Building the FAISS index

To build the FAISS index yourself, run:

  bash script/build.sh {model} {datastore} {path of datastore}

Download the built datastore

You can also directly access our built datastore through the link below.

Math datastore: https://huggingface.co/datasets/Reset23/math

Wiki datastore: https://huggingface.co/datasets/Reset23/wiki

How can I download these built datastores?

For example, to download the math datastore, run:

  git clone https://huggingface.co/datasets/Reset23/math
  cd math
  git lfs install
  git lfs pull

Step 4: Evaluating Perplexity

To evaluate kNN-LM on the validation set, run:

  bash script/eval_perplexity.sh {model} {dataset name} {datastore} {path of datastore}

for kNN-LM

or

  bash script/eval_perplexity.sh {model} {dataset name} base

for base model

Evaluating Models on Downstream Tasks

We evaluate both the base model and kNN-LMs on downstream tasks. For each task, we will provide the scripts used for evaluation with the base model or kNN-LM.

Reasoning Tasks

For base model, run bash script/evaluate_downstream.sh {model} {obqa, mmlu, arc_challenge, arc_easy, hellaswag, drop, nq, hotpotqa, gsm8k, bbh, winogrande}

For kNN-LM, run bash script/evaluate_downstream_knn.sh {model} {obqa, mmlu, arc_challenge, arc_easy, hellaswag, drop, nq, hotpotqa, gsm8k, bbh, winogrande} {datastore} {path of datastore}

Memory-intensive Tasks

Datasets: The corresponding datasets are stored in the data folder.

Eval Program: eval_fuzzy.py

Metrics: dcpmi

Evaluation command:

base: bash script/evaluate_downstream.sh {model} {sst2,rt,rte,yahoo,mr,hyp,cr,cb,agn}

kNN-LM: bash script/evaluate_downstream_knn.sh {model} {sst2,rt,rte,yahoo,mr,hyp,cr,cb,agn} {datastore} {path of datastore}

Acknowledgement

Citation

If you find our work helpful, please use the following citations.

@misc{geng2024greatmemoryshallowreasoning,
      title={Great Memory, Shallow Reasoning: Limits of $k$NN-LMs}, 
      author={Shangyi Geng, Wenting Zhao, Alexander M Rush},
      journal ={arXiv preprint arXiv:2408.11815},
      year={2024}
}

About

Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published