This repository is an implementation of test-time training for few-shot learning using vLLM for inference and Torchtune for LoRA finetuning.
Fewshot-TTT/
├── external/
│ ├── BIG-Bench-Hard/ # Submodule for BIG-Bench Hard tasks
│ └── torchtune/ # Submodule for adamzweiger's Torchtune
│
├── logs/ # Logs
│ ├── archive/ # Archived logs
│ └── current/ # Current logs
│
├── scripts/ # SLURM scripts for running experiments or utilities
│
├── src/ # Source code for the project
│ ├── tasks.py # Defines BIG-Bench Hard tasks
│ ├── utils.py # Common utilities (e.g., inference_vllm, compute_accuracy)
│ └── methods/
│ ├── baseline.py # Zero-/few-shot baseline
│ ├── e2e.py # Direct I/O finetuning without ICL
│ └── ttt_sweep.py # Main TTT method (in-context fine-tuning)
│
├── README.md # Project documentation
└── requirements.txt # List of dependencies
To ensure that the external submodules are included, clone the repository using the --recurse-submodules flag:
git clone --recurse-submodules https://github.com/adamzweiger/Fewshot-TTT.git
cd Fewshot-TTTAlternatively, if you've already cloned the repository without submodules, initialize and update them manually:
git submodule update --init --recursiveUsing conda:
conda create -n tttenv python=3.12
conda activate tttenvUsing venv:
python3.12 -m venv tttenv
source tttenv/bin/activateThese dependencies cover everything needed in the submodules as well.
pip install -r requirements.txtRun the respective evaluation script to test the method on tasks in BIG-Bench Hard.
python src/methods/baseline.py --model_dir Llama-3.1-8B-Instruct --output_file results.jsonUse the bash scripts in scripts/ for a complete end-to-end evaluation workflow:
sbatch scripts/baseline.sh