🤗 [Model] • 📃 [Paper] • ⚓ [VaLProbing-32K]
This is the official repo for the paper Make Your LLM Fully Utilize the Context. This repo can help you to reproduce the results of FILM-7B, a 32K-context LLM that overcomes the lost-in-the-middle problem. FILM-7B is trained from Mistral-7B-Instruct-v0.2 by applying Information-Intensie (In2) Training. FILM-7B achieves near-perfect performance on probing tasks, SOTA-level performance on real-world long-context tasks among ~7B size LLMs, and does not compromise the short-context performance.
Disclaimer: This repo is strictly for research purposes, and not an official product or service from Microsoft.
We recommend using Conda or the official Pytorch Docker to build up the environment.
git clone https://github.com/microsoft/FILM.git
cd FILM
conda create -n FILM python=3.10.11
conda activate FILM
pip install torch==2.0.1 # cuda11.7 and cudnn8
pip install -r requirements.txt
The system tempelate for FILM-7B:
'''[INST] Below is a context and an instruction. Based on the information provided in the context, write a response for the instruction.
### Context:
{YOUR LONG CONTEXT}
### Instruction:
{YOUR QUESTION & INSTRUCTION} [/INST]
'''
To reproduce the results on our VaL Probing, see the guidance in VaLProbing.
To reproduce the results on real-world long-context tasks, see the guidance in real_world_long.
To reproduce the results on short-context tasks, see the guidance in short_tasks.
@misc{an2024make,
title={Make Your LLM Fully Utilize the Context},
author={Shengnan An and Zexiong Ma and Zeqi Lin and Nanning Zheng and Jian-Guang Lou},
year={2024},
eprint={2404.16811},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.