Skip to content

Code and documentation to train Stanford's Alpaca models, and generate the data.

License

Notifications You must be signed in to change notification settings

RobertKirk/stanford_alpaca

 
 

Repository files navigation

Stanford-Alpaca

Fork of: Stanford Alpaca

Code License Data License Weight Diff License Python 3.9+ Code style: black

This is a fork of the Stanford Alpaca repo with adjustments and additional to enable generation of the in-distribution test dataset and the Sequential Instructions dataset used in Understanding the Effects of RLHF on LLM Generalisation and Diversity. The generated datasets can be found here:

To reproduce the generation of the Sequential Instructions dataset, follow the instructions in the Data Generation Process section below, but use python -m generate_instruction_sequential generate_instruction_following_data. This script also has the option of automatically uploading the generated dataset to huggingface using the --save_to_hf=<organisation>/<dataset_name> argument.

For the in-distribution test dataset, follow the instructions in the Data Generation Process section below as-is.

Otherwise, we recommend using the original repository, which has detailed instructions on the rest of the code.

Data Generation Process

Running the code

  • Set environment variables OPENAI_API_KEY to your OpenAI API key.
  • Install the dependencies with pip install -r requirements.txt.
  • Run python -m generate_instruction generate_instruction_following_data to generate the data.
  • Optionally pass --save_to_hf=<organisation>/<dataset_name> to automatically upload the generated dataset to huggingface.

Citation

Please cite the original repo if you use the data or code in this repo, as well as our paper:

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
@misc{kirkUnderstandingEffectsRLHF2023,
  title = {Understanding the {{Effects}} of {{RLHF}} on {{LLM Generalisation}} and {{Diversity}}},
  author = {Kirk, Robert and Mediratta, Ishita and Nalmpantis, Christoforos and Luketina, Jelena and Hambro, Eric and Grefenstette, Edward and Raileanu, Roberta},
  year = {2023},
  month = oct,
  number = {arXiv:2310.06452},
  eprint = {2310.06452},
  primaryclass = {cs},
  publisher = {{arXiv}},
  doi = {10.48550/arXiv.2310.06452},
  urldate = {2023-10-26},
  archiveprefix = {arxiv},
}

Naturally, you should also cite the original LLaMA paper and the Self-Instruct paper if you use the code or data from this repo.

Acknowledgements

We thank the original Alpaca authors for releasing their code.

About

Code and documentation to train Stanford's Alpaca models, and generate the data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%