Measuring Coding Challenge Competence With APPS

This is the repository for Measuring Coding Challenge Competence With APPS by Dan Hendrycks*, Steven Basart*, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, and Jacob Steinhardt.

Download the APPS dataset here. (~1.3GB)

This repository contains both training and evaluation code.

Fine-tuned GPT-2 1.5B and GPT-Neo 2.7B weights are available here.

For other benchmarks of enormous Transformers, see a dataset which tests ability in competition math, a dataset which tests knowledge of ethics, and a dataset spanning 50+ academic subjects.

How to Use

The training instructions are specified in train/README and similarly the evaluation instructions are specified in eval/README.

Hugging Face

The dataset is also available in Hugging Face datasets under apps.

Citation

If you find this useful in your research, please consider citing

@article{hendrycksapps2021,
  title={Measuring Coding Challenge Competence With APPS},
  author={Dan Hendrycks and Steven Basart and Saurav Kadavath and Mantas Mazeika and Akul Arora and Ethan Guo and Collin Burns and Samir Puranik and Horace He and Dawn Song and Jacob Steinhardt},
  journal={NeurIPS},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
eval		eval
train		train
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Measuring Coding Challenge Competence With APPS

How to Use

Hugging Face

Citation

About

Releases

Packages

Contributors 5

Languages

License

hendrycks/apps

Folders and files

Latest commit

History

Repository files navigation

Measuring Coding Challenge Competence With APPS

How to Use

Hugging Face

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages