You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Should be similar to original MMLU: see mmlu_scenario.py for the original MMLU and air_bench_scenario.py for how to use load_dataset() with Hugging Face datasets.
Edit: Also look at simple_scenarios.py and test_simple_scenarios.py for an example of MCQA.
The text was updated successfully, but these errors were encountered:
Paper: https://arxiv.org/abs/2311.12022
It is easiest to use the Hugging Face version: https://huggingface.co/datasets/Idavidrein/gpqa
Should be similar to original MMLU: see
mmlu_scenario.py
for the original MMLU andair_bench_scenario.py
for how to useload_dataset()
with Hugging Face datasets.Edit: Also look at
simple_scenarios.py
andtest_simple_scenarios.py
for an example of MCQA.The text was updated successfully, but these errors were encountered: