Name		Name	Last commit message	Last commit date
parent directory ..
data_utils		data_utils
fast_agent		fast_agent
prompts		prompts
slow_agent		slow_agent
.gitignore		.gitignore
README.md		README.md
eval-gpt-3.5.sh		eval-gpt-3.5.sh
eval-gpt-4.sh		eval-gpt-4.sh
eval-tgi.sh		eval-tgi.sh
eval.py		eval.py
eval_utils.py		eval_utils.py
metrics.py		metrics.py
requirements.txt		requirements.txt

README.md

ScienceWorld

This evaluation code is adapted from SwiftSage.

Installation

conda create -n sciworld python=3.8 pip
conda activate sciworld
pip3 install scienceworld==1.1.3
pip3 install -r requirements.txt
pip3 install torch --extra-index-url https://download.pytorch.org/whl/cu116
conda install -c "nvidia/label/cuda-11.6.0" cuda-toolkit
conda install -c conda-forge openjdk # if needed

Evaluation

GPT

Modify eval-gpt.sh to provide your OpenAI API key and optionally specify a model. Then run:

bash eval-gpt.sh

HuggingFace TGI (Text Generation Inference)

Modify eval-tgi.sh to provide your TGI controller addresses in an comma-separated array. Then run:

bash eval-tgi.sh

After evaluation is done, you can run python metrics.py to get the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

science-world

science-world

README.md

ScienceWorld

Installation

Evaluation

GPT

HuggingFace TGI (Text Generation Inference)

Files

science-world

Directory actions

More options

Directory actions

More options

Latest commit

History

science-world

Folders and files

parent directory

README.md

ScienceWorld

Installation

Evaluation

GPT

HuggingFace TGI (Text Generation Inference)