Install the required packages:
pip install -r requirements.txt
Tasks and datasets are organized as follows:
├── MATH
│ ├── algebra
│ ├── counting_and_probability
│ ├── geometry
│ ├── intermediate_algebra
│ ├── number_theory
│ ├── prealgebra
│ └── precalculus
├── TableQA
│ ├── TabMWP
│ ├── WTQ
│ └── HiTab
├── VQA
└── └── GQA
python run_trove.py --task_name "math/algebra"
- For MATH tasks, specify the task name as math/${dataset_name}, e.g., math/algebra.
- For TableQA and VQA tasks, directly used the dataset name: [tabmwp, wtq, hitab, gqa].
Note that the specified --task_name
argument should be lowercased.
python baseline.py --task_name "math/algebra" --suffix "primitive" # or "instance"
Note that for GQA dataset, we implement the locate_objects
and visual_qa
functions as fast apis.
So you need to launch the server first (as below), then run the trove/baseline experiments.
uvicorn server.gqa:app
python -m utils.eval --results_path ${RESULTS_PATH}