NaviTrace Evaluation

This repository contains example code on how to evaluate models on our benchmark NaviTrace, including model inference via API and the score calculation. The benchmark consists of a validation split and a test split with hidden ground-truths. If you want to see how your model scores on the test set or want to submit your model to the leaderboard, check out this Hugging Face Space.

Setup

Clone this repository git clone https://github.com/leggedrobotics/navitrace_evaluation.git
Create and activate a Python 3.10 environment with your preferred tools
pip install -r ./requirements.txt
Prepare an API key and base URL for the model that you want to evaluate

Usage

Run the notebook src/run_evaluation.ipynb, e.g. with jupyter lab src/run_evaluation.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NaviTrace Evaluation

Setup

Usage

About

Uh oh!

Releases

Packages

Languages

License

leggedrobotics/navitrace_evaluation

Folders and files

Latest commit

History

Repository files navigation

NaviTrace Evaluation

Setup

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages