Flexible simulation and evaluation framework for generative IR. [demo] [docs]
Tested in node 21.7.2 with npm 10.5.0
If you are not within the Webis network, you first need to replace all occurrences of the URL https://llm.srv.webis.de/api/chat
and the model
(default
) in the configuration with the values for a server you have access to (OpenAI-compatible API). Then,
npm install
node bin/genirsim static/configurations/discussion.json > eval.json
To run the web server:
npm install
node bin/genirsim-server
or
GENIRSIM_VERSION=$(jq -r '.version' package.json)
docker run --rm -it -p 8000:8000 ghcr.io/webis-de/genirsim:${GENIRSIM_VERSION}
If you use GenIRSim in your publication, cite it using the following publication:
Johannes Kiesel, Marcel Gohsen, Nailia Mirzakhmedova, Matthias Hagen, and Benno Stein.
Who Will Evaluate the Evaluators? Exploring the Gen-IR User Simulation Space.
In Lorraine Goeuriot et al., editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction.
15th International Conference of the CLEF Association (CLEF 2024),
volume 14958 of Lecture Notes in Computer Science, pages 166–171, September 2024. Springer.
See the paper's entry on the Webis publication page for the BibTeX entry, additional links, and the paper itself.