Skip to content

[COLING Demos 2025] an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs

Notifications You must be signed in to change notification settings

4real3000/EasyJudge

Repository files navigation

🕵️EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs

  • Lightweight Usage Model:EasyJudge is built to minimize dependency requirements, offering a simple installation process and precise documentation. Users can initiate the evaluation interface with only a few basic commands.

  • Comprehensive Evaluation Tool: EasyJudge offers a highly customizable interface, allowing users to select evaluation scenarios and flexibly combine evaluation criteria based on their needs. The visualization interface has been carefully designed to provide users with an intuitive view of various aspects of the evaluation results.

  • Efficient Inference Engine: EasyJudge employs model quantization, memory management optimization, and hardware acceleration support to enable efficient inference. As a result, EasyJudge can run seamlessly on consumer-grade GPUs and even CPUs.

System Overview

Example Image

Model

EasyJudge is now available on huggingface-hub: 🤗 4real/EasyJudge_gguf

Quick Start

(Example of Deploying on AutoDL Cloud Server)

Deploy ollama

1. Start the installation software on autodl
export OLLAMA_MODELS=/root/autodl-tmp/models
curl -fsSL https://ollama.com/install.sh | sh
2. Start the service
ollama serve
3. Import EasyJudge models

Modify the path after from in each Modelfile to the local path where the model is downloaded from huggingface.

ollama create PAIRWISE -f /root/autodl-tmp/EasyJudge/Modelfile/PAIRWISE.Modelfile
ollama create POINTWISE -f /root/autodl-tmp/EasyJudge/Modelfile/POINTWISE.Modelfile

Environment Configuration

(EasyJudge uses the environment PyTorch 2.3.0, Python 3.12 (ubuntu22.04), and Cuda 12.1.)

1. Create conda environment
conda create -n EasyJudge
conda init
conda activate EasyJudge
2. Install specified Python packages in bulk
pip install -r requirements.txt

Run the Program

To start the application, use the following command to run main.py with specific server configurations:

streamlit run main.py --server.address=127.0.0.1 --server.port=6006 --server.enableXsrfProtection=false

Citation

Please cite the repo or the paper if the model/code/resource/conclusion in this repo is helpful to you.

@article{li2024easyjudge,
  title={EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs},
  author={Li, Yijie and Sun, Yuan},
  journal={arXiv preprint arXiv:2410.09775},
  year={2024}
}

Acknowledge❤️

We acknowledge these works for their public codes: LLaMA-Factory, llama.cpp, ollama, auto-j, JudgeLM.

About

[COLING Demos 2025] an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages