-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
[x] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
I updated this morning to the new Ragas version (0.3.3) and during my evaluation run i got an error during the metric answer_relevancy.
When i downgrade the Ragas to 0.3.2 it works fine again.
Other than ResponseRelevancy i also use [LLMContextRecall, Faithfulness, ContextRelevance, AnswerAccuracy, FactualCorrectness(mode="recall"), FactualCorrectness(mode="precision")]. They work as expected.
Ragas version: 0.3.3
Python version: 3.13.7
Code to Reproduce
result_t = evaluate(
dataset=evaluation_dataset,
metrics=[
LLMContextRecall(),
Faithfulness(),
ContextRelevance(),
ResponseRelevancy(),
AnswerAccuracy(),
FactualCorrectness(mode="recall"),
FactualCorrectness(mode="precision"),
],
llm=evaluator_llm,
embeddings=embedding_model,
run_config=RunConfig(timeout=1000),
raise_exceptions=True,
)
Error trace
File "/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/asyncio/tasks.py", line 507, in wait_for
return await fut
^^^^^^^^^
File "/src/testing_service/.venv/lib/python3.13/site-packages/ragas/metrics/_answer_relevance.py", line 135, in _single_turn_ascore
return await self._ascore(row, callbacks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/src/testing_service/.venv/lib/python3.13/site-packages/ragas/metrics/_answer_relevance.py", line 142, in _ascore
responses = await self.question_generation.generate_multiple(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
data=prompt_input, llm=self.llm, callbacks=callbacks, n=self.strictness
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "src/testing_service/.venv/lib/python3.13/site-packages/ragas/prompt/pydantic_prompt.py", line 231, in generate_multiple
output_string = resp.generations[0][i].text
PS: the website is kinda broken since the update to 0.3.3