Skip to content

Commit

Permalink
doc: Fix links to API docs and pip packages in RAG eval harness noteb…
Browse files Browse the repository at this point in the history
…ook (#38)
  • Loading branch information
shadeMe authored Jul 17, 2024
1 parent 1bd290b commit 6c62db7
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions examples/rag_eval_harness.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@
"source": [
"%%bash\n",
"\n",
"pip install -U git+https://github.com/deepset-ai/haystack-experimental.git@main\n",
"pip install git+https://github.com/deepset-ai/haystack@main\n",
"pip install -U haystack-ai\n",
"pip install -U haystack-experimental\n",
"pip install datasets\n",
"pip install sentence-transformers"
]
Expand Down Expand Up @@ -931,7 +931,7 @@
"\n",
"You will evaluate your RAG pipeline using the `EvaluationHarness`. The `EvaluationHarness` executes a pipeline with a given set of inputs and evaluates its outputs with an evaluation pipeline using Haystack's built-in [Evaluators](https://docs.haystack.deepset.ai/docs/evaluators). This means you don't need to create a separate evaluation pipeline.\n",
"\n",
"The [`RAGEvaluationHarness`](https://docs.haystack.deepset.ai/v2.3-unstable/reference/evaluation-harness#ragevaluationharness) class, derived from the Evaluation Harness, simplifies the evaluation process specifically for RAG pipelines. It comes with a predefined set of evaluation metrics, detailed in the [`RAGEvaluationMetric`](https://docs.haystack.deepset.ai/v2.3-unstable/reference/evaluation-harness#ragevaluationmetric) enum, and basic RAG architecture examples, listed in the [`DefaultRAGArchitecture`](https://docs.haystack.deepset.ai/v2.3-unstable/reference/evaluation-harness#defaultragarchitecture) enum.\n",
"The [`RAGEvaluationHarness`](https://docs.haystack.deepset.ai/reference/evaluation-harness#ragevaluationharness) class, derived from the Evaluation Harness, simplifies the evaluation process specifically for RAG pipelines. It comes with a predefined set of evaluation metrics, detailed in the [`RAGEvaluationMetric`](https://docs.haystack.deepset.ai/reference/evaluation-harness#ragevaluationmetric) enum, and basic RAG architecture examples, listed in the [`DefaultRAGArchitecture`](https://docs.haystack.deepset.ai/reference/evaluation-harness#defaultragarchitecture) enum.\n",
"\n",
"Now, create a harness to evaluate the embedding-based RAG pipeline. For evaluating the RAG pipeline mentioned above, use the `DefaultRAGArchitecture.GENERATION_WITH_EMBEDDING_RETRIEVAL` architecture. You will evaluate the pipeline using the [DocumentMAPEvaluator](https://docs.haystack.deepset.ai/docs/documentmapevaluator), [DocumentRecallEvaluator](https://docs.haystack.deepset.ai/docs/documentrecallevaluator), and [FaithfulnessEvaluator](https://docs.haystack.deepset.ai/docs/faithfulnessevaluator).\n"
]
Expand Down Expand Up @@ -1904,7 +1904,7 @@
"source": [
"## Evaluating and Comparing Different Pipelines\n",
"\n",
"To evaluate alternative approaches, you can initiate another evaluation run using the same inputs but with different overrides, leveraging [`RAGEvaluationOverrides`](https://docs.haystack.deepset.ai/v2.3-unstable/reference/evaluation-harness#ragevaluationoverrides).\n",
"To evaluate alternative approaches, you can initiate another evaluation run using the same inputs but with different overrides, leveraging [`RAGEvaluationOverrides`](https://docs.haystack.deepset.ai/reference/evaluation-harness#ragevaluationoverrides).\n",
"\n",
"Now, update the model used with `OpenAIGenerator` in the RAG pipeline and execute the same EvaluationHarness instance:"
]
Expand Down Expand Up @@ -1941,7 +1941,7 @@
"id": "Q1YPapD2wYyA"
},
"source": [
"Compare the results of the two evaluation runs with [`comparative_individual_scores_report()`](https://docs.haystack.deepset.ai/v2.3-unstable/reference/evaluation-api#baseevaluationrunresultcomparative_individual_scores_report). The results for the new pipeline will have the `emb_eval_run_gpt4_*` name."
"Compare the results of the two evaluation runs with [`comparative_individual_scores_report()`](https://docs.haystack.deepset.ai/reference/evaluation-api#baseevaluationrunresultcomparative_individual_scores_report). The results for the new pipeline will have the `emb_eval_run_gpt4_*` name."
]
},
{
Expand Down

0 comments on commit 6c62db7

Please sign in to comment.