Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement RAGEvaluationHarness and related classes #10

Merged
merged 5 commits into from
Jun 3, 2024

Conversation

shadeMe
Copy link
Contributor

@shadeMe shadeMe commented May 29, 2024

Proposed Changes:

This PR implements the RAGEvaluationHarness and its related classes, building up on #4, #5.

Part of deepset-ai/haystack#7526.

How did you test it?

Unit tests

Checklist

@shadeMe shadeMe force-pushed the feat/eval-harness-rag branch 4 times, most recently from 7cd05d2 to 84c0fd2 Compare May 29, 2024 15:39
@shadeMe shadeMe force-pushed the feat/eval-harness-rag branch from 84c0fd2 to cb80fca Compare May 29, 2024 15:42
@shadeMe shadeMe marked this pull request as ready for review May 29, 2024 15:46
@shadeMe shadeMe requested a review from a team as a code owner May 29, 2024 15:46
@shadeMe shadeMe requested review from masci, davidsbatista and silvanocerza and removed request for a team and masci May 29, 2024 15:46
@shadeMe
Copy link
Contributor Author

shadeMe commented May 29, 2024

The test failure is due to a seemingly missing dependency, but the logs show that it's being installed 🤔 I'll look into this later, but the PR is reviewable. Tests pass locally.

Copy link
Contributor

@davidsbatista davidsbatista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work Madeesh, I left some minor comments, otherwise looks good.

test/test_requirements.txt Outdated Show resolved Hide resolved
haystack_experimental/evaluation/harness/rag/harness.py Outdated Show resolved Hide resolved
return cls(rag_pipeline, rag_components, deepcopy(metrics))

@classmethod
def default_with_keyword_retriever(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly like the function above, only changing the input_mapping={"query": "query"} for the RAGExpectedComponent.QUERY_PROCESSOR - maybe we can make this one function and make this an arg or some other way to collapse this into a single function?

RAGExpectedComponent.QUERY_PROCESSOR: RAGExpectedComponentMetadata(
                name="retriever", input_mapping={"query": "query"}
            )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation behind having separate "default" functions is to make it as simple as possible for the user to initialize a pipeline without having to understand the mapping of the expected components. If we were to merge the two into a single function, we'd have to expose some kind of enumeration to select the type of query processor component (as it can't be a simple boolean flag given the semantics of the parameter), which introduces more code than it removes.

haystack_experimental/evaluation/harness/rag/parameters.py Outdated Show resolved Hide resolved
haystack_experimental/evaluation/harness/rag/parameters.py Outdated Show resolved Hide resolved
test/evaluation/harness/rag/test_harness.py Outdated Show resolved Hide resolved
test/evaluation/harness/rag/test_harness.py Outdated Show resolved Hide resolved
shadeMe and others added 4 commits June 3, 2024 12:36
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
@coveralls
Copy link

coveralls commented Jun 3, 2024

Pull Request Test Coverage Report for Build 9349272889

Details

  • 324 of 336 (96.43%) changed or added relevant lines in 8 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+4.1%) to 97.764%

Changes Missing Coverage Covered Lines Changed/Added Lines %
test/evaluation/harness/rag/test_harness.py 151 153 98.69%
haystack_experimental/evaluation/harness/rag/harness.py 113 123 91.87%
Totals Coverage Status
Change from base Build 9193174995: 4.1%
Covered Lines: 612
Relevant Lines: 626

💛 - Coveralls

Copy link
Contributor

@silvanocerza silvanocerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. 👍

@shadeMe shadeMe merged commit 90eb38a into deepset-ai:main Jun 3, 2024
5 checks passed
@shadeMe shadeMe deleted the feat/eval-harness-rag branch June 3, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants