-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Evals with explanations #1699
Merged
Merged
Changes from 20 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
7956822
Add explanation template
anticorrelator 763c481
Spike out explanations
anticorrelator 38d2be9
Ruff 🐶
anticorrelator c314c74
Merge remote-tracking branch 'origin' into dustin/evals-with-explanat…
anticorrelator a66d6eb
Use tailored explanation prompt
anticorrelator 85092ca
Add explanation templates for all evals
anticorrelator 2b0f29b
Wire up prompt template objects
anticorrelator f0aa75f
Update models to use new template object
anticorrelator 6c6140a
Ruff 🐶
anticorrelator a11a4ba
Resolve type and linter issues
anticorrelator 9c5af4e
Fix more typing issues
anticorrelator a551d60
Address first round of feedback
anticorrelator 67e9e13
Extract `ClassificationTemplate` ABC
anticorrelator 75a027c
Label extraction belongs to the "template" object
anticorrelator 6fc6fc6
Add logging for unparseable labels
anticorrelator eb11ebb
Merge remote-tracking branch 'origin' into dustin/evals-with-explanat…
anticorrelator 59d9ded
Merge remote-tracking branch 'origin' into dustin/evals-with-explanat…
anticorrelator a2509c9
Patch in openai key environment variable for tests
anticorrelator eaff46d
Refactor to address feedback
anticorrelator b8e13e3
Evaluators should use PromptTemplates
anticorrelator d0f1d8b
Pair with Mikyo
anticorrelator 888f223
Fix for CI
anticorrelator cebda8c
`PROMPT_TEMPLATE_STR` -> `PROMPT_TEMPLATE`
anticorrelator 093e59c
Print prompt if verbose
anticorrelator 17025ef
Add __repr__ to `PromptTemplate`
anticorrelator 29ff6b4
fix relevance notebook
mikeldking cc8e7e2
docs: update evals
mikeldking e564db0
Normalize prompt templates in llm_classify
anticorrelator 6cdbecb
Ruff 🐶
anticorrelator ad1ef59
Merge remote-tracking branch 'origin/dustin/evals-with-explanations' …
anticorrelator 2b257d2
feat(evals): add an output_parser to llm_generate (#1736)
mikeldking 00d9cb4
docs(evals): document llm_generate with output parser (#1741)
mikeldking 8ac5201
Merge branch 'main' into dustin/evals-with-explanations
mikeldking File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,38 @@ | ||
from .default_templates import ( | ||
CODE_READABILITY_PROMPT_RAILS_MAP, | ||
CODE_READABILITY_PROMPT_TEMPLATE_STR, | ||
HALLUCINATION_PROMPT_RAILS_MAP, | ||
HALLUCINATION_PROMPT_TEMPLATE_STR, | ||
RAG_RELEVANCY_PROMPT_RAILS_MAP, | ||
RAG_RELEVANCY_PROMPT_TEMPLATE_STR, | ||
TOXICITY_PROMPT_RAILS_MAP, | ||
TOXICITY_PROMPT_TEMPLATE_STR, | ||
CODE_READABILITY_PROMPT_RAILS, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we need to preserve the binary True/False of the labels in the case of binary classification so we should not touch these mappings. |
||
CODE_READABILITY_PROMPT_TEMPLATE, | ||
HALLUCINATION_PROMPT_RAILS, | ||
HALLUCINATION_PROMPT_TEMPLATE, | ||
RAG_RELEVANCY_PROMPT_RAILS, | ||
RAG_RELEVANCY_PROMPT_TEMPLATE, | ||
TOXICITY_PROMPT_RAILS, | ||
TOXICITY_PROMPT_TEMPLATE, | ||
) | ||
from .template import ( | ||
NOT_PARSABLE, | ||
ClassificationTemplate, | ||
PromptOptions, | ||
PromptTemplate, | ||
map_template, | ||
normalize_classification_template, | ||
normalize_prompt_template, | ||
) | ||
from .template import NOT_PARSABLE, PromptTemplate, map_template, normalize_template | ||
|
||
__all__ = [ | ||
"UserTemplate", | ||
"PromptOptions", | ||
"PromptTemplate", | ||
"normalize_template", | ||
"ClassificationTemplate", | ||
"normalize_classification_template", | ||
"normalize_prompt_template", | ||
"map_template", | ||
"NOT_PARSABLE", | ||
"RAG_RELEVANCY_PROMPT_RAILS_MAP", | ||
"RAG_RELEVANCY_PROMPT_TEMPLATE_STR", | ||
"HALLUCINATION_PROMPT_RAILS_MAP", | ||
"HALLUCINATION_PROMPT_TEMPLATE_STR", | ||
"CODE_READABILITY_PROMPT_RAILS_MAP", | ||
"CODE_READABILITY_PROMPT_TEMPLATE_STR", | ||
"TOXICITY_PROMPT_RAILS_MAP", | ||
"TOXICITY_PROMPT_TEMPLATE_STR", | ||
"CODE_READABILITY_PROMPT_RAILS", | ||
"CODE_READABILITY_PROMPT_TEMPLATE", | ||
"HALLUCINATION_PROMPT_RAILS", | ||
"HALLUCINATION_PROMPT_TEMPLATE", | ||
"RAG_RELEVANCY_PROMPT_RAILS", | ||
"RAG_RELEVANCY_PROMPT_TEMPLATE", | ||
"TOXICITY_PROMPT_RAILS", | ||
"TOXICITY_PROMPT_TEMPLATE", | ||
] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation: I would add a bit more color here so the user understands what part of the execution is failing. Including maybe the response