Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(evals): add an output_parser to llm_generate #1736

Merged
merged 2 commits into from
Nov 14, 2023

Conversation

mikeldking
Copy link
Contributor

@mikeldking mikeldking commented Nov 13, 2023

resolves #1713

In more custom cases, there's a need to parse the output of llm_generate - e.x. trying to extract two labels at once. This also supports providing explanations as part of JSON output if needed. Also GPT-4-Turbo has JSON mode so this helps with that as well.

NB: this is a breaking change but it's under experimental so treating it as a feature.

@Arize-ai Arize-ai deleted a comment from review-notebook-app bot Nov 13, 2023
def llm_generate(
dataframe: pd.DataFrame,
template: Union[PromptTemplate, str],
model: BaseEvalModel,
system_instruction: Optional[str] = None,
verbose: bool = False,
) -> List[str]:
output_parser: Optional[Callable[[str], Dict[str, Any]]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output_parser: Optional[Callable[[str], Dict[str, Any]]] = None,
output_parser: Optional[Callable[[str], Any]] = None,

Seems like the user might want to write all sorts of kinds of parsers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axiomofjoy but we need to know what column to map it to - this at least forces you to declare the column. Or are you thinking of something different?

@mikeldking mikeldking merged commit 6408dda into main Nov 14, 2023
9 checks passed
@mikeldking mikeldking deleted the 1713-structured-output-parsing branch November 14, 2023 00:17
mikeldking added a commit that referenced this pull request Nov 14, 2023
* feat(evals): add an output_parser param for structured data extraction

* remove brittle test
mikeldking added a commit that referenced this pull request Nov 14, 2023
* Add explanation template

* Spike out explanations

* Ruff 🐶

* Use tailored explanation prompt

* Add explanation templates for all evals

* Wire up prompt template objects

* Update models to use new template object

* Ruff 🐶

* Resolve type and linter issues

* Fix more typing issues

* Address first round of feedback

* Extract `ClassificationTemplate` ABC

* Label extraction belongs to the "template" object

* Add logging for unparseable labels

* Patch in openai key environment variable for tests

* Refactor to address feedback

* Evaluators should use PromptTemplates

* Pair with Mikyo

* Fix for CI

* `PROMPT_TEMPLATE_STR` -> `PROMPT_TEMPLATE`

* Print prompt if verbose

* Add __repr__ to `PromptTemplate`

* fix relevance notebook

* docs: update evals

* Normalize prompt templates in llm_classify

* Ruff 🐶

* feat(evals): add an output_parser to llm_generate (#1736)

* feat(evals): add an output_parser param for structured data extraction

* remove brittle test

* docs(evals): document llm_generate with output parser (#1741)

---------

Co-authored-by: Mikyo King <mikyo@arize.com>
mikeldking added a commit that referenced this pull request Nov 15, 2023
* Add explanation template

* Spike out explanations

* Ruff 🐶

* Use tailored explanation prompt

* Add explanation templates for all evals

* Wire up prompt template objects

* Update models to use new template object

* Ruff 🐶

* Resolve type and linter issues

* Fix more typing issues

* Address first round of feedback

* Extract `ClassificationTemplate` ABC

* Label extraction belongs to the "template" object

* Add logging for unparseable labels

* Patch in openai key environment variable for tests

* Refactor to address feedback

* Evaluators should use PromptTemplates

* Pair with Mikyo

* Fix for CI

* `PROMPT_TEMPLATE_STR` -> `PROMPT_TEMPLATE`

* Print prompt if verbose

* Add __repr__ to `PromptTemplate`

* fix relevance notebook

* docs: update evals

* Normalize prompt templates in llm_classify

* Ruff 🐶

* feat(evals): add an output_parser to llm_generate (#1736)

* feat(evals): add an output_parser param for structured data extraction

* remove brittle test

* docs(evals): document llm_generate with output parser (#1741)

---------

Co-authored-by: Mikyo King <mikyo@arize.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] llm primitive to output structured JSON or structured data
2 participants