Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added new mixin to modify partial parsing behaviour #1152

Merged
merged 9 commits into from
Nov 14, 2024

Conversation

ivanleomk
Copy link
Collaborator

@ivanleomk ivanleomk commented Nov 8, 2024

This introduces a new mixin which allows us to toggle streaming behaviour.

By using a LiteralPartialMixin, it doesn't allow incomplete strings.

setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story='whispers of the leaves'
---------
setting=None story=None
setting=None story=None
setting='' story=None
setting='A' story=None
setting='A tranquil' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=''
setting='A tranquil forest' story='Wh'
setting='A tranquil forest' story='Whispers'
setting='A tranquil forest' story='Whispers of'
setting='A tranquil forest' story='Whispers of the'
setting='A tranquil forest' story='Whispers of the leaves'
setting='A tranquil forest' story='Whispers of the leaves.'
setting='A tranquil forest' story='Whispers of the leaves.'

Important

Introduces PartialLiteralMixin to handle incomplete literal values in streaming, updating parsing logic and adding tests.

  • Behavior:
    • Introduces PartialLiteralMixin in partial.py to modify partial parsing behavior for incomplete literal values.
    • Updates model_from_chunks and model_from_chunks_async to use partial_mode based on PartialLiteralMixin.
  • Documentation:
    • Updates partial.md to include usage instructions for PartialLiteralMixin.
  • Tests:
    • Adds test_literal_partial_mixin and test_literal_partial_mixin_async in test_stream.py to verify behavior with PartialLiteralMixin.
    • Ensures field updates are correctly handled with and without the mixin.

This description was created by Ellipsis for 371afcc. It will automatically update as commits are pushed.

@ivanleomk
Copy link
Collaborator Author

from instructor.dsl.partial import LiteralPartialMixin
from pydantic import BaseModel
from openai import OpenAI
import instructor

client = OpenAI()
client = instructor.from_openai(client)


class Story(BaseModel, LiteralPartialMixin):
    setting: str
    story: str


resp = client.chat.completions.create_partial(
    model="gpt-4o-mini",
    response_model=Story,
    messages=[
        {
            "role": "user",
            "content": "Give me a 1 sentence haiku",
        }
    ],
)

for r in resp:
    print(r)

print("---------")


class Story(BaseModel):
    setting: str
    story: str


resp = client.chat.completions.create_partial(
    model="gpt-4o-mini",
    response_model=Story,
    messages=[
        {
            "role": "user",
            "content": "Give me a 1 sentence haiku",
        }
    ],
)

for r in resp:
    print(r)

setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting=None story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story=None
setting='a quiet forest' story='whispers of the leaves'
---------
setting=None story=None
setting=None story=None
setting='' story=None
setting='A' story=None
setting='A tranquil' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=None
setting='A tranquil forest' story=''
setting='A tranquil forest' story='Wh'
setting='A tranquil forest' story='Whispers'
setting='A tranquil forest' story='Whispers of'
setting='A tranquil forest' story='Whispers of the'
setting='A tranquil forest' story='Whispers of the leaves'
setting='A tranquil forest' story='Whispers of the leaves.'
setting='A tranquil forest' story='Whispers of the leaves.'

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to a353547 in 29 seconds

More details
  • Looked at 196 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. instructor/dsl/partial.py:37
  • Draft comment:
    The LiteralPartialMixin class is currently empty. If it's intended to be a marker interface, consider adding a docstring to clarify its purpose. If it should have functionality, ensure it's implemented.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The LiteralPartialMixin class is currently empty, which might be intentional for a marker interface, but it should be documented or implemented if it is expected to have functionality.
2. instructor/dsl/partial.py:134
  • Draft comment:
    The logic for setting partial_mode is duplicated in model_from_chunks and model_from_chunks_async. Consider refactoring to avoid repetition.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The partial_mode variable is set based on whether the class is a subclass of LiteralPartialMixin. This logic is repeated in both model_from_chunks and model_from_chunks_async. Consider refactoring to avoid duplication.
3. tests/llm/test_openai/test_stream.py:146
  • Draft comment:
    The test_literal_partial_mixin_async function is defined inside test_literal_partial_mixin. Consider moving it outside to avoid potential issues with test discovery and execution.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The test function test_literal_partial_mixin_async is defined inside another test function test_literal_partial_mixin. This is not a standard practice and can lead to unexpected behavior.
4. tests/llm/test_openai/test_stream.py:81
  • Draft comment:
    Assertions should include error messages for clarity and debugging purposes. This applies to all assertions in this file.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The assertion statements in the test functions lack error messages, which are important for debugging when the assertion fails.

Workflow ID: wflow_oj41lz4IIsdG1Ytl


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

cloudflare-workers-and-pages bot commented Nov 8, 2024

Deploying instructor-py with  Cloudflare Pages  Cloudflare Pages

Latest commit: 651c1f6
Status: ✅  Deploy successful!
Preview URL: https://27317ffe.instructor-py.pages.dev
Branch Preview URL: https://add-partial-mixin.instructor-py.pages.dev

View logs

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 89d62f2 in 14 seconds

More details
  • Looked at 30 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. tests/dsl/test_partial.py:119
  • Draft comment:
    Consider adding tests specifically for LiteralPartialMixin to ensure its behavior is as expected. This will help verify that incomplete strings are not allowed when using this mixin.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The PR introduces a new mixin LiteralPartialMixin to modify partial parsing behavior. The tests are updated to include this mixin in the Summary class. However, the PR does not include any tests for the behavior of LiteralPartialMixin itself, which is crucial to ensure its functionality.
2. tests/dsl/test_partial.py:116
  • Draft comment:
    Assertions should include error messages for clarity and debugging purposes. This applies to the assertions on lines 141 and 168.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The assertions in the test functions lack error messages, which is against the rule that assertions should always have an error message.

Workflow ID: wflow_6lHJ8zldDXBEpcNa


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on b52ec6a in 25 seconds

More details
  • Looked at 125 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. docs/concepts/partial.md:10
  • Draft comment:
    The class name LiteralPartialMixin should be updated to PartialStringHandlingMixin to reflect the changes made in the code.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. docs/concepts/partial.md:10
  • Draft comment:
    Update the mixin name to PartialStringHandlingMixin to match the code changes.
    If the data structure you're using has literal values, you need to make sure to import the `PartialStringHandlingMixin` mixin.
  • Reason this comment was not posted:
    Marked as duplicate.
3. tests/llm/test_openai/test_stream.py:87
  • Draft comment:
    Update the comment to reflect the new mixin name PartialStringHandlingMixin.
    # Test with PartialStringHandlingMixin
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_fCTIiHCBAcP6UeRo


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ivanleomk ivanleomk requested a review from jxnl November 8, 2024 09:48
Copy link
Collaborator

@jxnl jxnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it not called PartialLiteralMixin

@ivanleomk
Copy link
Collaborator Author

why is it not called PartialLiteralMixin

Renamed to PartialLiteralMixin. My original intention was to give users the control over what the parsing mode was for their specific data structure but I realised the modes are unlikely to change so calling it Literal makes more sense

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 53dfdb0 in 1 minute and 5 seconds

More details
  • Looked at 229 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 7 drafted comments based on config settings.
1. tests/llm/test_openai/test_stream.py:86
  • Draft comment:
    The test description refers to PartialStringHandlingMixin, but the class name has been changed to PartialLiteralMixin. Update the description for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR changes the class name from PartialStringHandlingMixin to PartialLiteralMixin but the test description still refers to PartialStringHandlingMixin. This should be updated for consistency and clarity.
2. tests/llm/test_openai/test_stream.py:146
  • Draft comment:
    The test description refers to PartialStringHandlingMixin, but the class name has been changed to PartialLiteralMixin. Update the description for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR changes the class name from PartialStringHandlingMixin to PartialLiteralMixin but the test description still refers to PartialStringHandlingMixin. This should be updated for consistency and clarity.
3. tests/dsl/test_partial.py:118
  • Draft comment:
    The test description refers to PartialStringHandlingMixin, but the class name has been changed to PartialLiteralMixin. Update the description for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR changes the class name from PartialStringHandlingMixin to PartialLiteralMixin but the test description still refers to PartialStringHandlingMixin. This should be updated for consistency and clarity.
4. tests/dsl/test_partial.py:144
  • Draft comment:
    The test description refers to PartialStringHandlingMixin, but the class name has been changed to PartialLiteralMixin. Update the description for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR changes the class name from PartialStringHandlingMixin to PartialLiteralMixin but the test description still refers to PartialStringHandlingMixin. This should be updated for consistency and clarity.
5. tests/llm/test_openai/test_stream.py:176
  • Draft comment:
    Assertions should include an error message for clarity and debugging purposes. This applies to all assertions in this file and other test files.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The assertion statements in the test files lack error messages, which is against the rule that assertions should always have an error message.
6. tests/llm/test_openai/test_stream.py:148
  • Draft comment:
    Function names should follow a consistent naming pattern. Consider renaming test_literal_partial_mixin_async to match the pattern used in other async test functions.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The function names in the test files should follow a consistent naming pattern. There is inconsistency in naming between test_partial_model_async and test_literal_partial_mixin_async.
7. tests/dsl/test_partial.py:145
  • Draft comment:
    Function names should follow a consistent naming pattern. Consider renaming test_summary_extraction_async to match the pattern used in other async test functions.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The function names in the test files should follow a consistent naming pattern. There is inconsistency in naming between test_partial_model_async and test_literal_partial_mixin_async.

Workflow ID: wflow_uHUbE6DsqM0SUSXP


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 371afcc in 21 seconds

More details
  • Looked at 19 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. tests/llm/test_openai/test_stream.py:153
  • Draft comment:
    Use aclient instead of client for consistency.
    aclient = instructor.patch(aclient, mode=mode)
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The test function test_literal_partial_mixin_async uses client instead of aclient after patching, which is inconsistent and could lead to confusion or errors.
2. tests/llm/test_openai/test_stream.py:148
  • Draft comment:
    Ensure function names and parameter names follow consistent patterns. Consider renaming aclient to client for consistency.
async def test_literal_partial_mixin_async(model, mode, client):
  • Reason this comment was not posted:
    Confidence changes required: 70%
    The function name test_literal_partial_mixin_async is inconsistent with the naming pattern used in the file. The parameter aclient should be used consistently with client in other functions.

Workflow ID: wflow_aUFopz2wefsH8yhR


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ivanleomk ivanleomk merged commit 78a1926 into main Nov 14, 2024
14 of 15 checks passed
@ivanleomk ivanleomk deleted the add-partial-mixin branch November 14, 2024 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants