Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/support enumeration type hints #37

Merged
merged 20 commits into from
Feb 17, 2024

Conversation

lorenzocelli
Copy link
Contributor

Issue #10

Subclasses of enum.Enum can now be be used as type hints
Copy link
Member

@siliconlad siliconlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I've realized is the following.

from enum import Enum


class IntEnum(Enum):
    A = 1
    B = 2

IntEnum.A == 1  # False

Which means that when the LLM calls a function which accepts a type of IntEnum, it will fail because IntEnum.A is not the same as 1, for example.

To me there are two possible options:

  1. We could warn the user in the documentation that they need to deal with this. Not ideal but the fix would just be to add IntEnum(1).
  2. We could wrap the provided function to do the conversion ourselves. This would obviously be more convenient, but wonder if that introduces other problems.

What do you think @lorenzocelli

@lorenzocelli lorenzocelli linked an issue Feb 9, 2024 that may be closed by this pull request
@lorenzocelli
Copy link
Contributor Author

One thing I've realized is the following.

from enum import Enum


class IntEnum(Enum):
    A = 1
    B = 2

IntEnum.A == 1  # False

Which means that when the LLM calls a function which accepts a type of IntEnum, it will fail because IntEnum.A is not the same as 1, for example.

To me there are two possible options:

  1. We could warn the user in the documentation that they need to deal with this. Not ideal but the fix would just be to add IntEnum(1).
  2. We could wrap the provided function to do the conversion ourselves. This would obviously be more convenient, but wonder if that introduces other problems.

What do you think @lorenzocelli

Right. I agree with you, the first option is not ideal.

I think we could implement the second more easily if we stored the list of ParameterSchema objects in FunctionSchema. Then we could do something like the following inside _GPTEnabled.__call__ (ignoring args here):

def __call__(self, *args, **kwargs):

    p_schemas: list[ParameterSchema] = self.schema.parameters

    for schema in p_schemas:
        if isinstance(schema, ListParameterSchema):
            # replace value with corresponding enum type
            kwargs[schema.parameter.name] = IntEnum(kwargs[schema.parameter.name])

    return self.func(*args, **kwargs)

We could also delegate to ParameterSchema the task of converting a value back and remove the call to isinstance.

@siliconlad
Copy link
Member

Yeah I think that sounds good!

@lorenzocelli
Copy link
Contributor Author

We're still missing the value to enum conversion on invocation

@siliconlad siliconlad marked this pull request as draft February 10, 2024 18:16
@siliconlad
Copy link
Member

@lorenzocelli I've marked this as a draft. Let me know when it's ready for me to review 👍

@lorenzocelli
Copy link
Contributor Author

@siliconlad when we convert the enum to a set of values to send to the api, do you think we should keep the values or the names? Names are usually more informative than values, such as in this example from the python docs:

class Color(Enum):
    RED = 1
    GREEN = 2
    BLUE = 3

@lorenzocelli
Copy link
Contributor Author

lorenzocelli commented Feb 11, 2024

In order to store the parameter schemas inside the function schema, I found it convenient to remove the static methods in favour of a more instance-oriented approach. For example FunctionSchema.schema(...) -> dict becomes mySchema._populate_schema(...) called in the constructor to populate the schema dictionary. In this way it is easier to store other information like parameter_schemas in the process.

Another way could be to keep the static methods and provide them with a parameter_schemas dictionary to fill.

@lorenzocelli lorenzocelli marked this pull request as ready for review February 11, 2024 11:11
@siliconlad
Copy link
Member

@siliconlad when we convert the enum to a set of values to send to the api, do you think we should keep the values or the names? Names are usually more informative than values, such as in this example from the python docs:

class Color(Enum):
    RED = 1
    GREEN = 2
    BLUE = 3

The LLM might perform better with the names I think.

@lorenzocelli
Copy link
Contributor Author

The LLM might perform better with the names I think.

I suspected, I will mark this as a draft again then

@lorenzocelli lorenzocelli marked this pull request as draft February 11, 2024 14:55
@lorenzocelli lorenzocelli marked this pull request as ready for review February 11, 2024 16:18
Copy link
Member

@siliconlad siliconlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments :)

(some conflicts to resolve too)

tool2schema/schema.py Outdated Show resolved Hide resolved
tool2schema/schema.py Outdated Show resolved Hide resolved
Store parameter_schemas on initialisation but do not create the schema dictionary until to_json is called
@lorenzocelli lorenzocelli marked this pull request as draft February 12, 2024 22:28
Change the internal logic of ParameterSchema to go from _add_...(dict) methods to _get_...(), for similarity with FunctionSchema
@lorenzocelli
Copy link
Contributor Author

lorenzocelli commented Feb 15, 2024

Refactored ParameterSchema implementation to be closer to FunctionSchema implementation (using get_...() methods)

GPT-enabled functions can now be invoked with json-converted values as positional arguments
Add support for keyword arguments of type enum.Enum with a default value. Rename ParameterSchema.parse_value to decode_value, and add encode_value method to encode the default value when necessary.
@lorenzocelli
Copy link
Contributor Author

To support Enum parameters with default values, I added a ParameterSchema.encode_value method to convert an instance to its json representation (in the case of Enum, the name). Contextually I renamed parse_value (which converts the json representation to an instance) to decode_value.

@lorenzocelli lorenzocelli marked this pull request as ready for review February 15, 2024 16:47
Copy link
Member

@siliconlad siliconlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small comments. Otherwise looks good 🚀

tests/test_schema.py Outdated Show resolved Hide resolved
tool2schema/parameter_schema.py Outdated Show resolved Hide resolved
tool2schema/parameter_schema.py Show resolved Hide resolved
tool2schema/parameter_schema.py Outdated Show resolved Hide resolved
Use dictionary comprehension in ParameterSchema.to_json and check the value type in EnumTypeParameterSchema.decode_value
@lorenzocelli
Copy link
Contributor Author

I updated the 'enumerations' section of the README

lorenzocelli and others added 2 commits February 17, 2024 16:30
Co-authored-by: Angus Stewart <siliconlad@protonmail.com>
Copy link
Member

@siliconlad siliconlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@siliconlad siliconlad merged commit 2ac97fe into main Feb 17, 2024
5 checks passed
@siliconlad siliconlad deleted the feature/support-enumeration-type-hints branch February 17, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support enumeration type hints
2 participants