-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: split embedders in individual files #315
Conversation
711ebd2
to
c802c74
Compare
c802c74
to
b66ed36
Compare
from pydantic import BaseModel | ||
from typing_extensions import override | ||
|
||
from ..embeddings import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class is unmodified, except for this required imports.
from pydantic import BaseModel | ||
from typing_extensions import override | ||
|
||
from ..embeddings import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class is unmodified, except for this required imports.
from pydantic import BaseModel | ||
from typing_extensions import TypedDict, override | ||
|
||
from ..embeddings import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class is unmodified, except for this required imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but lets also have @JamesGuthrie review
from .openai import OpenAI | ||
from .voyageai import VoyageAI | ||
|
||
__all__ = ["Ollama", "OpenAI", "VoyageAI"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this __all__
? According to this:
if a package’s
__init__.py
code defines a list named__all__
, it is taken to be the list of module names that should be imported whenfrom package import *
is encountered
This seems like somewhat esoteric functionality, which we're not actively using, and I don't see the need to support. By removing it we can remove one more instruction in the readme above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this in order to be able to import all providers through the same module: from .embedders import Ollama, OpenAI, VoyageAI
The alternative (unless you know another one) would be to use individual imports such as:
from .embedders.ollama import Ollama
from .embedders.openai import OpenAI
from .embedders.voyageai import VoyageAI
I'm fine with this one and removing the content of the __init__.py
. Removing that step which I agree it would be one less step to consider when adding a new provider.
EDIT: Another alternative in between would be just to remove the __all__
from but keep the imports in the __init__.py
:
# __init__.py file
from .ollama import Ollama
from .openai import OpenAI
from .voyageai import VoyageAI
In that way we can still do the from .embedders import Ollama, OpenAI, VoyageAI
(When I coded this, I believed the __all__
was necessary but it is not).
WDYT @JamesGuthrie ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! I don't have a strong opinion, but the second suggestions (imports in __init__.py
) is "less magical", so I would prefer it over __all__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for the fast answer. Code (and docs) updated with the second option.
from .voyageai import VoyageAI | ||
|
||
__all__ = ["Ollama", "OpenAI", "VoyageAI"] | ||
from .ollama import Ollama as Ollama |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The redundant alias is needed, otherwise the linter will complain and provide the only alternative of adding it to the __all__
var
This PR refactors the existing
projects/pgai/pgai/embeddings.py
file by splitting it into smaller files, each containing a single class for a specific embedding provider (atm: voyageai, openai, ollama).The primary goal is to improve maintainability and prevent the file from growing indefinitely over time, which can make it harder to manage and navigate.