-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: #112 model singleton #136
Conversation
app.state.primary = args.primary | ||
app.state.reference = args.reference | ||
model = Model(primary_dataset_name=args.primary, reference_dataset_name=args.reference) | ||
app = create_app(model, graphiql=config.graphiql) | ||
|
||
uvicorn.run(app, port=args.port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out how to run with automatic reloads using uvicorn.run("main:app", reload=True, port=args.port)
, but this does not work now that we are creating app
on the fly rather than importing a static app
object. Haven't figured out how to get automatic server reloads working with the current setup.
src/phoenix/core/model.py
Outdated
def _infer_dimension_data_type(self, dimension_name: str) -> DimensionDataType: | ||
# TODO: verify corresponding dimension of reference dataset has same type | ||
dataframe_dtype = self.primary_dataset.dataframe[dimension_name].dtype | ||
if is_numeric_dtype(dataframe_dtype): # type: ignore | ||
return DimensionDataType.NUMERIC | ||
elif is_object_dtype(dataframe_dtype): # type: ignore | ||
return DimensionDataType.CATEGORICAL | ||
raise ValueError("Unrecognized dimension type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First crack at inferring dimension data type from pandas
data type. The types on these methods are rough, ignoring for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
app/schema.graphql
Outdated
actual | ||
feature | ||
tag | ||
embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might want to nix this. Not important for now
src/phoenix/core/model.py
Outdated
for name in primary_actual_column_names: | ||
if name is not None: | ||
dimensions.append( | ||
Dimension( | ||
name=name, | ||
data_type=self._infer_dimension_data_type(name), | ||
type=DimensionType.ACTUAL, | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this bit of code is repeated quite a bit, there's gonna be a way for you to merge the schemas, and to infer the types in one go. Not urgent but I think this code can read a bit cleaner. We can pair on this as a follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a crack at condensing this, let me know if this looks better.
src/phoenix/core/model.py
Outdated
primary_schema = primary_dataset.schema | ||
|
||
# add actual dimensions | ||
primary_actual_column_names = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might consider deferring on this, though nice to have as a strawman
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to remove actuals? Don't we want to compute cardinality on actual columns?
from phoenix.core.model import Model | ||
from phoenix.metrics.cardinality import cardinality | ||
|
||
T = TypeVar("T") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to comments below, when using Generics, let's just nix using single variables. There's no need make the type signature more complicated. T, like foo/bar, is commonly used in generics but hurts code readability. Here you can make this actually humanly understandable - it is the entity you are loading - e.g. DataItem
or other. Naming along these lines is important as this code must be understandable by many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified this section and no longer needed generics.
return Loader(cardinality=cardinality_dataloader) | ||
|
||
|
||
def _get_metric_dataloader( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is really worthy of an abstraction quite yet - the way it uses partial application of the function requires a distinct function signature. Not a big deal but I would caution against building abstractions when there is no need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
|
||
from phoenix.core.model import Model | ||
|
||
from .loader import Loader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@axiomofjoy can we use plurals here?
resolves #112
phoenix.core.model
singleton to strawberry info.contextphoenix.metrics
model.dimensions
pandas
data types