Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: #112 model singleton #136

Merged
merged 27 commits into from
Jan 5, 2023
Merged

feat: #112 model singleton #136

merged 27 commits into from
Jan 5, 2023

Conversation

mikeldking
Copy link
Contributor

@mikeldking mikeldking commented Dec 29, 2022

resolves #112

  • Figure out how to add phoenix.core.model singleton to strawberry info.context
  • Wire together cardinality via phoenix.metrics
  • Make the dimensions connection real via model.dimensions
  • First stab at inferring numerical and categorical types from pandas data types

@axiomofjoy axiomofjoy self-assigned this Dec 30, 2022
app.state.primary = args.primary
app.state.reference = args.reference
model = Model(primary_dataset_name=args.primary, reference_dataset_name=args.reference)
app = create_app(model, graphiql=config.graphiql)

uvicorn.run(app, port=args.port)
Copy link
Contributor

@axiomofjoy axiomofjoy Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out how to run with automatic reloads using uvicorn.run("main:app", reload=True, port=args.port), but this does not work now that we are creating app on the fly rather than importing a static app object. Haven't figured out how to get automatic server reloads working with the current setup.

Comment on lines 106 to 113
def _infer_dimension_data_type(self, dimension_name: str) -> DimensionDataType:
# TODO: verify corresponding dimension of reference dataset has same type
dataframe_dtype = self.primary_dataset.dataframe[dimension_name].dtype
if is_numeric_dtype(dataframe_dtype): # type: ignore
return DimensionDataType.NUMERIC
elif is_object_dtype(dataframe_dtype): # type: ignore
return DimensionDataType.CATEGORICAL
raise ValueError("Unrecognized dimension type")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First crack at inferring dimension data type from pandas data type. The types on these methods are rough, ignoring for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@axiomofjoy axiomofjoy marked this pull request as ready for review January 5, 2023 00:11
actual
feature
tag
embedding
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to nix this. Not important for now

src/phoenix/config.py Show resolved Hide resolved
src/phoenix/core/dimension.py Show resolved Hide resolved
Comment on lines 49 to 57
for name in primary_actual_column_names:
if name is not None:
dimensions.append(
Dimension(
name=name,
data_type=self._infer_dimension_data_type(name),
type=DimensionType.ACTUAL,
)
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this bit of code is repeated quite a bit, there's gonna be a way for you to merge the schemas, and to infer the types in one go. Not urgent but I think this code can read a bit cleaner. We can pair on this as a follow-up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a crack at condensing this, let me know if this looks better.

primary_schema = primary_dataset.schema

# add actual dimensions
primary_actual_column_names = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might consider deferring on this, though nice to have as a strawman

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to remove actuals? Don't we want to compute cardinality on actual columns?

src/phoenix/core/model.py Show resolved Hide resolved
from phoenix.core.model import Model
from phoenix.metrics.cardinality import cardinality

T = TypeVar("T")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to comments below, when using Generics, let's just nix using single variables. There's no need make the type signature more complicated. T, like foo/bar, is commonly used in generics but hurts code readability. Here you can make this actually humanly understandable - it is the entity you are loading - e.g. DataItem or other. Naming along these lines is important as this code must be understandable by many.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified this section and no longer needed generics.

return Loader(cardinality=cardinality_dataloader)


def _get_metric_dataloader(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is really worthy of an abstraction quite yet - the way it uses partial application of the function requires a distinct function signature. Not a big deal but I would caution against building abstractions when there is no need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified this code.

src/phoenix/server/api/types/loader.py Outdated Show resolved Hide resolved
src/phoenix/server/app.py Outdated Show resolved Hide resolved
src/phoenix/server/app.py Show resolved Hide resolved
Copy link
Contributor Author

@mikeldking mikeldking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments


from phoenix.core.model import Model

from .loader import Loader
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axiomofjoy can we use plurals here?

@mikeldking mikeldking merged commit 02bc015 into main Jan 5, 2023
@mikeldking mikeldking deleted the 112-model-singleton branch January 5, 2023 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[server] Model singleton
2 participants