-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: percent empty #118
feat: percent empty #118
Conversation
src/phoenix/metrics/percent_empty.py
Outdated
def percent_empty(df: pd.DataFrame) -> "pd.Series[float]": | ||
return df.isnull().sum() / df.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to pass in columns to run the compute over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/phoenix/metrics/percent_empty.py
Outdated
import pandas as pd | ||
|
||
|
||
def percent_empty(df: pd.DataFrame) -> "pd.Series[float]": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return dict mapping column_name to percent empty
src/phoenix/metrics/percent_empty.py
Outdated
import pandas as pd | ||
|
||
|
||
def percent_empty(df: pd.DataFrame) -> "pd.Series[float]": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should return None if dataframe has no rows.
src/phoenix/metrics/percent_empty.py
Outdated
num_records = dataframe.shape[0] | ||
if num_records == 0: | ||
return {col: None for col in column_names} | ||
return dict(dataframe[column_names].isnull().sum() / dataframe.shape[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: num records?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 65c0d0e
def _get_percent_empty_dataloader(model: Model) -> DataLoader[str, Optional[float]]: | ||
async def _percent_empty_load_function(column_names: List[str]) -> List[Optional[float]]: | ||
column_name_to_percent_empty = percent_empty( | ||
dataframe=model.primary_dataset.dataframe, column_names=column_names | ||
) | ||
return [column_name_to_percent_empty[col] for col in column_names] | ||
|
||
return DataLoader(load_fn=_percent_empty_load_function) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might not be worth a dataloader tbh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will address in a separate pr
No description provided.