Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): simplified evaluate & report methods #29

Merged
merged 7 commits into from
Sep 22, 2021

Conversation

UrbanoFonseca
Copy link
Contributor

On top of #27

@UrbanoFonseca UrbanoFonseca self-assigned this Sep 21, 2021
Copy link
Contributor

@jfsantos-ds jfsantos-ds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🚀

Copy link
Contributor

@jfsantos-ds jfsantos-ds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report in evaluate, pretty 🚀 ➡️ 🌞

UrbanoFonseca added 4 commits September 22, 2021 12:30
@UrbanoFonseca UrbanoFonseca marked this pull request as ready for review September 22, 2021 12:15
@@ -85,6 +86,8 @@ def evaluate(self, df: pd.DataFrame, dtypes: Optional[dict] = None, label: str=N
if label:
results['Feature Importance'] = self._feature_importance(corr_mat, p_corr_mat, label, corr_th)
results['High Collinearity'] = self._high_collinearity_detection(df, self.dtypes, label, vif_th, p_th=p_th)
if summary:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean_warnings here too, now I am thinking why dont we clean warnings just at the report methods?
I also realize it seems we dont yet have a good solution for the single executed tests that can stack duplicate warnings and never get cleaned if the warning list is consumed straight from the property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warnings can also be consumed with get_warnings() and those too should be clean

@@ -98,11 +98,11 @@ def flatlines(self, th: int=5, skip: list=[]):
self.store_warning(
QualityWarning(
test='Flatlines', category='Erroneous Data', priority=2, data=flatlines,
description=f"Found {total_flatlines} flatline events with a minimun length of {th} among the columns {set(flatlines.keys())}."
description=f"Found {total_flatlines} flatline events with a minimun length of {th:.0f} among the columns {set(flatlines.keys())}."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Float formatting seems wrong here, this threshold is an integer, shouldn't it be just {th} ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just {th} was printing a float with a lot of decimals (e.g. 5.00000) . the expression :0.f is equivalent to an int

))
return flatlines
else:
self._logger.info("No flatline events with a minimum length of %f were found.", th)
self._logger.info(f"No flatline events with a minimum length of {th:.0f} were found.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah this string formatting probably misled you, should be %d

Copy link
Contributor

@jfsantos-ds jfsantos-ds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small remarks, check before merging, otherwise here is my approval and 👍🏼

@UrbanoFonseca
Copy link
Contributor Author

Closes #36

@UrbanoFonseca UrbanoFonseca merged commit 50a6ca2 into master Sep 22, 2021
@UrbanoFonseca UrbanoFonseca deleted the fix/simplify-evaluate branch September 22, 2021 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants