fix(core): simplified evaluate & report methods #29

UrbanoFonseca · 2021-09-21T21:45:31Z

On top of #27

src/ydata_quality/core/engine.py

jfsantos-ds

Looks good 🚀

jfsantos-ds

report in evaluate, pretty 🚀 ➡️ 🌞

fix(readme): updated quickstart to new flow docs(readme): added get_warnings example fix(docs): fixed docstring on engine evaluate

src/ydata_quality/core/engine.py

src/ydata_quality/data_expectations/engine.py

jfsantos-ds · 2021-09-22T12:40:26Z

src/ydata_quality/data_relations/engine.py

@@ -85,6 +86,8 @@ def evaluate(self, df: pd.DataFrame, dtypes: Optional[dict] = None, label: str=N
        if label:
            results['Feature Importance'] = self._feature_importance(corr_mat, p_corr_mat, label, corr_th)
        results['High Collinearity'] = self._high_collinearity_detection(df, self.dtypes, label, vif_th, p_th=p_th)
+        if summary:


clean_warnings here too, now I am thinking why dont we clean warnings just at the report methods?
I also realize it seems we dont yet have a good solution for the single executed tests that can stack duplicate warnings and never get cleaned if the warning list is consumed straight from the property

warnings can also be consumed with get_warnings() and those too should be clean

jfsantos-ds · 2021-09-22T12:44:05Z

src/ydata_quality/erroneous_data/engine.py

@@ -98,11 +98,11 @@ def flatlines(self, th: int=5, skip: list=[]):
            self.store_warning(
                QualityWarning(
                    test='Flatlines', category='Erroneous Data', priority=2, data=flatlines,
-                    description=f"Found {total_flatlines} flatline events with a minimun length of {th} among the columns {set(flatlines.keys())}."
+                    description=f"Found {total_flatlines} flatline events with a minimun length of {th:.0f} among the columns {set(flatlines.keys())}."


Float formatting seems wrong here, this threshold is an integer, shouldn't it be just {th} ?

just {th} was printing a float with a lot of decimals (e.g. 5.00000) . the expression :0.f is equivalent to an int

jfsantos-ds · 2021-09-22T12:45:26Z

src/ydata_quality/erroneous_data/engine.py

            ))
            return flatlines
        else:
-            self._logger.info("No flatline events with a minimum length of %f were found.", th)
+            self._logger.info(f"No flatline events with a minimum length of {th:.0f} were found.")


ah this string formatting probably misled you, should be %d

jfsantos-ds

Small remarks, check before merging, otherwise here is my approval and 👍🏼

UrbanoFonseca · 2021-09-22T13:35:23Z

Closes #36

UrbanoFonseca self-assigned this Sep 21, 2021

jfsantos-ds reviewed Sep 21, 2021

View reviewed changes

src/ydata_quality/core/engine.py Outdated Show resolved Hide resolved

jfsantos-ds approved these changes Sep 21, 2021

View reviewed changes

UrbanoFonseca added 4 commits September 22, 2021 12:30

fix(core): integrated report into evaluate method

a5ba713

fix(readme): updated quickstart to new flow docs(readme): added get_warnings example fix(docs): fixed docstring on engine evaluate

fix(core): default plot in DataQuality to False

7661452

feat(core): updated evaluate ux on tutorials

2e2d679

fix(engines): mute deprecation warning, flatlines format

49ee925

UrbanoFonseca force-pushed the fix/simplify-evaluate branch from 21ec488 to 49ee925 Compare September 22, 2021 12:11

fix(data-relations): dtypes definition bug fix

4697d3b

UrbanoFonseca marked this pull request as ready for review September 22, 2021 12:15

docs(readme): updated warnings summary example

51352f9

jfsantos-ds reviewed Sep 22, 2021

View reviewed changes

src/ydata_quality/core/engine.py Show resolved Hide resolved

jfsantos-ds reviewed Sep 22, 2021

View reviewed changes

src/ydata_quality/data_expectations/engine.py Show resolved Hide resolved

jfsantos-ds reviewed Sep 22, 2021

View reviewed changes

jfsantos-ds approved these changes Sep 22, 2021

View reviewed changes

fix(engines): added clean_warnings on DQ and new engines

e5f3cc3

portellaa approved these changes Sep 22, 2021

View reviewed changes

UrbanoFonseca merged commit 50a6ca2 into master Sep 22, 2021

UrbanoFonseca deleted the fix/simplify-evaluate branch September 22, 2021 13:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): simplified evaluate & report methods #29

fix(core): simplified evaluate & report methods #29

UrbanoFonseca commented Sep 21, 2021

jfsantos-ds left a comment

jfsantos-ds left a comment

jfsantos-ds Sep 22, 2021

UrbanoFonseca Sep 22, 2021

jfsantos-ds Sep 22, 2021

UrbanoFonseca Sep 22, 2021

jfsantos-ds Sep 22, 2021

jfsantos-ds left a comment

UrbanoFonseca commented Sep 22, 2021

fix(core): simplified evaluate & report methods #29

fix(core): simplified evaluate & report methods #29

Conversation

UrbanoFonseca commented Sep 21, 2021

jfsantos-ds left a comment

Choose a reason for hiding this comment

jfsantos-ds left a comment

Choose a reason for hiding this comment

jfsantos-ds Sep 22, 2021

Choose a reason for hiding this comment

UrbanoFonseca Sep 22, 2021

Choose a reason for hiding this comment

jfsantos-ds Sep 22, 2021

Choose a reason for hiding this comment

UrbanoFonseca Sep 22, 2021

Choose a reason for hiding this comment

jfsantos-ds Sep 22, 2021

Choose a reason for hiding this comment

jfsantos-ds left a comment

Choose a reason for hiding this comment

UrbanoFonseca commented Sep 22, 2021