Data Labeler Compiler Diff #336

grant-eden · 2021-07-14T22:39:32Z

Pretty straightforward.
Might be strange that we have a for loop for one profile, but in the future that may change and it is consistent with the profile property.
Took a page out of Andrew's book for mocking.

dataprofiler/profilers/column_profile_compilers.py

JGSweets · 2021-07-15T15:30:34Z

dataprofiler/tests/profilers/test_column_profile_compilers.py

+                        'c': 0.84
+                    }
+                },
+                'data_label': ['a', 'b']


why is this data_label different than the one in statistics? Also, in profile do we have data_label at both levels? Should this instead pop it from the statistics level and reinsert it at the top level?

good catch, diff in the datalabeler profile is not supposed to return data label

grant-eden · 2021-07-15T17:13:06Z

dataprofiler/profilers/data_labeler_column_profile.py

@@ -233,6 +233,7 @@ def profile(self):
        Property for profile. Returns the profile of the column.
        """
        profile = {
+            "data_label": self.data_label,


@JGSweets @AnhTruong I've made a small refactor to put the data_label in the profile instead of having it extracted from the compiler level. Was there a specific reason the data_label was left out of the profile? Is this refactor inappropriate?

I think this is appropriate.

The update above: profile["data_label"] = col_profile.pop("data_label") completes the refactor.

AnhTruong

some comments

dataprofiler/profilers/column_profile_compilers.py

AnhTruong · 2021-07-15T18:35:24Z

dataprofiler/tests/profilers/test_column_profile_compilers.py

+        # Test disabling both datalabeler profiles for compiler diff
+        compiler2 = col_pro_compilers.ColumnDataLabelerCompiler(data, options)
+        expected_diff = {}
+        self.assertDictEqual(expected_diff, compiler1.diff(compiler2))


can we add a test with one profile is empty, then the result should be the other profile?

The result wont be the other profile, it would be empty, since you cant have a difference unless you have 2 profiles to compare

…o datalabelercompilerdiff

…data-profiler into datalabelercompilerdiff

* implementing ability for null option * Woah. NICE. (#336) * revised unalikeability functionality * added test cases for revised unalikeability functionality * update to 0.6.1 * added ability for null values to be set by user choice * fixed test cases for test_profiler_options.py and test_profiler_class_options.py relating to null_values * fixed functionality for null_values * fixed functionality and documentation for null_values; added test case to cover more types * fixed syntax for null_values * fixed syntax for profiler_options.py * fixed syntax for profiler_options.py * fixed syntax for test cases * fixed syntax for test cases * fixed syntax for test cases * fixed syntax * fixed syntax Co-authored-by: Grant <56846128+gme5078@users.noreply.github.com>

* implementing ability for null option * Woah. NICE. (capitalone#336) * revised unalikeability functionality * added test cases for revised unalikeability functionality * update to 0.6.1 * added ability for null values to be set by user choice * fixed test cases for test_profiler_options.py and test_profiler_class_options.py relating to null_values * fixed functionality for null_values * fixed functionality and documentation for null_values; added test case to cover more types * fixed syntax for null_values * fixed syntax for profiler_options.py * fixed syntax for profiler_options.py * fixed syntax for test cases * fixed syntax for test cases * fixed syntax for test cases * fixed syntax * fixed syntax Co-authored-by: Grant <56846128+gme5078@users.noreply.github.com>

Woah. NICE.

eefa567

grant-eden requested review from AnhTruong, ChrisWallace2020, JGSweets and lettergram as code owners July 14, 2021 22:39

JGSweets enabled auto-merge (squash) July 15, 2021 15:24

JGSweets reviewed Jul 15, 2021

View reviewed changes

dataprofiler/profilers/column_profile_compilers.py Outdated Show resolved Hide resolved

JGSweets reviewed Jul 15, 2021

View reviewed changes

Small refactor for the good of the universe

b093627

grant-eden commented Jul 15, 2021

View reviewed changes

Merge branch 'main' into datalabelercompilerdiff

1bf6c8c

JGSweets previously approved these changes Jul 15, 2021

View reviewed changes

AnhTruong reviewed Jul 15, 2021

View reviewed changes

grant-eden added 3 commits July 15, 2021 13:26

Here we goooooo

93d8eaf

Merge branch 'main' of https://github.com/capitalone/DataProfiler int…

776bac6

…o datalabelercompilerdiff

Merge branch 'datalabelercompilerdiff' of https://github.com/gme5078/…

6d503f3

…data-profiler into datalabelercompilerdiff

grant-eden dismissed JGSweets’s stale review via 6d503f3 July 15, 2021 20:28

JGSweets approved these changes Jul 15, 2021

View reviewed changes

AnhTruong approved these changes Jul 16, 2021

View reviewed changes

JGSweets merged commit 3aee648 into capitalone:main Jul 16, 2021

az85252 pushed a commit to az85252/DataProfiler that referenced this pull request Jul 19, 2021

Woah. NICE. (capitalone#336)

8b55e4b

stevensecreti pushed a commit to stevensecreti/DataProfiler that referenced this pull request Jun 15, 2022

Woah. NICE. (capitalone#336)

3208845

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Labeler Compiler Diff #336

Data Labeler Compiler Diff #336

grant-eden commented Jul 14, 2021

JGSweets Jul 15, 2021

grant-eden Jul 15, 2021 •

edited

Loading

grant-eden Jul 15, 2021

JGSweets Jul 15, 2021 •

edited

Loading

AnhTruong left a comment

AnhTruong Jul 15, 2021

grant-eden Jul 15, 2021

Data Labeler Compiler Diff #336

Data Labeler Compiler Diff #336

Conversation

grant-eden commented Jul 14, 2021

JGSweets Jul 15, 2021

Choose a reason for hiding this comment

grant-eden Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

grant-eden Jul 15, 2021

Choose a reason for hiding this comment

JGSweets Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

AnhTruong left a comment

Choose a reason for hiding this comment

AnhTruong Jul 15, 2021

Choose a reason for hiding this comment

grant-eden Jul 15, 2021

Choose a reason for hiding this comment

grant-eden Jul 15, 2021 •

edited

Loading

JGSweets Jul 15, 2021 •

edited

Loading