Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix boolean logic in expect_compound_columns_to_be_unique ignore_row_if #202

Merged
merged 5 commits into from
Oct 5, 2022

Conversation

clausherther
Copy link
Contributor

Fixes #200

This fixes the implementation of the ignore_row_if parameter in expect_compound_columns_to_be_unique highlighted by @mcannamela in #200 and adds a test to data_test.

Copy link

@mcannamela mcannamela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will fix the bug. I think a couple tweaks could make it nicer but no complaint from me if it shipped like this, I'm also happy to PR those myself in a follow up.

I also had the thought, is it possible that this logic shows up in other expectations? I didn't investigate this at all.

{% endfor %}
)
{%- endif -%}
{%- set op = "and" if ignore_row_if == "all_values_are_missing" else "or" -%}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct now, but closes off the "third behavior" of "never ignore" and somewhat obfuscates what behavior to expect if I pass a value other than "all_values_are_missing".

Did you mean to eliminate the (heretofore undocumented) third behavior?

What would you think of adding a guard prior to this that ensures the value of ignore_row_if is an expected one? Even if it's not documented, anybody reading the code would then know how to call it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the built-in uniqueness check always checks for nulls and excludes null values from the uniqueness check.
The default for the multi-column case (which is the generalized version of expect_column_values_to_be_unique) is to exclude rows where both columns are null from the uniqueness test, unless overridden.

I think I makes sense to check that the value of ignore_row_if is from a known set and error out otherwise.

Copy link
Contributor Author

@clausherther clausherther Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check for the 2 expected values. If we want to explicitly add a 3rd case, let's open an issue and separate PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, i agree with this default behavior

@clausherther
Copy link
Contributor Author

I also had the thought, is it possible that this logic shows up in other expectations? I didn't investigate this at all.

I'll take a look, good call out. If so, I'll open a separate issue/PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

expect_compound_columns_to_be_unique has backward logic for ignore_row_if
2 participants