Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'strtobool' instead of comparing with a string. #988

Merged
merged 4 commits into from
Aug 2, 2024

Conversation

ndrluis
Copy link
Collaborator

@ndrluis ndrluis commented Jul 31, 2024

No description provided.

@sungwy
Copy link
Collaborator

sungwy commented Jul 31, 2024

Thank you for cleaning this up @ndrluis

I think it would also make sense to move PropertyUtil out of pyiceberg/table/__init__.py to a new file, like pyiceberg/utils/properties.py. I think that will also help untangle some of our dependencies between the modules. WDYT?

@ndrluis
Copy link
Collaborator Author

ndrluis commented Jul 31, 2024

Yes, I think that makes sense. What do you think about extracting the methods from the PropertyUtil class into functions in the pyiceberg/utils/properties.py module?

@sungwy
Copy link
Collaborator

sungwy commented Jul 31, 2024

Yes that sounds like a good idea - the name does sound a bit redundant leaving it as it is:

from pyiceberg.utils.properties import PropertyUtil - it's quite the tongue twister

Copy link
Collaborator

@sungwy sungwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ndrluis - I left one very nit picky suggestion, but otherwise all of the changes and tests look great! Thank you for adding this PR

@@ -588,7 +589,7 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]:
def _(self, type_var: BooleanType) -> Literal[bool]:
value_upper = self.value.upper()
if value_upper in ["TRUE", "FALSE"]:
return BooleanLiteral(value_upper == "TRUE")
return BooleanLiteral(strtobool(value_upper))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel as though this is an over-optimization, since we check whether value_upper is in "TRUE" and "FALSE". WDYT?

An alternative is to instead just use strtobool for the whole function, but that would mean that we support a string expression like:

"col_A == 'off'" where col_A is a boolean column and 'off' gets parsed as False, and that feels unnecessarily over permissive. Should we leave this block of code as it is?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in this case, it makes sense to be strict.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted and added more tests to clarify that it needs to be more strict.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It looks like there's still one breaking test with the new changes:

FAILED tests/expressions/test_literals.py::test_string_to_boolean_literal - AssertionError: assert BooleanLiteral(False) == BooleanLiteral(True)
 +  where BooleanLiteral(False) = <function StringLiteral.to at 0x7eff9029e4d0>(BooleanType())
 +    where <function StringLiteral.to at 0x7eff9029e4d0> = literal('FALSE').to
 +      where literal('FALSE') = literal('FALSE')
 +    and   BooleanType() = BooleanType()
 +  and   BooleanLiteral(True) = literal(True)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my bad. I copied and pasted it but forgot to rerun the test.

@ndrluis ndrluis requested a review from sungwy August 1, 2024 13:48
default: bool,
) -> bool:
if value := properties.get(property_name):
return strtobool(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted in dd82e20 and I added some missing sad path tests.

@ndrluis ndrluis requested a review from kevinjqliu August 1, 2024 21:27
Copy link
Collaborator

@sungwy sungwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @ndrluis thank you for cleaning this up and adopting all the nitty feedback. Looking forward to using these cleaned up functions in my PRs as well.

@kevinjqliu thank you for your thorough review!

@kevinjqliu kevinjqliu merged commit 6c0d307 into apache:main Aug 2, 2024
7 checks passed
@kevinjqliu
Copy link
Contributor

Thanks for the contribution @ndrluis!

sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
* Use 'strtobool' instead of comparing with a string.

* Move the PropertyUtil methods to the properties module as functions

* fixup! Use 'strtobool' instead of comparing with a string.

* fixup! Use 'strtobool' instead of comparing with a string.
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
* Use 'strtobool' instead of comparing with a string.

* Move the PropertyUtil methods to the properties module as functions

* fixup! Use 'strtobool' instead of comparing with a string.

* fixup! Use 'strtobool' instead of comparing with a string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants