-
Notifications
You must be signed in to change notification settings - Fork 601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(datatypes): infer the correct shape for many existing ops #9334
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, I think it's a good idea!
Just a question about where we should handle the None
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'll rebase and merge when CI passes.
Usually when using eg duckdb's REGEXP_REPLACE() function, the pattern and replacement are always scalar constants. But actually it can accept columnar for all three of it's arguments. These sort of assumptions appeared to be all over the place in the code base. I found these fixs by grepping for "shape_like" and manually looking at all the instances. I didn't add any test cases, that felt like a monumental thing to do I wasn't willing to put the time in for. But IDK, I'm not sure I would even want to put in all the tests, it would be SO much boilerplate. To deal with Nodes sometimes having optional args, I modified shape_like() to be more flexible. IDK, that was sort of a lazy approach, we could be much more verbose and have each op do this filtering individually, but I didn't that was worth it.
@cpcloud hopefully that was OK for me to hit the merge button, the CI passed! Thank you! |
This is pinning down the expected behavior for cases before tackling the case() to cases() switch in ibis-project#9096 so that PR can be simpler I move the validation for comparable-ness down into the operation so that the logic is consolidated to one place. in ibis-project#9096 there might be multiple places that construct an ops.SimpleCase, and we don't want to have to implement the validation in all calling locations. We could consider relaxing the limitation for non-empty cases later, but for now lets be strict. I already fixed the shape of ops.SearchedCase in ibis-project#9334, but it looks like in that PR I forgot to also fix ops.SimpleCase, so I do that fix here.
This is pinning down the expected behavior for cases before tackling the case() to cases() switch in ibis-project#9096 so that PR can be simpler I move the validation for comparable-ness down into the operation so that the logic is consolidated to one place. in ibis-project#9096 there might be multiple places that construct an ops.SimpleCase, and we don't want to have to implement the validation in all calling locations. We could consider relaxing the limitation for non-empty cases later, but for now lets be strict. I already fixed the shape of ops.SearchedCase in ibis-project#9334, but it looks like in that PR I forgot to also fix ops.SimpleCase, so I do that fix here.
This is pinning down the expected behavior for cases before tackling the case() to cases() switch in ibis-project#9096 so that PR can be simpler I move the validation for comparable-ness down into the operation so that the logic is consolidated to one place. in ibis-project#9096 there might be multiple places that construct an ops.SimpleCase, and we don't want to have to implement the validation in all calling locations. We could consider relaxing the limitation for non-empty cases later, but for now lets be strict. I already fixed the shape of ops.SearchedCase in ibis-project#9334, but it looks like in that PR I forgot to also fix ops.SimpleCase, so I do that fix here.
Found this when looking into #9333
Usually when using eg duckdb's REGEXP_REPLACE() function, the pattern and replacement are always scalar constants. But actually it can accept columnar for all three of it's arguments.
These sort of assumptions appeared to be all over the place in the code base. I found these fixs by grepping for "shape_like" and manually looking at all the instances.
I didn't add any test cases, that felt like a monumental thing to do I wasn't willing to put the time in for. But IDK, I'm not sure I would even want to put in all the tests, it would be SO much boilerplate.
To deal with Nodes sometimes having optional args, I modified shape_like() to be more flexible.
IDK, that was sort of a lazy approach, we could be much more verbose and have each op do this filtering individually, but I didn't that was worth it.