Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): move from .case() to .cases() #9039

Closed
wants to merge 2 commits into from
Closed

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented Apr 22, 2024

WIP. Redo of #7914 on top of the modern code. Changes from that PR:

  • Instead of removing the old .case() APIs, I deprecate them here. There are a few tests for them to try to ensure we didn't break anyone.
  • keep both SimpleCase and SearchedCase. In previous PRs I removed SimpleCase. But we should support SimpleCase for more idiomatic/performant SQL. Eg now we still have CASE x WHEN 5 THEN ...
  • Before, x.cases((x==5, 3)) was allowed. Now it is disallowed, it's too clever. If you want to do that, it's an easy transition to use the top-level ibis.cases()
  • Added many more test cases, including for dtypes, dshapes, and expected errors on bad construction
  • added test that ibis.null().cases((None, "fill"), else_="not hit") always results in "not hit". Maybe not the best ergonomics, but at least it is consistent and written down. Perhaps we can revisit later.
  • Added xfail test where it is possible to construct ibis.cases(("not a bool", "someval")). I think we should fix that upstream in the ops.Literal constructor.
  • fixed bug where datashape was only getting determined from base or cases. Really it needs to depend on ALL inputs.
  • added some tests and implementation for dealing with empty branches: ibis.cases() (results in NULL) and ibis.cases(else_=5) (results in 5). I considered disallowing these, but I don't think there is anything semantiically wrong with supporting this.

TODOs that I found that should come in followups:

  • NULL replacement isn't super consistent yet. For example, val.substitute({None: 4}) currently does a fillna(). But if you do val.cases((None, 4), else_=val), then this ALWAYS hits the else_ case, because x = NULL never evals to True. This also isn't even consistent in the sense that it only special cases for python None. If you do ibis.null(), or something only known at runtime like ibis.literal(5).nullif(5), then this will always hit the else_ case. Due to these limitations, I vote for making matching against NULLs out of scope for .cases and .substitute. If a user wants to do this, then they better do a .fillna() before.

@NickCrews NickCrews force-pushed the case-to-cases branch 3 times, most recently from 9eedc15 to d8be23f Compare April 22, 2024 23:21
@NickCrews NickCrews marked this pull request as draft April 22, 2024 23:22
@NickCrews NickCrews force-pushed the case-to-cases branch 2 times, most recently from 6b3dc20 to 7f4c570 Compare April 23, 2024 00:48
@NickCrews NickCrews force-pushed the case-to-cases branch 2 times, most recently from 371c2bc to d8ea163 Compare April 23, 2024 01:18
@NickCrews NickCrews marked this pull request as ready for review April 23, 2024 01:18
@NickCrews
Copy link
Contributor Author

OK I think this is ready for review! I can break this into smaller commits if you want, I think I now have all the semantics figured out how I like them.

@NickCrews NickCrews force-pushed the case-to-cases branch 3 times, most recently from 6ca584a to 2bc9121 Compare April 23, 2024 19:02
@kszucs kszucs changed the title feat: move from .case() to .cases() feat(api): move from .case() to .cases() Apr 24, 2024
@kszucs
Copy link
Member

kszucs commented Apr 24, 2024

Can you keep around a test case the deprecated builder API?

@NickCrews
Copy link
Contributor Author

NickCrews commented Apr 24, 2024

@kszucs I re-added some simple tests how does that look? I can add the tests for all the edge cases, but I would prefer not to. More of the actual functionality (eg datashape and datatype logic) now happens inside the Operation, which both APIs share, so both APIs should be fairly aligned

@NickCrews NickCrews enabled auto-merge (rebase) April 24, 2024 19:08
@NickCrews
Copy link
Contributor Author

pyspark is issuing some ResourceWarning for some unrelated reason, is this some separate bug we should address?

@NickCrews NickCrews disabled auto-merge April 24, 2024 20:19
@NickCrews
Copy link
Contributor Author

NickCrews commented Apr 24, 2024

oops, just discovered this is breaking for existing users of Value.cases(). Need to add some compat code in there. ugggh it's gonna be annoying.

NickCrews added a commit that referenced this pull request Apr 24, 2024
This is setup for #9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
@NickCrews
Copy link
Contributor Author

OK, probably the most complex thing I do here is the arg, kwarg normalization to maintain compatibility with the old API. I feel pretty good about the way I did it, but I would love some red teaming to try to figure out if there are flaws with it.

This is setup for #9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
@NickCrews
Copy link
Contributor Author

Is it a bug that impala is returning a nan instead of None? It sure is annoying to have to do

    if exp is None:
        assert pd.isna(result)
    else:
        assert result == exp

just for impala

@NickCrews NickCrews enabled auto-merge (rebase) April 25, 2024 00:59
@NickCrews
Copy link
Contributor Author

@kszucs @cpcloud no way we can get this in for 9.0 is there ? The failing CI is unrelated, I think this thing is ready to go after a rebase

@cpcloud
Copy link
Member

cpcloud commented Apr 30, 2024

It looks like you've branched into ibis-project. This needs to be recreated as a PR coming from your fork.

@cpcloud
Copy link
Member

cpcloud commented Apr 30, 2024

I think this is too close to the release to get it in for 9.0. We'll probably need to do another non-major release a week or two afterward, so let's get it in first thing after we release 9.

NickCrews added a commit to NickCrews/ibis that referenced this pull request May 1, 2024
This is setup for ibis-project#9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
@NickCrews
Copy link
Contributor Author

Closing in in favor of #9096 (which is correctly coming from my fork)

@NickCrews NickCrews closed this May 1, 2024
auto-merge was automatically disabled May 1, 2024 17:02

Pull request was closed

NickCrews added a commit to NickCrews/ibis that referenced this pull request May 7, 2024
This is setup for ibis-project#9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
NickCrews added a commit to NickCrews/ibis that referenced this pull request May 8, 2024
This is setup for ibis-project#9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
NickCrews added a commit to NickCrews/ibis that referenced this pull request Jul 12, 2024
This is setup for ibis-project#9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants