Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): move from .case() to .cases() #9096

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented May 1, 2024

Redo of #7914 (with substantive changes) and #9039 (merely switching the base repo to the correct one, my fork)

I will try to keep the summary below up to date with what this PR does:

Breaking change to Value.cases()

Before the API was

def cases(
        self,
        case_result_pairs: Iterable[tuple[ir.BooleanValue, Value]],
        default: Value | None = None,
    ) -> Value:

Now it is

def cases(self, *branches: tuple[Value, Value], else_=None) -> Value:

In other words, it is now variadic, and the else_ is kwarg. The else_ naming convention is now consistent with Value.substitute()

This will be a hard breaking change for users when they upgrade to 10.0.

Alternatives considered (I am currently going for the simplest but harshest path):

  • Some gnarly compact code that detects and translates old-style use of the API, and emits a warning. This was deemed too complex.
  • Some slightly less gnarly code that merely detects the old-style API, and errors.

Soft-deprecates ibis.case() and Value.case()

Existing users will be able to continue to use these APIs with no problems. There will be no warnings or anything else. However, we will remove these functions from the docs, and make the docstring advise to use the new API.
I want there to be ONE endorsed way of doing these conditionals, so I think only advertising one API is very important to me.

Alternatives considered:

  • Hard-deprecation: Add @util.deprecated(instead="use ibis.cases() instead", as_of="10.0") to these functions.
  • Hard-removal: remove them entirely in 10.0
  • Still possible to do: hide .case() even more from users and tab completion, eg behind __getattr__()

See this comment thread for more discussion.

@NickCrews NickCrews force-pushed the case-to-cases branch 5 times, most recently from 8ac2d67 to 6885a2c Compare May 1, 2024 17:51
@NickCrews
Copy link
Contributor Author

NickCrews commented May 1, 2024

EDIT: duh, it's because they don't guarantee row order. Updated the assertions to be order-independent.

Any idea as to why the datafusion, exasol, and risingwave column tests are failing? I still have trouble getting those backends running on my M1 so I can't debug locally very well.

@NickCrews NickCrews requested a review from cpcloud May 1, 2024 18:05
@NickCrews NickCrews force-pushed the case-to-cases branch 7 times, most recently from 513bb94 to ae70dc6 Compare May 8, 2024 22:11
@NickCrews
Copy link
Contributor Author

@cpcloud gentle nudge for a review here :)

@NickCrews
Copy link
Contributor Author

@cpcloud anything I can do to help move this forward/easier to review?

@ncclementi
Copy link
Contributor

@NickCrews would you mind fixing the conflicts and CI, then I'll nudge the team to get you a review soon.

@NickCrews
Copy link
Contributor Author

Thanks for the poke. I have a lot of other PRs in flight at the moment with ibis, many of which I think are a lot closer to the finish line than this, I'd love to get those merged first before I add another ball to juggle.

@NickCrews
Copy link
Contributor Author

One thing that should get merged first is #9334, where I fix a datashape bug that is exposed in this PR. Once that is in, then this PR will be simpler.

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing through!

I still have many concerns about the way things are being done in this PR. There also seem to be a few scope increases (e.g., the cast insertion) that should really be avoided.

ibis/backends/pandas/executor.py Outdated Show resolved Hide resolved
ibis/backends/sql/compiler.py Outdated Show resolved Hide resolved
ibis/backends/tests/test_generic.py Outdated Show resolved Hide resolved
ibis/backends/tests/test_generic.py Outdated Show resolved Hide resolved
ibis/backends/tests/test_generic.py Outdated Show resolved Hide resolved
ibis/expr/operations/generic.py Outdated Show resolved Hide resolved
ibis/expr/types/generic.py Outdated Show resolved Hide resolved
"""
import ibis.expr.builders as bl
@staticmethod
def _norm_cases_args(*args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very opposed to this function. Instead of massaging this to be something close to maintainable, why not just wait until 10.0? Doesn't seem worth the trouble to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would love to drop this. I was just trying to provide an upgrade path. Ideally:

  • a user upgrades to 9.2. They now get a deprecation warning, but their code continues to work. There is another API available they can switch to without the deprecation warning.
  • They upgrade to 10.0. The old API disappears

Without this compat gnarliness, then

  • a user upgrades to 9.2 They see no change
  • they upgrade to 10.0. Their code breaks, and won't work until they switch over to the new API.

As a user I definitely prefer the first experience, but if we provide nice actionable error messages then the second experience is definitely bearable. Should we go for the 2nd experience?

Similarly, should we keep the old CaseBuilder stuff around, or should we also hard-delete it all at once and provide an actionable error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's good to separate out the possible breaking changes. This is what I propose (which I think you agree with IIUC):

  • ibis.case(): we add the deprecation in 10.0, add a test that the old API raises a deprecation but still gives the right result. This isn't too hard to continue support. We fully remove in 11.0
  • Value.case(): Same as above
  • Value.cases(): remove this _norm_cases_args() gnarliness. This will just be hard breaking change for users when they upgrade 9.x to 10.0. No warning or upgrade path, sorry folks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct.

I will look into making these changes in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit I just pushed, I think we end up with the behavior above:

  • removed the _norm_cases_args() gnarliness
  • kept the deprecation warnings for ibis.case() and Value.case(). I also am fine with removing the deprecation, letting past users keep using them with no problem, but just not surfacing them in the docs or otherwise advertising them.

ibis/expr/types/generic.py Outdated Show resolved Hide resolved
ibis/expr/types/numeric.py Show resolved Hide resolved
@NickCrews NickCrews force-pushed the case-to-cases branch 2 times, most recently from c33860b to e0fa647 Compare July 12, 2024 17:09
NickCrews added a commit to NickCrews/ibis that referenced this pull request Jul 12, 2024
This is pinning down the expected behavior for cases before tackling
the case() to cases() switch in
ibis-project#9096
so that PR can be simpler
NickCrews added a commit to NickCrews/ibis that referenced this pull request Jul 12, 2024
Prepping for ibis-project#9096

- moves several .ifelse() and .substitute() tests from ibis/backens/test/test_generic.py into a new test_conditionals.py
- moves some .case() tests from the pandas and dask backends into there, so the other backends are now tested as well, and removed duplication.
- adds a test case that `ibis.literal(5).nullif(5).case().when(None, "oops").else_("expected")` results in "expected". Not saying this is actually the best behavior,
but it is currently what we claim to be expected. Perhaps we want to revisit.
@github-actions github-actions bot added docs Documentation related issues or PRs tests Issues or PRs related to tests impala The Apache Impala backend clickhouse The ClickHouse backend snowflake The Snowflake backend labels Oct 7, 2024
@NickCrews NickCrews force-pushed the case-to-cases branch 3 times, most recently from 7e28c8a to 8956150 Compare October 8, 2024 18:56
@NickCrews
Copy link
Contributor Author

NickCrews commented Oct 8, 2024

@cpcloud looks like the docs CI is failing just for permissions issues, so I think this is good to merge! (If you confirm then in the future I will be confident to just merge if I see only this CI failure)

@cpcloud
Copy link
Member

cpcloud commented Oct 9, 2024

Please do not merge anything that has CI failures.

This job in particular is testing whether the documentation can be built, and that definitely is a hard requirement for merging a PR.

ibis/expr/types/generic.py Outdated Show resolved Hide resolved
@cpcloud cpcloud force-pushed the case-to-cases branch 2 times, most recently from 32d20da to aad95b8 Compare October 9, 2024 09:24
@cpcloud cpcloud added this to the 10.0 milestone Oct 9, 2024
@cpcloud cpcloud force-pushed the case-to-cases branch 2 times, most recently from 99ecc43 to be79239 Compare October 9, 2024 09:57
@cpcloud cpcloud added the deprecation Issues or PRs related to deprecating APIs label Oct 9, 2024
@cpcloud
Copy link
Member

cpcloud commented Oct 9, 2024

I put up #10290 to address the docs build failure.

.else_("NA")
.end()
),
improvements=_.raw_improvements.cases(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the files in this directory have pre-computed outputs, and if you change them, they'll rerun.

In this case, this causes a failure, because it's trying to authenticate to BigQuery (I'm guessing browser-based auth is first in the credential chain, and thus the message about a missing browser).

Even if the job did have permission to render this quarto document, it would still fail, because we verify that for documents that are "frozen" that they aren't being re-rendered unintentionally.

When stuff like this fails, it's not something that should really ever just be merged, hence the required property of the docs build check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so the lesson for me is, if I change something in docs/posts/, I should re-run it locally before pushing. for this bigquery notebook, I won't be able to do this since I don't have creds, but for the rest I should be good to go

@cpcloud cpcloud merged commit 54889db into ibis-project:main Oct 9, 2024
90 checks passed
@NickCrews NickCrews added the breaking change Changes that introduce an API break at any level label Oct 9, 2024
@NickCrews
Copy link
Contributor Author

@cpcloud I added the breaking change label to this PR, since it does hard-break Value.cases(), as described in the OP at the top.

@cpcloud
Copy link
Member

cpcloud commented Oct 9, 2024

Ah oops. Well, we'll have to add it manually after the release as it's already merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that introduce an API break at any level clickhouse The ClickHouse backend deprecation Issues or PRs related to deprecating APIs docs Documentation related issues or PRs impala The Apache Impala backend snowflake The Snowflake backend tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants