Skip to content

Conversation

@ueshin
Copy link
Member

@ueshin ueshin commented Jul 24, 2024

What changes were proposed in this pull request?

Allows bare literals for __and__ and __or__ of Column API in Spark Classic.

Why are the changes needed?

Currently bare literals are not allowed for __and__ and __or__ of Column API in Spark Classic and need to wrap with lit() function. It should be allowed similar to other similar operators.

>>> from pyspark.sql.functions import *
>>> c = col("c")
>>> c & True
Traceback (most recent call last):
...
py4j.Py4JException: Method and([class java.lang.Boolean]) does not exist

>>> c & lit(True)
Column<'and(c, true)'>

whereas other operators:

>>> c + 1
Column<'`+`(c, 1)'>
>>> c + lit(1)
Column<'`+`(c, 1)'>

Spark Connect allows this.

>>> c & True
Column<'and(c, True)'>
>>> c & lit(True)
Column<'and(c, True)'>

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Added the related tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@ueshin ueshin requested a review from HyukjinKwon July 24, 2024 23:03
@ueshin ueshin changed the title [SPARK-48996][PYTHON] Allow bare literals for __and__ and __or__ of Column [SPARK-48996][SQL][PYTHON] Allow bare literals for __and__ and __or__ of Column Jul 24, 2024
@ueshin
Copy link
Member Author

ueshin commented Jul 25, 2024

The failure seems not related to this PR.

@ueshin
Copy link
Member Author

ueshin commented Jul 25, 2024

Thanks! merging to master.

@ueshin ueshin closed this in 78b83fa Jul 25, 2024
ilicmarkodb pushed a commit to ilicmarkodb/spark that referenced this pull request Jul 29, 2024
… of Column

### What changes were proposed in this pull request?

Allows bare literals for `__and__` and `__or__` of Column API in Spark Classic.

### Why are the changes needed?

Currently bare literals are not allowed for `__and__` and `__or__` of Column API in Spark Classic and need to wrap with `lit()` function. It should be allowed similar to other similar operators.

```py
>>> from pyspark.sql.functions import *
>>> c = col("c")
>>> c & True
Traceback (most recent call last):
...
py4j.Py4JException: Method and([class java.lang.Boolean]) does not exist

>>> c & lit(True)
Column<'and(c, true)'>
```

whereas other operators:

```py
>>> c + 1
Column<'`+`(c, 1)'>
>>> c + lit(1)
Column<'`+`(c, 1)'>
```

Spark Connect allows this.

```py
>>> c & True
Column<'and(c, True)'>
>>> c & lit(True)
Column<'and(c, True)'>
```

### Does this PR introduce _any_ user-facing change?

Yes.

### How was this patch tested?

Added the related tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47474 from ueshin/issues/SPARK-48996/literal_and_or.

Authored-by: Takuya Ueshin <ueshin@databricks.com>
Signed-off-by: Takuya Ueshin <ueshin@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants