BUG: `pd.isnull` treats `list` and `tuple` input differently #52283

jrbourbeau · 2023-03-29T19:47:13Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
print(f"{pd.isnull([1, 2, 3]) = }")
print(f"{pd.isnull((1, 2, 3)) = }")

Issue Description

It looks like pd.isnull is treating list as array-like and tuple as a scalar

Expected Behavior

I'd expect lists and tuples to be treated similarly by pd.isnull. Similar to other parts of the API like pd.Series

Installed Versions

INSTALLED VERSIONS
------------------
commit           : 2e218d10984e9919f0296931d92ea851c6a6faf5
python           : 3.9.15.final.0
python-bits      : 64
OS               : Darwin
OS-release       : 22.3.0
Version          : Darwin Kernel Version 22.3.0: Mon Jan 30 20:42:11 PST 2023; root:xnu-8792.81.3~2/RELEASE_X86_64
machine          : x86_64
processor        : i386
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.5.3
numpy            : 1.24.0
pytz             : 2022.6
dateutil         : 2.8.2
setuptools       : 59.8.0
pip              : 22.3.1
Cython           : None
pytest           : 7.2.0
hypothesis       : None
sphinx           : 4.5.0
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : 1.1
pymysql          : None
psycopg2         : None
jinja2           : 3.1.2
IPython          : 8.7.0
pandas_datareader: None
bs4              : 4.11.1
bottleneck       : None
brotli           :
fastparquet      : 2023.2.0
fsspec           : 2022.11.0
gcsfs            : None
matplotlib       : 3.6.2
numba            : None
numexpr          : 2.8.3
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 11.0.0
pyreadstat       : None
pyxlsb           : None
s3fs             : 2022.11.0
scipy            : 1.9.3
snappy           :
sqlalchemy       : 1.4.46
tables           : 3.7.0
tabulate         : None
xarray           : 2022.9.0
xlrd             : None
xlwt             : None
zstandard        : None
tzdata           : None

The text was updated successfully, but these errors were encountered:

DeaMariaLeon · 2023-03-30T14:55:30Z

This function returns a boolean or array-like of bool. I'll keep the "bug" label just in case, but I don't think it is one.

https://pandas.pydata.org/docs/reference/api/pandas.isnull.html

jrbourbeau · 2023-03-30T15:00:54Z

Thanks @DeaMariaLeon. The thing that seems off to me is pd.isnull is treating lists as array-like (returning a array-like of bools) and tuples as scalar (returning a bool)

In [1]: import pandas as pd

In [2]: pd.isnull([1, pd.NA, 3])
Out[2]: array([False,  True, False])

In [3]: pd.isnull((1, pd.NA, 3))
Out[3]: False

My expectation is that both lists and tuples should be treated as array-like. Though feel free to let me know if that expectation is incorrect

DeaMariaLeon · 2023-03-30T15:39:41Z

Oh, I see! Thank you for opening an issue. :)

phofl · 2023-03-30T17:11:51Z

This is an edge case I think.

You can end up with tuples from a MultiIndex for example. In this scenario we want to treat the tuple as a single element, e.g.

df.drop(columns=(1, 2))

treats the tuple as a single element. I think this is similar here although it does not really look intuitive to me either.

rhshadrach · 2023-03-31T04:35:51Z

Interestingly in a list, tuples are treated as array-like:

obj = [(1.0, 2.0), (1.0, np.nan), (np.nan, 2.0), (np.nan, np.nan)]
print(pd.isnull(obj))
# [[False False]
#  [False  True]
#  [ True False]
#  [ True  True]]

jorisvandenbossche · 2023-03-31T11:30:03Z

treating lists as array-like (returning a array-like of bools) and tuples as scalar (returning a bool)

As far as I remember, in the past we made this distinction (in certain places) because tuples can be labels, as Patrick mentioned.

But it's indeed a tricky situation, with easy confusion and corner cases (a quick search for "tuple list label" gives quite some related issues). For example #43978 for the drop example.

Another example in indexing where the two are distinguished and have different behaviour:

>>> s = pd.Series(range(6), index=pd.MultiIndex.from_product([[1, 2, 3], [1, 2]]))
>>> s.loc[(1, 2)]  # tuple is a single label
1
>>> s.loc[[1, 2]]  # list is an indexer (in this case for the first level of the MultiIndex)
1  1    0
   2    1
2  1    2
   2    3
dtype: int64

jrbourbeau added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 29, 2023

DeaMariaLeon added Usage Question Closing Candidate May be closeable, needs more eyeballs Bug and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 30, 2023

DeaMariaLeon added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Usage Question Closing Candidate May be closeable, needs more eyeballs labels Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: `pd.isnull` treats `list` and `tuple` input differently #52283

BUG: `pd.isnull` treats `list` and `tuple` input differently #52283

jrbourbeau commented Mar 29, 2023

DeaMariaLeon commented Mar 30, 2023 •

edited

Loading

Uh oh!

jrbourbeau commented Mar 30, 2023

Uh oh!

DeaMariaLeon commented Mar 30, 2023

Uh oh!

phofl commented Mar 30, 2023

Uh oh!

rhshadrach commented Mar 31, 2023

Uh oh!

jorisvandenbossche commented Mar 31, 2023

Uh oh!

Uh oh!

BUG: pd.isnull treats list and tuple input differently #52283

BUG: pd.isnull treats list and tuple input differently #52283

Comments

jrbourbeau commented Mar 29, 2023

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

DeaMariaLeon commented Mar 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrbourbeau commented Mar 30, 2023

Uh oh!

DeaMariaLeon commented Mar 30, 2023

Uh oh!

phofl commented Mar 30, 2023

Uh oh!

rhshadrach commented Mar 31, 2023

Uh oh!

jorisvandenbossche commented Mar 31, 2023

Uh oh!

BUG: `pd.isnull` treats `list` and `tuple` input differently #52283

BUG: `pd.isnull` treats `list` and `tuple` input differently #52283

DeaMariaLeon commented Mar 30, 2023 •

edited

Loading