You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using a subclass setup that accesses the columns of the dataframe inside __init__ we are finding that pandas=1.5.x,1.4.x is much slower than than pandas=1.3.5/2.0.0.dev. This looks to be resolved in the most recent nightlies so, if not already tested, could some additional testing be done to ensure no further regressions in this area?
Timings by pandas version:
1.3.5 - 2ms
1.4.4 - 1.84s
1.5.2 - 2.2s
2.0.0 (nightly build) - 1.6ms
Expected Behavior
dropna runs at the same or faster than 1.3.5/2.0.0.dev speeds
Installed Versions
INSTALLED VERSIONS
commit : 66e3805
python : 3.10.8.final.0
python-bits : 64
OS : Darwin
OS-release : 22.2.0
Version : Darwin Kernel Version 22.2.0: Fri Nov 11 02:03:51 PST 2022; root:xnu-8792.61.2~4/RELEASE_ARM64_T6000
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_CA.UTF-8
LOCALE : en_CA.UTF-8
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the [main branch] (https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.
Reproducible Example
Issue Description
Using a subclass setup that accesses the columns of the dataframe inside
__init__
we are finding that pandas=1.5.x,1.4.x is much slower than than pandas=1.3.5/2.0.0.dev. This looks to be resolved in the most recent nightlies so, if not already tested, could some additional testing be done to ensure no further regressions in this area?Timings by pandas version:
1.3.5 - 2ms
1.4.4 - 1.84s
1.5.2 - 2.2s
2.0.0 (nightly build) - 1.6ms
Expected Behavior
dropna
runs at the same or faster than 1.3.5/2.0.0.dev speedsInstalled Versions
INSTALLED VERSIONS
commit : 66e3805
python : 3.10.8.final.0
python-bits : 64
OS : Darwin
OS-release : 22.2.0
Version : Darwin Kernel Version 22.2.0: Fri Nov 11 02:03:51 PST 2022; root:xnu-8792.61.2~4/RELEASE_ARM64_T6000
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_CA.UTF-8
LOCALE : en_CA.UTF-8
pandas : 1.3.5
numpy : 1.23.5
pytz : 2022.7
dateutil : 2.8.2
pip : 22.3.1
setuptools : 59.8.0
Cython : 0.29.33
pytest : 6.2.5
hypothesis : 6.62.0
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.2
html5lib : None
pymysql : None
psycopg2 : 2.9.3 (dt dec pq3 ext lo64)
jinja2 : 3.1.2
IPython : 8.8.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
fsspec : 2022.11.0
fastparquet : None
gcsfs : None
matplotlib : 3.5.3
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 10.0.1
pyxlsb : None
s3fs : None
scipy : 1.10.0
sqlalchemy : 1.4.46
tables : 3.7.0
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : None
numba : 0.56.4
The text was updated successfully, but these errors were encountered: