Skip to content

Commit

Permalink
FEAT-#4989: Switch pandas version to 1.5 (#5037)
Browse files Browse the repository at this point in the history
Co-authored-by: Mahesh Vashishtha <mvashishtha@users.noreply.github.com>
Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
Co-authored-by: Jonathan Shi <jhshi07@gmail.com>
Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
  • Loading branch information
5 people authored Oct 5, 2022
1 parent 02f1927 commit 7871c7b
Show file tree
Hide file tree
Showing 53 changed files with 1,538 additions and 436 deletions.
5 changes: 2 additions & 3 deletions docs/development/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -192,9 +192,8 @@ To build the documentation, please follow the steps below from the project root:

.. code-block:: bash
cd docs
pip install -r requirements-doc.txt
sphinx-build -b html . build
pip install -r docs/requirements-doc.txt
sphinx-build -b html docs docs/build
To visualize the documentation locally, run the following from `build` folder:

Expand Down
1 change: 1 addition & 0 deletions docs/release_notes/release_notes-0.16.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ Key Features and Updates
* FEAT-#4733: Support fastparquet as engine for `read_parquet` (#4807)
* FEAT-#4766: Support fsspec URLs in `read_csv` and `read_csv_glob` (#4898)
* FEAT-#4827: Implement `infer_types` dataframe algebra operator (#4871)
* FEAT-#4989: Switch pandas version to 1.5 (#5037)

Contributors
------------
Expand Down
4 changes: 3 additions & 1 deletion docs/requirements-doc.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# install current modin checkout to bring all required dependencies
.[all]
# now install some more optional dependencies
colorama
click
flatbuffers
Expand All @@ -10,7 +13,6 @@ recommonmark
sphinx
sphinx-click
ray[default]>=1.4.0
git+https://github.com/modin-project/modin.git@master#egg=modin[all]
# Override to latest version of modin-spreadsheet
git+https://github.com/modin-project/modin-spreadsheet.git@49ffd89f683f54c311867d602c55443fb11bf2a5
sphinxcontrib_plantuml
Expand Down
25 changes: 17 additions & 8 deletions docs/supported_apis/dataframe_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,18 +84,19 @@ default to pandas.
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``copy`` | `copy`_ | Y | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``corr`` | `corr`_ | Y | Correlation floating point precision may slightly |
| ``corr`` | `corr`_ | P | Correlation floating point precision may slightly |
| | | | differ from pandas. For now pearson method is |
| | | | available only. |
| | | | For other methods defaults to pandas |
| | | | available only. For other methods and for |
| | | | ``numeric_only`` defaults to pandas. |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``corrwith`` | `corrwith`_ | D | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``count`` | `count`_ | Y | **Hdk**: ``P``, only default params supported, |
| | | | otherwise ``D`` |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``cov`` | `cov`_ | Y | Covariance floating point precision may slightly |
| | | | differ from pandas |
| ``cov`` | `cov`_ | P | Covariance floating point precision may slightly |
| | | | differ from pandas. For ``numeric_only`` |
| | | | defaults to pandas. |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``cummax`` | `cummax`_ | Y | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
Expand Down Expand Up @@ -193,6 +194,8 @@ default to pandas.
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``interpolate`` | `interpolate`_ | D | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``isetitem`` | `isetitem`_ | D | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``isin`` | `isin`_ | Y | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``isna`` | `isna`_ | Y | |
Expand All @@ -207,8 +210,8 @@ default to pandas.
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``itertuples`` | `itertuples`_ | P | Modin does not parallelize iteration in Python |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``join`` | `join`_ | P | When ``on`` is set to ``right`` or ``outer`` |
| | | | it defaults to pandas |
| ``join`` | `join`_ | P | When ``on`` is set to ``right`` or ``outer`` or |
| | | | when ``validate`` is given defaults to pandas |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``keys`` | `keys`_ | Y | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
Expand Down Expand Up @@ -328,7 +331,9 @@ default to pandas.
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``resample`` | `resample`_ | Y | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``reset_index`` | `reset_index`_ | Y | **Hdk**: ``P``. ``D`` for ``level`` parameter |
| ``reset_index`` | `reset_index`_ | P | **Hdk**: ``P``. ``D`` for ``level`` parameter |
| | | | **Ray** and **Dask**: ``D`` when ``names`` or |
| | | | ``allow_duplicates`` is non-default |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``rfloordiv`` | `rfloordiv`_ | Y | See ``add``; **Hdk**: ``D`` |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
Expand Down Expand Up @@ -418,6 +423,8 @@ default to pandas.
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``to_latex`` | `to_latex`_ | D | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``to_orc`` | `to_orc`_ | D | |
+----------------------------+---------------------------+------------------------+----------------------------------------------------+
| ``to_parquet`` | `to_parquet`_ | P | **Dask**: Defaults to Pandas implementation and |
| | | | writes a single output file. |
| | | | **Ray**: Parallel implementation only if path |
Expand Down Expand Up @@ -556,6 +563,7 @@ default to pandas.
.. _`insert`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.insert.html#pandas.DataFrame.insert
.. _`interpolate`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html#pandas.DataFrame.interpolate
.. _`is_copy`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.is_copy.html#pandas.DataFrame.is_copy
.. _`isetitem`: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.isetitem.html?#pandas-dataframe-isetitem
.. _`isin`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html#pandas.DataFrame.isin
.. _`isna`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isna.html#pandas.DataFrame.isna
.. _`isnull`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isnull.html#pandas.DataFrame.isnull
Expand Down Expand Up @@ -659,6 +667,7 @@ default to pandas.
.. _`to_html`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_html.html#pandas.DataFrame.to_html
.. _`to_json`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html#pandas.DataFrame.to_json
.. _`to_latex`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_latex.html#pandas.DataFrame.to_latex
.. _`to_orc`: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_orc.html#pandas.DataFrame.to_orc
.. _`to_parquet`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet
.. _`to_period`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_period.html#pandas.DataFrame.to_period
.. _`to_pickle`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html#pandas.DataFrame.to_pickle
Expand Down
4 changes: 3 additions & 1 deletion docs/supported_apis/series_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,9 @@ the related section on :doc:`Defaulting to pandas </supported_apis/index>`.
+-----------------------------+---------------------------------+----------------------------------------------------+
| ``resample`` | Y | |
+-----------------------------+---------------------------------+----------------------------------------------------+
| ``reset_index`` | Y | **Hdk**: ``P``. ``D`` for ``level`` parameter |
| ``reset_index`` | P | **Hdk**: ``P``. ``D`` for ``level`` parameter |
| | | **Ray** and **Dask**: ``D`` when ``names`` or |
| | | ``allow_duplicates`` is non-default |
+-----------------------------+---------------------------------+----------------------------------------------------+
| ``rfloordiv`` | Y | See ``add``; **Hdk**: ``D`` |
+-----------------------------+---------------------------------+----------------------------------------------------+
Expand Down
3 changes: 3 additions & 0 deletions docs/supported_apis/utilities_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ default to pandas.
+---------------------------+---------------------------------+----------------------------------------------------+
| `pd.factorize`_ | D | |
+---------------------------+---------------------------------+----------------------------------------------------+
| `pd.from_dummies`_ | D | |
+---------------------------+---------------------------------+----------------------------------------------------+
| `pd.qcut`_ | D | |
+---------------------------+---------------------------------+----------------------------------------------------+
| ``pd.match`` | D | |
Expand Down Expand Up @@ -112,6 +114,7 @@ contributing a distributed version of any of these objects, feel free to open a
.. _`pd.cut`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html#pandas.cut
.. _`pd.to_numeric`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_numeric.html#pandas.to_numeric
.. _`pd.factorize`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.factorize.html#pandas.factorize
.. _`pd.from_dummies`: https://pandas.pydata.org/docs/reference/api/pandas.from_dummies.html#pandas-from-dummies
.. _`pd.qcut`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.qcut.html#pandas.qcut
.. _`pd.to_datetime`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html#pandas.to_datetime
.. _`pd.get_dummies`: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html#pandas.get_dummies
Expand Down
2 changes: 1 addition & 1 deletion environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: modin
channels:
- conda-forge
dependencies:
- pandas==1.4.4
- pandas==1.5.0
- numpy>=1.18.5
- pyarrow>=4.0.1
- dask[complete]>=2.22.0
Expand Down
2 changes: 1 addition & 1 deletion modin/_compat/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ class PandasCompatVersion:

if version.parse("1.1.0") <= pandas_version <= version.parse("1.1.5"):
CURRENT = PY36
elif version.parse("1.4.0") <= pandas_version <= version.parse("1.4.99"):
elif version.parse("1.5.0") <= pandas_version < version.parse("1.6"):
CURRENT = LATEST
else:
raise ImportError(f"Unsupported pandas version: {pandas.__version__}")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,49 @@

from pandas.io.common import get_handle
from pandas.core.apply import reconstruct_func
from pandas import DataFrame as pandas_DataFrame
from pandas.errors import DataError, SpecificationError


def pd_pivot_table(df, **kwargs): # noqa: PR01, RT01
def pandas_pivot_table(df, **kwargs): # noqa: PR01, RT01
"""Perform pandas pivot_table against a dataframe."""
return df.pivot_table(**kwargs)


def pd_convert_dtypes(df, **kwargs): # noqa: PR01, RT01
def pandas_convert_dtypes(df, **kwargs): # noqa: PR01, RT01
"""Perform pandas convert_dtypes against a dataframe or series."""
return df.convert_dtypes(**kwargs)


__all__ = ["get_handle", "pd_pivot_table", "pd_convert_dtypes", "reconstruct_func"]
def pandas_compare(df, **kwargs): # noqa: PR01, RT01
"""Perform pandas compare against a dataframe or series."""
return df.compare(**kwargs)


def pandas_dataframe_join(df, other, **kwargs): # noqa: PR01, RT01
"""Perform pandas DataFrame.join against a dataframe or series."""
return pandas_DataFrame.join(df, other, **kwargs)


def pandas_reset_index(df, **kwargs): # noqa: PR01, RT01
"""Perform pandas reset_index against a dataframe or series."""
return pandas_DataFrame.reset_index(df, **kwargs)


def pandas_to_csv(df, **kwargs): # noqa: PR01, RT01
"""Perform pandas to_csv against a dataframe or series."""
return df.to_csv(**kwargs)


__all__ = [
"get_handle",
"pandas_pivot_table",
"pandas_convert_dtypes",
"pandas_compare",
"pandas_dataframe_join",
"reconstruct_func",
"pandas_reset_index",
"pandas_to_csv",
"DataError",
"SpecificationError",
]
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,43 @@
from modin._compat import PandasCompatVersion

if PandasCompatVersion.CURRENT == PandasCompatVersion.PY36:
from .py36.pd_common import (
from .py36.pandas_common import (
get_handle,
pd_pivot_table,
pd_convert_dtypes,
pandas_pivot_table,
pandas_convert_dtypes,
pandas_compare,
pandas_dataframe_join,
reconstruct_func,
pandas_reset_index,
pandas_to_csv,
DataError,
SpecificationError,
)


elif PandasCompatVersion.CURRENT == PandasCompatVersion.LATEST:
from .latest.pd_common import (
from .latest.pandas_common import (
get_handle,
pd_pivot_table,
pd_convert_dtypes,
pandas_pivot_table,
pandas_convert_dtypes,
pandas_compare,
pandas_dataframe_join,
reconstruct_func,
pandas_reset_index,
pandas_to_csv,
DataError,
SpecificationError,
)

__all__ = ["get_handle", "pd_pivot_table", "pd_convert_dtypes", "reconstruct_func"]
__all__ = [
"get_handle",
"pandas_pivot_table",
"pandas_convert_dtypes",
"pandas_compare",
"pandas_dataframe_join",
"reconstruct_func",
"pandas_reset_index",
"pandas_to_csv",
"DataError",
"SpecificationError",
]
Loading

0 comments on commit 7871c7b

Please sign in to comment.