- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
Description
After submitting my changes to my current PR (#62502 ), I noticed that one of the code checks (Pyodide) failed.
The pickle tests (test_pickle) try to read in a temporary pickle file containing a Categorical array (produced earlier in each test), but fail with NotImplementedError`.
After some investigation, I noticed that this error is raised in a Cython function that is passed a tuple with two entries but expects either one or three.
I also found another PR where this check fails for the same reason.
Here is an example of a failing test output:
__________________________________________ test_round_trip_current[cat-expected39-python_pickler-pandas_proto_5] ___________________________________________
typ = 'cat'
expected = [0, 1, 2, 3, 4, ..., 9995, 9996, 9997, 9998, 9999]
Length: 10000
Categories (10000, int64): [0, 1, 2, 3, ..., 9996, 9997, 9998, 9999]
pickle_writer = functools.partial(<function to_pickle at 0x7f92e2f2a520>, protocol=5), writer = <function python_pickler at 0x7f92db703ba0>
temp_file = PosixPath('/tmp/pytest-of-aija/pytest-0/test_round_trip_current_cat_ex29/7281f1e5-0c7b-4f7b-858a-12178d7dc0be')
    @pytest.mark.parametrize(
        "pickle_writer",
        [
            pytest.param(python_pickler, id="python"),
            pytest.param(pd.to_pickle, id="pandas_proto_default"),
            pytest.param(
                functools.partial(pd.to_pickle, protocol=pickle.HIGHEST_PROTOCOL),
                id="pandas_proto_highest",
            ),
            pytest.param(functools.partial(pd.to_pickle, protocol=4), id="pandas_proto_4"),
            pytest.param(
                functools.partial(pd.to_pickle, protocol=5),
                id="pandas_proto_5",
            ),
        ],
    )
    @pytest.mark.parametrize("writer", [pd.to_pickle, python_pickler])
    @pytest.mark.parametrize("typ, expected", flatten(create_pickle_data()))
    def test_round_trip_current(typ, expected, pickle_writer, writer, temp_file):
        path = temp_file
        # test writing with each pickler
        pickle_writer(expected, path)
        # test reading with each unpickler
>       result = pd.read_pickle(path)
                 ^^^^^^^^^^^^^^^^^^^^
pandas/tests/io/test_pickle.py:174:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/io/pickle.py:208: in read_pickle
    return pickle.load(handles.handle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pandas/core/arrays/categorical.py:1775: in __setstate__
    return super().__setstate__(state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pandas/_libs/arrays.pyx:85: in pandas._libs.arrays.NDArrayBacked.__setstate__
    cpdef __setstate__(self, state):
Metadata
Metadata
Assignees
Labels
No labels