"TypeError: unorderable types" in Python3 when index values are dict keys of tuples or tuples with non-homogeneous dtypes #22077

toobaz · 2018-07-27T06:34:06Z

Code Sample, a copy-pastable example if possible

In [2]: from collections import OrderedDict
In [3]: param_index = OrderedDict([((('a', 'b'), ('c', 'd')), 1),
   ...:                            ((('a', None), ('c', 'd')), 2),
   ...:                           ])
   ...: 

In [4]: pd.Series([1, 2], index=param_index.keys())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    634         try:
--> 635             order = uniques.argsort()
    636             order2 = order.argsort()

TypeError: unorderable types: NoneType() < str()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/sorting.py in safe_sort(values, labels, na_sentinel, assume_unique)
    450         try:
--> 451             sorter = values.argsort()
    452             ordered = values.take(sorter)

TypeError: unorderable types: NoneType() < str()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    397             try:
--> 398                 codes, categories = factorize(values, sort=True)
    399             except TypeError:

~/nobackup/repo/pandas/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    177                     kwargs[new_arg_name] = new_arg_value
--> 178             return func(*args, **kwargs)
    179         return wrapper

~/nobackup/repo/pandas/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    642                                         na_sentinel=na_sentinel,
--> 643                                         assume_unique=True)
    644 

~/nobackup/repo/pandas/pandas/core/sorting.py in safe_sort(values, labels, na_sentinel, assume_unique)
    454             # try this anyway
--> 455             ordered = sort_mixed(values)
    456 

~/nobackup/repo/pandas/pandas/core/sorting.py in sort_mixed(values)
    440                            dtype=bool)
--> 441         nums = np.sort(values[~str_pos])
    442         strs = np.sort(values[str_pos])

~/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py in sort(a, axis, kind, order)
    846         a = asanyarray(a).copy(order="K")
--> 847     a.sort(axis=axis, kind=kind, order=order)
    848     return a

TypeError: unorderable types: NoneType() < str()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-4-2fff0a9c0f74> in <module>()
----> 1 pd.Series([1, 2], index=param_index.keys())

~/nobackup/repo/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    191 
    192             if index is not None:
--> 193                 index = ensure_index(index)
    194 
    195             if data is None:

~/nobackup/repo/pandas/pandas/core/indexes/base.py in ensure_index(index_like, copy)
   5006             index_like = copy(index_like)
   5007 
-> 5008     return Index(index_like)
   5009 
   5010 

~/nobackup/repo/pandas/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    448                     from .multi import MultiIndex
    449                     return MultiIndex.from_tuples(
--> 450                         data, names=name or kwargs.get('names'))
    451             # other iterable of some kind
    452             subarr = com.asarray_tuplesafe(data, dtype=object)

~/nobackup/repo/pandas/pandas/core/indexes/multi.py in from_tuples(cls, tuples, sortorder, names)
   1333             arrays = lzip(*tuples)
   1334 
-> 1335         return MultiIndex.from_arrays(arrays, sortorder=sortorder, names=names)
   1336 
   1337     @classmethod

~/nobackup/repo/pandas/pandas/core/indexes/multi.py in from_arrays(cls, arrays, sortorder, names)
   1277         from pandas.core.arrays.categorical import _factorize_from_iterables
   1278 
-> 1279         labels, levels = _factorize_from_iterables(arrays)
   1280         if names is None:
   1281             names = [getattr(arr, "name", None) for arr in arrays]

~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in _factorize_from_iterables(iterables)
   2549         # For consistency, it should return a list of 2 lists.
   2550         return [[], []]
-> 2551     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in <listcomp>(.0)
   2549         # For consistency, it should return a list of 2 lists.
   2550         return [[], []]
-> 2551     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in _factorize_from_iterable(values)
   2521         codes = values.codes
   2522     else:
-> 2523         cat = Categorical(values, ordered=True)
   2524         categories = cat.categories
   2525         codes = cat.codes

~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    402                     # raise, as we don't have a sortable data structure and so
    403                     # the user should give us one by specifying categories
--> 404                     raise TypeError("'values' is not ordered, please "
    405                                     "explicitly specify the categories order "
    406                                     "by passing in a categories argument.")

TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.

Problem description

The above is a simplified version of the example in this comment - and both of them used to work (I tested in 0.19.0+git14-ga40e185, @jolespin tested in 0.22). Creating this separate issue because #15457 itself is not a regression.

Notice the error changes if you replace param_index.keys() with list(param_index.keys()) (but stays the same if you just replace the OrderedDict with an ordinary dict).

Expected Output

In 0.19.0+git14-ga40e185:

In [4]: pd.Series([1, 2], index=param_index.keys())
Out[4]: 
((a, b), (c, d))       1
((a, None), (c, d))    2
dtype: int64

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+360.g24fd90f66
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.28.4
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1634.dev0+ge8120cf6d
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1
gcsfs: None

The text was updated successfully, but these errors were encountered:

evfro · 2018-08-07T09:23:49Z

having the same problem

mroeschke · 2021-06-21T01:24:26Z

The latest result on master looks okay to me (coercing to a MutiIndex instead of having a flat Index). Could use a test

In [13]: In [2]: from collections import OrderedDict
    ...: In [3]: param_index = OrderedDict([((('a', 'b'), ('c', 'd')), 1),
    ...:    ...:                            ((('a', None), ('c', 'd')), 2),
    ...:    ...:                           ])
    ...:    ...:
    ...:
    ...: In [4]: pd.Series([1, 2], index=param_index.keys())
Out[13]:
(a, b)     (c, d)    1
(a, None)  (c, d)    2
dtype: int64

devdattakhoche · 2022-01-07T09:39:50Z

@mroeschke, @toobaz can we close this issue, I think its its fixed and a informative error message seems to show when we do pass multi dimensional index in Series .

ValueError: Index data must be 1-dimensional

#15457

jreback · 2022-01-07T09:46:39Z

would take a PR with a test like the OP

devdattakhoche · 2022-01-07T10:04:50Z

would take a PR with a test like the OP

@jreback Can you elaborate, I didn't got you ? What do we need test for here ? I didn't understand 'OP' here ?

devdattakhoche · 2022-01-07T15:00:45Z

I am willing to contribute here, Can I know what is required ?

toobaz added Regression Functionality that used to work in a prior pandas version 2/3 Compat MultiIndex labels Jul 27, 2018

toobaz mentioned this issue Jul 27, 2018

"TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int #15457

Closed

jbrockmendel added the Constructors Series/DataFrame/Index/pd.array Constructors label Jul 23, 2019

mroeschke added Bug and removed 2/3 Compat labels Apr 4, 2020

jbrockmendel added the Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). label Sep 22, 2020

phofl mentioned this issue Apr 9, 2023

Add regression tests noatamir/pyladies-workshop#7

Closed

26 tasks

sortofamudkip mentioned this issue Apr 18, 2023

TST: Test for index of dict keys as tuples #52758

Merged

5 tasks

mroeschke closed this as completed in #52758 Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"TypeError: unorderable types" in Python3 when index values are dict keys of tuples or tuples with non-homogeneous dtypes #22077

"TypeError: unorderable types" in Python3 when index values are dict keys of tuples or tuples with non-homogeneous dtypes #22077

toobaz commented Jul 27, 2018 •

edited by mroeschke

Loading

INSTALLED VERSIONS

evfro commented Aug 7, 2018

mroeschke commented Jun 21, 2021

devdattakhoche commented Jan 7, 2022 •

edited

Loading

jreback commented Jan 7, 2022

devdattakhoche commented Jan 7, 2022 •

edited

Loading

devdattakhoche commented Jan 7, 2022

"TypeError: unorderable types" in Python3 when index values are dict keys of tuples or tuples with non-homogeneous dtypes #22077

"TypeError: unorderable types" in Python3 when index values are dict keys of tuples or tuples with non-homogeneous dtypes #22077

Comments

toobaz commented Jul 27, 2018 • edited by mroeschke Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

evfro commented Aug 7, 2018

mroeschke commented Jun 21, 2021

devdattakhoche commented Jan 7, 2022 • edited Loading

jreback commented Jan 7, 2022

devdattakhoche commented Jan 7, 2022 • edited Loading

devdattakhoche commented Jan 7, 2022

toobaz commented Jul 27, 2018 •

edited by mroeschke

Loading

Output of `pd.show_versions()`

devdattakhoche commented Jan 7, 2022 •

edited

Loading

devdattakhoche commented Jan 7, 2022 •

edited

Loading