Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible usage of itemgetter(key) #26

Open
filak opened this issue Sep 13, 2023 · 1 comment
Open

Possible usage of itemgetter(key) #26

filak opened this issue Sep 13, 2023 · 1 comment

Comments

@filak
Copy link

filak commented Sep 13, 2023

Thank you for a great library!

I would like to sort a list of dicts by multiple values - some possibly in Unicode - sample program:

from pyuca import Collator
from operator import itemgetter
coll = Collator()

def multisort_list_of_dicts(xs, specs):
    for key, reverse, col in reversed(specs):   
        if col:
            xs.sort(key=coll.sort_key(itemgetter(key)), reverse=reverse)
        else:    
            xs.sort(key=itemgetter(key), reverse=reverse)
    return xs

data = [{'k1': 'd', 'k2': 10},{'k1': 'č', 'k2': 10},{'k1': 'a', 'k2': 20},{'k1': 'a', 'k2': 10}]

Standard sorting:

sort_spec = (('k2', False, False), ('k1', False, False))

for item in multisort_list_of_dicts(data, sort_spec):
    print(item)
  • works as expected - with the accented chars shifted after normal chars:
{'k1': 'a', 'k2': 10}
{'k1': 'd', 'k2': 10}
{'k1': 'č', 'k2': 10}
{'k1': 'a', 'k2': 20}

Unicode sorting for k1:

sort_spec = (('k2', False, False), ('k1', False, True))

for item in multisort_list_of_dicts(data, sort_spec):
    print(item)
  • this obviously cannot work, because sort_key() expects a string:

    def sort_key(self, string):

  File "C:\Programs\Python\Python311\Lib\site-packages\pyuca\collator.py", line 119, in sort_key
    normalized_string = unicodedata.normalize("NFD", string)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: normalize() argument 2 must be str, not operator.itemgetter

How to get the desired output with pyuca ?

{'k1': 'a', 'k2': 10}
{'k1': 'č', 'k2': 10}
{'k1': 'd', 'k2': 10}
{'k1': 'a', 'k2': 20}

How to possibly handle the itemgetter input in pyuca ?

@filak
Copy link
Author

filak commented Sep 13, 2023

Sure there is a workaround - pre-populate the data with sort_key() values. But this might not be always optimal/feasible.

from operator import itemgetter
from pyuca import Collator
coll = Collator()

def multisort_list_of_dicts(xs, specs):
    for key, reverse in reversed(specs):
        xs.sort(key=itemgetter(key), reverse=reverse)
    return xs

data = [{'k1': 'd', 'k2': 10},{'k1': 'č', 'k2': 10},{'k1': 'a', 'k2': 20},{'k1': 'a', 'k2': 10}]

data_sortable = []

for d in data:
     d['ks'] = coll.sort_key( d['k1'] )
     data_sortable.append(d)

sort_spec = (('k2', False), ('ks', False))

for item in multisort_list_of_dicts(data_sortable, sort_spec):
    print(item)

Output:

{'k1': 'a', 'k2': 10, 'ks': (7239, 0, 32, 0, 2, 0)}
{'k1': 'č', 'k2': 10, 'ks': (7290, 0, 32, 40, 0, 2, 2, 0)}
{'k1': 'd', 'k2': 10, 'ks': (7311, 0, 32, 0, 2, 0)}
{'k1': 'a', 'k2': 20, 'ks': (7239, 0, 32, 0, 2, 0)}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant