Multidicts does not preserve order on update #68

lysyloren · 2017-04-29T15:40:56Z

When updating existing key in multidict, it is moved to the end (original order is not preserved):

>>> d = CIMultiDict()
>>> d['1'] = '2'
>>> d['3'] = '4'
>>> d
<CIMultiDict('1': '2', '3': '4')>
>>> d['1'] = '10'
>>> d
<CIMultiDict('3': '4', '1': '10')>

In above example, guarantee of preserving insertion order should leave existing key '1' on its original first position.

The text was updated successfully, but these errors were encountered:

asvetlov · 2017-04-29T23:00:11Z

I don't think so.
Updating for key '1' means deleting existing key-value pair from multidict and adding a new pair '1': '10'.

Thus the behavior is totally correct.

lysyloren · 2017-04-30T19:29:17Z

I disagree. Updating for key '1' means updating existing key-value pair (not deleting it and adding new one). Please note, that this is not implementation detail point of view, but it is how end user see and use it.

Please also compare multidict ordering with: collections.OrderedDict and new Python 3.6 "compact" dict implementation. Both preserve insertion key order on update.

Now multidict behavior is last update preserving order not insertion preserving order and if you decide to leave this behavior, this difference should be clearly stated documentation. Now it is confusing for user familiar with OrderedDict or "compact" dict, because multidict ordering behavior is different from ordering behavior in these objects from standard library.

asvetlov · 2017-05-11T12:17:00Z

Th problem is that multidict allows multiple values for the same key.
For d = MultiDict([(1, 'a'), (2, 'b'), (1, 'c')]) the order is (1, 'a'), (2, 'b'), (1, 'c').
Please note that key 1 is appearing twice.

What order should be after d[1] = 'd' operation?

I guess (2, 'b'), (1, 'd). The ordering is clear if we use delete-and-insert operation but is not clean for updating.

Update first occurrence and delete all others? I have no intuitive clean solution.

asvetlov · 2017-05-21T08:16:18Z

After talking with people at US PyCon I've changed my mind.
Yes, the order should be preserved and behavior must be close to OrderedDict as much as possible.
It means we should replace the last key-value pair and delete all others.
I believe the change will not break any existing code except some unit tests which are explicitly depends on the order in their assertions.

@lysyloren would you make a Pull Request for it?

lysyloren · 2017-05-24T16:38:08Z

@lysyloren would you make a Pull Request for it?

Sorry, I have couple of urgent projects and no time for making this Pull Request for now.

Beside of this, the implementation of multidict is now based on list with O(n) complexity. I think, that faster and more reliable implementation may now be based on "compact" dict on Python 3.6 (and OrderedDict on Python < 3.6 for backward compatibility). I realize, that it will require rewrite most of the code, but new code should be simpler, probably Cython version will not be required (because of speed of build in dict), and the most important: complexicity will be O(1).

asvetlov · 2017-05-25T03:52:27Z

Hmm, it's complicated question.

At first we should support Python 3.4 and 3.5. We'll drop 3.4 this Autumn or Winter most likely but 3.5 remains the showstopper.
Sure, I can just grab an implementation from 3.6 but you know -- supporting C code is hard and potentially error-prone. For me much easier to keep on Cython level.

At second we use multidicts mostly for relative small dictionaries: HTTP headers and data provided by GET and POST requests.
For this cases O(N) is not worse than O(1) assuming the dict size is less than 30 elements usually.
But I've learned from Raymond's awesome talk (http://pyvideo.org/pycon-us-2017/modern-python-dictionaries-a-confluence-of-a-dozen-great-ideas.html) that keeping hash values in the list explicitly could improve performance very well.
Also we could get rid of using list internally but provide a custom implementation with delaying list shrinking until an amount of deleted items doesn't hit a limit.

All these approaches make sense to be applied but we need to test them and measure performance carefully.

asvetlov · 2017-06-08T15:12:48Z

Fixed by #87

asvetlov closed this as completed Apr 29, 2017

asvetlov reopened this May 21, 2017

asvetlov mentioned this issue Jun 7, 2017

Fix update order #87

Merged

asvetlov closed this as completed Jun 8, 2017

akhomchenko mentioned this issue Oct 6, 2017

ItemsView behavior #159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multidicts does not preserve order on update #68

Multidicts does not preserve order on update #68

lysyloren commented Apr 29, 2017

asvetlov commented Apr 29, 2017

lysyloren commented Apr 30, 2017

asvetlov commented May 11, 2017

asvetlov commented May 21, 2017

lysyloren commented May 24, 2017

asvetlov commented May 25, 2017

asvetlov commented Jun 8, 2017

Multidicts does not preserve order on update #68

Multidicts does not preserve order on update #68

Comments

lysyloren commented Apr 29, 2017

asvetlov commented Apr 29, 2017

lysyloren commented Apr 30, 2017

asvetlov commented May 11, 2017

asvetlov commented May 21, 2017

lysyloren commented May 24, 2017

asvetlov commented May 25, 2017

asvetlov commented Jun 8, 2017