Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace OrderedDict with plain dict #33508

Merged
merged 2 commits into from
Aug 20, 2023
Merged

Replace OrderedDict with plain dict #33508

merged 2 commits into from
Aug 20, 2023

Conversation

eumiro
Copy link
Contributor

@eumiro eumiro commented Aug 18, 2023

Python dicts are ordered, so we do not need to use collections.OrderedDict.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:providers area:Scheduler including HA (high availability) scheduler provider:amazon-aws AWS/Amazon - related issues provider:apache-hive provider:apache-sqoop provider:google Google (including GCP) related issues provider:http labels Aug 18, 2023
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, python dict is ordered since 3.7

@potiuk
Copy link
Member

potiuk commented Aug 19, 2023

Needs to solv conflicts with main.

@uranusjr
Copy link
Member

Personally I still like to use OrderedDict, especially for things that explicitly depend on the ordering, such as the executor queue. But I guess this is subjective.

@potiuk
Copy link
Member

potiuk commented Aug 20, 2023

Personally I still like to use OrderedDict, especially for things that explicitly depend on the ordering, such as the executor queue. But I guess this is subjective.

I used to think the same, but after moving to 3.7 and then 3.8, it's already past the Python minor version where it's been not only working like that (3.6) but also official (3.7). And I think it's better to use it when it really matters - especially that there are cases we potentially want to to specifically use OrderedDict (and ones that are highly unlikely to be incorporated into regular dict).

After adding reversed to a regular dict in 3.8 (which means we can use it now), there are still two things that really make a difference between regular and Ordered dicts that matters.

  • Ability to re-arrange the order with move_to_end() which can move items to end (or beginning surprisingly) in an efficient way without reordeding the whole dict - which makes it better to do any kind of LRU caching.

  • Equality comparision - when we want to check if the two dicts have the same order when comparing them.

>>> OrderedDict([(1,1), (2,2)]) == OrderedDict([(2,2), (1,1)])
False
>>> dict([(1,1), (2,2)]) == dict([(2,2), (1,1)])
True

So I guess it's better to use OrderedDict when we want to make use of one of those properties.

I looked through the changes and does not seem either order-sensitive equality of re-arranging is used/intended to be used :).

@potiuk potiuk merged commit 63e6eab into apache:main Aug 20, 2023
42 checks passed
@eumiro eumiro deleted the ordereddict branch August 20, 2023 15:49
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Aug 27, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.7.1 milestone Aug 27, 2023
ephraimbuddy pushed a commit that referenced this pull request Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:providers area:Scheduler including HA (high availability) scheduler provider:amazon-aws AWS/Amazon - related issues provider:apache-hive provider:apache-sqoop provider:google Google (including GCP) related issues provider:http type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants