-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Title: Add Optional Comparator Support to heapq for Enhanced Flexibility #127328
Comments
I think we rejected this idea in #106570 (I didn't go through the dpo post though so maybe we can revisit this now). cc @rhettinger |
Oh, okay. I read his argument for rejection, and I have a few questions.
It would be an optional argument and used only by those who are ready to accept the performance penalty.
Why do we care? It's an optional parameter, and people are worried about the invalidation of the heap invariant on Discourse as well. However, this is not a language construct used by everyone where we need to worry about code flow invalidation. It’s an optional thing, and users should know what it will do to their code.
What does "too low-level and fine-grained" mean here? These are public APIs of heapq. |
This kind of mindset could work for third party libs but I'm not sure the standard library aims to be flexible here. Note that heaps in other languages such as C++ do allow you to specify your own comparison operator so I personally would like this feature in Python as well.
Keep in mind that the code to modify is not only the pure Python code but also the C code which then requires additional handling. We could say "ok here's the very flexible API and deal with it" but it appears that we don't want to be too flexible.
Either it's because it's meant to rely on C accelerators or it's because it's to say that the operation is a low-level one in the heapq algorithm (think of it as an "basic operation" like an addition or a multiplication when you're doing complex arithmetic). I'll let Raymond confirm my understandings. |
fair enough. got you.
can I see one example? is it an extension-level thing or an interpreter/runtime? |
Yes this is what I meant. Maintenance of the C code is something we also consider when adding new features. Usually the pure Python code is only there as a fallback and the C part is usually the one that dictates whether there will be a maintenance burden or not. |
Feature or enhancement
Proposal:
Currently, the heapq module in CPython uses a fixed implementation based on the default comparison behavior of Python objects. However, this design restricts the usability of heapq in scenarios where users need a custom ordering for their data.
Proposal
Introduce an optional comparator parameter to the heapq module to allow greater flexibility in heap operations. This would eliminate the need for users to create additional wrapper objects or manage unnecessary cognitive overhead for simple use cases (e.g., custom ordering for coordinates, tuples, or complex data types).
Motivation
1 Reduce Cognitive Load: Users currently must wrap data in objects with custom dunder methods (lt, etc.) or use workarounds like storing tuples with artificially ordered keys. This is an unnecessary complexity for scenarios that require only custom sorting logic.
2 Consistency: Other priority queue implementations in programming languages (e.g., Java's PriorityQueue or C++'s std::priority_queue) support custom comparators directly.
3 Maybe Flexibility
4 Instead of explicitly including extra information like distance in the tuple (as in the Dijkstra example), we could pass the raw data along with a comparator. The comparator would determine which element to pop and where to place it in the heap based on the desired priority.
Add an optional key or cmp argument to the following functions in the heapq module:
1 heapify
2 heappush
3 heappop
4 heappushpop
5 heapreplace
6 merge
The comparator could function similarly to the key argument in sorted(), where users provide a callable to define custom ordering logic.
Example Usage: Coordinates
Example Usage: Dijkstra’s Algorithm
Dijkstra’s algorithm for finding the shortest path in a graph often requires a priority queue with custom ordering based on the distance from the source node. The lack of a built-in comparator in heapq makes the implementation less intuitive.
In this example:
The heap uses a tuple of (distance, node) to prioritize nodes based on their distance from the source.
Adding an optional comparator would eliminate the need for tuples and make the code cleaner and more intuitive.
The discussion on Discourse is quite messy, so I created this issue here to gather more focused feedback from developers involved in CPython development, especially the professors(I guess they will agree to it :)). If there’s agreement on this approach, I’m happy to work on implementing it, as I’ve already tested it by modifying the .py file.
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
Discussion happened here
https://discuss.python.org/t/create-new-package-similar-to-heapq-but-be-able-to-pass-custom-comparator-through-a-constructor/72136
The text was updated successfully, but these errors were encountered: