Using a key for the heapq functions #106570

amrear · 2023-07-09T13:15:23Z

Feature or enhancement

The ability to use a key in order to heappop, heappush, heapify, etc. can come in handy in situations where the objects that are being pushed to the heap are not comparable. This functionality is similar to the key argument of the sorted function. It is up to the developer to keep the key argument consistent where more than one heapq function is called on a single heap. This functionality can be propagated to the PriorityQueue of the queue library which uses heapq under the hood.

rhettinger · 2023-07-09T15:43:03Z

Internally, the heapq functions do many comparisons per call and we would end-up calling the key function many times for the same key even in a sing heapify call. This isn't efficient and is at odds with the expectation (from sorted, min, max, nsmallest, nlargest, merge, and groupby) that a key function will be called at most once per key.

Instead of a key function, the docs recommend, "the solution to the problem of non-comparable tasks is to create a wrapper class that ignores the task item and only compares the priority field". The docs then provides this simple recipe:

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any=field(compare=False)

That allows the priorities to be made explicit, for the priority to be computed only once, for the priority to be made visible via the __repr__, and for the basic heap algorithms to be used as-is.

If we went down the path of adding a key-function, it creates an unexpected efficiency trap and creates a need to add caching to the key-function which has its own challenges.

Also, by having the user create heaps with prioritized inputs, it is easy to have multiple heaps, each with a different key (heapify by date or heapify by cost). In contrast, a key-function would be error-prone, creating the risk of heapifying with one key-function and popping/pushing with another.

Note that we did add key functions for nsmallest, nlargest, and merge because it was possible to avoid the above issues by pre-computing the key function exactly once per key. We intentionally excluded the heappush and heappop functions because they are too fine-grained and low level to incomporate compute-once logic.

amrear added the type-feature A feature request or enhancement label Jul 9, 2023

rhettinger closed this as not planned Won't fix, can't repro, duplicate, stale Jul 9, 2023

picnixz mentioned this issue Nov 27, 2024

Title: Add Optional Comparator Support to heapq for Enhanced Flexibility #127328

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using a key for the heapq functions #106570

Using a key for the heapq functions #106570

amrear commented Jul 9, 2023

rhettinger commented Jul 9, 2023 •

edited

Loading

Using a key for the heapq functions #106570

Using a key for the heapq functions #106570

Comments

amrear commented Jul 9, 2023

Feature or enhancement

rhettinger commented Jul 9, 2023 • edited Loading

rhettinger commented Jul 9, 2023 •

edited

Loading