You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ability to use a key in order to heappop, heappush, heapify, etc. can come in handy in situations where the objects that are being pushed to the heap are not comparable. This functionality is similar to the key argument of the sorted function. It is up to the developer to keep the key argument consistent where more than one heapq function is called on a single heap. This functionality can be propagated to the PriorityQueue of the queue library which uses heapq under the hood.
The text was updated successfully, but these errors were encountered:
Internally, the heapq functions do many comparisons per call and we would end-up calling the key function many times for the same key even in a sing heapify call. This isn't efficient and is at odds with the expectation (from sorted, min, max, nsmallest, nlargest, merge, and groupby) that a key function will be called at most once per key.
Instead of a key function, the docs recommend, "the solution to the problem of non-comparable tasks is to create a wrapper class that ignores the task item and only compares the priority field". The docs then provides this simple recipe:
@dataclass(order=True)
class PrioritizedItem:
priority: int
item: Any=field(compare=False)
That allows the priorities to be made explicit, for the priority to be computed only once, for the priority to be made visible via the __repr__, and for the basic heap algorithms to be used as-is.
If we went down the path of adding a key-function, it creates an unexpected efficiency trap and creates a need to add caching to the key-function which has its own challenges.
Also, by having the user create heaps with prioritized inputs, it is easy to have multiple heaps, each with a different key (heapify by date or heapify by cost). In contrast, a key-function would be error-prone, creating the risk of heapifying with one key-function and popping/pushing with another.
Note that we did add key functions for nsmallest, nlargest, and merge because it was possible to avoid the above issues by pre-computing the key function exactly once per key. We intentionally excluded the heappush and heappop functions because they are too fine-grained and low level to incomporate compute-once logic.
Feature or enhancement
The ability to use a key in order to heappop, heappush, heapify, etc. can come in handy in situations where the objects that are being pushed to the heap are not comparable. This functionality is similar to the key argument of the sorted function. It is up to the developer to keep the key argument consistent where more than one heapq function is called on a single heap. This functionality can be propagated to the PriorityQueue of the queue library which uses heapq under the hood.
The text was updated successfully, but these errors were encountered: