Skip to content

Commit

Permalink
Reallow non-callable key_fn in .cluster_objects()
Browse files Browse the repository at this point in the history
Resolves #691

Thanks to @jfuruness for flagging.
  • Loading branch information
jsvine committed Oct 1, 2022
1 parent 6d71c2e commit 1e97656
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ All notable changes to this project will be documented in this file. The format
## Fixed

- Fixed issue where `py.typed` file was not included in PyPi distribution. ([#698](https://github.com/jsvine/pdfplumber/issues/698) + [#703](https://github.com/jsvine/pdfplumber/pull/703)) [h/t @jhonatan-lopes]
- Reinstated the ability to call `utils.cluster_objects(...)` with any hashable value (`str`, `int`, `tuple`, etc.) as the `key_fn` parameter, reverting breaking change in [58b1ab1](https://github.com/jsvine/pdfplumber/commit/58b1ab1). ([#691](https://github.com/jsvine/pdfplumber/issues/691)) [h/t @jfuruness]

### Development Changes

Expand Down
7 changes: 5 additions & 2 deletions pdfplumber/utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import itertools
import re
import string
from collections.abc import Sequence
from collections.abc import Hashable, Sequence
from operator import itemgetter
from typing import (
TYPE_CHECKING,
Expand Down Expand Up @@ -68,9 +68,12 @@ def make_cluster_dict(values: Iterable[T_num], tolerance: T_num) -> Dict[T_num,


def cluster_objects(
xs: List[R], key_fn: Callable[[R], T_num], tolerance: T_num
xs: List[R], key_fn: Union[Hashable, Callable[[R], T_num]], tolerance: T_num
) -> List[List[R]]:

if not callable(key_fn):
key_fn = itemgetter(key_fn)

values = map(key_fn, xs)
cluster_dict = make_cluster_dict(values, tolerance)

Expand Down
4 changes: 4 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ def test_cluster_objects(self):
a = ["a", "ab", "abc", "b"]
assert utils.cluster_objects(a, len, 0) == [["a", "b"], ["ab"], ["abc"]]

b = [{"x": 1, 7: "a"}, {"x": 1, 7: "b"}, {"x": 2, 7: "b"}, {"x": 2, 7: "b"}]
assert utils.cluster_objects(b, "x", 0) == [[b[0], b[1]], [b[2], b[3]]]
assert utils.cluster_objects(b, 7, 0) == [[b[0]], [b[1], b[2], b[3]]]

def test_resolve(self):
annot = self.pdf.annots[0]
annot_ad0 = utils.resolve(annot["data"]["A"]["D"][0])
Expand Down

0 comments on commit 1e97656

Please sign in to comment.