Simplify stdlib code by using `itertools.batched()` #126317

lgeiger · 2024-11-01T23:40:02Z

#98364 introduced itertools.batched() but since it's pretty new it's still rarely used inside the standard library.

pickle.py includes a custom version multiple times which all could be replaced by itertools.batched() for improved readability.
This probably has a negligible performance impact (at the very least it won't be slower) but should make the code easier to understand.

Let me know if you're interested in a contribution for this or whether you'd rather not have the extra churn for a minor change like this.

Linked PRs

gh-126317: Simplify stdlib code by using itertools.batched() #126323

The text was updated successfully, but these errors were encountered:

picnixz · 2024-11-02T00:40:39Z

I'm not sure we want to refactor pickle.py, especially not the pure Python implementation. But I'll let this decision fall to @serhiy-storchaka.

serhiy-storchaka · 2024-11-02T07:52:52Z

There is no great need in such change, but it makes the code slightly smaller, so I have no objections. Please show the results of microbenchmarks for pickling large list, dict and set. Even a small speedup would be an additional argument for this change. A slowdown would be a sign that there is something wrong with itertools.batched().

dongwooklee96 · 2024-11-02T08:56:47Z

bench.py

import pickle
import pyperf

large_list = list(range(10**6))  # 1,000,000 list
large_dict = {str(i): i for i in range(10**6)}  # 1,000,000 dict
large_set = set(range(10**6))  # 1,000,000 set

def pickle_save(data, filename):
    with open(filename, "wb") as f:
        pickle.dump(data, f)

def pickle_load(filename):
    with open(filename, "rb") as f:
        return pickle.load(f)

def run_benchmarks():
    runner = pyperf.Runner()

    # Pickle dump test
    runner.bench_func("pickle_save_large_list", pickle_save, large_list, "large_list.pkl")
    runner.bench_func("pickle_save_large_dict", pickle_save, large_dict, "large_dict.pkl")
    runner.bench_func("pickle_save_large_set", pickle_save, large_set, "large_set.pkl")

    # Pickle load test
    runner.bench_func("pickle_load_large_list", pickle_load, "large_list.pkl")
    runner.bench_func("pickle_load_large_dict", pickle_load, "large_dict.pkl")
    runner.bench_func("pickle_load_large_set", pickle_load, "large_set.pkl")

if __name__ == "__main__":
    run_benchmarks()

result

Benchmark hidden because not significant (6): pickle_save_large_list, pickle_save_large_dict, pickle_save_large_set, pickle_load_large_list, pickle_load_large_dict, pickle_load_large_set

The result of running the benchmark using pyperf.

I'm sorry, my results are a little different from what I've done before, so I'm guessing there's a difference in configure. If you get a different result, I'd appreciate it if you could let me know.

serhiy-storchaka · 2024-11-02T14:09:13Z

Thank you for your contribution @dongwooklee96.

…ythonGH-126323)

picnixz added type-feature A feature request or enhancement stdlib Python modules in the Lib dir labels Nov 2, 2024

picnixz added this to Pickle and copy issues 🥒 Nov 2, 2024

dongwooklee96 added a commit to dongwooklee96/cpython that referenced this issue Nov 2, 2024

pythongh-126317: Simplify stdlib code by using itertools.batched()

11eb1ee

bedevere-app bot mentioned this issue Nov 2, 2024

gh-126317: Simplify stdlib code by using itertools.batched() #126323

Merged

dongwooklee96 added a commit to dongwooklee96/cpython that referenced this issue Nov 2, 2024

pythongh-126317: Simplify stdlib code by using itertools.batched()

4fa459b

dongwooklee96 added a commit to dongwooklee96/cpython that referenced this issue Nov 2, 2024

pythongh-126317: Fix to avoid calling the len function twice

21d7b01

serhiy-storchaka pushed a commit that referenced this issue Nov 2, 2024

gh-126317: Simplify pickle code by using itertools.batched() (GH-126323)

bd4be5e

serhiy-storchaka closed this as completed Nov 2, 2024

github-project-automation bot moved this to Done in Pickle and copy issues 🥒 Nov 2, 2024

github-actions bot mentioned this issue Dec 1, 2024

Monthly issue metrics report hugovk/test#88

Closed

picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024

pythongh-126317: Simplify pickle code by using itertools.batched() (p…

6c5227d

…ythonGH-126323)

ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025

pythongh-126317: Simplify pickle code by using itertools.batched() (p…

96db8d0

…ythonGH-126323)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify stdlib code by using `itertools.batched()` #126317

Simplify stdlib code by using `itertools.batched()` #126317

lgeiger commented Nov 1, 2024 •

edited by bedevere-app bot

Loading

picnixz commented Nov 2, 2024 •

edited

Loading

serhiy-storchaka commented Nov 2, 2024

dongwooklee96 commented Nov 2, 2024 •

edited

Loading

serhiy-storchaka commented Nov 2, 2024

Simplify stdlib code by using itertools.batched() #126317

Simplify stdlib code by using itertools.batched() #126317

Comments

lgeiger commented Nov 1, 2024 • edited by bedevere-app bot Loading

Linked PRs

picnixz commented Nov 2, 2024 • edited Loading

serhiy-storchaka commented Nov 2, 2024

dongwooklee96 commented Nov 2, 2024 • edited Loading

serhiy-storchaka commented Nov 2, 2024

Simplify stdlib code by using `itertools.batched()` #126317

Simplify stdlib code by using `itertools.batched()` #126317

lgeiger commented Nov 1, 2024 •

edited by bedevere-app bot

Loading

picnixz commented Nov 2, 2024 •

edited

Loading

dongwooklee96 commented Nov 2, 2024 •

edited

Loading