Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-120496: Make enum_iter thread safe #120591

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Jun 16, 2024

We make enum_iter thread-safe using a critical section. We use the same approach as in #119438 to allow for exits in the middle of the critical section. The method enum_iter_long is guarded by the critical section from enum_long.

Without making enum_iter thread safe problems can occur:

  • For enumerate(range(10)) pairs (n, m) with n unqual to m can be generated
  • When iterating over sys.maxsize, e.g. enumerate(range(sys.maxsize - 10, sys.maxsize + 10)) an overflow can occur.
  • As an optimization enum_iter keeps track of the returned tuple. In the free-threaded build multiple threads can operate on the same tuple.

To determine the overhead of making enumerate thread-safe here are some benchmarking results for a single-threaded application. Performance when actually using multiple-threads to readout the enumerate is considered less important.

Benchmark script:

import pyperf
runner = pyperf.Runner()

setup = """

def weighted_sum(k):
    value = 0
    for n, m in enumerate(k):
        value += n * m 
    return value

range10 = range(10)
range1000 = range(1000)
x = list(range(10))
"""

runner.timeit(name="enumerate(range10)", stmt="enumerate(range10)", setup=setup)
runner.timeit(name="list(enumerate(range10))", stmt="list(enumerate(range10))", setup=setup)
runner.timeit(name="list(enumerate(range1000))", stmt="list(enumerate(range1000))", setup=setup)
runner.timeit(name="weighted_sum", stmt="weighted_sum(x)", setup=setup)

Results of main versus free-threading:

enumerate(range10): Mean +- std dev: [enumerate_main] 73.2 ns +- 3.5 ns -> [enumerate_ft] 94.1 ns +- 3.5 ns: 1.29x slower
list(enumerate(range10)): Mean +- std dev: [enumerate_main] 292 ns +- 10 ns -> [enumerate_ft] 351 ns +- 19 ns: 1.20x slower
list(enumerate(range1000)): Mean +- std dev: [enumerate_main] 31.1 us +- 1.6 us -> [enumerate_ft] 32.5 us +- 1.4 us: 1.04x slower
weighted_sum: Mean +- std dev: [enumerate_main] 433 ns +- 17 ns -> [enumerate_ft] 938 ns +- 49 ns: 2.16x slower

Geometric mean: 1.37x slower

Results of free-threading vs. free-threading with this PR:

list(enumerate(range10)): Mean +- std dev: [enumerate_ft] 351 ns +- 19 ns -> [enumerate_ft_pr] 410 ns +- 18 ns: 1.17x slower
list(enumerate(range1000)): Mean +- std dev: [enumerate_ft] 32.5 us +- 1.4 us -> [enumerate_ft_pr] 40.9 us +- 1.7 us: 1.26x slower

Benchmark hidden because not significant (2): enumerate(range10), weighted_sum

Geometric mean: 1.10x slower

The case enumerate(range10) is not affected by the PR (used to check the benchmarking is stable). The cases list(enumerate(range(x))) slow down a bit. More representative is perhaps the weighted_sum benchmark where the enumerate is used a for loop with a minimal amount of work. There the overhead of the locking not significant.

@eendebakpt eendebakpt changed the title gh-120496: make enum_iter thread safe gh-120496: Make enum_iter thread safe Jun 16, 2024
}
Py_END_CRITICAL_SECTION();

if (reuse_result) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we hold the only references to result this can be outside the critical section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant