-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] bpo-39465: _PyUnicode_FromId() now uses an hash table #20048
Conversation
Result of bench.py microbenchmark of https://bugs.python.org/issue39465:
Reading an hashtable entry is more than 6x slower than just just reading an object attribute: 15 ns vs 2 ns. |
Context for these numbers:
_PyUnicode_FromId("abc") with this change (15.1 ns) remains 2.4x faster than PyUnicode_FromString (35.8 ns) and 5.9x faster than PyUnicode_InternFromString (89.8 ns) |
With PR #20051 and one extra optimization, I optimized this PR to: |
I optimized the code. Now the overhead of this PR is way smaller. Microbenchmark using LTO compilation:
It adds 2.33 nanoseconds per _PyUnicode_FromId() call. It's hard to bet |
cc @ericsnowcurrently @pablogsal @serhiy-storchaka @methane: Do you think that such slowdown is acceptable? The intent of this PR is to prepare The next step is to have per-interpreter hash table. I'm targeting Python 3.10, not Python 3.9. |
_Py_hashtable_get() in _PyUnicode_FormId() calls
Machine code on x86-64 on Fedora 32 with gcc (GCC) 10.0.1 20200430 (Red Hat 10.0.1-0.14), compilation using LTO. It contains many NOP, likely for better code placement :-) |
Could it be could be done with an array of identifiers instead of a hash table? |
I would use not hastables, but continuous arrays. At first access assign an index for It will still much slower than the current code. Would be nice to add a compile time option to disable subinterpreters. |
+1 on avoiding a hash table. What is the set of strings used with Regardless, the same effect could be made (in the same way) using a table/struct holding all these known identifiers instead of having them spread all over as statics (and we could even get rid of |
It might be interesting to look at how MicroPython interns strings. There's a preprocessing step before C compilation, and new ones can also be added dynamically. |
The number of all constants is limited, but it is hard to determine it at compile time. And the code that use the compile-time registry of constant would be non-readable. Also, it would be inefficient to initialize all constants at the interpreter start if most of them will not be used. It is better to build the list of constants at runtime. |
It is better to build the objects on demand, but would it be worth it to allocate space for them at the beginning, and use build-time-constant indexes into the array? |
+1 This is more what I was saying. I didn't mean to initialize all the constants at start (just like we do not currently with |
+1 |
I like the idea of an array and assign a global unique id when an identifier is initialized. |
The problem of a generic API (I am not only talking about Py_IDENTIFIER here) which uses an array for per-interpreter variables is the memory usage if only a small part of variables are used. The advantage is that reading a variable is more efficient than using a hash table. For Py_IDENTIFIER, in the benchmark, the hash table only has 138 entries (256 buckets), so an array is reasonable. |
PR #20058 implements the array idea. The performance overhead is way lower, and it makes _PyUnicode_FromId() compatible with subinterpreters:
|
Rewrote _Py_Identifier structure and _PyUnicode_FromId() function to store Python objects in an hash table rather than a single-linked list. Add _PyUnicode_PreInit() to create the hash table: it must be called before the first PyType_Ready() call.
I failed to make the hashtable as fast as an array, so I close this PR in favor of PR #20058 which uses an array. |
Rewrote _Py_Identifier structure and _PyUnicode_FromId() function to
store Python objects in an hash table rather than a single-linked
list.
Add _PyUnicode_PreInit() to create the hash table: it must be called
before the first PyType_Ready() call.
https://bugs.python.org/issue39465