Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-116738: Make _codecs module thread-safe #117530

Merged
merged 7 commits into from
May 2, 2024

Conversation

swtaarrs
Copy link
Member

@swtaarrs swtaarrs commented Apr 4, 2024

The module itself is a thin wrapper around calls to functions in Python/codecs.c, so that's where the meaningful changes happened:

  • Move codecs-related state that lives on PyInterpreterState to a struct declared in pycore_codecs.h.

  • In free-threaded builds, add two mutexes to codecs_state, one to synchronize operations on search_path, and one to synchronize initialization.

  • _PyCodecRegistry_Init() is now _PyCodecRegistry_EnsureInit(), and is called unconditionally to ensure initialization, rather than only when initialization is needed. When it returns 0 (as opposed to -1 to indicate an error), the PyObject * members of codecs_state can be read without further synchronization.

  • Operations that mutate codecs.search_path must be performed while holding codecs.search_path_mutex. This allows PyCodec_Unregister() to search for and delete a specific item from the list. Because search_path_mutex is used as a normal mutex and not a critical section, we must be extremely careful with operations called while holding it.

  • Issue: Audit all built-in modules for thread safety #116738

@swtaarrs swtaarrs marked this pull request as ready for review April 4, 2024 17:49
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. Two comments below

Python/codecs.c Outdated Show resolved Hide resolved
Python/codecs.c Outdated Show resolved Hide resolved
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. A few comments below.

Python/codecs.c Show resolved Hide resolved
Python/pystate.c Outdated Show resolved Hide resolved
Objects/object.c Outdated Show resolved Hide resolved
@swtaarrs
Copy link
Member Author

This should be ready to go as of my last commit - let me know if any other changes are needed.

@colesbury colesbury merged commit f8290df into python:main May 2, 2024
36 checks passed
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:

- Move codecs-related state that lives on `PyInterpreterState` to a
  struct declared in `pycore_codecs.h`.

- In free-threaded builds, add a mutex to `codecs_state` to synchronize
  operations on `search_path`. Because `search_path_mutex` is used as a
  normal mutex and not a critical section, we must be extremely careful
  with operations called while holding it.

- The codec registry is explicitly initialized as part of
  `_PyUnicode_InitEncodings` to simplify thread-safety.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) skip news topic-free-threading
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants