Skip to content

Fix TSS key exhaustion in implicitly_convertible()#6020

Merged
rwgk merged 1 commit intopybind:masterfrom
rwgk:implicitly_convertible_back_to_thread_local
Mar 29, 2026
Merged

Fix TSS key exhaustion in implicitly_convertible()#6020
rwgk merged 1 commit intopybind:masterfrom
rwgk:implicitly_convertible_back_to_thread_local

Conversation

@rwgk
Copy link
Copy Markdown
Collaborator

@rwgk rwgk commented Mar 28, 2026

Description

Fixes #5975

PR #5777 (included in v3.0.1) replaced the static bool / thread_local bool reentrancy guard in implicitly_convertible() with static thread_specific_storage<int> to make it sub-interpreter and free-threading safe. Because implicitly_convertible is a template function parametrized on <InputType, OutputType>, every unique type pair instantiation creates its own static thread_specific_storage, each of which allocates a TSS key via PyThread_tss_create().

Projects with hundreds of modules and many implicit conversions (the reporters have 590+ and 1000+ PYBIND11_MODULEs) exhaust the OS TSS key limit (PTHREAD_KEYS_MAX: 1024 on Linux, 512 on macOS). The problem manifests on Python 3.12+ because CPython itself consumes more TSS keys for subinterpreter support, reducing the budget available to user code.

The fix replaces static thread_specific_storage<int> with thread_local bool:

  • Thread-safe: each thread gets its own copy, so it handles free-threading correctly.
  • No TSS key allocation: thread_local uses compiler/OS TLS storage (e.g., __thread segment on Linux), which has effectively unlimited capacity.
  • Subinterpreter sharing is benign: the guard only prevents recursive implicit conversions on the same thread — the correct behavior regardless of which interpreter is active.
  • Trivially destructible: bool does not require __cxa_thread_atexit runtime support, so it works on all C++11 platforms including older macOS targets. (This was the concern that motivated using TSS in fix: make implicitly_convertable sub-interpreter and free-threading safe #5777, but it only applies to types with non-trivial destructors.)
  • Precedent: the v3.0.0 code already used thread_local bool under #ifdef Py_GIL_DISABLED.

The non-copyable/non-movable set_flag RAII guard (added in #5777) is retained but simplified back to wrapping bool&.

Suggested changelog entry:

  • Fixed TSS key exhaustion when using many implicit conversions across hundreds of modules, caused by per-template-instantiation thread_specific_storage allocation in implicitly_convertible().

Replace `static thread_specific_storage<int>` with `thread_local bool`
in the implicit conversion reentrancy guard. Since implicitly_convertible
is a template function, each unique <InputType, OutputType> pair created
its own TSS key via PyThread_tss_create(). Projects with hundreds of
modules and many implicit conversions could exhaust PTHREAD_KEYS_MAX
(1024 on Linux, 512 on macOS), especially on Python 3.12+ where CPython
itself consumes more TSS keys for subinterpreter support.

thread_local bool is safe here because:
- bool is trivially destructible, so it works on all C++11 platforms
  including older macOS (the concern that motivated the TSS approach in
  PR pybind#5777 applied only to types with non-trivial destructors needing
  __cxa_thread_atexit runtime support)
- Each thread gets its own copy, so it is thread-safe for free-threading
- Subinterpreter sharing is benign: the guard prevents recursive implicit
  conversions on the same thread regardless of which interpreter is active
- The v3.0.0 code already used thread_local bool under Py_GIL_DISABLED

This effectively reverts the core change from PR pybind#5777 while keeping
the non-copyable/non-movable set_flag guard.

Made-with: Cursor
@rwgk rwgk requested a review from b-pass March 29, 2026 05:29
Copy link
Copy Markdown
Collaborator

@b-pass b-pass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Mar 29, 2026

Thanks @b-pass

@rwgk rwgk merged commit 70b6fd3 into pybind:master Mar 29, 2026
89 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Mar 29, 2026
@rwgk rwgk deleted the implicitly_convertible_back_to_thread_local branch March 29, 2026 15:03
rwgk added a commit to rwgk/pybind11 that referenced this pull request Mar 30, 2026
…ybind#6020)

Replace `static thread_specific_storage<int>` with `thread_local bool`
in the implicit conversion reentrancy guard. Since implicitly_convertible
is a template function, each unique <InputType, OutputType> pair created
its own TSS key via PyThread_tss_create(). Projects with hundreds of
modules and many implicit conversions could exhaust PTHREAD_KEYS_MAX
(1024 on Linux, 512 on macOS), especially on Python 3.12+ where CPython
itself consumes more TSS keys for subinterpreter support.

thread_local bool is safe here because:
- bool is trivially destructible, so it works on all C++11 platforms
  including older macOS (the concern that motivated the TSS approach in
  PR pybind#5777 applied only to types with non-trivial destructors needing
  __cxa_thread_atexit runtime support)
- Each thread gets its own copy, so it is thread-safe for free-threading
- Subinterpreter sharing is benign: the guard prevents recursive implicit
  conversions on the same thread regardless of which interpreter is active
- The v3.0.0 code already used thread_local bool under Py_GIL_DISABLED

This effectively reverts the core change from PR pybind#5777 while keeping
the non-copyable/non-movable set_flag guard.

Made-with: Cursor
@rwgk rwgk removed the needs changelog Possibly needs a changelog entry label Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Updating to 3.0.1 causes TSS initialization error on Import

2 participants