Fix TSS key exhaustion in `implicitly_convertible()` by rwgk · Pull Request #6020 · pybind/pybind11

rwgk · 2026-03-28T20:31:31Z

Description

PR #5777 (included in v3.0.1) replaced the static bool / thread_local bool reentrancy guard in implicitly_convertible() with static thread_specific_storage<int> to make it sub-interpreter and free-threading safe. Because implicitly_convertible is a template function parametrized on <InputType, OutputType>, every unique type pair instantiation creates its own static thread_specific_storage, each of which allocates a TSS key via PyThread_tss_create().

Projects with hundreds of modules and many implicit conversions (the reporters have 590+ and 1000+ PYBIND11_MODULEs) exhaust the OS TSS key limit (PTHREAD_KEYS_MAX: 1024 on Linux, 512 on macOS). The problem manifests on Python 3.12+ because CPython itself consumes more TSS keys for subinterpreter support, reducing the budget available to user code.

The fix replaces static thread_specific_storage<int> with thread_local bool:

Thread-safe: each thread gets its own copy, so it handles free-threading correctly.
No TSS key allocation: thread_local uses compiler/OS TLS storage (e.g., __thread segment on Linux), which has effectively unlimited capacity.
Subinterpreter sharing is benign: the guard only prevents recursive implicit conversions on the same thread — the correct behavior regardless of which interpreter is active.
Trivially destructible: bool does not require __cxa_thread_atexit runtime support, so it works on all C++11 platforms including older macOS targets. (This was the concern that motivated using TSS in fix: make implicitly_convertable sub-interpreter and free-threading safe #5777, but it only applies to types with non-trivial destructors.)
Precedent: the v3.0.0 code already used thread_local bool under #ifdef Py_GIL_DISABLED.

The non-copyable/non-movable set_flag RAII guard (added in #5777) is retained but simplified back to wrapping bool&.

Suggested changelog entry:

Fixed TSS key exhaustion when using many implicit conversions across hundreds of modules, caused by per-template-instantiation thread_specific_storage allocation in implicitly_convertible().

Replace `static thread_specific_storage<int>` with `thread_local bool` in the implicit conversion reentrancy guard. Since implicitly_convertible is a template function, each unique <InputType, OutputType> pair created its own TSS key via PyThread_tss_create(). Projects with hundreds of modules and many implicit conversions could exhaust PTHREAD_KEYS_MAX (1024 on Linux, 512 on macOS), especially on Python 3.12+ where CPython itself consumes more TSS keys for subinterpreter support. thread_local bool is safe here because: - bool is trivially destructible, so it works on all C++11 platforms including older macOS (the concern that motivated the TSS approach in PR pybind#5777 applied only to types with non-trivial destructors needing __cxa_thread_atexit runtime support) - Each thread gets its own copy, so it is thread-safe for free-threading - Subinterpreter sharing is benign: the guard prevents recursive implicit conversions on the same thread regardless of which interpreter is active - The v3.0.0 code already used thread_local bool under Py_GIL_DISABLED This effectively reverts the core change from PR pybind#5777 while keeping the non-copyable/non-movable set_flag guard. Made-with: Cursor

b-pass

Looks good to me.

rwgk · 2026-03-29T15:03:26Z

Thanks @b-pass

…ybind#6020) Replace `static thread_specific_storage<int>` with `thread_local bool` in the implicit conversion reentrancy guard. Since implicitly_convertible is a template function, each unique <InputType, OutputType> pair created its own TSS key via PyThread_tss_create(). Projects with hundreds of modules and many implicit conversions could exhaust PTHREAD_KEYS_MAX (1024 on Linux, 512 on macOS), especially on Python 3.12+ where CPython itself consumes more TSS keys for subinterpreter support. thread_local bool is safe here because: - bool is trivially destructible, so it works on all C++11 platforms including older macOS (the concern that motivated the TSS approach in PR pybind#5777 applied only to types with non-trivial destructors needing __cxa_thread_atexit runtime support) - Each thread gets its own copy, so it is thread-safe for free-threading - Subinterpreter sharing is benign: the guard prevents recursive implicit conversions on the same thread regardless of which interpreter is active - The v3.0.0 code already used thread_local bool under Py_GIL_DISABLED This effectively reverts the core change from PR pybind#5777 while keeping the non-copyable/non-movable set_flag guard. Made-with: Cursor

rwgk requested a review from b-pass March 29, 2026 05:29

b-pass approved these changes Mar 29, 2026

View reviewed changes

rwgk merged commit 70b6fd3 into pybind:master Mar 29, 2026
89 checks passed

github-actions bot added the needs changelog Possibly needs a changelog entry label Mar 29, 2026

rwgk deleted the implicitly_convertible_back_to_thread_local branch March 29, 2026 15:03

BrewTestBot mentioned this pull request Mar 31, 2026

pybind11 3.0.3 Homebrew/homebrew-core#275433

Merged

rwgk removed the needs changelog Possibly needs a changelog entry label Mar 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TSS key exhaustion in `implicitly_convertible()`#6020

Fix TSS key exhaustion in `implicitly_convertible()`#6020
rwgk merged 1 commit intopybind:masterfrom
rwgk:implicitly_convertible_back_to_thread_local

rwgk commented Mar 28, 2026

Uh oh!

b-pass left a comment

Uh oh!

rwgk commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rwgk commented Mar 28, 2026

Description

Suggested changelog entry:

Uh oh!

b-pass left a comment

Choose a reason for hiding this comment

Uh oh!

rwgk commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants