Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505

Skylion007 · 2023-02-08T16:11:11Z

Description

Fix #4459 without introducing the segfault discovered in #4500. This is a nasty hotfix, but it should support both use cases without crashing.

This segfault is due to some nasty behavior with the side effects of get_local_internals which currently relies on UB behavior. Full writeup can be found here: #4500 (comment)

Suggested changelog entry:

* Fix regression that prevented using atexit callbacks with cpp registered types in an embedded interpreter.

Skylion007 · 2023-02-08T16:57:33Z

Hmm, looks like some compilers are being a bit too clever and delaying the initialization of the static variable or otherwise reordering it when optimizations are enabled. This does unfortunately complicate the fix a bit. Thoughts @rwgk?

rwgk · 2023-02-09T01:58:42Z

The failing test is Pass classes and data between modules defined in C++ and Python. Maybe one module still points to data in the other?

I spent 10ish minutes looking around but still feel pretty clueless.

Wild idea, potentially silly: What if we clear only before we initialize a new interpreter, e.g. near the top of initialize_interpreter()?

Skylion007 · 2023-02-09T15:16:04Z

The failing test is Pass classes and data between modules defined in C++ and Python. Maybe one module still points to data in the other?

I spent 10ish minutes looking around but still feel pretty clueless.

Wild idea, potentially silly: What if we clear only before we initialize a new interpreter, e.g. near the top of initialize_interpreter()?

Same issue, we can't call get_local_internals without calling get_internals which requires a GIL lock and thread local storage. The GIL can only be locked if the Python interpreter is initialized.

Skylion007 · 2023-02-09T15:42:52Z

The failing test is Pass classes and data between modules defined in C++ and Python. Maybe one module still points to data in the other?

I spent 10ish minutes looking around but still feel pretty clueless.

Wild idea, potentially silly: What if we clear only before we initialize a new interpreter, e.g. near the top of initialize_interpreter()?

@rwgk I don't think the error is necessarily specific to this test, I think it's just the first test which access local_internals and therefore initializes it.

Skylion007 · 2023-02-09T15:48:49Z

include/pybind11/embed.h

+    // avoid undefined behaviors when initializing another interpreter
+    local_internals.registered_types_cpp.clear();
+    local_internals.registered_exception_translators.clear();
+
    if (internals_ptr_ptr) {
        delete *internals_ptr_ptr;


@rwgk It just occurred to me, that the reason deleting the internals_ptr_ptr here works is that it effectively leaks the containers in internals since it just deallocates them without calling the dtors. This shouldn't be working here because if the dtors did run, then the capsule dtor for builtins object would be called which requires the CPython API which would segfault. What this is actually doing is just leaking the objects. If we also changed the behavior to improperly swap out pointers to the container, that would work, but it's not a longterm fix.

effectively leaks the containers in internals since it just deallocates them without calling the dtors

Yes, I was aware of that. Just to confirm.

Some background that may or may not matter here, sharing JIC:

Until Python 3.9 or 3.10 (not sure anymore) Python itself was leaking (https://github.com/rwgk/stuff/blob/f6549ecadb46169feb5f3c2456dcd13852382f7a/noddy/embedded_noddy_main_run.c), but with one of those versions that was cleaned up.

Before the cleanup in Python I was thinking: well, there is no point in being clean in pybind11. Therefore I wasn't concerned about any internals leaks.

That thinking has changed now. But who has the free energy to cleanup pybind11? And is it even doable without an ABI break? I don't know.

To come back to the problem here: I wouldn't worry about leaks, still. We "just" need to satisfy the situations of #3744 and #4500 simultaneously, with or without leaks.

Then worry about the leaks later.

Concretely: could we simply replicate the ptr_ptr tricked used by (non-local) internals? I.e. leak the local internals, too?

I think we will need an ABI break at some point.

Concretely: could we simply replicate the ptr_ptr tricked used by (non-local) internals? I.e. leak the local internals, too?

Probably? wouldn't that be an ABI break in itself though?

Probably? wouldn't that be an ABI break in itself though?

I don't think so.

@lalaland for a 2nd opinion.

I think we can change local internals without an ABI break?

I think the bigger issue is that it might be an API break, since adding leaking behavior is explicitly changing the behavior of existing code. And in memory tight situations, that can be quite bad.

I think the bigger issue is that it might be an API break, since adding leaking behavior is explicitly changing the behavior of existing code. And in memory tight situations, that can be quite bad.

Absolutely, it's a rock-and-a-hard place situation.

Ideally someone rolls up their sleeves and wrestles it down properly with whatever it takes.

I cannot do that: embedding has no critical use at Google. The only reason I'm looking is general pybind11 health.

So this here boils down to: what do we want to do given very limited resources?

I would vote for leaking now to fix the segfaults as a quick patch, but actually try to think of a way to eliminate the leak as a more long term project.

There has got to be some way of fixing this in the long term.

Skylion007 · 2023-02-13T20:20:21Z

@Kalakaarboyz1 Sadly, this doesn't work on all compilers currently, so we need to fix that.

rwgk · 2023-03-16T16:47:57Z

Slightly random thoughts & at a random time (from the back of my mind):

We have a conflict between clearing module-local data before (fix: clear local internals after finalizing interpreter #2101 #3744) or after (Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505) the Py_Finalize(); call.
It so happened that the change under fix: clear local internals after finalizing interpreter #2101 #3744 came first. It got in because we don't have any testing involving atexit. IMO in retrospect it was a mistake to merge fix: clear local internals after finalizing interpreter #2101 #3744. But it was merged already more than a year ago, now it's too late to undo it without creating more harm than good.

My ideas for the community of users having a vested interest in multiple interpreter initialize/finalize cycles in a given process:

0a. Just don't do it. Initialize the interpreter only once in a given process, and only finalize it once, when the process terminates.

0b. If you really must have multiple interpreter initialize/finalize cycles, write a full rationale for why it's worth putting up with the trouble. Add to the pybind11 documentation.

Add a test for the atexit situation before doing anything else.
Commit to whatever it takes to make the new test and all existing tests pass.

Skylion007 · 2023-03-16T16:52:15Z

I could be in favor of 0a, atexit seems like a more common use case than multiple interperters. Honestly, it's our slopping handle of get_internals_state, and the fact we can't control when to try to regenerate it. Ideally, we should have seperated the Python Internals State from the C++ one in some way.

Fix pybind#4459 (cpp atexit callbacks) without segfault

7d63557

Skylion007 requested a review from rwgk February 8, 2023 16:11

Skylion007 mentioned this pull request Feb 8, 2023

Show segfault in embed scoped_interpreter test #4500

Closed

Make local_internals non-copyable. Simplify embed.h code.

0684269

Skylion007 requested a review from henryiii February 8, 2023 16:32

Include default ctor for local_internals

86acb12

Skylion007 commented Feb 9, 2023

View reviewed changes

rwgk mentioned this pull request Feb 11, 2023

FWD pybind11 google/pybind11k#4505

Closed

Try hack as optimization fence

91cbc66

Skylion007 mentioned this pull request Mar 16, 2023

Introduce get_python_state_dict() for Python 3.12 compatibility. #4570

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505

Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505

Skylion007 commented Feb 8, 2023 •

edited

Skylion007 commented Feb 8, 2023

rwgk commented Feb 9, 2023 •

edited

Skylion007 commented Feb 9, 2023

Skylion007 commented Feb 9, 2023

Skylion007 Feb 9, 2023

rwgk Feb 9, 2023

Skylion007 Feb 9, 2023 •

edited

rwgk Feb 9, 2023

EthanSteinberg Feb 9, 2023

rwgk Feb 9, 2023

EthanSteinberg Feb 9, 2023

Skylion007 commented Feb 13, 2023

rwgk commented Mar 16, 2023

Skylion007 commented Mar 16, 2023

Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505

Are you sure you want to change the base?

Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505

Conversation

Skylion007 commented Feb 8, 2023 • edited

Description

Suggested changelog entry:

Skylion007 commented Feb 8, 2023

rwgk commented Feb 9, 2023 • edited

Skylion007 commented Feb 9, 2023

Skylion007 commented Feb 9, 2023

Skylion007 Feb 9, 2023

Choose a reason for hiding this comment

rwgk Feb 9, 2023

Choose a reason for hiding this comment

Skylion007 Feb 9, 2023 • edited

Choose a reason for hiding this comment

rwgk Feb 9, 2023

Choose a reason for hiding this comment

EthanSteinberg Feb 9, 2023

Choose a reason for hiding this comment

rwgk Feb 9, 2023

Choose a reason for hiding this comment

EthanSteinberg Feb 9, 2023

Choose a reason for hiding this comment

Skylion007 commented Feb 13, 2023

rwgk commented Mar 16, 2023

Skylion007 commented Mar 16, 2023

Skylion007 commented Feb 8, 2023 •

edited

rwgk commented Feb 9, 2023 •

edited

Skylion007 Feb 9, 2023 •

edited