Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo: return_value_policy::copy ignored #4591

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Mar 24, 2023

Description

Demo: return_value_policy::copy ignored after a Python object was created already with return_value_policy::reference_internal. Back-ported from pywrapcc test_class_sh_property_stl.cpp,py

Suggested changelog entry:

…:reference_internal` already created a Python object. Back-ported from pywrapcc test_class_sh_property_stl.cpp,py
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 24, 2023

Hi @wjakob, what do you think about the existing behavior? It's not very likely to be a problem in real life, but if it bites, it could be very surprising and extremely difficult to debug, especially in a large system.

See test_cpy_after_ref in test_property_stl.py. To see how that can bite in real life, swap these lines in test_persistent_holder:

    c0 = h.vec_fld_hld_cpy[0]  # Must be first. See test_cpy_after_ref().
    r0 = h.vec_fld_hld_ref[0]  # Must be second.

With those lines swapped the test fails:

Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests":
=========================================================== test session starts ============================================================
platform linux -- Python 3.10.9, pytest-7.1.2, pluggy-1.0.0
C++ Info: Debian Clang 14.0.6 C++17 __pybind11_internals_v4_clang_libstdcpp_cxxabi1002__ PYBIND11_SIMPLE_GIL_MANAGEMENT=False
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, configfile: pytest.ini
plugins: xdist-3.0.2, flakefinder-1.1.0
collected 3 items

test_property_stl.py .F.                                                                                                             [100%]

================================================================= FAILURES =================================================================
__________________________________________________________ test_persistent_holder __________________________________________________________

    def test_persistent_holder():
        h = m.VectorFieldHolder()
        r0 = h.vec_fld_hld_ref[0]  # Must be second.
        c0 = h.vec_fld_hld_cpy[0]  # Must be first. See test_cpy_after_ref().
        assert c0.fld.wrapped_int == 300
        assert r0.fld.wrapped_int == 300
        h.reset_at(0, 400)
>       assert c0.fld.wrapped_int == 300
E       assert 400 == 300
E        +  where 400 = <pybind11_tests.property_stl.Field object at 0x7f0e8f7ccaf0>.wrapped_int
E        +    where <pybind11_tests.property_stl.Field object at 0x7f0e8f7ccaf0> = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>.fld

c0         = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>
h          = <pybind11_tests.property_stl.VectorFieldHolder object at 0x7f0dca754f70>
r0         = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>

test_property_stl.py:25: AssertionError
========================================================= short test summary info ==========================================================
FAILED test_property_stl.py::test_persistent_holder - assert 400 == 300
======================================================= 1 failed, 2 passed in 0.12s ========================================================

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 24, 2023

I think (but have not verified) that the existing behavior goes back to this

if (handle registered_inst = find_registered_python_instance(src, tinfo)) {
return registered_inst;
}

pre-empting this

case return_value_policy::copy:
if (copy_constructor) {
valueptr = copy_constructor(src);
} else {
#if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
std::string type_name(tinfo->cpptype->name());
detail::clean_type_id(type_name);
throw cast_error("return_value_policy = copy, but type " + type_name
+ " is non-copyable!");
#else
throw cast_error("return_value_policy = copy, but type is "
"non-copyable! (#define PYBIND11_DETAILED_ERROR_MESSAGES or "
"compile in debug mode for details)");
#endif
}
wrapper->owned = true;
break;

@wjakob
Copy link
Member

wjakob commented Oct 27, 2023

Hi @rwgk — I agree, this is super-dangerous and something I did not think of when coding up the original logic a long time ago. I introduced a change here in nanobind by always prioritizing return_value_policy::copy and return_value_policy::move when the instance is already registered with the binding tool. They return a new object in that case. Can't see any major reasons against pybind11 doing the same thing besides existing software potentially having come to rely on the current bug-prone behavior.

This goes deeper by the way: what about the other return value policies, when returning an object that is already registered with the binding tool? All of them are currently ignored.

  • return_value_policy::reference: ignored, and that seems just fine.

  • return_value_policy::take_ownership: no ownership will be taken. But should it? Perhaps the previous version of the object was returned via return_value_policy::reference.

  • return_value_policy::reference_internal: This entangles the lifetime of the parent and child. If the instance already exists, then that doesn't happen. Might be problematic.

I don't do anything special for these other cases in nanobind, because I wasn't 100% sure what to do about them. It would be be good to decide on a way of resolving all of these situations, and then implementing & documenting them consistently in both binding tools.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 27, 2023

Thanks Wenzel! That's very useful background information.

I don't do anything special for these other cases in nanobind, because I wasn't 100% sure what to do about them.

What I usually do in this situation is experimenting: run global testing with different ideas. Reviewing the results usually gives me a much more certainty than going with just thinking and guessing.

It might take me a while to get back here. The problem didn't come up repeatedly, and we're down to a very tiny fraction of failing tests, therefore I came to look at it as a fringe case. But it would definitely be good to get rid of this trap.

If you or anyone wants to try things out (changes in pybind11), please let me know. I can offer running global tests for experimentation. It's not a lot of work for me (only for the machines).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants