Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with forked process, spawn works fine #113

Open
lhjnilsson opened this issue Dec 18, 2022 · 1 comment
Open

Issue with forked process, spawn works fine #113

lhjnilsson opened this issue Dec 18, 2022 · 1 comment

Comments

@lhjnilsson
Copy link

Hey,

I have issue where pynng crashes with a heavy core dump if i initialize a socket after a process is started.
This works under Mac but fails on Ubuntu Linux for me (github action & docker container on mac m1)

My personal conclusion is Fork vs Spawn(?). Where Mac uses Spawn by default and Linux Fork when starting a new process in python. https://docs.python.org/3/library/multiprocessing.html (Contexts and start methods)

My scenario is "Workers" that are running as processes having 3 sockets. One Respondent0 to respond to configuration changes, One Pub0 to respond to state(eg, ready) and one Rep0 where it will fetch workload from a remote target, for my scenario is OHLC(Open High Low Close) stock data.
"WorkerPool" Holds the Survey0, sending configurations and listens to Sub0 for state changes.

Code: https://github.com/quantfamily/python/blob/fddc8c4b64c0696c02467ece0c14c1a7679bbae4/src/client/foreverbull/worker/worker.py
Test:
https://github.com/quantfamily/python/blob/fddc8c4b64c0696c02467ece0c14c1a7679bbae4/src/client/tests/worker/test_worker.py

Outcome:
../../tests/test_simply.py::test_new_pool panic: nng is not fork-reentrant safe
This message is indicative of a BUG.
Report this at https://github.com/nanomsg/nng/issues
panic: nng is not fork-reentrant safe
This message is indicative of a BUG.
Report this at https://github.com/nanomsg/nng/issues
/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_panic+0xec) [0x400435040c]
/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_panic+0xec) [0x400435040c]/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_plat_init+0x12e) [0x400435b4be]

/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_proto_open+0x1a) [0x4004350d9a]
/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(+0x430ff) [0x400430e0ff]
/usr/local/lib/libpython3.10.so.1.0(+0x15507a) [0x400199207a]
/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_plat_init+0x12e) [0x400435b4be]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]

/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(nni_proto_open+0x1a) [0x4004350d9a]/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/python3.10/site-packages/pynng/_nng.abi3.so(+0x430ff) [0x400430e0ff]/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0x16d) [0x400198abcd]

/usr/local/lib/libpython3.10.so.1.0(+0x15507a) [0x400199207a]/usr/local/lib/libpython3.10.so.1.0(+0x15ef35) [0x400199bf35]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x273) [0x400198ba23]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0x16d) [0x400198abcd]
/usr/local/lib/libpython3.10.so.1.0(+0x15ef35) [0x400199bf35]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x273) [0x400198ba23]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x582d) [0x4001986b9d]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x582d) [0x4001986b9d]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/libpython3.10.so.1.0(+0x161fe2) [0x400199efe2]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x1300) [0x4001982670]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0xc8) [0x400198ab28]
/usr/local/lib/libpython3.10.so.1.0(+0x15eb42) [0x400199bb42]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x273) [0x400198ba23]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x50db) [0x400198644b]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(+0x161fe2) [0x400199efe2]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x1300) [0x4001982670]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0xc8) [0x400198ab28]

/usr/local/lib/libpython3.10.so.1.0(PyObject_Call+0xb1) [0x400199fc71]/usr/local/lib/libpython3.10.so.1.0(+0x15ef35) [0x400199bf35]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x273) [0x400198ba23]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x50db) [0x400198644b]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(PyObject_Call+0xb1) [0x400199fc71]

/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]/usr/local/lib/libpython3.10.so.1.0(+0x161fe2) [0x400199efe2]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0x16d) [0x400198abcd]

/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(+0x161fe2) [0x400199efe2]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4bab) [0x4001985f1b]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_FastCallDictTstate+0x16d) [0x400198abcd]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_Call_Prepend+0x4c) [0x400199c83c]
/usr/local/lib/libpython3.10.so.1.0(+0x2372ec) [0x4001a742ec]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x363) [0x400198bb13]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x582d) [0x4001986b9d]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_Call_Prepend+0x4c) [0x400199c83c]
/usr/local/lib/libpython3.10.so.1.0(+0x2372ec) [0x4001a742ec]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x363) [0x400198bb13]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x582d) [0x4001986b9d]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6df) [0x4001981a4f]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
/usr/local/lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x29f6) [0x4001983d66]
/usr/local/lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x78) [0x4001992aa8]
Fatal Python error: Aborted

Current thread 0x0000004001f3a4c0 (most recent call first):
Fatal Python error: File "/usr/local/lib/python3.10/site-packages/pynng/nng.py", line 315 in init
File "/build/client/foreverbull/worker/worker.py", line 50 in run
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 314 in _bootstrap
File "/usr/local/lib/python3.10/multiprocessing/popen_fork.py", line 71 in _launch
File "/usr/local/lib/python3.10/multiprocessing/popen_fork.py", line 19 in init
File "/usr/local/lib/python3.10/multiprocessing/context.py", line 281 in _Popen
File "/usr/local/lib/python3.10/multiprocessing/context.py", line 224 in _Popen
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 121 in start
File "/build/client/foreverbull/worker/worker.py", line 145 in setup
File "/tests/test_simply.py", line 85 in test_new_pool
File "/usr/local/lib/python3.10/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/python.py", line 1641 in runtest
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 255 in
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 311 in from_call
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 215 in call_and_report
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 126 in runtestprotocol
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 323 in _main
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 269 in wrap_session
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/config/init.py", line 162 in main
File "/usr/local/lib/python3.10/site-packages/_pytest/config/init.py", line 185 in console_main
File "/usr/local/lib/python3.10/site-packages/pytest/main.py", line 5 in
File "/usr/local/lib/python3.10/runpy.py", line 86 in _run_code
File "/usr/local/lib/python3.10/runpy.py", line 196 in _run_module_as_main
Aborted

Current thread 0x0000004001f3a4c0 (most recent call first):
File "/usr/local/lib/python3.10/site-packages/pynng/nng.py", line 315 in init
File "/build/client/foreverbull/worker/worker.py", line 50 in run
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 314 in _bootstrap
File "/usr/local/lib/python3.10/multiprocessing/popen_fork.py", line 71 in _launch
File "/usr/local/lib/python3.10/multiprocessing/popen_fork.py", line 19 in init
File "/usr/local/lib/python3.10/multiprocessing/context.py", line 281 in _Popen
File "/usr/local/lib/python3.10/multiprocessing/context.py", line 224 in _Popen
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 121 in start
File "/build/client/foreverbull/worker/worker.py", line 145 in setup
File "/tests/test_simply.py", line 85 in test_new_pool
File "/usr/local/lib/python3.10/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/python.py", line 1641 in runtest
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 255 in
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 311 in from_call
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 215 in call_and_report
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 126 in runtestprotocol
File "/usr/local/lib/python3.10/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 323 in _main
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 269 in wrap_session
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in call
File "/usr/local/lib/python3.10/site-packages/_pytest/config/init.py", line 162 in main
File "/usr/local/lib/python3.10/site-packages/_pytest/config/init.py", line 185 in console_main
File "/usr/local/lib/python3.10/site-packages/pytest/main.py", line 5 in
File "/usr/local/lib/python3.10/runpy.py", line 86 in _run_code
File "/usr/local/lib/python3.10/runpy.py", line 196 in _run_module_as_main

@gdamore
Copy link

gdamore commented Jan 18, 2024

So, not too sure what Python or pynng is doing under the hood -- maybe it is trying to be clever -- but nng is not fork-reentrant safe (because it's multithreaded, and almost impossible to write fork-reentrant safe code once you start using mutexes. The POSIX group really screwed up here.)

If you use posix_spawn(), or if you do your forking before you initialize any NNG sockets, it should be fine. Fork is also fine, if you do so using whatever flags are needed to ensure that threads are not duplicated, and follow that up with an exec().

You simply cannot allow the instance of NNG started in a parent to be used in a child. It won't work. The panic here is a safety to let users know that they are trying to do something that won't work reliably, if ever, and I don't want to debug someone's messed up ("forked up") application if they try to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants