Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU pegged at 100% and sched_yield() called in infinite loop when OpenMP enabled #666

Open
caliguian opened this issue Mar 22, 2024 · 3 comments

Comments

@caliguian
Copy link

caliguian commented Mar 22, 2024

We have been using OpenMP with Imagick for as long as I can remember (many years), but after our system was recently updated to PHP 8.1+ and ImageMagick 7.1.1-26, we started experiencing random instances of processes getting locked at 100% CPU usage with truss revealing that sched_yield(); is being called in an infinite loop on the locked processes. This appears to happen, on seemingly random occasions, when the Imagick and OpenMP modules initially load. Backtrace details for an example process that was stuck in a sched_yield() infinite loop can be seen below:

(gdb) thread apply all bt

Thread 1 (LWP 100499 of process 33248):
#0  sched_yield () at sched_yield.S:4
#1  0x00000008495aa491 in __kmp_wait_4_ptr (spinner=0x84960b51c <__kmp_initz_lock+28>, checker=1, pred=0x8495c4b50 <__kmp_bakery_check(void*, unsigned int)>, obj=0x84960b500 <__kmp_initz_lock>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_dispatch.cpp:2695
#2  0x00000008495c37c8 in __kmp_acquire_ticket_lock_timed_template (lck=0x0, gtid=<optimized out>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.cpp:640
#3  __kmp_acquire_ticket_lock (lck=0x0, gtid=<optimized out>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.cpp:645
#4  0x00000008495c9564 in __kmp_acquire_bootstrap_lock (lck=0x84960b51c <__kmp_initz_lock+28>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.h:524
#5  __kmp_get_global_thread_id_reg () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:253
#6  0x000000084958991e in __kmp_entry_thread () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp.h:3401
#7  kmpc_malloc (size=35590812956) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_alloc.cpp:1098
#8  0x00000008496041b7 in __kmp_is_address_mapped (addr=0x847bad5c8) at /usr/src/contrib/llvm-project/openmp/runtime/src/z_Linux_util.cpp:2067
#9  0x00000008495d3925 in __kmp_register_library_startup () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6774
#10 0x00000008495c95c4 in __kmp_do_serial_initialize () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6966
#11 0x00000008495c9578 in __kmp_get_global_thread_id_reg () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:255
#12 0x00000008495b40b2 in __kmp_api_omp_init_lock (user_lock=0x84960b51c <__kmp_initz_lock+28>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_ftn_entry.h:1083
#13 0x000000084b24773c in MagickCoreGenesis () from /usr/local/lib/libMagickCore-7.so.10
#14 0x000000084a26bd8e in zm_startup_imagick () from /usr/local/lib/php/20230831-debug/imagick.so
#15 0x000000000083b010 in zend_startup_module_ex ()
#16 0x000000000083bb10 in ?? ()
#17 0x0000000000853131 in zend_hash_apply ()
#18 0x000000000083b90c in zend_startup_modules ()
#19 0x0000000000766b0f in php_module_startup ()
#20 0x00000000009f51e9 in ?? ()
#21 0x00000000009f37a4 in ?? ()
#22 0x0000000000423720 in _start ()
(gdb)

Trace info for a second process that also locked at the same time (pid 33247 vs 33248) can be seen below as well:

(gdb) thread apply all bt

Thread 1 (LWP 101575 of process 33247):
#0  sched_yield () at sched_yield.S:4
#1  0x00000008456d0491 in __kmp_wait_4_ptr (spinner=0x84573151c <__kmp_initz_lock+28>, checker=1, pred=0x8456eab50 <__kmp_bakery_check(void*, unsigned int)>, obj=0x845731500 <__kmp_initz_lock>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_dispatch.cpp:2695
#2  0x00000008456e97c8 in __kmp_acquire_ticket_lock_timed_template (lck=0x0, gtid=<optimized out>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.cpp:640
#3  __kmp_acquire_ticket_lock (lck=0x0, gtid=<optimized out>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.cpp:645
#4  0x00000008456ef564 in __kmp_acquire_bootstrap_lock (lck=0x84573151c <__kmp_initz_lock+28>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_lock.h:524
#5  __kmp_get_global_thread_id_reg () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:253
#6  0x00000008456af91e in __kmp_entry_thread () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp.h:3401
#7  kmpc_malloc (size=35524908316) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_alloc.cpp:1098
#8  0x000000084572a1b7 in __kmp_is_address_mapped (addr=0x847bad5c8) at /usr/src/contrib/llvm-project/openmp/runtime/src/z_Linux_util.cpp:2067
#9  0x00000008456f9925 in __kmp_register_library_startup () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6774
#10 0x00000008456ef5c4 in __kmp_do_serial_initialize () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6966
#11 0x00000008456ef578 in __kmp_get_global_thread_id_reg () at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_runtime.cpp:255
#12 0x00000008456da0b2 in __kmp_api_omp_init_lock (user_lock=0x84573151c <__kmp_initz_lock+28>) at /usr/src/contrib/llvm-project/openmp/runtime/src/kmp_ftn_entry.h:1083
#13 0x0000000845bb773c in MagickCoreGenesis () from /usr/local/lib/libMagickCore-7.so.10
#14 0x00000008451f4d8e in zm_startup_imagick () from /usr/local/lib/php/20230831-debug/imagick.so
#15 0x000000000083b010 in zend_startup_module_ex ()
#16 0x000000000083bb10 in ?? ()
#17 0x0000000000853131 in zend_hash_apply ()
#18 0x000000000083b90c in zend_startup_modules ()
#19 0x0000000000766b0f in php_module_startup ()
#20 0x00000000009f51e9 in ?? ()
#21 0x00000000009f37a4 in ?? ()
#22 0x0000000000423720 in _start ()
(gdb)

We have tried updating from PHP 8.1 to 8.3, and we have tried rebuilding ImageMagick with a few different options disabled, but the problem persists. Unfortunately, we cannot simply disable OpenMP support, since doing so results in a significant slowdown of the convert processes.

Additional details: Our system is currently running FreeBSD 13.2-RELEASE-p9.

@Danack
Copy link
Collaborator

Danack commented Mar 22, 2024

This is likely to be one of those bugs that has a simple cause, but only shows itself in a limited set of circumstances.

I'd suggest opening an issue at https://github.com/llvm/llvm-project and asking for advice there, as I have no real idea of how OpenMP even works, and they might have an idea of where to even start poking.

Can you say which sapi of PHP you're using? e.g. PHP-fpm, apache module etc.

Can you say which version of OpenMP you were using previously where it worked more reliably?

It's probably worth trying to add a hack to see if that makes a difference. In the code for PHP_MINIT_FUNCTION(imagick) just before the MagickWandGenesis() call, add something like:

for (i = 0; i < 100; i += 1) {
	usleep(1000);
}

to see if that removes a race condition that 'fixes' the problem.

@Danack
Copy link
Collaborator

Danack commented Mar 22, 2024

Also, the fact that issue #666 is related to OpenMP seems a little bit like a "as the prophecy foretold" situation.

@caliguian
Copy link
Author

Someone else has looked into this a little deeper than I had, and they have created a bug report for it in the LLVM github project: llvm/llvm-project#86684

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants