ganesha crash @lock_entry_dec_ref() #1124

skmprabhu252 · 2024-05-04T19:35:42Z

(gdb) bt
#0  0x00007fa67d71ab8f in raise () from /lib64/libpthread.so.0
#1  0x00007fa67f620e3d in crash_handler (signo=6, info=0x7fa6477ec3f0, ctx=0x7fa6477ec2c0) at /usr/src/debug/gpfs.nfs-ganesha-5.7-ibm018.00.el8.x86_64/MainNFSD/nfs_init.c:256
#2  <signal handler called>
#3  0x00007fa67cf6facf in raise () from /lib64/libc.so.6
#4  0x00007fa67cf42ea5 in abort () from /lib64/libc.so.6
#5  0x00007fa67cf42d79 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#6  0x00007fa67cf68426 in __assert_fail () from /lib64/libc.so.6
#7  0x00007fa67f65d834 in lock_entry_dec_ref (lock_entry=0x7fa66800d950) at /usr/src/debug/gpfs.nfs-ganesha-5.7-ibm018.00.el8.x86_64/SAL/state_lock.c:650
#8  0x00007fa67f660bd3 in process_blocked_lock_upcall (block_data=0x7fa66800ab20) at /usr/src/debug/gpfs.nfs-ganesha-5.7-ibm018.00.el8.x86_64/SAL/state_lock.c:1822
#9  0x00007fa67f65b010 in state_blocked_lock_caller (ctx=0x7fa658000e30) at /usr/src/debug/gpfs.nfs-ganesha-5.7-ibm018.00.el8.x86_64/SAL/state_async.c:82
#10 0x00007fa67f6a824c in fridgethr_start_routine (arg=0x7fa658000e30) at /usr/src/debug/gpfs.nfs-ganesha-5.7-ibm018.00.el8.x86_64/support/fridgethr.c:486
#11 0x00007fa67d7101ca in start_thread () from /lib64/libpthread.so.0
#12 0x00007fa67cf5ae73 in clone () from /lib64/libc.so.6
(gdb) p *lock_entry
$1 = {sle_list = {next = 0x7fa66800d, prev = 0xc2897b211783a425}, sle_owner_locks = {next = 0x0, prev = 0x0}, sle_client_locks = {next = 0x0, prev = 0x0}, sle_state_locks = {next = 0x0, prev = 0x0},
  sle_export_locks = {next = 0x0, prev = 0x0}, sle_export = 0x1923e80, sle_obj = 0x7fa640003298, sle_block_data = 0x7fa66800ab20, sle_owner = 0x0, sle_state = 0x7fa66800d790, sle_blocked = STATE_CANCELED,
  **`sle_ref_count = -1,`** sle_lock = {lock_sle_type = FSAL_POSIX_LOCK, lock_type = FSAL_LOCK_W, lock_start = 0, lock_length = 0, lock_reclaim = false}, sle_mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
      __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}}
(gdb)

The problem is the refcount is negative (sle_ref_count = -1)

I suspect one of the scenarios below is causing this issue:

In state_release_grant(), we are calling free_cookie with unblock=true even though do_lock_op() does not return success.

In process_blocked_lock_upcall(), we are decrementing refcount even if try_to_grant_lock() does not return success. Specifically, in try_to_grant_lock(), the call_back() failed to grant a lock, but we are still decrementing refcount in process_blocked_lock_upcall().

         status = call_back(lock_entry->sle_obj,
                            lock_entry);

         if (status == STATE_LOCK_BLOCKED) {
                 /* The lock is still blocked, restore it's type and
                  * leave it in the list.
                  */
                 lock_entry->sle_blocked = blocked;
                 lock_entry->sle_block_data->sbd_grant_type =
                                                 STATE_GRANT_NONE;
                 LogEntry("Granting callback left lock still blocked",
                          lock_entry);
                 return;
         }

The issue is very random. I encountered this crash while running the following test case:

Mount an NFS share using NFSv3 on the client machine twice, and then run the below process on both mount points.

process-1 -> create & delete file in loop.
Process-2 -> try to acquire blocking write lock (running with 5 threads)
Process-3 -> try to acquire blocking read lock (running with 5 threads)
Process-4-> try to acquire overlapping byte range write lock ( 5 threads)

The text was updated successfully, but these errors were encountered:

ffilz · 2024-05-06T17:51:24Z

Do you want to try a fix for those issues and see if it makes any difference?

ffilz added bug Need Info Need more information from the reporter labels May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ganesha crash @lock_entry_dec_ref() #1124

ganesha crash @lock_entry_dec_ref() #1124

skmprabhu252 commented May 4, 2024

ffilz commented May 6, 2024

ganesha crash @lock_entry_dec_ref() #1124

ganesha crash @lock_entry_dec_ref() #1124

Comments

skmprabhu252 commented May 4, 2024

ffilz commented May 6, 2024