Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V5.6: heap-use-after-free: freed during RELEASE_LOCKOWNER, used in subsequent PUTFH (on MacOS), in NFSv4 #1119

Open
drieber opened this issue Apr 16, 2024 · 7 comments

Comments

@drieber
Copy link
Contributor

drieber commented Apr 16, 2024

I am getting an easily reproducible heap-use-after-free with my custom FSAL on MacOS.

Client and server are both on MacOS, the client is the MacOS kernel
I am running with ganesha V5.6
My server only supports NFSv4
My fsal is configured with lock_support=false and lock_support_async_block=false
My fsal does not implement lock_op2
Please notice MacOS kernel sends compound request: PUTFH, RELEASE_LOCKOWNER (I'm not sure if it is really necessary to send PUTFH, but that's what darwin does)

It strongly feels like refcounts gone wrong

==82347==ERROR: AddressSanitizer: heap-use-after-free on address 0x000120e392b0 at pc 0x000102ebd094 bp 0x0003020446c0 sp 0x0003020446b8
READ of size 8 at 0x000120e392b0 thread T51
    #0 0x102ebd090 in urcu_ref_get_unless_zero+0x4c (srcfsd_darwin:arm64+0x102c35090)
    #1 0x102eb5018 in gsh_refstr_get+0x14 (srcfsd_darwin:arm64+0x102c2d018)
    #2 0x102eb7a5c in tmp_get_exp_paths+0x21c (srcfsd_darwin:arm64+0x102c2fa5c)
    #3 0x102eb3540 in _get_gsh_export_ref+0x238 (srcfsd_darwin:arm64+0x102c2b540)
    #4 0x102eb41c8 in get_gsh_export+0xa1c (srcfsd_darwin:arm64+0x102c2c1c8)
    #5 0x102cefb58 in nfs4_mds_putfh+0x994 (srcfsd_darwin:arm64+0x102a67b58)
    #6 0x102cedb48 in nfs4_op_putfh+0x460 (srcfsd_darwin:arm64+0x102a65b48)
    #7 0x102c522ec in process_one_op+0x2460 (srcfsd_darwin:arm64+0x1029ca2ec)
    #8 0x102c57368 in nfs4_Compound+0x2928 (srcfsd_darwin:arm64+0x1029cf368)
    #9 0x102befbe4 in nfs_rpc_process_request+0x7918 (srcfsd_darwin:arm64+0x102967be4)
    #10 0x102bf0214 in nfs_rpc_valid_NFS+0x3cc (srcfsd_darwin:arm64+0x102968214)
    #11 0x103143f10 in svc_vc_decode+0x408 (srcfsd_darwin:arm64+0x102ebbf10)
    #12 0x10313819c in svc_request+0x18c (srcfsd_darwin:arm64+0x102eb019c)
    #13 0x103143998 in svc_vc_recv+0x4a7c (srcfsd_darwin:arm64+0x102ebb998)
    #14 0x103137fe4 in svc_rqst_xprt_task_recv+0x1f4 (srcfsd_darwin:arm64+0x102eaffe4)
    #15 0x10312f784 in svc_rqst_epoll_loop+0xbd4 (srcfsd_darwin:arm64+0x102ea7784)
    #16 0x10314e200 in work_pool_thread+0x8b8 (srcfsd_darwin:arm64+0x102ec6200)
    #17 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #18 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

0x000120e392b0 is located 0 bytes inside of 10-byte region [0x000120e392b0,0x000120e392ba)
freed by thread T53 here:
    #0 0x11ea3ece0 in wrap_free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x52ce0)
    #1 0x102f20eb8 in gsh_free+0x14 (srcfsd_darwin:arm64+0x102c98eb8)
    #2 0x102f20e94 in gsh_refstr_release+0x13c (srcfsd_darwin:arm64+0x102c98e94)
    #3 0x102f79be8 in urcu_ref_put+0x3e4 (srcfsd_darwin:arm64+0x102cf1be8)
    #4 0x102f74de0 in gsh_refstr_put+0x1c (srcfsd_darwin:arm64+0x102cecde0)
    #5 0x102f7676c in clear_op_context_export_impl+0x378 (srcfsd_darwin:arm64+0x102cee76c)
    #6 0x102f762b4 in clear_op_context_export+0x20 (srcfsd_darwin:arm64+0x102cee2b4)
    #7 0x102c57514 in nfs4_Compound+0x2ad4 (srcfsd_darwin:arm64+0x1029cf514)
    #8 0x102befbe4 in nfs_rpc_process_request+0x7918 (srcfsd_darwin:arm64+0x102967be4)
    #9 0x102bf0214 in nfs_rpc_valid_NFS+0x3cc (srcfsd_darwin:arm64+0x102968214)
    #10 0x103143f10 in svc_vc_decode+0x408 (srcfsd_darwin:arm64+0x102ebbf10)
    #11 0x10313819c in svc_request+0x18c (srcfsd_darwin:arm64+0x102eb019c)
    #12 0x103143998 in svc_vc_recv+0x4a7c (srcfsd_darwin:arm64+0x102ebb998)
    #13 0x103137fe4 in svc_rqst_xprt_task_recv+0x1f4 (srcfsd_darwin:arm64+0x102eaffe4)
    #14 0x10312f784 in svc_rqst_epoll_loop+0xbd4 (srcfsd_darwin:arm64+0x102ea7784)
    #15 0x10314e200 in work_pool_thread+0x8b8 (srcfsd_darwin:arm64+0x102ec6200)
    #16 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #17 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

previously allocated by thread T0 here:
    #0 0x11ea3eba4 in wrap_malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x52ba4)
    #1 0x102f20ce4 in gsh_refstr_alloc+0x18 (srcfsd_darwin:arm64+0x102c98ce4)
    #2 0x102ed99a4 in gsh_refstr_dup+0x24 (srcfsd_darwin:arm64+0x102c519a4)
    #3 0x102ede61c in export_commit_common+0x4844 (srcfsd_darwin:arm64+0x102c5661c)
    #4 0x102ee2630 in export_commit+0x30 (srcfsd_darwin:arm64+0x102c5a630)
    #5 0x10307c534 in proc_block+0x10b4 (srcfsd_darwin:arm64+0x102df4534)
    #6 0x10307d24c in load_config_from_parse+0x79c (srcfsd_darwin:arm64+0x102df524c)
    #7 0x102ec7a2c in ReadExports+0xaec (srcfsd_darwin:arm64+0x102c3fa2c)
    #8 0x102bbc934 in nfs_libmain2+0x164c (srcfsd_darwin:arm64+0x102934934)
    #9 0x10262c00c in devtools_srcfs::SrcfsnMain(devtools_vfs::NfsMainOptions&)+0x388 (srcfsd_darwin:arm64+0x1023a400c)
    #10 0x100a3302c in devtools_srcfs::(anonymous namespace)::Main()+0x1050 (srcfsd_darwin:arm64+0x1007ab02c)
    #11 0x100a2fcc4 in main+0xd64 (srcfsd_darwin:arm64+0x1007a7cc4)
    #12 0x18583e0dc  (<unknown module>)

Thread T51 created by T40 here:
    #0 0x11ea37b10 in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb10)
    #1 0x10314c91c in work_pool_spawn+0xfc (srcfsd_darwin:arm64+0x102ec491c)
    #2 0x10314e00c in work_pool_thread+0x6c4 (srcfsd_darwin:arm64+0x102ec600c)
    #3 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #4 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

Thread T40 created by T39 here:
    #0 0x11ea37b10 in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb10)
    #1 0x10314c91c in work_pool_spawn+0xfc (srcfsd_darwin:arm64+0x102ec491c)
    #2 0x10314e00c in work_pool_thread+0x6c4 (srcfsd_darwin:arm64+0x102ec600c)
    #3 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #4 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

Thread T39 created by T38 here:
    #0 0x11ea37b10 in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb10)
    #1 0x10314c91c in work_pool_spawn+0xfc (srcfsd_darwin:arm64+0x102ec491c)
    #2 0x10314e00c in work_pool_thread+0x6c4 (srcfsd_darwin:arm64+0x102ec600c)
    #3 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #4 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)
Thread T38 created by T0 here:
    #0 0x11ea37b10 in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb10)
    #1 0x10314c91c in work_pool_spawn+0xfc (srcfsd_darwin:arm64+0x102ec491c)
    #2 0x10314c608 in work_pool_init+0xc2c (srcfsd_darwin:arm64+0x102ec4608)
    #3 0x103117484 in svc_init+0xc6c (srcfsd_darwin:arm64+0x102e8f484)
    #4 0x102bdf91c in nfs_Init_svc+0xc70 (srcfsd_darwin:arm64+0x10295791c)
    #5 0x102bb58f8 in nfs_Init+0xe58 (srcfsd_darwin:arm64+0x10292d8f8)
    #6 0x102bb4594 in nfs_start+0x260 (srcfsd_darwin:arm64+0x10292c594)
    #7 0x102bbcb80 in nfs_libmain2+0x1898 (srcfsd_darwin:arm64+0x102934b80)
    #8 0x10262c00c in devtools_srcfs::SrcfsnMain(devtools_vfs::NfsMainOptions&)+0x388 (srcfsd_darwin:arm64+0x1023a400c)
    #9 0x100a3302c in devtools_srcfs::(anonymous namespace)::Main()+0x1050 (srcfsd_darwin:arm64+0x1007ab02c)
    #10 0x100a2fcc4 in main+0xd64 (srcfsd_darwin:arm64+0x1007a7cc4)
    #11 0x18583e0dc  (<unknown module>)

Thread T53 created by T39 here:
    #0 0x11ea37b10 in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb10)
    #1 0x10314c91c in work_pool_spawn+0xfc (srcfsd_darwin:arm64+0x102ec491c)
    #2 0x10314e00c in work_pool_thread+0x6c4 (srcfsd_darwin:arm64+0x102ec600c)
    #3 0x185bc6f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #4 0x185bc1d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

SUMMARY: AddressSanitizer: heap-use-after-free (srcfsd_darwin:arm64+0x102c35090) in urcu_ref_get_unless_zero+0x4c
Shadow bytes around the buggy address:
  0x000120e39000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x000120e39080: fa fa fa fa fa fa 07 fa fa fa 07 fa fa fa 07 fa
  0x000120e39100: fa fa 00 00 fa fa fa fa fa fa fa fa fa fa 00 00
  0x000120e39180: fa fa fa fa fa fa fa fa fa fa 03 fa fa fa fa fa
  0x000120e39200: fa fa 00 02 fa fa fa fa fa fa fa fa fa fa fa fa
=>0x000120e39280: fa fa fd fd fa fa[fd]fd fa fa fa fa fa fa fa fa
  0x000120e39300: fa fa 00 03 fa fa fa fa fa fa fa fa fa fa fa fa
  0x000120e39380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 02 fa
  0x000120e39400: fa fa 02 fa fa fa 00 02 fa fa 00 02 fa fa 02 fa
  0x000120e39480: fa fa 02 fa fa fa 00 04 fa fa 00 00 fa fa 00 00
  0x000120e39500: fa fa 00 00 fa fa 00 03 fa fa 00 04 fa fa 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==82347==ABORTING

I think what's used-after-free is either op_ctx->ctx_fullpath or op_ctx->ctx_pseudopath (or maybe fullpath / pseudopath?).

I suspect the issue is somewhere in release_lock_owner:
https://github.com/nfs-ganesha/nfs-ganesha/blob/V5.6/src/SAL/nfs4_state.c#L708-L745

@drieber
Copy link
Contributor Author

drieber commented Apr 16, 2024

Here is a snippet from the server logs.
log-snippet.log

@ffilz
Copy link
Member

ffilz commented Apr 16, 2024

I think this will fix the issue: https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/1193187?usp=search

@ffilz
Copy link
Member

ffilz commented Apr 17, 2024

Patch updated. All handling of the gsh_refstr has been clarified. Documentation (comments and function documentation) has now been provided. I also cleaned up so code is used more in common and more clearly.

@drieber
Copy link
Contributor Author

drieber commented Apr 17, 2024

I tested patchset 2 locally and it does fix my issue.

I need to deliver a fix to my users quickly. Even if you submit this fix to V6-dev* today, I am still using V5.6. If you submitted the fix to V5.*, we would still have to carefully import it into our build. With that in mind, I am thinking maybe I submit a local fix on my side while I give our developers more time to review your more extensive fix. So I have this question for you: does this fix look acceptable to you:

--- a/src/FSAL/commonlib.c
+++ b/src/FSAL/commonlib.c
@@ -3043,8 +3043,8 @@ static void set_op_context_export_fsal_n
 						  bool discard_refstr)
 {
 	if (discard_refstr) {
-		gsh_refstr_put(op_ctx->ctx_fullpath);
-		gsh_refstr_put(op_ctx->ctx_pseudopath);
+		if (op_ctx->ctx_fullpath != NULL) gsh_refstr_put(op_ctx->ctx_fullpath);
+		if (op_ctx->ctx_pseudopath != NULL) gsh_refstr_put(op_ctx->ctx_pseudopath);
 	}
 
 	op_ctx->ctx_export = exp;
@@ -3165,6 +3165,8 @@ void save_op_context_export_and_clear(st
 	op_ctx->ctx_export = NULL;
 	op_ctx->fsal_export = NULL;
 	op_ctx->ctx_pnfs_ds = NULL;
+	op_ctx->ctx_fullpath = NULL;
+	op_ctx->ctx_pseudopath = NULL;
 }
 
 void restore_op_context_export(struct saved_export_context *saved)

@ffilz
Copy link
Member

ffilz commented Apr 17, 2024

As a downstream only fix, that should fix the immediate issue. I do encourage changing to the more complete fix ASAP and perhaps only release this temporary fix in a fork, but all of that is up to you how to handle.

@ffilz
Copy link
Member

ffilz commented Apr 17, 2024

Changing to verified based on comment above.

@ffilz
Copy link
Member

ffilz commented Apr 17, 2024

Also, please feel free to add a Verified +1 to the Gerrithub review, and code review +1 if you are comfortable doing so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants