Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra data copy during write #15

Open
asomers opened this issue Sep 6, 2021 · 2 comments
Open

Extra data copy during write #15

asomers opened this issue Sep 6, 2021 · 2 comments

Comments

@asomers
Copy link
Contributor

asomers commented Sep 6, 2021

During a write, fuse3 first copies data from the kernel into userland in Session::dispatch. Then it passes a slice of that buffer to handle_write, which ends up copying the data again into a new Vec. It then passes that data as a slice to Filesystem::write, where it might well be copied again. The same thing happens in setxattr.

Instead, Session::dispatch should read from /dev/fuse using readv into a header-sized buffer and a large data buffer. Then it should pass the data buffer by value to Filesystem::write using a Vec. That would eliminate one data copy, and possibly two, depending on how the file system implements write.

@Sherlock-Holo
Copy link
Owner

use writev should avoid memory copy, we own the header buffer and user data(such as Filesystem::read will return Bytes)

when read fuse request, we can allocate 2 buffer, one for header the other for fuse data, when receive a write opcode, consider

The max size of write requests from the kernel. The absolute minimum is 4k, FUSE recommends at least 128k, max 16M. The FUSE default is 16M on macOS and 128k on other systems.

the data may be large or small

  • small like 4K size data: if we pass the data buffer to Filesystem::write, we need to allocate the data buffer(the buffer size is 16M) again
  • large like 15M size data: we pass the data buffer to Filesystem::write then we allocate the data buffer again, but this is no different from the status quo.

anyway, we can replace read/write with readv/writev at first, then find a way to improve write opcode

@asomers
Copy link
Contributor Author

asomers commented Mar 5, 2024

BTW, the maximum size of write that a filesystem will receive is given by the max_write field during FUSE_INIT. So it could be much less than 16M.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants