Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor Buffer (and possibly bigint?) performance #1973

Open
vekexasia opened this issue Feb 23, 2024 · 5 comments
Open

Poor Buffer (and possibly bigint?) performance #1973

vekexasia opened this issue Feb 23, 2024 · 5 comments

Comments

@vekexasia
Copy link

It looks like passing buffer using napi-rs introduces a lot of computing overhead.

I discovered this while working on a new library hoping to replace the current native C binding. I wanted to benchmark the new code and to my surprise the rust version was a lot slower. The benchmark is basically a for loop to 40000000 in nodejs calling every time the associated fn.

The library I was working on should deal with both BigInts and Buffers so i created a simple .rs lib containing several empty functions whose signature can be seen in the results below:

pub fn bigint_u8refarr(n: BigInt, out: &[u8]) -> (): 129ns/op
pub fn bigint_buffer(n: BigInt, out: Buffer) -> (): 134ns/op
pub fn bigint_only(n: BigInt) -> (): 46ns/op
pub fn buffer_only(out: Buffer) -> (): 98ns/op
pub fn u8refarr_only(out: &[u8]) -> (): 97ns/op
pub fn nothing() -> (): 14ns/op

I then created a .c file, compiled with node-gyp and here the results:

c.noop: 12ns/op
bufferOnly_c: 26ns/op
bufferOnlyWithRef_c: 56ns/op

noop is self-explanatory and is on-par with nothing() in rust.

bufferOnly_c and bufferOnlyWithRef_c are basically the same with the exception that "WithRef" also handles napi_create_reference and napi_delete_reference which i saw is being used in napi-rs in the FromNapiValue trait implementation for Buffer.

anyway. here the code:

napi_value bufferOnly (napi_env env, napi_callback_info info) {
    napi_value argv[1];
    napi_status status;
    size_t argc = 1;
    status = napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
    assert(status == napi_ok);

    size_t byte_width;
    uint8_t* raw_buffer;
    status = napi_get_buffer_info(env, argv[0], (void**) &raw_buffer, &byte_width);
    assert(status == napi_ok);
    napi_value result;
    napi_get_undefined(env, &result);
    return result;
}

napi_value bufferOnlyWithRef (napi_env env, napi_callback_info info) {
    napi_value argv[1];
    napi_status status;
    size_t argc = 1;
    status = napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
    assert(status == napi_ok);
    napi_ref ref;
    status = napi_create_reference(env, argv[0], 1, &ref);
    assert(status == napi_ok);

    size_t byte_width;
    uint8_t* raw_buffer;
    status = napi_get_buffer_info(env, argv[0], (void**) &raw_buffer, &byte_width);
    assert(status == napi_ok);

    // delete ref
    status = napi_delete_reference(env, ref);
    assert(status == napi_ok);

//    return argv[0];
    napi_value result;
    napi_get_undefined(env, &result);
    return result;
}

There is obviously some other overhead in rust implementation that might justify this gap:

bufferOnlyWithRef_c: 56ns/op
pub fn buffer_only(out: Buffer) -> (): 98ns/op

I'm out of ideas on how to optimize this and to be honest i'm not even sure why we want to use napi_create/delete_reference.

Any other insight is more than welcome. If needed i can provide a github repo with some messy code i was playing with.

@vekexasia
Copy link
Author

Also creating/maintaining the Arc seems to be adding a bit of overhead.

@Brooooooklyn
Copy link
Sponsor Member

@vekexasia can you try the code on branch #1975

@vekexasia
Copy link
Author

hey @Brooooooklyn thanks for the reply. I see there is no branch associated to that pr anymore. Any easy way for me to try what's been merged? or do i have to git clone , build and specify the path in cargo.toml

sorry for the dumb question. Quite a bit new to rust.

@Brooooooklyn
Copy link
Sponsor Member

@vekexasia
Copy link
Author

vekexasia commented Feb 24, 2024

Ok i think i managed to test.

here results:

with:

napi = { git = "https://github.com/napi-rs/napi-rs",package = "napi", rev="8ca1967bd896ede46c6afdf1487a198692a07d8b", default-features = false, features = ["napi9"] }
napi-derive = {  git = "https://github.com/napi-rs/napi-rs",package = "napi-derive", rev="8ca1967bd896ede46c6afdf1487a198692a07d8b"}
[c]toBufferBE: 85ns/op
[rs]myToBufferBe: 179ns/op
[rs]myToBufferBe-preallocated: 115ns/op


with

napi = { version = "2.15.4", default-features = false, features = ["napi9"] }
napi-derive = "2.15.3"
[c]toBufferBE: 85ns/op
[rs]myToBufferBe: 183ns/op
[rs]myToBufferBe-preallocated: 113ns/op


custom implementation of buffer

[c]toBufferBE: 85ns/op
[rs]myToBufferBe: 91ns/op
[rs]myToBufferBe-preallocated: 46ns/op

here is my custom impl of Buffer called Buffer2

impl napi::bindgen_prelude::FromNapiValue for Buffer2 {
  unsafe fn from_napi_value(
    env: napi::sys::napi_env,
    napi_val: napi::sys::napi_value,
  ) -> napi::Result<Self> {
    let mut buf = ptr::null_mut();
    let mut len = 0;
    //     napi::check_status!(
    //   unsafe { sys::napi_create_reference(env, napi_val, 1, &mut ref_) },
    //   "Failed to create reference from Buffer"
    // )?;
    napi::check_status!(
      unsafe { napi::sys::napi_get_buffer_info(env, napi_val, &mut buf, &mut len as *mut usize) },
      "Failed to get Buffer pointer and length"
    )?;

    // From the docs of `napi_get_buffer_info`:
    // > [out] data: The underlying data buffer of the node::Buffer. If length is 0, this may be
    // > NULL or any other pointer value.
    //
    // In order to guarantee that `slice::from_raw_parts` is sound, the pointer must be non-null, so
    // let's make sure it always is, even in the case of `napi_get_buffer_info` returning a null
    // ptr.
    let buf = NonNull::new(buf as *mut u8);
    let inner = match buf {
      Some(buf) if len != 0 => buf,
      _ => NonNull::dangling(),
    };

    Ok(Self {
      inner,
      len,
      capacity: len,
    })
  }
}

impl AsRef<[u8]> for Buffer2 {
  fn as_ref(&self) -> &[u8] {
    // SAFETY: the pointer is guaranteed to be non-null, and guaranteed to be valid if `len` is not 0.
    unsafe { std::slice::from_raw_parts(self.inner.as_ptr(), self.len) }
  }
}

impl AsMut<[u8]> for Buffer2 {
  fn as_mut(&mut self) -> &mut [u8] {
    // SAFETY: This is literally undefined behavior. `Buffer::clone` allows you to create shared
    // access to the underlying data, but `as_mut` and `deref_mut` allow unsynchronized mutation of
    // that data (not to speak of the JS side having write access as well, at the same time).
    unsafe { std::slice::from_raw_parts_mut(self.inner.as_ptr(), self.len) }
  }
}

impl std::ops::Deref for Buffer2 {
  type Target = [u8];

  fn deref(&self) -> &Self::Target {
    self.as_ref()
  }
}

impl std::ops::DerefMut for Buffer2 {
  fn deref_mut(&mut self) -> &mut Self::Target {
    self.as_mut()
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants