Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add warnings about transferring Buffers and ArrayBuffer #33252

Closed
wants to merge 1 commit into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented May 5, 2020

See discussion here: #33240

/cc @addaleax @mafintosh @ronag @mcollina

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added doc Issues and PRs related to the documentations. worker Issues and PRs related to Worker support. labels May 5, 2020
Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer an approach where the user cannot get it wrong, i.e. even if they put a buffer in the transfer list, we still copy it. I'm worried that not all users would actually read this warning.

Maybe an option or flag where the user has to opt-in to this unsafe behavior, where the description in this PR applies.

I'm not all that familiar with this API. Does similar considerations apply in the web as well?

@mafintosh
Copy link
Member

@ronag that's what @addaleax fixed here #32759

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

The only suggestion I’d have is that it would seem to make more sense to me to move this to the Buffer doc and link to that from the MessagePort documentation, because neither the problem of multiple views over the same memory range nor the possibility an ArrayBuffer being detached are specific to MessagePorts or Workers.

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

@mafintosh:

@ronag that's what @addaleax fixed here #32759

From what I can see, #32759 only partially addresses the issue in that it only helps with Buffer instances sliced off the pre-allocated pool. Buffers sliced from Buffer.alloc() are still unsafe and internally (C++) created Buffers are likely still unsafe to transfer (I haven't tested those completely yet).

@ronag ... it is not possible to completely protect users against getting it wrong here as the same fundamental problem exists with TypedArray instances in general although those are less likely to crash the process if misused. As @mafintosh points out in the other thread, the issue is the fact that ArrayBuffer instances may be shared and that's independent of Buffer.

@addaleax
Copy link
Member

addaleax commented May 5, 2020

@jasnell Not sure what you mean by “safe” here, but Buffers created on the C++ side generally have their own ArrayBuffer, like Buffer.alloc() does.

@mafintosh
Copy link
Member

@jasnell agreed.

I think I would almost go as far as stating it this way: "Unless you allocated the ArrayBuffer, safe Buffer or the TypedArray, you probably should not transfer it"

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

@addaleax:

The only suggestion I’d have is that it would seem to make more sense to me to move this to the Buffer doc and link to that from the MessagePort documentation, because neither the problem of multiple views over the same memory range nor the possibility an ArrayBuffer being detached are specific to MessagePorts or Workers.

I actually want to create similar warnings in a couple different places with different angles. In the stream docs, for instance, I'd like to add a warning about the possibility of the underlying ArrayBuffer being modified before data can be used/flushed. I'd just rather keep the warnings close to the APIs where the uses are most likely to be problematic.

@addaleax
Copy link
Member

addaleax commented May 5, 2020

@jasnell I’m okay with that approach, but in that case, I’d reduce the text here to re-scope it to what is actually relevant to transfers using MessagePort: That transferring a Buffer invalidates all of its views, and that more data may be transferred than necessary.

doc/api/worker_threads.md Outdated Show resolved Hide resolved
doc/api/worker_threads.md Outdated Show resolved Hide resolved
doc/api/worker_threads.md Outdated Show resolved Hide resolved
@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

@addaleax:

@jasnell Not sure what you mean by “safe” here, but Buffers created on the C++ side generally have their own ArrayBuffer, like Buffer.alloc() does.

Yes, but we also have a tendency to grab and use pointers to the raw data at the C++ level. What I mean by safe here is that buffer instances created at the C++ level may end up being used in ways that are completely invisible to the user land code. The Abort in the original post is one such example.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

It would be good to have a way to check if a buffer is pooled and if so clone manually.

@addaleax
Copy link
Member

addaleax commented May 5, 2020

It would be good to have a way to check if a buffer is pooled and if so clone manually.

@mcollina You mean other than buf.byteLength !== buf.buffer.byteLength?

@addaleax:

@jasnell Not sure what you mean by “safe” here, but Buffers created on the C++ side generally have their own ArrayBuffer, like Buffer.alloc() does.

Yes, but we also have a tendency to grab and use pointers to the raw data at the C++ level. What I mean by safe here is that buffer instances created at the C++ level may end up being used in ways that are completely invisible to the user land code. The Abort in the original post is one such example.

@jasnell Okay, that makes sense 👍 But that’s true for all Buffers, whether created in C++ or from JS land. I was just confused because the comment and the text here was specifically referring to Buffers created in C++.

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

I was just confused because the comment and the text here was specifically referring to Buffers created in C++.

Sorry, I meant this in the context that I've gone through a bunch of tests already with the Buffers created at the JavaScript level and know pretty much what the issues are there. I'm still going through the various ways we create buffers at the C++ level to see if there are any other possible gotchas.

For instance, the following are handled differently with regards to transferability:

Here, for instance, the returned Buffer instance is transferable and the original Buffer will no longer be usable...

  static char data[10] = {};
  memset(data, 1, 10);
  v8::MaybeLocal<v8::Object> buf =
    Buffer::New(env->isolate(), data, 10);
  args.GetReturnValue().Set(buf.ToLocalChecked());

While here, it is not (the buffer will be cloned), and the original Buffer continues to be usable.

  char* data = node::Malloc<char>(10);
  memset(data, 0, 10);
  v8::MaybeLocal<v8::Object> buf =
    Buffer::New(env->isolate(), data, 10, [](char* data, void* hint) {
      free(data);
    }, nullptr);
  args.GetReturnValue().Set(buf.ToLocalChecked());

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

@mcollina You mean other than buf.byteLength !== buf.buffer.byteLength?

This check is not really a reliable way of determining if a Buffer is pooled:

const ab = new ArrayBuffer(100);
const buf = Buffer.from(ab, 0, 10);  // Not pooled, just over allocated.
console.log(buf.byteLength !== buf.buffer.byteLength);

@addaleax
Copy link
Member

addaleax commented May 5, 2020

@jasnell Right, but that doesn’t seem like a typical situation to me – if the ArrayBuffer is larger than its “primary” view, then there’s usually a reason for that, and that reason is that that extra memory is intended to be used for something, right?

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

...and that reason is that that extra memory is intended to be used for something, right?

Or it's not and it's just a silly bug that a user does. The point is not really about how likely the case is but about how reliable of a check it is to determine if the Buffer is pooled. It fails in the following case also:

const a = Buffer.from('test');
const b = Buffer.from(a.buffer);

console.log(b.byteLength === b.buffer.byteLength);

In which case the byteLength's are identical but b is still pooled. This is by far a much easier mistake for users to make as we have seen instances in the wild where people forget to account for offset and length.

@addaleax
Copy link
Member

addaleax commented May 5, 2020

@jasnell Right, that second example is a good point, because I’d be more worried about false positives here than false negatives. But I’m also not sure that there would be any reasonable way to check whether b in that example shares its AB with another view.

@jasnell
Copy link
Member Author

jasnell commented May 5, 2020

There absolutely isn't a way which is why the recommendation here should be: never transfer a Buffer unless you know exactly where it came from and how it's being used.

doc/api/worker_threads.md Outdated Show resolved Hide resolved
@RReverser
Copy link
Member

I don't know about Buffer specifics in this case, but at least on the Web upon transfer the original underlying ArrayBuffer becomes of size 0.

Couldn't the same check be used here? (buf.buffer.byteLength === 0)

In the worst case, it will match empty buffers too but these can be filtered out when Buffer is first received.

@jasnell
Copy link
Member Author

jasnell commented May 6, 2020

@RReverser ... that happens here also when an ArrayBuffer is transferred. The issue, however, is that not all ArrayBuffers used in Node.js are transferable. Some (like the one used for the pre-allocated Buffer pool have to be cloned. The other key difference with Node.js vs. the browser is that on the C++ side, we frequently grab pointers to the underlying memory storage for Buffers (or create Buffer instances around external memory) and not all of the code in Node.js that does so is prepared adequately to deal with detached ArrayBuffer instances and we end up with crashes.

Signed-off-by: James M Snell <jasnell@gmail.com>
@addaleax
Copy link
Member

addaleax commented May 7, 2020

Landed in baba42c

@addaleax addaleax closed this May 7, 2020
addaleax pushed a commit that referenced this pull request May 7, 2020
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #33252
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Juan José Arboleda <soyjuanarbol@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
codebytere pushed a commit that referenced this pull request May 11, 2020
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #33252
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Juan José Arboleda <soyjuanarbol@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@codebytere codebytere mentioned this pull request May 18, 2020
codebytere pushed a commit that referenced this pull request Jun 7, 2020
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #33252
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Juan José Arboleda <soyjuanarbol@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@codebytere codebytere mentioned this pull request Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. worker Issues and PRs related to Worker support.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet