Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: add base64url encoding option #36952

Merged
merged 1 commit into from Jan 18, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
16 changes: 13 additions & 3 deletions doc/api/buffer.md
Expand Up @@ -50,6 +50,9 @@ const buf7 = Buffer.from('tést', 'latin1');
## Buffers and character encodings
<!-- YAML
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/36952
description: Introduced `base64url` encoding.
- version: v6.4.0
pr-url: https://github.com/nodejs/node/pull/7111
description: Introduced `latin1` as an alias for `binary`.
Expand Down Expand Up @@ -106,6 +109,11 @@ string into a `Buffer` as decoding.
specified in [RFC 4648, Section 5][]. Whitespace characters such as spaces,
tabs, and new lines contained within the base64-encoded string are ignored.

* `'base64url'`: [base64url][] encoding as specified in
panva marked this conversation as resolved.
Show resolved Hide resolved
[RFC 4648, Section 5][]. When creating a `Buffer` from a string, this
encoding will also correctly accept regular base64-encoded strings. When
panva marked this conversation as resolved.
Show resolved Hide resolved
encoding a `Buffer` to a string, this encoding will omit padding.

* `'hex'`: Encode each byte as two hexadecimal characters. Data truncation
may occur when decoding strings that do exclusively contain valid hexadecimal
characters. See below for an example.
Expand Down Expand Up @@ -482,9 +490,10 @@ Returns the byte length of a string when encoded using `encoding`.
This is not the same as [`String.prototype.length`][], which does not account
for the encoding that is used to convert the string into bytes.

For `'base64'` and `'hex'`, this function assumes valid input. For strings that
contain non-base64/hex-encoded data (e.g. whitespace), the return value might be
greater than the length of a `Buffer` created from the string.
For `'base64'`, `'base64url'`, and `'hex'`, this function assumes valid input.
For strings that contain non-base64/hex-encoded data (e.g. whitespace), the
return value might be greater than the length of a `Buffer` created from the
string.

```js
const str = '\u00bd + \u00bc = \u00be';
Expand Down Expand Up @@ -3418,6 +3427,7 @@ introducing security vulnerabilities into an application.
[`buffer.constants.MAX_STRING_LENGTH`]: #buffer_buffer_constants_max_string_length
[`buffer.kMaxLength`]: #buffer_buffer_kmaxlength
[`util.inspect()`]: util.md#util_util_inspect_object_options
[base64url]: https://tools.ietf.org/html/rfc4648#section-5
[binary strings]: https://developer.mozilla.org/en-US/docs/Web/API/DOMString/Binary
[endianness]: https://en.wikipedia.org/wiki/Endianness
[iterator]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols
19 changes: 19 additions & 0 deletions lib/buffer.js
Expand Up @@ -646,6 +646,20 @@ const encodingOps = {
encodingsMap.base64,
dir)
},
base64url: {
encoding: 'base64url',
encodingVal: encodingsMap.base64url,
byteLength: (string) => base64ByteLength(string, string.length),
write: (buf, string, offset, len) =>
buf.base64urlWrite(string, offset, len),
slice: (buf, start, end) => buf.base64urlSlice(start, end),
indexOf: (buf, val, byteOffset, dir) =>
indexOfBuffer(buf,
fromStringFast(val, encodingOps.base64url),
byteOffset,
encodingsMap.base64url,
dir)
},
hex: {
encoding: 'hex',
encodingVal: encodingsMap.hex,
Expand Down Expand Up @@ -702,6 +716,11 @@ function getEncodingOps(encoding) {
if (encoding === 'hex' || StringPrototypeToLowerCase(encoding) === 'hex')
return encodingOps.hex;
break;
case 9:
if (encoding === 'base64url' ||
StringPrototypeToLowerCase(encoding) === 'base64url')
return encodingOps.base64url;
break;
}
}

Expand Down
4 changes: 4 additions & 0 deletions lib/internal/buffer.js
Expand Up @@ -18,12 +18,14 @@ const { validateNumber } = require('internal/validators');
const {
asciiSlice,
base64Slice,
base64urlSlice,
latin1Slice,
hexSlice,
ucs2Slice,
utf8Slice,
asciiWrite,
base64Write,
base64urlWrite,
latin1Write,
hexWrite,
ucs2Write,
Expand Down Expand Up @@ -1027,12 +1029,14 @@ function addBufferPrototypeMethods(proto) {

proto.asciiSlice = asciiSlice;
proto.base64Slice = base64Slice;
proto.base64urlSlice = base64urlSlice;
proto.latin1Slice = latin1Slice;
proto.hexSlice = hexSlice;
proto.ucs2Slice = ucs2Slice;
proto.utf8Slice = utf8Slice;
proto.asciiWrite = asciiWrite;
proto.base64Write = base64Write;
proto.base64urlWrite = base64urlWrite;
proto.latin1Write = latin1Write;
proto.hexWrite = hexWrite;
proto.ucs2Write = ucs2Write;
Expand Down
5 changes: 5 additions & 0 deletions lib/internal/util.js
Expand Up @@ -180,6 +180,11 @@ function slowCases(enc) {
StringPrototypeToLowerCase(`${enc}`) === 'utf-16le')
return 'utf16le';
break;
case 9:
if (enc === 'base64url' || enc === 'BASE64URL' ||
StringPrototypeToLowerCase(`${enc}`) === 'base64url')
return 'base64url';
break;
default:
if (enc === '') return 'utf8';
}
Expand Down
4 changes: 4 additions & 0 deletions src/api/encoding.cc
Expand Up @@ -68,13 +68,17 @@ enum encoding ParseEncoding(const char* encoding,
} else if (encoding[1] == 'a') {
if (strncmp(encoding + 2, "se64", 5) == 0)
return BASE64;
if (strncmp(encoding + 2, "se64url", 8) == 0)
return BASE64URL;
}
if (StringEqualNoCase(encoding, "binary"))
return LATIN1; // BINARY is a deprecated alias of LATIN1.
if (StringEqualNoCase(encoding, "buffer"))
return BUFFER;
if (StringEqualNoCase(encoding, "base64"))
return BASE64;
if (StringEqualNoCase(encoding, "base64url"))
return BASE64URL;
break;

case 'a':
Expand Down
4 changes: 4 additions & 0 deletions src/node_buffer.cc
Expand Up @@ -1184,13 +1184,15 @@ void Initialize(Local<Object> target,

env->SetMethodNoSideEffect(target, "asciiSlice", StringSlice<ASCII>);
env->SetMethodNoSideEffect(target, "base64Slice", StringSlice<BASE64>);
env->SetMethodNoSideEffect(target, "base64urlSlice", StringSlice<BASE64URL>);
env->SetMethodNoSideEffect(target, "latin1Slice", StringSlice<LATIN1>);
env->SetMethodNoSideEffect(target, "hexSlice", StringSlice<HEX>);
env->SetMethodNoSideEffect(target, "ucs2Slice", StringSlice<UCS2>);
env->SetMethodNoSideEffect(target, "utf8Slice", StringSlice<UTF8>);

env->SetMethod(target, "asciiWrite", StringWrite<ASCII>);
env->SetMethod(target, "base64Write", StringWrite<BASE64>);
env->SetMethod(target, "base64urlWrite", StringWrite<BASE64URL>);
env->SetMethod(target, "latin1Write", StringWrite<LATIN1>);
env->SetMethod(target, "hexWrite", StringWrite<HEX>);
env->SetMethod(target, "ucs2Write", StringWrite<UCS2>);
Expand Down Expand Up @@ -1223,13 +1225,15 @@ void RegisterExternalReferences(ExternalReferenceRegistry* registry) {

registry->Register(StringSlice<ASCII>);
registry->Register(StringSlice<BASE64>);
registry->Register(StringSlice<BASE64URL>);
registry->Register(StringSlice<LATIN1>);
registry->Register(StringSlice<HEX>);
registry->Register(StringSlice<UCS2>);
registry->Register(StringSlice<UTF8>);

registry->Register(StringWrite<ASCII>);
registry->Register(StringWrite<BASE64>);
registry->Register(StringWrite<BASE64URL>);
registry->Register(StringWrite<LATIN1>);
registry->Register(StringWrite<HEX>);
registry->Register(StringWrite<UCS2>);
Expand Down
8 changes: 6 additions & 2 deletions src/string_decoder.cc
Expand Up @@ -70,7 +70,10 @@ MaybeLocal<String> StringDecoder::DecodeData(Isolate* isolate,

size_t nread = *nread_ptr;

if (Encoding() == UTF8 || Encoding() == UCS2 || Encoding() == BASE64) {
if (Encoding() == UTF8 ||
Encoding() == UCS2 ||
Encoding() == BASE64 ||
Encoding() == BASE64URL) {
// See if we want bytes to finish a character from the previous
// chunk; if so, copy the new bytes to the missing bytes buffer
// and create a small string from it that is to be prepended to the
Expand Down Expand Up @@ -198,7 +201,7 @@ MaybeLocal<String> StringDecoder::DecodeData(Isolate* isolate,
state_[kBufferedBytes] = 2;
state_[kMissingBytes] = 2;
}
} else if (Encoding() == BASE64) {
} else if (Encoding() == BASE64 || Encoding() == BASE64URL) {
state_[kBufferedBytes] = nread % 3;
if (state_[kBufferedBytes] > 0)
state_[kMissingBytes] = 3 - BufferedBytes();
Expand Down Expand Up @@ -311,6 +314,7 @@ void InitializeStringDecoder(Local<Object> target,
ADD_TO_ENCODINGS_ARRAY(ASCII, "ascii");
ADD_TO_ENCODINGS_ARRAY(UTF8, "utf8");
ADD_TO_ENCODINGS_ARRAY(BASE64, "base64");
ADD_TO_ENCODINGS_ARRAY(BASE64URL, "base64url");
ADD_TO_ENCODINGS_ARRAY(UCS2, "utf16le");
ADD_TO_ENCODINGS_ARRAY(HEX, "hex");
ADD_TO_ENCODINGS_ARRAY(BUFFER, "buffer");
Expand Down
1 change: 1 addition & 0 deletions test/addons/parse-encoding/binding.cc
Expand Up @@ -6,6 +6,7 @@ namespace {
#define ENCODING_MAP(V) \
V(ASCII) \
V(BASE64) \
V(BASE64URL) \
V(BUFFER) \
V(HEX) \
V(LATIN1) \
Expand Down
1 change: 1 addition & 0 deletions test/addons/parse-encoding/test.js
Expand Up @@ -8,6 +8,7 @@ assert.strictEqual(parseEncoding(''), 'UNKNOWN');

assert.strictEqual(parseEncoding('ascii'), 'ASCII');
assert.strictEqual(parseEncoding('base64'), 'BASE64');
assert.strictEqual(parseEncoding('base64url'), 'BASE64URL');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need to add this to ENCODING_MAP in the .cc file here :)

Copy link
Member Author

@panva panva Jan 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did now, locally. But i need to rebuild them somehow :) before i waste ci time i'd like to get local pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

../../../deps/npm/node_modules/node-gyp/bin/node-gyp.js --nodedir=../../.. rebuild, from the directory of the test (yes, I have that saved as an alias 😋)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, i did that and it still fails. In CI too :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly src/api/encoding.cc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@panva Was just about to say that :) Yes, that parser also needs to be updated here, it seems

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certain about the second part of the diff, not so much about the first one - do you see a better way to optimize the distinction between se64 and se64url?

assert.strictEqual(parseEncoding('binary'), 'LATIN1');
assert.strictEqual(parseEncoding('buffer'), 'BUFFER');
assert.strictEqual(parseEncoding('hex'), 'HEX');
Expand Down