Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Don't think atob and btoa should be marked as legacy #40754

Closed
jimmywarting opened this issue Nov 8, 2021 · 22 comments
Closed

doc: Don't think atob and btoa should be marked as legacy #40754

jimmywarting opened this issue Nov 8, 2021 · 22 comments
Labels
doc Issues and PRs related to the documentations.

Comments

@jimmywarting
Copy link

btoa(data)

Stability: 3 - Legacy. Use buf.toString('base64') instead.

I advocate more spec'ed and cross compatible code on multiple platforms. Therefore i normally never use or suggest node's only Buffer class whenever possible.

@jimmywarting jimmywarting added the doc Issues and PRs related to the documentations. label Nov 8, 2021
@aduh95
Copy link
Contributor

aduh95 commented Nov 8, 2021

I think the reasoning for putting a legacy status on this API was: folks who are looking into using btoa are probably looking for an API that can convert a string into a base64 representation, which btoa is not capable of doing for non-ASCII strings, according to https://developer.mozilla.org/en-US/docs/Web/API/btoa#unicode_strings.
If the web was providing an API that does what buf.toString('base64') is doing, we would suggest this instead; in any case, we should keep the legacy status for atob and btoa IMHO.

@jimmywarting
Copy link
Author

here is how you can deal with non-ASCII strings (taken from mdn)

function utf8_to_b64( str ) {
  return btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
  return decodeURIComponent(escape(atob( str )));
}

// Usage:
utf8_to_b64('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"

this works grate in across different env and don't tie you to using a nodes buffer class

@Trott
Copy link
Member

Trott commented Feb 15, 2022

This is tricky. We added "Legacy" as a status in our docs to conform with what the specs were doing. But then we applied it to atob() and btoa() even though the specs don't call those Legacy. The reason is that there was (and probably still is) a strong desire in Node.js core to discourage people from using atob() and btoa().

Here's one Node.js collaborator explaining: https://twitter.com/addaleax/status/1384553868726607872

And another: https://twitter.com/jasnell/status/1372626002480754689

@jimmywarting
Copy link
Author

jimmywarting commented Feb 15, 2022

I agree with jasnell that you should try to use typedarrays whenever possible and even avoid using base64 in the first place...
such as URL.createObjectURL instead of base64 dataurl and using canvas.toBlob instead of canvas.toDataURL and using FormData to send files instead of using json for a rest api. and using TextEncoder/TextDecoder and uint8arrays

But what i don't like to see is node:buffer being preferred over native web api's if somebody is recommending one to use buffer and it ends up in your browser bundle, then you have to download buffer, base64-js and ieee754 of a total of 16.8 kB (unminified - just tried import('https://jspm.dev/buffer') in the browser dev tool) that isn't a good cross platform solution for either Deno or Browser.

I'm somewhat okey by calling it legacy and say something in terms of "warning beware of non-utf8 strings, use typed arrays over base64 strings whenever possible, and have a look at mdn for possible solutions/workaround"
but what i don't like to see is TypeScript marking legacy api's as deprecated and that it recommends you to use node:buffer instead. it isn't deprecated, it won't go away and yes it has it old historical perks that i think many developers already knows about. and it exist in all environments.
I think that it should rather make a big awareness of how it handles encoding than calling it legacy. i think it's still useful for the small footprint it has and it simplicity for cross compatibility across different environments.

quite many use btoa just for the simple sake of doing a basic Authentication header cuz you have no other choice than to pass it in as a base64 string in the request headers.
but i don't recommend this approach either... cookie or any bearer token could be preferred over basic authentication... but you don't always have a choice in that either when it comes to 3th party apis

We added "Legacy" as a status in our docs to conform with what the specs were doing

Where in the spec those it say that it is marked as legacy?

@Trott
Copy link
Member

Trott commented Feb 16, 2022

Where in the spec those it say that it is marked as legacy?

I specifically said atob() and btoa() were not Legacy in the specs:

But then we applied it to atob() and btoa() even though the specs don't call those Legacy.

When I wrote "We added 'Legacy'", I mean we added "Legacy" as a status that we use in our docs because the specs were adding Legacy as a status they use for things. Sorry that what I wrote wasn't as clear as it could have been. (EDIT: It probably still isn't....)

@jacobbogers
Copy link

which btoa is not capable of doing for non-ASCII strings,

Base64 standard is not allowed to return anything outside the range of ASCII to begin with, so this is not a valid argument.

a strong desire in Node.js core to discourage people from using atob() and btoa().

That is not a valid argument, these functions are standard (and will remain standard) on web clients,

@tniessen
Copy link
Member

which btoa is not capable of doing for non-ASCII strings,

Base64 standard is not allowed to return anything outside the range of ASCII to begin with, so this is not a valid argument.

@jacobbogers I believe that @aduh95 was referring to the input of btoa(), not to its base64 output. Of course its output is ASCII, but it does not handle non-ASCII strings well in its input.

a strong desire in Node.js core to discourage people from using atob() and btoa().

That is not a valid argument, these functions are standard (and will remain standard) on web clients,

Just because they are standard (and will remain standard) on web clients does not mean that we have to encourage their use within Node.js.

@tniessen
Copy link
Member

The main argument against the legacy status that I have seen here is that TypeScript marks legacy APIs as deprecated. Well, we didn't deprecate them, TypeScript did.

@jimmywarting
Copy link
Author

jimmywarting commented May 12, 2023

The main argument against the legacy status that I have seen here is that TypeScript marks legacy APIs as deprecated. Well, we didn't deprecate them, TypeScript did.

TypeScript have literally taken the exact same wording as NodeJS have in their docs:
@deprecated — Use buf.toString('base64') instead.

Don't they just scan everything that is prefixed with: Stability: 3 - xxx and use that?

a strong desire in Node.js core to discourage people from using atob() and btoa().

That is not a valid argument, these functions are standard (and will remain standard) on web clients,

Just because they are standard (and will remain standard) on web clients does not mean that we have to encourage their use within Node.js.

Well, then don't encourage ppl to use node:buffer and switch to using typed arrays instead instead of dealing with strings when it's the mater of handling binary

- Use buf.toString('base64')
+ Use TypedArrays such as Uint8Array or heck even Blob & append it to `FormData()` to send it or something...

i would even go as far as to taking away the legacy flag and still keep the message below or something.

Here is my full proposal:

  ## `btoa(data)`
  
  <!-- YAML
  added: v16.0.0
  -->
  
-  > Stability: 3 - Legacy. Use `buf.toString('base64')` instead.
-  
-  Global alias for [`buffer.btoa()`][].
  
+  **Beware:** Calling `btoa` on a Unicode string can cause a `Character Out Of Range` 
+  exception if a character exceeds the range of a 8-bit ASCII-encoded character. 
+
+  It's instead better to just deal with data using [TypedArrays](https://mdn-link) instead of increasing
+  the data with base64 that can take up to 33% more memory and requires
+  unnecessary decode/encode performance
+ 
+  Also see this [alternative solutions][https://developer.mozilla.org/en-US/docs/Glossary/Base64#the_unicode_problem]
  
  ## `clearImmediate(immediateObject)`

@jimmywarting
Copy link
Author

jimmywarting commented May 12, 2023

heck even using cbor, Bson, Protobuf or any alternative solution that dose not depend on base64 strings would be a better transport

if you have some api endpoint that for the most part only deals with json then you could combine it with FormData somehow:

const blob = fs.openAsBlob('screen recording.png')
const json = { width: 1024, height: 720, geolocation: '32.2313, 22.2328' }
const fd = new FormData()
fd.set('video', blob)
fd.set('metadata', JSON.stringify(json))
new Response(fd).pipeTo(destination)

See, No external dependency needed, support both json and binary data with no 33% increased base64 string.
To decode it you could just do: fetch(api).then(res => res.formData()).then(...)


Another way of doing it would be to send one simple large blob that concat json byteLength + json + binary blob

const blob = fs.openAsBlob('screen recording.png')
const json = { width: 1024, height: 720, geolocation: '32.2313, 22.2328' }
const meta = new TextEncoder().encode(json)
new Blob([
  new Uint16Array([meta.byteLength]),
  meta,
  binary
]).stream().pipeTo(dest)

Just need to be a little creative. how you then encode / decode it and if you need any additional binary files you may need to include any additional blob size in the json and include something like { fileSizes [24, 54, 22] } so you know where to slice the blob at

even the alternative solution that i mention above that comes from MDN is a good solution if you still insist on using only json.

i still think atob/btoa is still a legit reason to still use them for small simple things where you know for a fact that it's only dealing with ascii characters

@tniessen
Copy link
Member

TypeScript have literally taken the exact same wording as NodeJS have in their docs:

How so? The Node.js docs say legacy (Stability: 3). They explicitly do not say deprecated (Stability: 0).

@jimmywarting
Copy link
Author

TypeScript have literally taken the exact same wording as NodeJS have in their docs:

How so? The Node.js docs say legacy (Stability: 3). They explicitly do not say deprecated (Stability: 0).

It's a automated script that generates @types/node for each NodeJS version.

don't know exactly how it works, but i found this:

https://github.com/DefinitelyTyped/DefinitelyTyped/blob/05663f015e3d72c8ded01ac424f69310adae1572/types/node/scripts/generate-docs/ast-processing.ts#L187-L189
https://github.com/DefinitelyTyped/DefinitelyTyped/blob/05663f015e3d72c8ded01ac424f69310adae1572/types/node/scripts/generate-docs/ast-processing.ts#LL223C5-L223C11

        switch (stability) {
            case Stability.Legacy:
                tags.push(this.createDeprecatedTag(stabilityText?.replace('Legacy. ', '') ?? 'Legacy API'));
                break;
        }

That's why this is so annoying and why i want to see the legacy flag being removed cuz i don't want to see btoa('xyz') for using it's functionality.

it dose not say this when you don't include @types/node and instead using the DOM lib.
but vscode dose not always know what you are using/targeting when it's plain vanilla js project.

@tniessen
Copy link
Member

I understand what the TypeScript tooling is doing. However, you claim that "TypeScript have literally taken the exact same wording as NodeJS have in their docs," which is untrue. We have not marked these functions as deprecated. Legacy APIs are not deprecated. See Stability index.

If you are targeting Node.js environments, then using @types/node and avoiding atob()/btoa() makes sense. If you are not only targeting Node.js environments, there might be better type definitions out there.

@jimmywarting
Copy link
Author

jimmywarting commented May 12, 2023

I understand what the TypeScript tooling is doing. However, you claim that "TypeScript have literally taken the exact same wording as NodeJS have in their docs,

What are you talking about? they parse the docs and import everything NodeJS have written about it.
and yes it's marked as legacy but TS creates a @deprecated jsdoc
TS treads legacy as deprecated which is a bit Meh!

it comes up even where i do not have @types/node included or even a tsconfig.json file.
As soon as something slips in that i'm using some sort of NodeJS function or any isomorphic code that runs on multiple env or have optional dependencies on NodeJS. it then decides to automatically include @types/node for me.

I created a empty folder/project with just one index.js files that included:

globalThis.TextDecoder ??= await import('node:util').then((mod => mod.TextDecoder)
globalThis.URL ??= await import('node:util').then((mod => mod.URL)

atob('few')

and it then decide that: Nope! atob should not be used... presumably b/c you are using NodeJS. which i don't always dose since it's optional and isomorphic for cross compatible reasons.

image

even including something like webpack build process will make the entire project seems as if it's using a NodeJS env through of all files in my src code where i do not even have NodeJS specific code

@tniessen
Copy link
Member

What are you talking about?

I don't really want to repeat the same argument over and over again. I don't know how to make it any clearer than in my last three comments: Node.js did not mark these APIs as deprecated. Whatever TypeScript is doing should not affect whether we can consider these functions as legacy APIs. We don't maintain @types/node. Legacy does not mean deprecated.

@jimmywarting

This comment was marked as off-topic.

@panva
Copy link
Member

panva commented May 14, 2023

what i meant by this was that Typescript takes the same description but also treats legacy as deprecated

That's a problem to be raised with DefinitelyTyped, specifically the script used.

@jimmywarting
Copy link
Author

DefinitelyTyped issue or not... i still think the issue remains: atob/btoa should not be flagged as legacy. it's a web spec'ed api.
And i don't feel like any web spec'ed api should ever be flagged as deprecated or legacy.

@tniessen
Copy link
Member

atob/btoa should not be flagged as legacy. it's a web spec'ed api.

If you use these functions, you are relying on the HTML Standard, which Node.js does not officially implement. These functions are not part of the ECMAScript standard, which is what Node.js implements.

They only exist in the HTML Standard because Brendan Eich "reflected them into JS in a big hurry in 1995". They were added to Netscape almost 30 years ago and never made it into the ECMAScript standard. It is well-known (see, for example, this discussion) that the web platform needs better APIs than atob() and btoa(), but somehow, neither ECMA-262 nor WhatWG nor W3C have standardized any.

Node.js was doing fine without these functions for more than a decade (from 2009 until 2021). JavaScript runtimes have zero obligation to implement these functions because, again, they are not part of the ECMAScript specification. Then, before they were even added, there were suggestions to deprecate them. That did not happen, but they were marked as legacy following Jordan's comment, who explained legacy as "things that are icky that we wish we could get rid of." And that's exactly what these functions are.

@panva
Copy link
Member

panva commented May 14, 2023

Great summary @tniessen 👍

neither ECMA-262 nor WhatWG nor W3C have standardized any.

... not yet ;) hopefully this moves forward https://github.com/tc39/proposal-arraybuffer-base64

If I'm looking at Node.js docs then it's entirely correct to suggest the use of Buffer instead and the legacy status does not preclude its use.

That the status from Node.js gets transformed into deprecated in @types/node and that some IDEs auto pull it in is unfortunate, but also not something the Node.js project is responsible for fixing.

@jimmywarting
Copy link
Author

jimmywarting commented May 14, 2023

@tniessen
Copy link
Member

Sure, when TC39 or WhatWG or W3C finally decide on better APIs for base64/hex encoding and decoding after almost 30 years, then that will be yet another reason to mark atob()/btoa() as legacy. Regardless, I think there is plenty justification for the legacy status already, and any TypeScript issues are out of our hands. Is there anything left to do or discuss here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations.
Projects
None yet
Development

No branches or pull requests

6 participants