Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zlib: expose zlib.crc32() #52692

Merged
merged 6 commits into from May 2, 2024
Merged

zlib: expose zlib.crc32() #52692

merged 6 commits into from May 2, 2024

Conversation

joyeecheung
Copy link
Member

@joyeecheung joyeecheung commented Apr 25, 2024

This patch exposes the crc32() function from zlib to user-land.

It computes a 32-bit Cyclic Redundancy Check checksum of data. If
value is specified, it is used as the starting value of the checksum,
otherwise, 0 is used as the starting value.

The CRC algorithm is designed to compute checksums and to detect error
in data transmission. It's not suitable for cryptographic authentication.

To be consistent with other APIs, if the data is a string, it will
be encoded with UTF-8 before being used for computation. If users only
use Node.js to compute and match the checksums, this works well with
other APIs that uses the UTF-8 encoding by default.

Some third-party JavaScript libraries compute the checksum on a
string based on str.charCodeAt() so that it can be run in browsers.
If users want to match the checksum computed with this kind of library
in the browser, it's better to use the same library in Node.js
if it also runs in Node.js. If users have to use zlib.crc32() to
match the checksum produced by such a third-party library:

  1. If the library accepts Uint8Array as input, use TextEncoder
    in the browser to encode the string into a Uint8Array with UTF-8
    encoding, and compute the checksum based on the UTF-8 encoded string
    in the browser.
  2. If the library only takes a string and compute the data based on
    str.charCodeAt(), on the Node.js side, convert the string into
    a buffer using Buffer.from(str, 'utf16le').
const zlib = require('node:zlib');
const { Buffer } = require('node:buffer');

let crc = zlib.crc32('hello');  // 907060870
crc = zlib.crc32('world', crc);  // 4192936109

crc = zlib.crc32(Buffer.from('hello', 'utf16le'));  // 1427272415
crc = zlib.crc32(Buffer.from('world', 'utf16le'), crc);  // 4150509955

NOTE: it's exposed in zlib because that's also what Python and Ruby do.

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. zlib Issues and PRs related to the zlib subsystem. labels Apr 25, 2024
This patch exposes the crc32() function from zlib to user-land.

It computes a 32-bit Cyclic Redundancy Check checksum of `data`. If
`value` is specified, it is used as the starting value of the checksum,
otherwise, 0 is used as the starting value.

```js
const zlib = require('node:zlib');
const { Buffer } = require('node:buffer');

let crc = zlib.crc32('hello');  // 907060870
crc = zlib.crc32('world', crc);  // 4192936109

crc = zlib.crc32(Buffer.from('hello'));  // 907060870
crc = zlib.crc32(Buffer.from('world'), crc);  // 4192936109
```
@lpinca
Copy link
Member

lpinca commented Apr 25, 2024

Wouldn't it be better to expose it under crypto?

Edit: On second thought, maybe not.

@joyeecheung
Copy link
Member Author

joyeecheung commented Apr 25, 2024

Wouldn't it be better to expose it under crypto?

It's not a cryptographic hash, but a checksum. I was thinking about copying the python documentation (python also puts it on zlib):

The algorithm is not cryptographically strong, and should not be used for authentication or digital signatures. Since the algorithm is designed for use as a checksum algorithm, it is not suitable for use as a general hash algorithm.

(Not sure if it's okay to copy other documentation, but if we have to write one from scratch it wouldn't be too different anyway?)

@panva
Copy link
Member

panva commented Apr 26, 2024

Wouldn't it be better to expose it under crypto?

Edit: On second thought, maybe not.

I had the exact same thought process.

The algorithm is not cryptographically strong, and should not be used for authentication or digital signatures. Since the algorithm is designed for use as a checksum algorithm, it is not suitable for use as a general hash algorithm.

I agree this documentation bit or something to its effect is necessary to circumvent the same brain exercises myself and @lpinca went through :)

@joyeecheung
Copy link
Member Author

joyeecheung commented Apr 26, 2024

The doc currently puts it as

The CRC algorithm is designed to compute checksums and to detect error
in data transmission. It's not suitable for cryptographic authentication.

Also, note that both Python and Ruby puts crc32 under a zlib module

@joyeecheung joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 26, 2024
@github-actions github-actions bot added request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. and removed request-ci Add this label to start a Jenkins CI on a PR. labels Apr 26, 2024
Copy link
Contributor

Failed to start CI
   ⚠  Something was pushed to the Pull Request branch since the last approving review.
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/8848132665

@joyeecheung
Copy link
Member Author

huh, this is...amusing. I'll start from Jenkins.

@nodejs-github-bot
Copy link
Collaborator

@lpinca lpinca added the semver-minor PRs that contain new features and should be released in the next minor version. label Apr 26, 2024
src/node_zlib.cc Outdated Show resolved Hide resolved
Co-authored-by: Anna Henningsen <github@addaleax.net>
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

lib/zlib.js Show resolved Hide resolved
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@joyeecheung
Copy link
Member Author

Updated to run clang-format on the code. Can I get a re-approval to land it? @benjamingr @jasnell @addaleax @lpinca @anonrig @VoltrexKeyva

@panva panva added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. and removed request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. labels May 2, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label May 2, 2024
@nodejs-github-bot nodejs-github-bot merged commit 2cfabb1 into nodejs:main May 2, 2024
57 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 2cfabb1

Ch3nYuY pushed a commit to Ch3nYuY/node that referenced this pull request May 8, 2024
This patch exposes the crc32() function from zlib to user-land.

It computes a 32-bit Cyclic Redundancy Check checksum of `data`. If
`value` is specified, it is used as the starting value of the checksum,
otherwise, 0 is used as the starting value.

```js
const zlib = require('node:zlib');
const { Buffer } = require('node:buffer');

let crc = zlib.crc32('hello');  // 907060870
crc = zlib.crc32('world', crc);  // 4192936109

crc = zlib.crc32(Buffer.from('hello'));  // 907060870
crc = zlib.crc32(Buffer.from('world'), crc);  // 4192936109
```

PR-URL: nodejs#52692
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Mohammed Keyvanzadeh <mohammadkeyvanzade94@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this pull request May 8, 2024
This patch exposes the crc32() function from zlib to user-land.

It computes a 32-bit Cyclic Redundancy Check checksum of `data`. If
`value` is specified, it is used as the starting value of the checksum,
otherwise, 0 is used as the starting value.

```js
const zlib = require('node:zlib');
const { Buffer } = require('node:buffer');

let crc = zlib.crc32('hello');  // 907060870
crc = zlib.crc32('world', crc);  // 4192936109

crc = zlib.crc32(Buffer.from('hello'));  // 907060870
crc = zlib.crc32(Buffer.from('world'), crc);  // 4192936109
```

PR-URL: #52692
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Mohammed Keyvanzadeh <mohammadkeyvanzade94@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run. semver-minor PRs that contain new features and should be released in the next minor version. zlib Issues and PRs related to the zlib subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants