Fix encoding/decoding of base-256 numbers #215

justfalter · 2019-05-30T17:28:21Z

This PR fixes issues I've identified with the handling of base-256 encoded numbers within node-tar (see #188). The issues would generally present themselves when attempting to extract a gnu-formatted tar entry for a file greater than 8gb in size.

The last byte of the buffer was incorrectly being ignored when encoding/decoding.
Javascript can only have safe integer precision for numbers between -9007199254740991 and 9007199254740991. Any numbers outside these bounds will see the lowest-order bits rounded off. For example, 9007199254749999 will be rounded to 9007199254750000.
- This is likely what accounted for the belief that the last byte of the buffer was always set to 00 or 20.
- See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER for more details.
I've modified large-integer.js parse and encode functions to throw TypeError exceptions if they encounter a number that will not be precisely represented as a javascript integer, or if the buffer being decoded does not appear to be a base-256 encoded number (must start with 80 or ff).

- Encoding/decoding of base-256 numbers failed to failed to handle last byte in buffer. Handling was previously broken. - Take javascript's MAX_SAFE_INTEGER / MIN_SAFE_INTEGER into account when encoding/decoding. Namely, if the numbers cannot accurately be represented in javascript with integer-precision, a TypeError will be thrown. - Throw a TypeError if the parser is passed an buffer that does not appear to be base-256 encoded. (must start with 0x80 or 0xff)

lib/large-numbers.js

isaacs · 2019-06-01T04:59:56Z

I see what happened here.

To encode files over 8gb, bsdtar drops the trailing 0x20, and uses that as a part of the octal-in-ascii number. This is an ambiguous part of the spec (such as it is), which gnutar interprets differently. Thankfully, bsdtar also prepends a PAX extended file attributes entry, which gnutar interprets properly. So, for example, a 10gb file would fill those 12 bytes with '120000000000', no terminator. Gnutar misinterprets this as 0o12000000000 (a mere 1.25 Gb), but that's overridden by the PAX header.

I'm not sure what bsdtar would write in that block if the file size couldn't fit in 12 octal digits, but I'm also not eager to create a 64 gb file on my laptop to find out.

isaacs · 2019-06-01T05:07:12Z

Actually I did get curious and checked. Bsdtar does the same thing as gnutar, but only when the file is 64gb or greater.

In short, this patch is good, it's landed, and published to latest. Thank you for digging in and fixing this. The reason it escaped my notice for so long is frankly that pax headers are so much more straightforward and make this type of bug irrelevant in so many cases.

The checks for Number.MAX_SAFE_INTEGER made me chuckle. I thought for a second that throwing a new error should be semver major bump, but if people are using this library to pack and unpack tarballs with 8 petabyte files in them, then the world is definitely in trouble.

justfalter · 2019-06-01T20:52:40Z

Thanks for taking the time to go over this, @isaacs. I’ve got something upstream from my project that is explicitly producing gnu-formatted tarballs, for some reason.

Honestly, the only reason I even thought to do the min/max int checks were because you had tests that would have violated those checks. 8 petabyte files are pretty unlikely, but I wondered if there weren’t other numbers that might be encoded (uid, his, etc) that might someday exceed max int. I dug a bit into libarchive, and they have similar checks, there, as well.

sdball reviewed May 30, 2019

View reviewed changes

lib/large-numbers.js Outdated Show resolved Hide resolved

Remove duplicate word.

9a44de7

isaacs merged commit 9a44de7 into isaacs:master Jun 1, 2019

isaacs mentioned this pull request Jun 4, 2019

more than 8Go file support and data corruption #188

Closed

isaacs mentioned this pull request Jul 1, 2019

Release: npm@6.10.0 npm/cli#205

Merged

snyk-bot mentioned this pull request Aug 13, 2019

[Snyk] Upgrade npm from 6.6.0 to 6.10.2 devopsred/gaia#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix encoding/decoding of base-256 numbers #215

Fix encoding/decoding of base-256 numbers #215

justfalter commented May 30, 2019 •

edited

isaacs commented Jun 1, 2019

isaacs commented Jun 1, 2019

justfalter commented Jun 1, 2019

Fix encoding/decoding of base-256 numbers #215

Fix encoding/decoding of base-256 numbers #215

Conversation

justfalter commented May 30, 2019 • edited

isaacs commented Jun 1, 2019

isaacs commented Jun 1, 2019

justfalter commented Jun 1, 2019

justfalter commented May 30, 2019 •

edited