Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to upload a file larger than 64KB #1713

Closed
yutent opened this issue Feb 12, 2023 · 1 comment
Closed

Unable to upload a file larger than 64KB #1713

yutent opened this issue Feb 12, 2023 · 1 comment
Labels

Comments

@yutent
Copy link

yutent commented Feb 12, 2023

import fetch from 'node-fetch'
import { Blob } from 'buffer'

let body = new FormData()

body.append('file', new Blob([Buffer.from('a'.repeat(64 * 1024 + 1))]))

fetch('https://localhost/upload', {
  method: 'POST',
  body
})

It will throw the error.

TypeError [ERR_INVALID_STATE]: Invalid state: chunk ArrayBuffer is zero-length or detached
    at new NodeError (node:internal/errors:393:5)
    at ReadableByteStreamController.enqueue (node:internal/webstreams/readablestream:1122:13)
    at Object.pull (file:///code/node_modules/fetch-blob/index.js:161:42) {
  code: 'ERR_INVALID_STATE'
}

OS: debian 11 amd64
node: 18.10.0
node-fetch: 3.3.0

@yutent yutent added the bug label Feb 12, 2023
@jimmywarting
Copy link
Collaborator

I where about to start writing about something completely different. (about how it's better to just use blob's backed up by the filesystem but then i tough 64KB was nothing. it's so small... so it would not matter.

then i investigated it and it further and tried to reproduce it (preferable without fetch - if possible)
b/c it's most likely not a problem with node-fetch, maybe it was just something with our toIterator to web stream converter.

So i manage to reproduce it with just:

import { Blob as fetchBlob } from 'node-fetch'
import { Blob } from 'buffer'

const buf = Buffer.from('a'.repeat(64 * 1024 + 2))
const blob = new Blob([buf])
const file = new fetchBlob([blob])

for await (const c of file.stream()) {
  console.log(c)
}

(therefore i'm going to close this as this issue dose not belong here - but feel free to keep commenting)

This is more or less what's going on under the hood.
the hole FormData is converted into a Blob and then sent to the destination.
kind of like formData2blob(fd).stream().pipeTo(dest)
Anyhow that's beside the point, (bit of topic)


readable web byte-streams have this kind of step where it detaches the underlying ArrayBuffer when using:

ReadableStream({
  type: 'bytes',
  async pull (ctrl) { 
    ctrl.enqueue(uint8array) 
  }
})

And once the uint8array have been enqueued then the buffer will be detached and the byte length will become zero so you can't use that uint8array again for anything else.

that's the root cause of this error: chunk ArrayBuffer is zero-length or detached. it have been detached.
then i tried something else:

import { Blob } from 'buffer'

const buf = Buffer.from('a'.repeat(64 * 1024 + 2))
const blob = new Blob([buf])

for await (const c of blob.stream()) {
  console.log(c)
}
Uint8Array(65536) [
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97,
  97, 97, 97, 97,
  ... 65436 more items
]
Uint8Array(2) [ 97, 97 ]

all looks good ✅
then i had a look at the underlying ArrayBuffer and tried something else

for await (const c of blob.stream()) {
  console.log(c.buffer) // log the arrayBuffer
}
ArrayBuffer {
  [Uint8Contents]: <61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 ... 65438 more bytes>,
  byteLength: 65538
}
ArrayBuffer {
  [Uint8Contents]: <61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 ... 65438 more bytes>,
  byteLength: 65538
}

that dose not look right... ❌
should have been

ArrayBuffer {
  [Uint8Contents]: <61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 ... 65438 more bytes>,
  byteLength: 65538
}
ArrayBuffer {
  [Uint8Contents]: <61 61>,
  byteLength: 2
}

I took an extra look at the NodeJS source code: console.log(Blob.toString()) cuz i knew that .stream() impl in NodeJS it isn't perfect.

stream() {
    if (!isBlob(this))
      throw new ERR_INVALID_THIS('Blob');

    const self = this;
    return new lazyReadableStream({
      async start() {
        this[kState] = await self.arrayBuffer();
        this[kIndex] = 0;
      },

      pull(controller) {
        if (this[kState].byteLength - this[kIndex] <= kMaxChunkSize) {
          controller.enqueue(new Uint8Array(this[kState], this[kIndex]));
          controller.close();
          this[kState] = undefined;
        } else {
          controller.enqueue(new Uint8Array(this[kState], this[kIndex], kMaxChunkSize));
          this[kIndex] += kMaxChunkSize;
        }
      }
    });
  }

the issue here is that it's using:

controller.enqueue(new Uint8Array(this[kState], this[kIndex], kMaxChunkSize));

instead of doing things such as:

controller.enqueue(new Uint8Array(this[kState].slice(start, end));

This is unexpected from a web spec perspective of reusing the same underlying ArrayBuffer (but with an uint8Array with a byte offset & size instead)

So really this isn't a issue with fetch-blob either, but with NodeJS own implementation of theirs Blob class
it should not even be doing:

    return new lazyReadableStream({
      async start() {
        this[kState] = await self.arrayBuffer();

as that is a bad memory hog to allocate the hole arrayBuffer like that instead of creating some more stream friendlier variant
I have already reported it to NodeJS as said it's bad. nodejs/node#42264
i even reported this same exact problem in one of the comments: nodejs/node#42264 (comment)

But i totally forgot about even posting them to NodeJS earlier.

I have always been reluctant to use NodeJS own Blob implementation and doing things such as export default globalThis.Blob || polyfill, cuz it well... sry for saying this: NodeJS blob impl. kind of sucks...

  • it isn't stream friendly,
  • casting the blob.type to all lowercase is bad
  • and there is no way to create blob backed up by the filesystem so they are completely useless - you might just as well use ArrayBuffer or Typed Arrays then, b/c it's hold in memory anyway...

@node-fetch node-fetch locked and limited conversation to collaborators Feb 12, 2023
@jimmywarting jimmywarting converted this issue into discussion #1714 Feb 12, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

2 participants