Implement createObjectURL/Blob from File API #16167

bmeck · 2017-10-12T18:46:10Z

Tracking Issue to allow Loaders to create in-memory URLs that can be imported for things like code coverage:

The text was updated successfully, but these errors were encountered:

bmeck · 2017-10-12T18:46:27Z

TimothyGu · 2017-10-12T19:35:01Z

For reference: https://w3c.github.io/FileAPI/

bcoe · 2017-10-12T20:28:58Z

@bmeck @TimothyGu I'd be interested in pitching in on this work, along with being one of the early consumers with Istanbul ... designing the Blob and BlobeStore bit sounds interesting. Do you picture we'd be exposing existing structures in V8?

bmeck · 2017-10-12T21:39:29Z

@bcoe great! Unfortunately v8 does not expose Blobs in the File API terms, their blobs in v8.h refer to snapshot blobs which are a very different beast. The File API is quite thorough in what should be done. We should avoid File for now though since I can't think of a clear use case.

The important bit to the BlobStore is that it works across workers. If a worker makes a url using URL.createObjectURL it should be available in all threads.

If you need any help I can assist when I have a bit more free time or if you schedule something in advance I will make time.

refack · 2017-10-12T21:51:09Z

For reference - What is BlobStore?

bmeck · 2017-10-12T21:59:54Z

@refack it is the place that url string => Blob mapping is stored by the environment. See spec.

It is used such that it can share URLs across workers so you can do multi-threaded processing: https://jsfiddle.net/ctyvm1tr/1/

bcoe · 2017-10-12T22:07:56Z

@bmeck I intend to make some time this weekend to read through the spec and play with the existing APIs in the browser. Once I know more than basically nothing, I would definitely be interested in arranging a quick screen share.

Is there any prior art in the codebase that shares state across workers that we could build on?

bmeck · 2017-10-12T22:41:27Z

@bcoe nothing in this realm that is sane to read that I know of. I know game engines use it, but that isn't helpful since I don't know their internals.

Fishrock123 · 2017-10-12T23:29:48Z

I'm not really certain what the point of this is given we have an existing file system api and various types of buffers. Could this please be elaborated on before implementation? Thanks.

jasnell · 2017-10-13T00:16:54Z

So I've been working on this but I've been behind due to other pressing matters. It's very much something that I would like to see. To be specific: I already have an implementation underway, I just haven't had the time to finish it. My goal is to have an initial implementation by mid to late November.

In terms of the what the implementation would provide:

A node::blob::Blob native class that represents an immutable chunk of data. This could represent a file on disk, it could represent an allocated chunk of memory, etc. There would be a corresponding JS object but the key point of node::blob::Blob is that the data is held at the native layer without ever crossing into JS unless a FileReader is used.
A node::blob::BlobStore native class that is essentially an addressable store for node::blob::Blob objects. This is essentially a relatively straightforward map-like object.
JavaScript level Blob, File and FileReader classes implemented per the spec. These would be backed by the node::blob::Blob.
An implementation of URL.createObjectURL(). There would be both C and JS implementations of this method, allowing a URL to be generated for a node::blob::Blob within a node::blob::BlobStore.

While this all may seem complicated, the interfaces here are rather simple. A File Blob, for instance, is a thin wrapper on top of libuv's existing file system operations for reading a file. This would essentially just end up being a FileReader based alternative to fs.createReadStream(). It's really quite lightweight in the details. The key issue with File, however, is the requirement to support mime types, which we currently do not handle within Core. That will take some thinking to figure out.

For Blob in general, it is really nothing more than a persistent allocated chunk of memory. It would be possible to create a Blob from one or more TypedArray objects. I'm sketching out additional APIs for the http and http2 modules that would allow a response to draw data from a Blob rather than through the Streams API. There is already something analogous in the http2 implementation in the form of the respondWithFile() and respondWithFD() APIs in the http2 side. Basically, the idea would be to prepare chunks of allocated memory at the native layer, with data that never passes into the JS layer (unless absolutely necessary to do so), then use those to source the data for responses. In early benchmarking this yields a massive boost in throughput without the usual backpressure control issues.

There is certainly a cost, and there are aspects of the implementation that are non-trivial, but the benefits are quite real.

FWIW, I'm not entirely sold on the idea of implementing the File and FileReader portions of this model yet, so I haven't worked on those pieces and could easily be talked out of doing so.

bcoe · 2017-10-13T06:32:25Z

@jasnell my personal interest in this API surface is a follow on from:

#15445

The goal being to facilitate test-coverage and other transpilation steps in .mjs files.

I'm picturing that one could instrument code for coverage using pseudo code that looks something like this:

export async function resolve(specifier, parentModuleURL, defaultResolver) {
  const resolved = new url.URL(specifier, parentModuleURL)
  const ext = path.extname(resolved.pathname)
  if (ext === 'mjs') {
    const source = fs.readFileSync(resolved.pathname)
    const instrumented = istanbul.instrument(source)
    const blob = new Blob([instrumentedSource], {type : 'application/mjs'})
    return {
      url: createObjectURL(blob),
      format: 'esm'
    }
  } else {
    return defaultResolver(specifier, parentModuleURL)
  }
}

Does it seem like I'm on the same page as to how this API could potentially be used?

...an aside:

I keep coming back to the argument that @guybedford's work on #15445 should be exposed through an API hook rather than just a flag. In the world of developer tools, it's often the case that a few transformations need to be performed in sequence, e.g.,

a TypeScript transpilation step takes place to translate TypeScript typing into valid ES2015
a Babel transpilation parsing bleeding edge features, e.g., class decorators.
Istanbul runs, adding line counters to each line of (now ES2015) code.

I don't hate the idea of using createObjectURL() to facilitate the transpilation step ... but now that I sit down and hammer out some pseudo code, I'm not immediately seeing how one could compose the multi-step transformations (described above) using the --loader flag.

In the land of require.extensions one is able to create a stack of the prior transformations being applied, and a multistep transpilation can be applied without each actor knowing about the other (this is important, given the fractal nature of developer toolchains).

CC: @demurgos, @iarna

devsnek · 2018-06-19T22:52:00Z

with the new worker api i'd like to get this all working primarily to support new Worker('blob:uuid')

@bmeck that should be enough reasoning to land mimes yea?

guybedford · 2018-06-24T15:22:44Z

This work would be great to see.

@bcoe it's best not to try and see this as the final picture on the matter I think, but rather allow it to inform the discussions. The use case you describe is one very much understood by the modules group, that will be polished in due course.

Would also be interested to hear your thoughts on #18914 as it is a goal of mine to get that going again, just not sure how much to prioritise it right now.

jimmywarting · 2020-02-28T12:54:08Z

Don't really need the FileReader now when there exist new reading methods on blob's

blob.text() (promise)
blob.arrayBuffer() (promise)
blob.stream() whatwg readable stream

jimmywarting · 2021-02-12T23:58:52Z

jasnell · 2021-02-13T03:05:18Z

So folks know, I've already started work on the async blobs piece. And that is a prereq for the filesystem blobs. Expect a pr soonish.

jasnell · 2021-08-12T23:27:49Z

URL.createObjectURL() and URL.revokeObjectURL() have landed.

jimmywarting · 2021-08-14T23:02:56Z

URL.createObjectURL() and URL.revokeObjectURL() have landed.

...And blob#streams and some other minor stuff! Cool! what's next? The File class?

I suppose a async blob source is supported now (from #39693) ...or?

I'm not entirely sure how the underlying data structor looks like anymore (how it's handled in the backend)... if it still behaves like a large ArrayBuffer bucket like before, or like a blobParts array that holds all chunks with a offset+size. What happens under the hood if you slice a large blob? dose it takes up more memory or is it just a references point now? what would eg happen if i did:

const blob = new Blob([new Uint8Array(2gb)])
const concat = new Blob([blob, blob]) // 4gb
concat.slice(0, 2gb)

jimmywarting · 2022-03-09T12:03:36Z

Reading blobs text larger than 500 MiB is a problem.

github-actions · 2022-09-06T01:30:10Z

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

jasnell · 2022-09-06T01:31:20Z

This was done

bmeck self-assigned this Oct 12, 2017

mscdex added the feature request Issues that request new features to be added to Node.js. label Oct 12, 2017

jimmywarting mentioned this issue Jul 10, 2019

What rules do the community tend to override? airbnb/javascript#1089

Open

b1f6c1c4 mentioned this issue Jan 19, 2020

node js compatibility tinode/tinode-js#28

Closed

fabiospampinato mentioned this issue Feb 19, 2020

webContents.printToPDF regression in v5 electron/electron#18093

Closed

3 tasks

jimmywarting mentioned this issue May 25, 2020

Question: Who and why uses Blob on Node.JS node-fetch/node-fetch#835

Closed

jimmywarting mentioned this issue Dec 9, 2020

Getting file metadata WICG/file-system-access#101

Closed

jasnell mentioned this issue Jan 6, 2021

buffer: introduce Blob #36811

Closed

github-actions bot added the stale label Sep 6, 2022

jasnell closed this as completed Sep 6, 2022

jimmywarting mentioned this issue May 31, 2023

Pursuing this as an active proposal to the real spec Aschen/webblob-experiment#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement createObjectURL/Blob from File API #16167

Implement createObjectURL/Blob from File API #16167

bmeck commented Oct 12, 2017 •

edited by jasnell

bmeck commented Oct 12, 2017

TimothyGu commented Oct 12, 2017

bcoe commented Oct 12, 2017

bmeck commented Oct 12, 2017

refack commented Oct 12, 2017

bmeck commented Oct 12, 2017

bcoe commented Oct 12, 2017

bmeck commented Oct 12, 2017

Fishrock123 commented Oct 12, 2017

jasnell commented Oct 13, 2017

bcoe commented Oct 13, 2017

devsnek commented Jun 19, 2018

guybedford commented Jun 24, 2018

jimmywarting commented Feb 28, 2020 •

edited

jimmywarting commented Feb 12, 2021 •

edited

jasnell commented Feb 13, 2021

jasnell commented Aug 12, 2021

jimmywarting commented Aug 14, 2021 •

edited

jimmywarting commented Mar 9, 2022

github-actions bot commented Sep 6, 2022

jasnell commented Sep 6, 2022

Implement createObjectURL/Blob from File API #16167

Implement createObjectURL/Blob from File API #16167

Comments

bmeck commented Oct 12, 2017 • edited by jasnell

bmeck commented Oct 12, 2017

TimothyGu commented Oct 12, 2017

bcoe commented Oct 12, 2017

bmeck commented Oct 12, 2017

refack commented Oct 12, 2017

bmeck commented Oct 12, 2017

bcoe commented Oct 12, 2017

bmeck commented Oct 12, 2017

Fishrock123 commented Oct 12, 2017

jasnell commented Oct 13, 2017

bcoe commented Oct 13, 2017

devsnek commented Jun 19, 2018

guybedford commented Jun 24, 2018

jimmywarting commented Feb 28, 2020 • edited

jimmywarting commented Feb 12, 2021 • edited

jasnell commented Feb 13, 2021

jasnell commented Aug 12, 2021

jimmywarting commented Aug 14, 2021 • edited

jimmywarting commented Mar 9, 2022

github-actions bot commented Sep 6, 2022

jasnell commented Sep 6, 2022

bmeck commented Oct 12, 2017 •

edited by jasnell

jimmywarting commented Feb 28, 2020 •

edited

jimmywarting commented Feb 12, 2021 •

edited

jimmywarting commented Aug 14, 2021 •

edited