Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Out-of-band signal for requesting Binary AST #27

Open
RReverser opened this issue Mar 10, 2018 · 26 comments
Open

[Discussion] Out-of-band signal for requesting Binary AST #27

RReverser opened this issue Mar 10, 2018 · 26 comments

Comments

@RReverser
Copy link
Contributor

RReverser commented Mar 10, 2018

I recently asked on the chat about a planned way to request Binary AST from the server and got the following answer:

@Yoric: We plan to have a mechanism, but we haven't attempted to design it yet. The vague consensus for the moment was to use something like <script src="..." binsrc="...">, which seems like the cheap way to keep it backwards-compatible.

While this is a relatively simple solution, I have a concern about limitations it imposes.

In particular, in an ideal world I think it would be reasonable to support a usecase where e.g. a shared CDN with lots of JavaScript libraries could simply create Binary AST variants of all the assets, and return them instead of regular JavaScript when it knows that 1) browser supports it and 2) that such change would be mostly invisible to the consumer (that is, JS was indeed requested via <script> or import(...) or other means purely for execution, and not with XMLHttpRequest or fetch).

To support usecases like that, signal for Binary AST support should come not from HTML level (as it's much harder to get HTML updated on all the websites where script is inserted), but rather on network level.

One way to do this would be adding binast or similar marker to the Accept-Encoding list for script requests in supported browsers, which would tell the server that Binary AST version can be safely returned with Content-Encoding: binast in the response.

Using encoding headers for this goal feels quite natural, as it's mostly an encoding format for JavaScript, although one might argue that because it's not lossless in terms of debugging information, it doesn't belong to Accept-Encoding/Content-Encoding headers - in that case, I'm open to any proposals.

@RReverser
Copy link
Contributor Author

RReverser commented Mar 11, 2018

Actually, another option could be to use Accept/Content-Type pair, but 1) it might look weird to have different content-type to be loaded for <script type="text/javascript" src="..."> - maybe not a big deal though? probably not a problem, since server can already return any supported JS MIME type ignoring what is specified on the tag and 2) currently script's Accept header simply sends */*, so this would need to be similar to WebP support where new variant is explicitly requested first.

@Yoric
Copy link
Collaborator

Yoric commented Mar 12, 2018

Thanks for starting this conversation. It would indeed be great to specify this in such a way that proxies and CDNs can speed up webpages transparently.

So far, our experiments with putting compression inside BinAST don't look useful, so I suspect that we're going to end up using an out-of-the-box compression mechanism. So what about a

-Accept-Encoding:binjs+gz (or + anything else); or
-Content-Type: application/binjs and Accept-Encoding: gz.

@RReverser
Copy link
Contributor Author

Having thought about it a bit more, I'm starting to think that mime-type way is indeed a bit easier to implement and makes more sense that encoding one I originally proposed.

So, to be more precise about:

-Content-Type: application/binjs and Accept-Encoding: gz.

Client-side will have to prepend another mime-type to Accept as well, so it will look like:

...
Accept: application/binjs, */*
Accept-Encoding: gzip, deflate, br
...

and server will respond with something like:

...
Content-Type: application/binjs
Content-Encoding: br
...

Does that look right?

@Yoric
Copy link
Collaborator

Yoric commented Mar 12, 2018

That looks fine to me.

@RReverser
Copy link
Contributor Author

Cool, looks good to me too.

@RReverser
Copy link
Contributor Author

@Yoric Do we need to commit to this in the proposal text somehow?

@Yoric
Copy link
Collaborator

Yoric commented Mar 12, 2018

I believe that TC39 doesn't care about mime types or content encoding (@syg, can you confirm?), so this should probably go to some other proposal.

For the sake of experimenting, I have filed a Firefox bug on the topic. I'll try and find someone to work on it (possibly me) once we have a working multipart tokenizer in Firefox.

@RReverser
Copy link
Contributor Author

@Yoric Thanks!

@RReverser
Copy link
Contributor Author

To clarify - what I meant, even if TC39 doesn't care about these details, it would be still nice to have all information about Binary AST (including delivery) in the same place just so that it would be easy to find and refer to.

@Yoric
Copy link
Collaborator

Yoric commented Mar 12, 2018

@RReverser Maybe in an Examples section? You'll have to discuss this with @syg, he's the Master of the Spec.

@annevk
Copy link
Member

annevk commented Mar 26, 2018

FWIW, if this ends up being a thing this will need to be defined in the HTML Standard. Having a monkey patch of sorts of the algorithms there would be good so tests and such can be written against that. You might also want to file an upstream issue at https://github.com/whatwg/html/issues/new to make it clear you're extending the algorithms defined there.

@annevk
Copy link
Member

annevk commented Mar 26, 2018

I also think we might want to restrict these to be CORS-loaded, as with module scripts. That's another thing that'll need to be defined here.

@RReverser
Copy link
Contributor Author

@annevk

to make it clear you're extending the algorithms defined there

Given that this most likely won't be allowed inline in script tags, what algorithms do you think we'll need to change in HTML spec? As far as I see it, this proposal could get away with no changes to actual normative sections since it should behave exactly as any other external script element, with text/javascript and such, just like gzip and brotli don't have any special handling in HTML spec (AFAIK).

I also think we might want to restrict these to be CORS-loaded, as with module scripts.

Won't this break existing pages relying on scripts in the "transparent optimisation" scenario described above?

@annevk
Copy link
Member

annevk commented Mar 26, 2018

@RReverser for classic scripts HTML requires all responses, regardless of their Content-Type header, to be parsed per the JavaScript specification. This would change that, no?

As for requiring CORS, we sorta decided to do that for new types of resources. Given what we now know about attacks on opaque responses that seems like a good thing. It seems bad to me to allow new types of resources to be loaded without CORS, thereby continuing to support known bad patterns.

@RReverser
Copy link
Contributor Author

@annevk

This would change that, no?

I suppose that's true, yeah, although first we would need to have ECMAScript spec changes landed first to have something to link to.

As for requiring CORS, we sorta decided to do that for new types of resources.

I agree in general, but what concerns me in this case is that it's not really a new type of resource (as in regular definition of "resource type"), rather a special encoding of existing ones that should work transparently for actual websites.

So I'm not saying we shouldn't do it per se, but I do wonder if there are scenarios where it would break websites that rely on third-party scripts from a hosting that desides to opt-in to Binary AST. Or, if it's mostly third-party, this shouldn't cause any new issues?

@annevk
Copy link
Member

annevk commented Mar 27, 2018

first we would need to have ECMAScript spec changes landed

The HTML Standard links directly to some ECMAScript proposals, such as BigInt and import(), that are expected to make it.

If the third-party makes the choice that would be problematic, yes. However, we could make it so that the Accept header is not modified for such requests (that might be a good idea regardless, as modifying the Accept header is itself a slight same-origin policy violation imo).

@Yoric
Copy link
Collaborator

Yoric commented Apr 18, 2018

Minimal change: I'd like to rename the mime type application/javascript+binast. I believe that this is clearer.

@RReverser
Copy link
Contributor Author

I don't have any preferences regarding the mime type, so sounds good to me.

@annevk
Copy link
Member

annevk commented Apr 18, 2018

I'd recommend using - instead of + since binast is specific to JavaScript if I'm not mistaken and not a general applicable suffix.

@Yoric
Copy link
Collaborator

Yoric commented Apr 18, 2018

Well, I have the not-so-well-hidden idea of applying this to json soon, and then to experiment with css and html, so I believe that the + makes sense. What do you think, @annevk?

@annevk
Copy link
Member

annevk commented Apr 18, 2018

Yeah, then it certainly would.

@hsivonen
Copy link

Well, I have the not-so-well-hidden idea of applying this to json soon

How does the result differ from CBOR?

@Yoric
Copy link
Collaborator

Yoric commented Apr 20, 2018

I wasn't aware of CBOR. This seems to be pretty much equivalent.

@adamroach
Copy link

On the question of + versus - in the MIME type: the existence of application/foo+bar implies an underlying standalone MIME type of application/bar. So, if you think it makes sense to have an application/binast, then the plus might make sense.

I suspect that's probably not what you want. But once you make a decision, I'm happy to help you figure out how to get this all registered. The MIME type registration would be an IETF item.

Tagging @linuxwolf for his awareness.

@j-f1
Copy link

j-f1 commented Jun 15, 2018

@adamroach Would application/bar+foo also imply a MIME type of just application/bar?

@annevk
Copy link
Member

annevk commented Jun 16, 2018

@j-f1 no, e.g., there's image/svg+xml, but not image/svg (though note there's also no image/xml so @adamroach's rule doesn't work entirely either).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants