Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of unrecognized types #27

Open
littledan opened this issue Nov 9, 2019 · 37 comments
Open

Semantics of unrecognized types #27

littledan opened this issue Nov 9, 2019 · 37 comments

Comments

@littledan
Copy link
Member

In several threads, @ljharb has suggested that unrecognized type values be treated as a missing type. Let's centralize discussion of that question here.

@ljharb
Copy link
Member

ljharb commented Nov 9, 2019

The same reason unrecognized attributes are ignored - that if a “magic” attribute is added in 2021, 2020 browsers shouldn’t fail on it or else anyone trying to support 2020 browsers can’t use “magic” - apply to a type value.

If a new type is added in 2021 (not like a new module type, but like “modules-stricter” or “wasm-extra”, or something else we can’t predict) then code that wants to support 2020 engines can’t use it if it fails; they can if it’s ignored.

@littledan
Copy link
Member Author

My worry, on the other hand, is that if we permit unknown type values, then it could quickly become Web-incompatible to add a different interpretation to that type in the future. This is one reason why the Web went with <script type=module> -- since elements of that form were inert!

@ljharb
Copy link
Member

ljharb commented Nov 9, 2019

The difference there is that because the script tag is out of band, there’s no need to coordinate with other platforms, since only the web has an html script tag.

Putting things in-band, in code, has the additional wrinkle of interacting with other non-web platforms, unless you want a build process required.

Any out of band mechanism needs no such compatibility concern because it can be tailored to the target platforms.

@littledan
Copy link
Member Author

I'm a bit confused. Even though the script tags are out of band, there was a compatibility issue in this case. I'd expect we'd have to worry about compatibility errors in general whether something is in-band or out-of-band, even if those compatibility issues only exist on the Web and not in other JS environments. (Our slogan is "Don't break the Web", not "Don't break the intersection of all JS environments".)

@ljharb
Copy link
Member

ljharb commented Nov 9, 2019

Yes, but that’s not our only slogan - the web isn’t the sole constituent for the JS language. You’re right that there’s always a compatibility concern, but when it’s within the web, the concerns can be limited to web concerns.

Things that go inside code are always cross-platform concerns.

@littledan
Copy link
Member Author

littledan commented Nov 9, 2019

Yes, I agree we should worry both about web-compatibility concerns and concerns across the JS ecosystem. Wherever type: is used, I think there's a very strong forward-compatiblity concern if we ignore unknown types. If we decide to move this out of JS and into an out-of-band file, that would affect all environments where it's used--if you're trying to affect how that works, forward compatibility concerns apply as well.

@ljharb
Copy link
Member

ljharb commented Nov 9, 2019

The web would be the only consumer of such out of band information, as far as i can tell. Erroring on unknown types in that scenario seems fine to me.

@littledan
Copy link
Member Author

littledan commented Nov 9, 2019

I disagree that there is a difference between in-band and out-of-band with respect to this sort of question. An environment may choose to either entirely ignore the inline type: or not have an out-of-band file. They are logically equivalent.

@ljharb
Copy link
Member

ljharb commented Nov 9, 2019

I think ignoring the inline type entirely wouldn’t make any sense at all; if something is going to be added to the language, it’s semantics should apply to every implementation.

@gibson042
Copy link

I agree with @ljharb that we need to have an upgrade path for new type values, but disagree that the strategy should be "load modules with unknown type value as if the attribute were absent".

For constraining preexisting types, new attributes would be better than new type values (e.g., type: "esm", pure: "true" rather than type: "esm-pure").

For extending preexisting types in a way that does not break loading (such as use of new ECMAScript standard library functionality, e.g. String.prototype.matchAll), no changes at all are necessary for importing, just guards inside the module.

However, for extending preexisting types at such a fundamental level that it does break loading (such as use of new ECMAScript syntax or a new WASM version/section id/etc.), a new type value is warranted—and no amount of import signals would help an old client load those modules. What I would hope for there is a way to express that an import is conditional, ideally with sufficient power to also define fallback imports (although that may be out of scope for this proposal).

@ljharb
Copy link
Member

ljharb commented Nov 10, 2019

The security concerns motivating this proposal seem to be about ensuring that you can safely import something you know to be non-executing, and that it won’t change to execute on you later. If that’s accurate, then instead of specifying a type, this really seems like a flag “i expect this not to execute in the JS environment” - it’d be set for json and css, and not for wasm or js, for example; and new module types would fall into one bucket or the other.

@littledan
Copy link
Member Author

@gibson042 The fallback idea is an interesting one. I was imagining import-maps would provide that, but that functionality has now been removed (and it's unclear how import map fallbacks would tie into the module attributes). Do you have any concrete ideas for how these fallbacks could work? Do you think they are needed for our MVP here?

@xtuc
Copy link
Member

xtuc commented Nov 11, 2019

I think that import-maps would be a great fit for this proposal on the web. It could allow to fallback if a certain type isn't known to the current environment by providing an alternative implementation (security guarantee would not be preserved however).

Migration to WebAssembly with a fallback to JS is a good example.

nodejs/node#29978 might allow it on Node.js as well, but there have different use-cases. Edit: #25 might a better place to discuss Node.js implementation details.

@littledan
Copy link
Member Author

I do like the idea of falling back between module types somehow. Note, fallbacks were removed from import-maps. Even if they were still present, we'd have to think about how to represent a URL together with module attributes in the import map, where currently, import maps only map to a URL.

@gibson042
Copy link

gibson042 commented Nov 13, 2019

I don't know that fallback semantics are necessary for MVP, but the pain from their absence is already real—await, regular expression named captures and lookbehind assertions, and object rest/spread were all added after import, making them syntactic hazards for modules (a problem which will only grow over time, as already happened with script source for browsers).

As for concrete ideas, I would look to CSS @supports for inspiration because it serves an analogous function. So we might end up with something like

import { foo as fooPreferred } from "./newHotness.js" with {
	condition: "es.version >= 2021"
};
import { foo as fooFallback } from "./oldAndBusted.js" with {
	condition: "!(es.version >= 2021)"
};
const foo = fooPreferred ?? fooFallback;

import { patterns as patternsPreferred } from "./unicode-modern.js" with {
	condition: "es.unicodeSequenceProperties"
};
import { patterns as patternsFallback } from "./unicode-legacy.js" with {
	condition: "!es.unicodeSequenceProperties"
};
const patterns = patternsPreferred ?? patternsFallback;

// Per EDIT below...
import { utils as utilsPreferred } from "./util-modern.js" with {
	condition: "(class { #private })"
};
import { utils as utilsFallback } from "./util-legacy.js" with {
	condition: "try { !eval('(class { #private })') }catch{ true }"
};
const utils = utilsPreferred ?? utilsFallback;

or a less ImportCall-friendly equivalent that promotes out the microsyntax.

Semantically, conditions would be subject to static evaluation... a syntactically invalid condition would abort loading the module graph, and a false condition would assign undefined (or perhaps a sentinel symbol) to every identifier from the import rather than try processing it.

EDIT: Even better (IMO) than static evaluation would be dynamic evaluation in an isolated realm that lacks top-level await and other asynchrony, allowing for more intuitive feature detection (e.g., condition: "RegExp('\\p{RGI_Emoji}', 'u')" rather than condition: "es.unicodeSequenceProperties").

@littledan
Copy link
Member Author

I don't want to jump down your throat over this example, but: Oof, querying on ES version is a whole other can of worms. The thing is, implementations don't really tend to ship a whole ES version in lockstep, but rather go feature by feature, so I am not sure how meaningful the query would be if it refers to spec versions. Similar attempts in the web platform found implementations lying, rendering the feature useless.

@devsnek
Copy link
Member

devsnek commented Nov 13, 2019

Even querying on a specific feature can end up hitting implementation bugs or missing the granularity required for the checks you want to perform.

@gibson042
Copy link

I had a suspicion that bikeshedding this here might be a bad idea... ☹️

But as stated above, the primary need for this is syntax... ECMAScript source modules can dynamically test for things like Array.prototype.flat or Object.fromEntries, but they won't load at all on old hosts if they use e.g. import.meta or private fields.

@devsnek
Copy link
Member

devsnek commented Nov 13, 2019

I don't think that idea really works here though. The syntax a module uses is encapsulated to that module, not the consumer.

@gibson042
Copy link

It becomes relevant for the consumer as soon as they try importing it. Cooperation between the importing and exporting sides would obviously be required, but at least it becomes possible to use new formats and new syntax rather than being bound by the lowest common denominator.

@devsnek
Copy link
Member

devsnek commented Nov 14, 2019

@gibson042 If I can choose between three scripts with identical behaviour, and the only differences is the features they use internally, why wouldn't I just use whatever has the broadest support instead of setting up a matrix of loading conditions?

@gibson042
Copy link

It's possible that you personally still write in or transpile to ES5, but authors all over the Internet are constantly clamoring for new syntactic sugar and the ability to use it, with varying claims of the resulting performance and/or convenience. I'll refer you to whatwg/html#4432 for a particularly relevant example.

@littledan
Copy link
Member Author

I'd suggest that we continue the discussion about the possible use cases for non-type attributes in #8 . Let's limit this thread to discussing handling of unrecognized attributes, and just take it for granted that this sort of situation will come up if we ever add another attribute.

@gibson042
Copy link

To distill my earlier comments to just that topic, I'm advocating for unrecognized types to abort loading the module graph. When an import comes with an explicit type signal, the host must either honor that by processing it correctly, or fail altogether.

@jfparadis
Copy link

I like the idea that "unrecognized abort loading", however if a new type comes on, it gets confusing:

  • on a JS engine without module-attributes: loads proceeds with unspecified result
  • on a JS engine with module-attributes, without new type: load is aborted
  • on a JS engine with module-attributes, with new type: load succeeds

It's likely that this will be creatively used by developers to achieve what @gibson042 mentioned above, although with a performance penalty:

import { foo as fooPreferred } from "./newHotness.js" with type: 'fooType';
import { foo as fooFallback } from "./oldAndBusted.js"; // no type;
const patterns = patternsPreferred ?? patternsFallback;

@littledan
Copy link
Member Author

@jfparadis Hmm, I would be surprised if we gave the semantics that, if the type is unrecognized, you get a module out that somehow has lots of named exports, that are all undefined or null. Generally, I'd imagine module graph loading to fail if one of the loads is unsuccessful. (As you point out, this is not a great loading strategy because you always load the fallback.) Top-level await may be our friend here!

@dandclark
Copy link
Collaborator

The web is going to reject unknown type values (see the import assertion integration PR ). On the web this seems clearly necessary because otherwise browsers without support for a new module type will be vulnerable to the MIME type security issue where an unexpected JS MIME type could cause unexpected script execution:

// Consider the case where third-party-site responds with a JS payload with MIME type application/javascript.
// If unknown 'type' values were ignored, this would then trigger unexpected script execution on browsers that
// don't yet support JSON modules.
import json from "https://third-party-site/info.json" assert { type: "json" };

@MylesBorins , do you know what Node is doing for unsupported type values? If there is some convergence among hosts here then I think we should consider enforcing this behavior with something like #111. It would be nice to have interoperability on this.

@dandclark
Copy link
Collaborator

During the April TC39 meeting (notes) we discussed #111, an attempt at specifying interoperable behavior that hosts must follow when unrecognized types are asserted. There were concerns about how this would affect type virtualization; that it would cause problems with custom loaders, user-specified types, and transforms between module types.

Last week we discussed this further during an SES call (recording here). My takeaway was that enforcing a particular behavior for unrecognized type assertions is probably not something we should pursue further.

The first reason for this is that it's hard to see how something like this could be specified without limiting the capabilities of some hosts.

One the one hand, a restriction like this seems natural for the web since the web has a static list of module types that it supports, and it doesn't allow any author hooks that change loader behavior or add support for more types. So any type assertion not recognized in the HTML spec can always be rejected.

On the other hand, this restriction is problematic for hosts like Node. They want an extensible set of type assertions so that bundlers and virtualized realms and compartments can accept new types of modules. So Node doesn't have a static list of "valid" types that can be allow-listed. The set of known types and how those types are handled might be changed arbitrarily by author JavaScript, and there might be author-defined transforms between module types. I don't see a good way to say what an "unrecognized" type is in such an environment without limiting these capabilities.

A second point that came up during the meeting is that there some doubt about the motivation for restricting host behavior for unrecognized types. The goal is to maximize portability between environments. But code that is importing non-universally supported types is not likely to be very portable anyway. If I have code running in some virtualization environment that depends on custom module types defined in that environment, that code is likely to fall over when I try to run it in some other environment.

The remaining case I can try to make for a restriction of behavior here is that it would be good if things like the following would be guaranteed to fail across all hosts, rather than succeed on some but fail on others:

import foo from "./bar" assert { type: "TODO figure out the right thing to assert here" };
import config from "./config.json" assert { type: "jso" }; // oops, typo in the assertion value

But given the difficulties noted above I'm skeptical that it's worth pursuing.

Due to these considerations, during the July meeting I plan to recommend that we not pursue #111 or other approaches to limit
host behavior for unrecognized type assertions.

CC some folks who participated in that discussion: @bmeck, @kriskowal, @caridy, also CC @annevk who I know was interested in having something like #111.

@annevk
Copy link
Member

annevk commented Jun 25, 2021

If certain host environments make the type essentially a free-for-all field, not only will that code not be portable, it might end up with a different meaning across hosts. E.g., some bundler claims "html" for something and later the HTML Standard uses "html" for something with different semantics.

That seems rather problematic for a language feature.

@bmeck
Copy link
Member

bmeck commented Jun 25, 2021

@annevk the kind of "leave it to the host" behavior is similar in nature to how string specifiers are not given a mandatory resolution and loading algorithm. E.G. import "react" can mean different resolution and loading across hosts as well. The language is only specifying the hook for a host to attach behavior to around the string "react" in that case and I think for assertions the result will likely be similar unless a mandatory upstreaming of all pertinent types is created. For that matter, I don't think CSS/HTML modules are appropriate in the JS specification given our current division of concerns.

@annevk
Copy link
Member

annevk commented Jun 25, 2021

I'm not saying all hosts have to support the same types, but the unbounded behavior suggested above for some hosts will make it harder to share types and code.

@bmeck
Copy link
Member

bmeck commented Jun 25, 2021

@annevk I'm not sure what you are suggesting be done about it. The unbounded behavior is intentional to allow programs to provide their own specialized types. I don't think namespacing to discern user provided like x- headers or css prefixes worked out great and don't think that is being suggested right now, but we could ponder something to discern if something came from a specific specification (web vs node vs ...) but it seems like it would have the same issues.

@codehag
Copy link

codehag commented Jul 7, 2021

For types which are restricted to an environment, such as node, would it make sense to declare that environment as another key on the assertion? for example env: "node", and enforce rejecting assertions that are made which are not part of that host? This would also communicate to users why their code works in one environment and not in another?

I think the direction this is going is toward possibly a block list rather than an allow list (based on the slides). That is quite a bit weaker in terms of guarantee and I am wondering if we can strengthen it a bit while also helping users understand why something might not work across hosts.

@guybedford
Copy link

Personally for Node.js I would want to encourage using conditional exports over an environment specific assertion. Code that can work is preferable to code that does not work! And environment assertions would be naturally incompatible with conditional exports / conditional internal imports (https://nodejs.org/dist/latest-v16.x/docs/api/packages.html#packages_conditional_exports). Even for things like import { readFile } from 'fs', we can use conditional exports to shim this layer between environments such that import { readFile } from 'fs' assert { env: 'node' } would actually inhibit better virtualization.

@bmeck
Copy link
Member

bmeck commented Jul 7, 2021

@codehag unfortunately "types that are limited to an environment" would add boilerplate to web and node specific types. Doing so declaring the environment providing the implementation would prevent cross environment implementation compatibility. Even with a more usable directive that just defines that the type is provided by the host and not a custom type it would prevent loaders from polyfilling types which is also detrimental. Also, to be clear, the types provided by Node are not defined since they can be expanded upon and users might load implementations for types like YAML, CoffeeScript, etc. which are not intended to be prevented by Node itself. It would be good to know what kind of problem we are seeking to solve still. On the call we had a few weeks ago I questioned the premise of this constraint solving a problem in practice.

@Jack-Works
Copy link
Member

I think we should have a syntactic way to distinguish between safe-to-ignore types and unsafe-to-ignore types.

e.g.

import {} from './x' asserts { type: "yaml" }
import {} from './x' enhances { immutable: true } // safe-to-ignore

@ljharb
Copy link
Member

ljharb commented Jul 11, 2021

type yaml would be safe to ignore too, in an implementation that didn't require that assertion to load yaml modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

13 participants