Support for emitting modules into a subdirectory #4742

shadaj · 2022-10-17T05:57:26Z

Was: Linker crashes when emitting modules into a subdirectory

If exporting a method/value under a module name that contains a /, the linker crashes because it attempts to pass the entire name directly to createTempFile. Having a slash in the module name is useful when integrating with frameworks such as Next.js, which use directories to map components to application routes.

For example, running fastLinkJS with a project containing the following export:

object Next {
  @JSExportTopLevel(name = "component", moduleID = "foo/bar")
  def component(): js.Any = ...
}

results in the following crash

[error] java.lang.IllegalArgumentException: Invalid prefix or suffix
[error]         at java.base/java.nio.file.TempFileHelper.generatePath(TempFileHelper.java:60)
[error]         at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:125)
[error]         at java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:159)
[error]         at java.base/java.nio.file.Files.createTempFile(Files.java:878)
[error]         at org.scalajs.linker.PathOutputDirectory$Impl.$anonfun$writeAtomic$2(PathOutputDirectory.scala:71)
[error]         ...

The text was updated successfully, but these errors were encountered:

sjrd · 2022-10-17T14:47:21Z

Uh oh, that was never meant to be allowed. The infrastructure of OutputDirectory is only meant to represent a single, flat directory. We should disallow this at compile time instead of crashing at link time.

Having a slash in the module name is useful when integrating with frameworks such as Next.js, which use directories to map components to application routes.

I sympathize, but IMO this falls under the category of "frameworks that want a specific shape for the output code rather than semantics". We don't guarantee that we will support every possible framework's idea of how the target .js files should "look like".

Have you considered defining the structure separately, as a hierarchy of .js files that import the relevant files of the Scala.js output?

shadaj · 2022-10-17T18:15:31Z

Yeah, so for now I've just been maintaining a bunch of glue JS files that just import and re-export the Scala.js modules. Because in Next.js' case they are so boilerplate-y though (https://github.com/shadaj/slinky/blob/main/docs/pages/docs/%5Bid%5D.js), was thinking it might not be too hard to support this use case in Scala.js since we don't have any restrictions on how the code inside the module is laid out, just where it is placed.

But also see your point that this does introduce some more responsibilities that Scala.js may not want to handle.

gzm0 · 2022-10-19T09:35:29Z

My 2cts:

My original idea was to not restrict module IDs in the IR specification, very similar to how ESModules do not define what the module string means.
Per consequence, how modules are mapped to files is specific to the linker implementation.
In the current linker implementation the mapping from module ID to filename is defined via OutputPatterns (and arguably very crude).

I could see the value of extending OutputPatterns to something more general. Challenges:

OutputPatterns needs to be serializable, so it's the usual flexibility / complexity tradeoff (we can't just make it a function).
The full output universe needs to be understood, otherwise the incremental linker cannot remove unnecessary files anymore.

keynmol · 2022-10-19T13:29:40Z

@armanbilge asked me to reference my own usecase for this - a while back I attempted to do the same for Cloudflare Pages. Quoting my Discord message:

Cloudflare pages can work it's magic on files like /admin/api/get_[[[id]].js but iirc I get errors trying to use / in the name of the module when using ES modules

Current thinking is to output filenames in admin__api__get_[[id]] format and externally rewrite to paths - perhaps there's a better way?

gzm0 · 2022-10-20T06:04:41Z

There's one more challenge I forgot: How to synthesize internal module names. If they need to go into special directories, things might get more complicated (OTOH, a simple prefix config might be enough).

gzm0 · 2022-10-23T19:07:50Z

@shadaj @keynmol in your use cases, how would you expect internal modules (modules with shared code between the modules you declare) to be named / structured?

shadaj · 2022-10-23T19:42:32Z

For Next.js, we would probably want to dump all internal modules into a components folder, but don't have any naming restrictions on the actual modules.

keynmol · 2022-10-28T18:29:20Z

For cloudflare, I believe only modules that explicitly export a function with a certain name (e.g. onRequest or onRequestGet) are considered by CF for routing.

Therefore other files and their names (intermediate modules) are likely to be ignored (*), so I would personally not expect any structure from them - they can all be dumped top level for all I care.

If I have time, I will test the (*) assumption to make sure it works as I described.

gzm0 · 2022-12-17T14:12:57Z

For cloudflare, I believe only modules that explicitly export a function with a certain name (e.g. onRequest or onRequestGet) are considered by CF for routing.

Therefore other files and their names (intermediate modules) are likely to be ignored (*), so I would personally not expect any structure from them - they can all be dumped top level for all I care.

FWIW, this is a problem: We cannot guarantee that an internal module does not export a function with a name that is special to cloudflare.

armanbilge · 2023-04-21T17:22:53Z

Just stumbled into this again. I understand that ESM-style npm packages are typically organized into subdirectories e.g.

import '@shoelace-style/shoelace/dist/components/button/button.js';

So it would be nice to be able to make such things with Scala.js as well.

gzm0 · 2023-04-22T14:39:42Z

Maybe, but do we have examples of where this is actually useful? For shoelace:

import '@shoelace-style/shoelace/dist/components/button/button.js';
        --------------- -------- --------------- ------ ------
        scope           package  build artifacts module redundant

Even what the doc lists as utilities (e.g. include) is under the components "namespace".

micheal-hill · 2023-07-01T06:51:45Z

I'd like to add another voice for supporting structured directories. It's pretty tedious to manually setup the file structure for Next.js or Remix by hand.

Is this something where a PR would be considered? Alternatively, is there a mechanism that would allow extending the plugin to have this functionality within build.sbt?

For cloudflare, I believe only modules that explicitly export a function with a certain name (e.g. onRequest or onRequestGet) are considered by CF for routing.
Therefore other files and their names (intermediate modules) are likely to be ignored (*), so I would personally not expect any structure from them - they can all be dumped top level for all I care.

FWIW, this is a problem: We cannot guarantee that an internal module does not export a function with a name that is special to cloudflare.

I don't think it's important to make this guarantee for CloudFlare or any other tooling. Assuming that generated modules can have some user-defined directory structure + internal modules live somewhere else, I would expect it to be easy to post-process this into whatever structure the user needs/wants.

gzm0 · 2023-07-02T06:47:25Z

Is this something where a PR would be considered?

IMO, yes, for sure. However, I'd urge you to first propose and discuss the exact changes you'd like to do (in this issue). Especially for internal modules, it is not clear to me how the interface could look like.

Alternatively, is there a mechanism that would allow extending the plugin to have this functionality within build.sbt?

You could move the modules around after they are built by Scala.js. But that would also involve adjusting the import paths (since they are relative). So, yes, a mechanism could be built, but to the best of my knowledge, there isn't any at the moment.

gzm0 · 2023-07-03T07:02:52Z

A couple of thoughts I had:

The following might get us quite far:

Interpret / in module IDs as paths
Add a setting internalModuleIDPrefix
(optional) specify that generated internal module IDs never contain a slash (IIUC already the case)

Unclear with this approach is how to deal with module ID equality: Right now, the following are considered different module IDs: a/b, a//b, a/./b, a/c/../b`.

sjrd · 2023-07-03T09:05:52Z

Unclear with this approach is how to deal with module ID equality: Right now, the following are considered different module IDs: a/b, a//b, a/./b, a/c/../b`.

Given that any slash today results in a crash, and that a single . or .. likely does as well, I think we can forbid leading, trailing and double slashes, as well as path elements that are . or ...

micheal-hill · 2023-07-13T01:07:47Z

Unclear with this approach is how to deal with module ID equality: Right now, the following are considered different module IDs: a/b, a//b, a/./b, a/c/../b`.

Given that any slash today results in a crash, and that a single . or .. likely does as well, I think we can forbid leading, trailing and double slashes, as well as path elements that are . or ...

Wouldn't it be simpler to just use something like java.nio.Path.normalize instead of trying to be clever about it?

sjrd · 2023-07-13T04:17:39Z

Our linker cross-compiles to JS. In Scala.js we don't have access to Path, so we cannot do that.

gzm0 · 2023-07-15T11:45:57Z

Also FWIW: Paths are platform specific (at least if we're using the default FileSystem). So we need to be careful: We do not want the linker's output to depend on the platform it is running on.

micheal-hill · 2023-07-17T21:51:02Z

Apologies that I don't have a better understanding of how the plugin works internally - I've not had a lot of time to do some digging just yet. I've also never done anything with cross compilation, so maybe my understanding is all garbage!

Our linker cross-compiles to JS. In Scala.js we don't have access to Path, so we cannot do that.

At some point it needs to execute platform-specific code though, no? ie. there is some write(path: String, contents: ???) function somewhere along the way, which has a different implementation in JS to JVM? Given this assumption:

Could modify this function to create intermediate directories if required
Create a new normalize function using the same convention; ie. dispatching to separate implementations based on the platform being executed.

sjrd · 2023-07-17T22:47:41Z

There are two notions of "platform" at play here. Usually we mean JVM v JS, indeed. But here we're talking about the OS and its filesystem. We want the linker to produce identical outputs regardless of the OS and of the filesystem(s) being used. Therefore, we cannot rely on normalize (whether on the JVM or JS).

micheal-hill · 2023-07-25T01:19:19Z

I did some experimenting here; the linker already has runtime-specific (ie. Node, JVM) implementations and appears to require little change to (see the sketch here; note that this is pretty scrappy code that I'll clean up and test if people are happy with the approach).

There are two notions of "platform" at play here. Usually we mean JVM v JS, indeed. But here we're talking about the OS and its filesystem. We want the linker to produce identical outputs regardless of the OS and of the filesystem(s) being used. Therefore, we cannot rely on normalize (whether on the JVM or JS).

I'm not sure that there's anything that we need to do to support various OS here. The underlying runtime can deal with this for us.

gzm0 · 2023-07-27T07:17:24Z

I'm not sure that there's anything that we need to do to support various OS here. The underlying runtime can deal with this for us.

As far as actually writing the modules to the filesystem is concerned, this is correct. However, we also need to consider the usage sites of the modules (i.e. the imports):

When the linker splits off some code into an internal module, it will need to import that code. Currently, it will emit:

import * as <internal name> from "./<moduleID>.<ext>"

Where the specifics are determined by the OutputPatterns.

So far, the assumption was that moduleID does not contain any special characters for the underlying filesystem (i.e. no / or \). If we start supporting subdirectories, we must change/break this assumption.

At this point, we need to take care, that:

The code we generate is the same on all platforms/operating systems (e.g. it is not OK if you get back slashes in imports when you build on windows but you get forward slashes when you build on *nix).
The code we generate works on all JS platforms. Since (at least last time I checked), what a module identifier means is not specified in the ESM spec, this is necessarily trade-offy. Given that we already assume that ./ denotes a relative import, it is IMO OK to assume further slashes mean subdirectories.

You might correctly observe at this point, that (apart from the changes you're already working on in your branch), there is nothing to do for the success path. However, we need to make sure that we are not generating broken code (e.g. `import ... from './a/../b\foo.js'). This is what @sjrd and I have been discussing about.

HTH

micheal-hill · 2023-07-28T00:32:33Z

When the linker splits off some code into an internal module, it will need to import that code. Currently, it will emit:

This makes sense. It seems that I didn't do enough testing; I can replicate this once I start writing non-trivial code. Are you able to point to where this code is emitted from the linker?

it is not OK if you get back slashes in imports when you build on windows but you get forward slashes when you build on *nix

I don't understand why this is important, but a simple solution here is to use only / as the path separator. If a path contains only /, Windows will understand it correctly (I did an experiment to test this, some sample code here).

But more importantly I think upholding this assumption will end up with broken code if any form of directory nesting is allowed, regardless of the separator used. For example:

Assuming that / is used as the path separator character and have some modules emitted to <build root>/util/foo.

// If I'm working on a *nix system, I'm likely to have an export that looks like
@JSExportTopLevel("baz", moduleID = "app/bar")
// This would import modules from `foo` using `import <names> from "../util/foo"`; which would work correctly

// If I'm working on a Windows system, I'm likely to have an export that looks like
@JSExportTopLevel("baz", moduleID = "app\\bar")
// This would import modules from `foo` using `import <names> from "../util\\foo"`; which I believe _would break_ because of the mixed characters - Though I have not tested this assumption.

A mirror of this situation occurs if \ is used as the path separator: code generated on Windows will be fine, but anything generated on *nix will die.

Note that in both of these cases, the break occurs because the user followed the conventions of their OS but scala-js uses only a single character for the separator. One might argue that the user could be educated to follow an alternative pattern (eg. always use *nix paths, even on Windows), but that is a situation that I expect would be ripe for confusion.

So if this is a hard requirement; I don't think that there's any simple way to both support any OS and support emitting into subdirectories.

sjrd · 2023-07-28T06:25:59Z

We definitely have to standardize on / on all systems, even Windows. We will need to forbid any other special characters of file systems, notably \ and :. The rationale for Windows users is that those are URI relative paths. Not file system ones.

micheal-hill · 2023-07-29T00:49:24Z

We definitely have to standardize on / on all systems, even Windows. We will need to forbid any other special characters of file systems, notably \ and :. The rationale for Windows users is that those are URI relative paths. Not file system ones.

Just to be clear; you're saying that the user must always use something that looks like @JSExportTopLevel("baz", moduleID = "app/bar"), right?
This seems reasonable to me. I'm worried that it's a little bit awkward for Windows users, but it looks like it's the easiest to implement. Perhaps also improves consistency for anyone collaborating between Windows/*nix 🤷

Assuming you're happy with the direction I've started looking at earlier (the link here), I can start figuring out how to get the linker to emit import statements with the correct relative paths.
I believe that org.scalajs.linker.standard.ModuleSet.Module is at the center of this, but I haven't figured out how the paths are being populated yet - it would be much appreciated if anyone has any pointers

gzm0 · 2023-07-30T09:15:08Z

Just to be clear; you're saying that the user must always use something that looks like @JSExportTopLevel("baz", moduleID = "app/bar"), right?

Yes.

Perhaps also improves consistency for anyone collaborating between Windows/*nix

Yes. In fact, this is a requirement: The same input with the same options must produce the same output on all systems/OSs.

Assuming you're happy with the direction I've started looking at earlier (the link here), I can start figuring out how to get the linker to emit import statements with the correct relative paths.

That looks like a good start to me.

I believe that org.scalajs.linker.standard.ModuleSet.Module is at the center of this, but I haven't figured out how the paths are being populated yet - it would be much appreciated if anyone has any pointers

The linker frontend does not deal with paths, but with ModuleIDs.: an abstract representation of the identity of the module
- For external modules, this is what the user provides in JSImport.
- For public modules, this is what the user provides in JSExportTopLevel
- For internal modules, this is generated automatically
- Definitions of types of modules
The linker backend (aka Emitter) then translates these into filenames to put them after from.
Default patterns

HTH

gzm0 · 2023-10-12T08:10:55Z

Nikita's comment on Discord just made me realize that when building this, we need to be extremely careful to keep relative path imports meaningful.

micheal-hill · 2023-10-20T00:48:12Z

When I have some time I'll aim to come back to this; however for now I've had some life/work stuff get in the way

shadaj mentioned this issue Oct 17, 2022

Add support for Next.JS shadaj/slinky#456

Open

sjrd added the enhancement Feature request (that does not concern language semantics, see "language") label Dec 11, 2022

sjrd changed the title ~~Linker crashes when emitting modules into a subdirectory~~ Support for emitting modules into a subdirectory Jan 23, 2023

armanbilge mentioned this issue Apr 21, 2023

Setup published npm package gemini-hlsw/lucuma-core#708

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for emitting modules into a subdirectory #4742

Support for emitting modules into a subdirectory #4742

shadaj commented Oct 17, 2022 •

edited by sjrd

sjrd commented Oct 17, 2022

shadaj commented Oct 17, 2022

gzm0 commented Oct 19, 2022

keynmol commented Oct 19, 2022

gzm0 commented Oct 20, 2022

gzm0 commented Oct 23, 2022

shadaj commented Oct 23, 2022

keynmol commented Oct 28, 2022

gzm0 commented Dec 17, 2022

armanbilge commented Apr 21, 2023

gzm0 commented Apr 22, 2023

micheal-hill commented Jul 1, 2023

gzm0 commented Jul 2, 2023

gzm0 commented Jul 3, 2023

sjrd commented Jul 3, 2023

micheal-hill commented Jul 13, 2023

sjrd commented Jul 13, 2023

gzm0 commented Jul 15, 2023

micheal-hill commented Jul 17, 2023 •

edited

sjrd commented Jul 17, 2023

micheal-hill commented Jul 25, 2023 •

edited

gzm0 commented Jul 27, 2023

micheal-hill commented Jul 28, 2023

sjrd commented Jul 28, 2023

micheal-hill commented Jul 29, 2023

gzm0 commented Jul 30, 2023

gzm0 commented Oct 12, 2023

micheal-hill commented Oct 20, 2023

Support for emitting modules into a subdirectory #4742

Support for emitting modules into a subdirectory #4742

Comments

shadaj commented Oct 17, 2022 • edited by sjrd

sjrd commented Oct 17, 2022

shadaj commented Oct 17, 2022

gzm0 commented Oct 19, 2022

keynmol commented Oct 19, 2022

gzm0 commented Oct 20, 2022

gzm0 commented Oct 23, 2022

shadaj commented Oct 23, 2022

keynmol commented Oct 28, 2022

gzm0 commented Dec 17, 2022

armanbilge commented Apr 21, 2023

gzm0 commented Apr 22, 2023

micheal-hill commented Jul 1, 2023

gzm0 commented Jul 2, 2023

gzm0 commented Jul 3, 2023

sjrd commented Jul 3, 2023

micheal-hill commented Jul 13, 2023

sjrd commented Jul 13, 2023

gzm0 commented Jul 15, 2023

micheal-hill commented Jul 17, 2023 • edited

sjrd commented Jul 17, 2023

micheal-hill commented Jul 25, 2023 • edited

gzm0 commented Jul 27, 2023

micheal-hill commented Jul 28, 2023

sjrd commented Jul 28, 2023

micheal-hill commented Jul 29, 2023

gzm0 commented Jul 30, 2023

gzm0 commented Oct 12, 2023

micheal-hill commented Oct 20, 2023

shadaj commented Oct 17, 2022 •

edited by sjrd

micheal-hill commented Jul 17, 2023 •

edited

micheal-hill commented Jul 25, 2023 •

edited