Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi module compilation for elm #8076

Merged

Conversation

ChristophP
Copy link
Contributor

@ChristophP ChristophP commented May 9, 2022

Fixes #2508
Fixes #8263

↪️ Pull Request

Add multi module compilation for Elm](58174b5)

The Elm compiler can compile multiple entrypoints into one bundle, which
helps keep the asset size small since they all share the same runtime.

The approach used here has been inspired by the elm plugin for vite: https://github.com/hmsk/vite-plugin-elm#combine-multiple-main-files-experimental-from-270-beta1

💻 Examples

import { Elm } from
'./Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

Elm.Main.init();
Elm.AnotherModule.init();
Elm.YetAnotherModule.init();

🚨 Test instructions

Use a js file like this:

// src/index.js
import { Elm } from "../../../Main.elm?with=./ModuleB.elm";

document.body.innerHTML = '<div id="elm-1"></div><div id="elm-2"></div>'

Elm.Main.init({ node: document.getElementById('elm-1') });
Elm.ModuleB.init({ node: document.getElementById('elm-2') });
module Main exposing (main)

import Browser
import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)


type Msg
    = Increment
    | Decrement


update msg model =
    case msg of
        Increment ->
           ({ model | count = model.count + 1 }, Cmd.none)

        Decrement ->
           ({ model | count = model.count - 1 }, Cmd.none)


view model =
    div []
        [ button [ onClick Increment ] [ text "+1" ]
        , div [] [ text <| String.fromInt model.count ]
        , button [ onClick Decrement ] [ text "-1" ]
        ]


main =
    Browser.sandbox
        { init = \_ -> ({ count = 0 }, Cmd.none)
        , view = view
        , update = update
        , subscriptions = always Sub.none
        }

and another Elm module

module ModuleB exposing (main)

import Browser
import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)


type Msg
    = Increment
    | Decrement


update msg model =
    case msg of
        Increment ->
           ({ model | count = model.count + 1 }, Cmd.none)

        Decrement ->
           ({ model | count = model.count - 1 }, Cmd.none)


view model =
    div []
        [ button [ onClick Increment ] [ text "+1" ]
        , div [] [ text <| String.fromInt model.count ]
        , button [ onClick Decrement ] [ text "-1" ]
        ]


main =
    Browser.sandbox
        { init = \_ -> ({ count = 0 }, Cmd.none)
        , view = view
        , update = update
        , subscriptions = always Sub.none
        }

ChristophP added 4 commits May 8, 2022 21:53
The Elm compiler can compile multiple entrypoints into one bundle, which
helps keep the asset size small since they all share the same runtime.

Use like this:
import { Elm } from
'./Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

Elm.Main.init();
Elm.AnotherModule.init();
Elm.YetAnotherModule.init();
@devongovett
Copy link
Member

I'm not very familiar with Elm, but what's the difference between doing this vs making a new entry file and importing things there instead? The syntax here of importing files with a query string seems kinda strange to me.

@ChristophP
Copy link
Contributor Author

@devongovett Let me give you a bit more background on this. The difference is basically creating smaller bundles.

Multiple compile steps vs a single compile step

import { Elm } from "src/Main.elm";
import { Elm as Elm2 } from "src/Main2.elm";

Elm.Main.init();
Elm2.Main2.init();

and

import { Elm } from "src/Main.elm?with=src/Main2.elm";

Elm.Main.init();
Elm.Main2.init();

is that

  • the former creates two JS assets in two compilation steps, effectively duplicating the Elm runtime + any Elm code that may be shared between the entrypoints and their dependencies.
  • the latter creates one JS asset in a single compilation step, combining multiple Elm entrypoints into the same output file. This way, the Elm runtime + any shared code gets shared in the compiled output between all entrypoints.

When using the elm compiler on the command line those cases be equivalent to this

elm make src/Main.elm --output=elm.js
elm make src/Main2.elm --output=elm2.js
# vs
elm make src/Main.elm src/Main2.elm --output=elm.js

How common is this?

I wouldn't say this is super common, but if anyone wants to do this they couldn't use Elm with parcel right now.

Query string syntax

I agree that it seems a bit odd. I was wondering how to do this and found that this is how it's done with Vite. https://github.com/hmsk/vite-plugin-elm#combine-multiple-main-files-experimental-from-270-beta1
If there's a better way of specifying that I'd be happy to change it.

@ChristophP
Copy link
Contributor Author

For reference, webpack handles it like this.
https://github.com/elm-community/elm-webpack-loader#files-default---path-to-required-file
It provides some extra config, but one of the things I love about parcel is that you specify most things where you import it.

@devongovett
Copy link
Member

Sure. I guess what I was asking is if you could make a new Elm file that imports all of your entry points and re-exports them somehow, and then import that from JS instead?

@teppix
Copy link

teppix commented May 17, 2022

if you could make a new Elm file that imports all of your entry points and re-exports them somehow

Elm-applications are imported by their module names, and for a module to be considered an application, it has to contain a function named main. That means that you can't really define more than one application in the same module.

@ChristophP
Copy link
Contributor Author

Yes, what @teppix said. The Elm compiler doesn't allow more than one main per entry point.

@devongovett
Copy link
Member

So you can't do something like this?

module Main exposing (ModuleA, ModuleB)

import ModuleA
import ModuleB

and then from JS:

import { Elm } from './Main.elm';

Elm.ModuleA.init();
Elm.ModuleB.init();

I have no idea if this is valid syntax, but that's what I meant. If possible, that seems nicer than doing it in the query string.

@ChristophP
Copy link
Contributor Author

ChristophP commented May 17, 2022

Unfortunately not. In order to have an .init() function on the JS side you need to define a main function on the Elm side. Currently Elm only allows one main function per entry point module and it has to be in the topmost module (not imported by any other module).

@devongovett
Copy link
Member

Got it. Maybe that would be a good issue to open in Elm then. In the meantime, I guess this is fine. Would you mind adding a test? We'll need to add this to the documentation as well.

@ChristophP
Copy link
Contributor Author

@devongovett

Tests

sure, I'll add a test for this.

Docs

I can also prepare some documentation if you'd like. Would that be another PR in this repo over here https://github.com/parcel-bundler/website/blob/v2/src/languages/elm.md ?

Syntax / API design

I've been thinking a bit more on the not-so-perfect syntactical way of using this.

import { Elm } from './Main.elm?with=./AnotherModule.elm,./YetAnotherModule.elm'

When there are more than one extra modules I separate them with a , withing the value of the query param with. This is kind of noisy and hard to read.

I thought maybe this could be improved a little by specifying the with param once for each module, like this.

import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm'

and then get those values with asset.query.getAll('with').
The upside is it seem a bit more readable and more natural with how URL params are used normally, the downside is that instead of one extra character per module (,) its 6 (&with=).

Do you have any thoughts on this? Improvement or just as bad?

@devongovett
Copy link
Member

Better I think!

@ChristophP ChristophP force-pushed the add-multi-module-compilation-for-elm branch from 3c9eb60 to 4e8eec7 Compare May 20, 2022 18:40
@ChristophP
Copy link
Contributor Author

I made the changes regarding multiple with params as well as adding a test. I was still planning on improving error handling when one of the extra sources isn't there but haven't gotten around to it yet. Will hopefully do it this week.

@ChristophP
Copy link
Contributor Author

ChristophP commented May 26, 2022

I have a question that I just thought about:

What would happen if someone imported this twice?
Assume Elm code is imported with parcel in two files like this.

// in file1.js
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';
// in file2
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';

I would assume the asset is compiled once and imported in both files, right?
Would the same be true if someone did the same but changed the order of the query params?

// in file1.js
import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm';
// in file2
import { Elm } from './Main.elm?with=./YetAnotherModule.elm&with=./AnotherModule.elm';

How does parcel treat the specifiers where the only difference is the order of the query params, i.e. when the name is the same, when the queries are the same, when the exact string is the same? Would parcel run the transformer twice and include the compiled JS twice in the resulting bundle?

@mischnic
Copy link
Member

Here's where the query comes from in the resolver:

query: url.search ? new URLSearchParams(url.search) : undefined,

and it's then turned into a string to store it in the cache.

query: result.query?.toString(),

It looks like this won't normalize the order. So there will be two assets.

And theoretically you could also write a transformer where this order matters, so there's not much we can do here.

@ChristophP
Copy link
Contributor Author

ChristophP commented May 27, 2022

@mischnic thanks for the clarification.

I see how that could be difficult to change, but also surprising at the same time. (i.e. when importing ./image.jpeg?width=100&height=200 vs ./image.jpeg?height=200&width=100 would be different assets).

I wonder if the current solution with the query string is the way to go then, because I could see that leading to unintentionally increasing the bundle size significantly without warning or specifying the path slightly diffently (i.e. no leading ./).

So I'd like to propose a different solution for mutli-module compilation for Elm: In the package.json, one could add something like this:

  "@parcel/transformer-elm": {
    // this means: whenever Main.elm is imported, also compile AnotherModule.elm and YetAnotherModule.elm with it
    "extraSources": {
      "./src/Main.elm": ["./src/AnotherModule.elm", "./src/YetAnotherModule.elm"]
    }
  }

This is more of a static config but it has a few benefits IMO

  1. No pitfalls: It avoids the unintentional asset duplication problem
  2. Cleaner imports: When importing the compiled Elm code multiple times you only have to import { Elm } from "./src/Main.elm"; instead of import { Elm } from './Main.elm?with=./AnotherModule.elm&with=./YetAnotherModule.elm'; every time. Users have a single place to look at which Elm assets contain which modules.
  3. It scales better: Having 10+ modules in an array in the package.json works, while having 10 extra files in the query string becomes unreadable.

Any thoughts on this new approach? If you're in favor of this different approach, would you prefer me closing this PR and opening a new one or add commits to this one?

@ChristophP
Copy link
Contributor Author

ChristophP commented May 29, 2022

I went ahead and implemented the approach described above and it works. nicely e716224
Please let me know if you're okay with the approach, then I can add tests and documentation.

@mischnic
Copy link
Member

The biggest difference I see between query param and the central config: is there a usecase for importing A.elm?with=B.elm and in some other file importing A.elm?with=C.elm (which wouldn't be possible with the config the with parameter is essentially fixed for all imports)?

Apart from the other advantages you've listed, if the only concern is deduplicating two different but equivalent with lists, then this would work:

transform({asset}) {
    if(isQueryNormalized(asset.query)) {
        return whatTheElmTransformerDoesAtTheMoment(...);
    } else {
        let specifier = "./" + path.filename(asset.filePath) + "?" +
                        // Sort, normalize leading "./", etc.
                        normalizeQuery(asset.query).toString();
        asset.setCode(`export * from "${specifier}";`);
        asset.type = "js";
        return [asset];
    }
}

@ChristophP
Copy link
Contributor Author

ChristophP commented May 29, 2022

Hi @mischnic ,

is there a usecase for importing A.elm?with=B.elm and in some other file importing A.elm?with=C.elm (which wouldn't be possible with the config the with parameter is essentially fixed for all imports)?

I don't see one to be honest. The whole point of this multi-module compilation is saving on bundle size. If one needs A.elm and B.elm somewhere and A.elm and C.elm somewhere else, it would be better to have the three of them in one bundle together, because otherwise A.elm would be duplicated, which kinda defeats the purpose of doing mulit-module compilation in the first place.

Apart from the other advantages you've listed, if the only concern is deduplicating two different but equivalent with lists, then this would work:

This is very interesting. Nice trick to sort the query string and turn it into a JS asset with a normalized query. Maybe that would also be good for the image transformer.

Despite of there being a solution to the duplication problem, I think advantages of the config approach outweigh the ones of the query string approach (which is mostly not having to have a config). Especially the fact that the query approach gets really tedious with long file paths and/or many extra sources. So I'm proposing to go with the config solution.

@ChristophP
Copy link
Contributor Author

ChristophP commented Jun 17, 2022

So I ended up removing the query string deduplication logic for two reasons

  • I couldn't get it to work as I wanted it in the test. The Main.elm was always shown multiple times in the integration test.
  • The deduplication would have prevented unintentional asset duplication when things are imported more than once with different order of query params. If people use this feature sparingly or with the special resolver like this Add multi module compilation for elm #8076 (comment), unintentional asset duplication won't be a huge deal anyway.

Other than adding documentation for this feature and publishing the resolver package I think the work here is done. Does this look like something we could merge?

@mischnic
Copy link
Member

If they're duplicated, then there will be two assets with the filepath Main.elm:

The Main.elm was always shown multiple times in the integration test.

(Sorry, I might have confused you there. There will be multiple assets called Main.elm, but only one of them will be the actual Elm bundle, and all other ones will just contain export * from "...";)

@ChristophP
Copy link
Contributor Author

@mischnic

(Sorry, I might have confused you there. There will be multiple assets called Main.elm, but only one of them will be the actual Elm bundle, and all other ones will just contain export * from "...";)

Ah ok, yeah I got confused a bit. Would there be a good way to test that the other ones are just export * from ... then?

If so I could do that and bring back the deduplication. But overall, I guess the deduplication logic is not that important when the recommended way to use this feature is the resolver you described.

@mischnic
Copy link
Member

I can't think of a better way than asset.getCode().length < ???.

But yeah, I don't think that the deduplication is essential

@ChristophP
Copy link
Contributor Author

Ok, if you're fine not having the deduplication logic then I guess this PR is ready.

Would you like me to create another PR for the documentation?

Copy link
Member

@devongovett devongovett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes updating the documentation would be really helpful. 😄

@ChristophP
Copy link
Contributor Author

Added some docs here parcel-bundler/website#1038

Will still finish up the extra third party resolver package @mischnic was proposing. here #8076 (comment)

@ChristophP
Copy link
Contributor Author

ChristophP commented Jul 31, 2022

Thanks @mischnic will look at some of the suggestions

@ChristophP
Copy link
Contributor Author

ChristophP commented Jul 31, 2022

I also bumped terser to the latest version to fix some issues around production builds that hang forever
see terser/terser#1212 and passiomatic/elm-designer#58

@mischnic
Copy link
Member

You'll need to run yarn to update the lockfile after the Terser bump. And please also bump the version in https://github.com/parcel-bundler/parcel/blob/v2/packages/optimizers/terser/package.json

@devongovett
Copy link
Member

Handled terser bump separately

@devongovett devongovett merged commit fca5c8c into parcel-bundler:v2 Jul 31, 2022
@ChristophP ChristophP deleted the add-multi-module-compilation-for-elm branch July 31, 2022 19:52
gorakong pushed a commit that referenced this pull request Nov 3, 2022
* upstream/v2: (22 commits)
  Cross compile toolchains are built into docker image already
  Also fix release build
  Update centos node version
  v2.7.0
  Changelog for v2.7.0
  Use placeholder expression when replacing unused symbols (#8358)
  Lint (#8359)
  Add support for errorRecovery option in @parcel/transformer-css (#8352)
  VS Code Extension for Parcel (#8139)
  Add multi module compilation for elm (#8076)
  Bump terser from 5.7.2 to 5.14.2 (#8322)
  Bump node-forge from 1.2.1 to 1.3.0 (#8271)
  allow cjs config files on type module projects (#8253)
  inject script for hmr when there is only normal script in html (#8330)
  feat: support react refresh for @emotion/react (#8205)
  Update index.d.ts (#8293)
  remove charset from jsloader script set (#8346)
  Log resolved targets in verbose log level (#8254)
  Fix missing module in large app from Experimental Bundler (#8303)
  [Symbol Propagation] Non-deterministic bundle hashes (#8212)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Elm code with Debug.log fails with Javascript error Elm: Bulk compile assets to avoid duplication
4 participants