Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Replace data structures with tagged arrays #4020

Open
gorbak25 opened this issue Mar 5, 2021 · 53 comments
Open

Proposal: Replace data structures with tagged arrays #4020

gorbak25 opened this issue Mar 5, 2021 · 53 comments

Comments

@gorbak25
Copy link

gorbak25 commented Mar 5, 2021

Summary

Might be a necrobump of #2577
Currently pattern matching on union types discriminates the types in the union using instanceOf. This incurs a significant performance penalty for larger union types and makes it infeasible to dispatch the code based on the type(switch on a tag instead of an if(... instanceof ...) ladder. Additionally construction of large data structures is slow due to the need to constantly allocate new heap objects(instead of being JS literal). JIT is affected as accessing a property/walking the prototype chain is not cheap.
Consider replacing data structures with an array ["TAG", value0, value1, ..., valueN]

Motivation

After running sed -i 's/instanceof/.constructor ===/g' dist/bundle.js on https://github.com/erlscripten/erlps-aesophia I've decreased the runtime by 6%. This makes sense as erlscripten is based on the following union type - https://github.com/erlscripten/erlps-core/blob/c881e39901a2d6c29972b8a13c3c1ee760754560/src/Erlang/Type.purs#L42
I've started benchmarking various alternatives to instanceof and came to the conclusion that by just keeping everything in an array we generate fast JS which also can be easily optimized by the JIT compiler.
For

data Tree = Node Tree Tree | Leaf Int

foo :: Tree -> Int -> Int
foo (Node l r) acc = foo r (foo l acc)
foo (Leaf x) acc = acc + x

main :: Int
main = foo (Node (Node (Node (Leaf 1) (Leaf 2)) (Node (Leaf 3) (Leaf 4))) (Node (Node (Leaf 5) (Leaf 6)) (Node (Leaf 7) (Leaf 8)))) 0

Keeping everything in an array yields an 95%+ performance improvement https://jsbench.me/beklw3j28m/2
I've also noticed that construction of large terms in erlscripten is suspiciously slow(due to the usage of "new")

Proposal

Replace new Foo(a1, a2, a3) with ['Foo', a1, a2, a3]
Replace v instanceof Foo with v[0] === 'Foo'
Replace v.valueN with v[N+1]

Examples

Take a look at the benchmark: https://jsbench.me/beklw3j28m/2

Small question

In the Erlscripten project we plan to maintain and ship a custom superoptimizing Purescript compiler.
Right now it lives here: https://github.com/erlscripten/purescript/commits/erlscripten-lambdaelim It includes some unmerged fixes in this repository and it features a completelly reworked codegen which never generates additional lamdas(it uses let statements instead of lambdas to save on stack space, this also allows for TCO to be applied everywhere). We will implement this proposal on our fork next week. Would you be interested in a PR? Is this a feature which can be upstreamed?

@gorbak25 gorbak25 changed the title Proposal: Construction and pattern matching of large union types is very slow Proposal: Replace data structures with tagged arrays Mar 5, 2021
@hdgarrood
Copy link
Contributor

I’m open to changing this but I would like to see much more comprehensive benchmarks first, and preferably compare the results on other engines too, not just V8.

@natefaubion
Copy link
Contributor

I did benchmarking recently, and at least for the test I put together, bare records with int tags were the fastest. Is there a reason to prefer arrays over records?

@garyb
Copy link
Member

garyb commented Mar 5, 2021

Just for the record, the current prototype-based things was chosen as originally it far outperformed using literals, thanks to the "hidden class" optimisation.

Nate and I briefly talked about the object-literal-with-int-tags idea recently since he discovered that now performs better than the prototype thing - I'd also prefer that approach over array literals unless there is definitely a noteworthy performance benefit.

@rhendric
Copy link
Member

rhendric commented Mar 5, 2021

I believe object literals outperform arrays on both V8 and Firefox, but only if the field names are unique. If you try a benchmark where every object looks like { tag, value0, value1, ... }, I think you'll see arrays do better. But if data Tree = Node Tree Tree | Leaf Int corresponds to /* Node */ { value0, value1, tag: 0 }; /* Leaf */ { value2, tag: 1 };, I expect that to (narrowly?) outperform arrays. (I'm not sure if putting the common tag field at the end makes a difference; I do it out of superstition based on past experience but it might not be relevant in modern JS engines.)

@natefaubion
Copy link
Contributor

I'll note that I strongly agree that we should change the constructor representation. The current representation did help us at one point, I don't think it's necessary anymore, and it contributes to the very large bundle sizes, as even unit constructors involve generating classes and performing instanceof checks. I would just like to see numbers around existing hot code, such as Data.Map. I agree that we should not optimize for just v8, and we should strike a balance as best we can. FWIW, I believe Elm uses bare records with a tag field, but that might have changed.

@paluh
Copy link

paluh commented Mar 6, 2021

@hdgarrood I think that the proposed benchmark is not V8 specific and can be run on any browser (I've tried it on Fiirefox and got similar performance gain).
@rhendric There is also an encoding which lies somewhere in between { tag: 1, values: [...] } (two allocations) which can outperform plain non monomorphic records on case matching I think. On the other hand proposed array based representation (with Int tags instead of String) should result in the smallest bundle size.

My two cents:
From a full stack PS user perspective I see any of the above value representation change as a possible huge benefit - it can simplify and optimize data exchange flow between PS client and PS server (on JS backend). I know I should not rely on this implementation detail but I will for sure :-P.
Having any of the above representations makes serialization into the JSON much simpler with zero allocation overhead.

@natefaubion
Copy link
Contributor

natefaubion commented Mar 6, 2021

But if data Tree = Node Tree Tree | Leaf Int corresponds to /* Node / { value0, value1, tag: 0 }; / Leaf */ { value2, tag: 1 };, I expect that to (narrowly?) outperform arrays. (I'm not sure if putting the common tag field at the end makes a difference; I do it out of superstition based on past experience but it might not be relevant in modern JS engines.)

JS engines use an inline cache based on the structural hash of the record, which includes fields and field order. For optimal performance with a sum type, where you have to lookup the tag field, I think you'd want to pad each constructor, so they all have the same fields (initialized to undefined when unused). If you mix up the structural hash for each constructor, dispatch will degrade to a polymorphic inline cache, and then to megamorphic dispatch depending on the variety.

@gorbak25
Copy link
Author

gorbak25 commented Mar 6, 2021

Personally I think arrays are preferable due to 2 assumptions which(probably) are true:

  1. v[x] is trivially optimized by a JIT compiler - Should compile down to some simple pointer arithmetic, 1-2 instructions max - when Turbofan in V8 engages it should make a large difference.
  2. The bundle size is minimal - no need to specify names of the object literals

I agree that before making the decision on what the new format shall look like we need to have more data. I will create more appropriate benchmarks - there are 4 implementations to consider:

  1. Arrays with string tag
  2. Arrays with int tag
  3. Object literals with string tag
  4. Object literals with int tag

Under 3 runtimes - Firefox, Chrome, NodeJS

My current benchmark is not representative of the entire situation, it's a good start but it skips over the performance of matching large union types and the performance of creating large terms. For instance right now I'm kind of disappointed of the time it takes to construct some large erlang terms(https://github.com/erlscripten/erlps-aesophia/blob/58b62b3f0d0f579a461f19f98d326d67d730b0dc/src/AesoAstInferTypes.purs#L964). And the time it takes to match a large union type(for N clauses you are calling instanceof N times :( - https://github.com/aeternity/aesophia/blob/c6e7db23816052da8851ba68503abed9f023ae3c/src/aeso_ast_infer_types.erl#L2023 or https://github.com/erlscripten/erlps-core/blob/c881e39901a2d6c29972b8a13c3c1ee760754560/src/Erlang/Type.purs#L193 )

I will create two more benchmarks in addition to the one I've already made:

  1. Evaluate a small language with lot of types in the AST(for instance... data Expr = Literal Number | Add Expr Expr | Sub Expr Expr | Mul Expr Expr | Div Expr Expr | Srqt Expr | ... - the more types the better ;) )
  2. Construct a bintree with 1 mln elements(and do nothing with it - I'm only interested in the allocation overhead)

@gorbak25
Copy link
Author

gorbak25 commented Mar 7, 2021

Here are more benchmarks: https://github.com/erlscripten/purs-union-representation
It looks like using objects with int tags is the way forward.

@natefaubion
Copy link
Contributor

The disadvantage of integer tags however is that it's extremely difficult to debug if you're tracing values.

@thomashoneyman
Copy link
Member

Judging from those benchmarks it looks like objects with string tags are only narrowly behind

@garyb
Copy link
Member

garyb commented Mar 7, 2021

Thanks for putting these benchmarks together!

I was going to come and retract what I said about arrays, as I realised it's not like the objects have usefully named fields anyway - but looks like objects is the way to go anyway, from these.

I think string tags are probably worth it for the slight performance hit, given they preserve some semblance of human readability in the generated code, and more importantly mean that debug-tracing is still reasonably ergonomic.

@gorbak25
Copy link
Author

gorbak25 commented Mar 7, 2021

I've found a small typo in bench3.js - I've fixed it and redone the benchmarks.
TL DR: Objects with int/string tags are the fastest to allocate both on Firefox/NodeJS. In Bench1 and Bench2 on NodeJS Objects win - on Firefox that's not the case as Bench2 favors heavily arrays.

@gorbak25
Copy link
Author

gorbak25 commented Mar 8, 2021

For the record. Elm generates different tags depending on the environment. Perhaps we shall a compiler flag enabling debug mode(Emit string tags instead of ints in debug mode)?

@jy14898
Copy link
Contributor

jy14898 commented Mar 10, 2021

I think it's worth considering using JavaScript's Symbol primitive for tagging - I imagine the performance is very similar to strings once the Symbol is constructed, but with the benefit of being globally unique even when given the same "debug" name:
Symbol("Just") === Symbol("Just") // false

The compiler wouldn't make use of this fact at this moment, but I've been experimenting adding unions to purescript where types can exist in the same union, if they are discernible from each other at runtime - using integer tags or strings instead of implicit class constructors or symbols would result in very few instances being possible by default.

@hdgarrood
Copy link
Contributor

I'm not keen on adding a compiler flag - that doubles the amount of paths we have to test whenever we're doing anything with codegen, and complicates the incremental build system since it would then have to check that the flags which affect codegen haven't changed if the output is otherwise up to date.

Using Symbol would mean dropping support for a few browsers which are still fairly widely used, so I think for that to be an option we'd have to do something like var just = typeof Symbol === 'undefined' ? "Just" : Symbol("Just").

@garyb
Copy link
Member

garyb commented Mar 10, 2021

Another fallback option for that, that isn't string-dependent, would be to define the tags as anonymous objects, like var Just = { sym: "Just" } - it'd still be debug-readable to a reasonable degree ({ tag: { sym: "Just" }, value0: true }), and also acts like a symbol in that it is only equal with itself.

I don't know how object reference comparisons equality compares with int/string equality though, so that would need benchmarking too. I suppose the same thing is true of Symbol equality - we've not seen how that performs either.

@gorbak25
Copy link
Author

Personally I think it would be great to aim at the smallest bundle sizes(int tags).
The compiler can always generate comments annotating tags. For instance:

switch(v[0]) {
 case 0: //Nothing
 ...
 break;
 case 1: //Just
 ...
 break;
}

@garyb
Copy link
Member

garyb commented Mar 10, 2021

Gzip means most of this kind of code barely affects the "actual" bundle size. Our app's on-disk bundle size is ~10mb, but served as 1mb.

@gorbak25
Copy link
Author

As promised here is the compiler which emits arrays with string tags for union types:
https://github.com/erlscripten/purescript/tree/erlscripten-arr-record
It shaved off 0.3s from https://github.com/erlscripten/erlps-aesophia compared to the exact same compiler without the patch.

I think it will be good to implement the other alternatives and make comparisons on existing projects.

@hdgarrood
Copy link
Contributor

@natefaubion are the benchmarks you mentioned published anywhere?

@natefaubion
Copy link
Contributor

@hdgarrood The benchmark I constructed was not anymore interesting than what has been referenced so far.

@gorbak25 Another benchmark I would like to see is around sum-type variety. Consider:

  • Define a sum type with 7 constructors, where each constructor has one additional field.
  • Construct a large array that cycles through the variants.
  • Loop through the array, pattern matching on each value.

I would like to see if there is a significant difference between the different approaches and something like:

function Test(tag, value0, value1, value2, value3, value4, value5, value6) {
  return { tag, value0, value1, value2, value3, value4, value5, value6 };
}

[Test("Test1", 0), Test("Test2", 0, 1), Test("Test3", 0, 1, 2), ... etc]

To determine what affect uniform undefined padding has on it.

@natefaubion
Copy link
Contributor

The benchmark I wrote was for an Array slice: http://jsbench.github.io/#6baa45e72a0c75ee532dba584e9b73f4

@natefaubion
Copy link
Contributor

natefaubion commented Mar 10, 2021

I would prefer to not use any sort of tag that relies on referential identity, or otherwise ties something to a specific realm. This is one issue that I find frustrating about the current instanceof approach. It would be really nice if our data types were just simple primitives.

@megamaddu
Copy link

megamaddu commented Mar 11, 2021

Tags would need the complete module and type name, right? That's one advantage Symbols, anonymous tagged objects, and classes have over strings. It's probably a good idea to include string tags with long, redundant prefixes in the tests, as it might take longer to compare "Company.Components.Feature.SomeModule.SomeType" to "Company.Components.Feature.SomeModule.SomeOtherType" than it does to compare "X" with "Y".


Gzip means most of this kind of code barely affects the "actual" bundle size. Our app's on-disk bundle size is ~10mb, but served as 1mb.

Unzipping and parsing large bundles can still be an issue on underpowered devices, so it might still be worth reducing where the compiler can easily do so (easily as in a bundler or compressor doesn't know those values can be optimized out).

@natefaubion
Copy link
Contributor

Tags would need the complete module and type name, right?

It's not clear to me why you need the whole module name. Do you have a specific issue in mind? Are you are trying to guard against bad FFI mixups?

@megamaddu
Copy link

But if ints end up being the choice for all around performance it'd be pretty important to include some means of looking up types from that int tag, at least as a debug option.

@megamaddu
Copy link

Tags would need the complete module and type name, right?

It's not clear to me why you need the whole module name. Do you have a specific issue in mind? Are you are trying to guard against bad FFI mixups?

FFI, maybe coercible representations, I'm not sure.

@megamaddu
Copy link

megamaddu commented Mar 11, 2021

One more thought, not for any particular side --

It may be more valuable to optimize bundle size and parsing over runtime performance, especially since there's no built in means of bundle splitting. A slight performance difference in an event handler is rarely noticed, but a page that's consistently slow to load in the first place can cause abandonment. Runtime performance edge cases can also be handled with FFI more easily than bundle size issues. It's harder to benchmark that without real apps and devices though.

Edit: This is anecdotal at this point, but at a previous company I did a bunch of bundle optimization testing and reducing the initial bundle size from 700kb to 200kb reduced the time to a responsive page by about 2 seconds on our most common mobile device. It may end up making no difference in this case though. Strings might be extremely easy to parse. Large embedded JSON objects are slow to parse, and they're sometimes left as strings within a page's source and read after page load using JSON.parse instead, as there are far fewer syntax possibilities when reading something as JSON.

@megamaddu
Copy link

I did a quick test of bench1.js with an "Object + Symbol tag" on node.js:

> node bench[1].js  
Baseline Purescript x 127,112 ops/sec ±1.15% (87 runs sampled)
Erlscripten Purescript x 125,985 ops/sec ±2.07% (85 runs sampled)
Baseline Purescript + Array with String tag x 876,524 ops/sec ±0.82% (92 runs sampled)
Erlscripten Purescript + Array with String tag x 973,233 ops/sec ±0.78% (88 runs sampled)
Baseline Purescript + Array with Int tag x 878,747 ops/sec ±0.91% (90 runs sampled)
Erlscripten Purescript + Array with Int tag x 970,162 ops/sec ±1.11% (90 runs sampled)
Baseline Purescript + Object with String tag x 903,641 ops/sec ±1.08% (88 runs sampled)
Erlscripten Purescript + Object with String tag x 1,006,011 ops/sec ±0.94% (91 runs sampled)
Baseline Purescript + Object with Int tag x 918,739 ops/sec ±0.81% (92 runs sampled)
Erlscripten Purescript + Object with Int tag x 1,003,943 ops/sec ±1.16% (89 runs sampled)
Baseline Purescript + Object with Symbol tag x 2,061,186 ops/sec ±1.18% (88 runs sampled)
Erlscripten Purescript + Object with Symbol tag x 2,853,434 ops/sec ±0.89% (88 runs sampled)
Fastest is Erlscripten Purescript + Object with Symbol tag

Looks like it was maybe designed for this. 😅

Here's the added code:

.add('Baseline Purescript + Object with Symbol tag', function() {
var Node_ = /* typeof Symbol === undefined ? { type: "Node" } : */ Symbol("Node")
var Leaf_ = /* typeof Symbol === undefined ? { type: "Leaf" } : */ Symbol("Leaf")
var foo = function (v) {
    return function (acc) {
        if (v.tag == Node_) {
            return foo(v.value1)(foo(v.value0)(acc));
        };
        if (v.tag == Leaf_) {
            return acc + v.value0 | 0;
        };
        throw new Error("Failed pattern match at A (line 7, column 1 - line 7, column 26): " + [ v.constructor.name, acc.constructor.name ]);
    };
};
// {tag: "Node", value0: _, value1: _}
// {tag: "Leaf", value0: _}
var main = foo({tag: Node_, value0: {tag: Node_, value0: {tag: Node_, value0: {tag: Leaf_, value0: 1}, value1: {tag: Leaf_, value0: 2}}, value1: {tag: Node_, value0: {tag: Leaf_, value0: 3}, value1: {tag: Leaf_, value0: 4}}}, value1: {tag: Node_, value0: {tag: Node_, value0: {tag: Leaf_, value0: 5}, value1: {tag: Leaf_, value0: 6}}, value1: {tag: Node_, value0: {tag: Leaf_, value0: 7}, value1: {tag: Leaf_, value0: 8}}}})(0);
})

.add('Erlscripten Purescript + Object with Symbol tag', function() {
var Node_ = /* typeof Symbol === undefined ? { type: "Node" } : */ Symbol("Node")
var Leaf_ = /* typeof Symbol === undefined ? { type: "Leaf" } : */ Symbol("Leaf")
let foo = function ($copy_v) {
    return function ($copy_acc) {
        var $tco_var_v = $copy_v;
        $tco_loop: while (true) {
            let v = $tco_var_v;
            let acc = $copy_acc;
            if (v.tag === Node_) {
                $tco_var_v = v.value1;
                $copy_acc = foo(v.value0)(acc);
                continue $tco_loop;
            };
            if (v.tag === Leaf_) {
                return acc + v.value0 | 0;
            };
            throw new Error("Failed pattern match at A (line 7, column 1 - line 7, column 26): " + [ v.constructor.name, acc.constructor.name ]);
        };
    };
};
// {tag: "Node", value0: _, value1: _}
// {tag: "Leaf", value0: _}
var main = foo({tag: Node_, value0: {tag: Node_, value0: {tag: Node_, value0: {tag: Leaf_, value0: 1}, value1: {tag: Leaf_, value0: 2}}, value1: {tag: Node_, value0: {tag: Leaf_, value0: 3}, value1: {tag: Leaf_, value0: 4}}}, value1: {tag: Node_, value0: {tag: Node_, value0: {tag: Leaf_, value0: 5}, value1: {tag: Leaf_, value0: 6}}, value1: {tag: Node_, value0: {tag: Leaf_, value0: 7}, value1: {tag: Leaf_, value0: 8}}}})(0);
})

@natefaubion
Copy link
Contributor

That's a very significant improvement! Hard to pass that up.

@megamaddu
Copy link

Is that really 22x faster than baseline 0.14 purescript?? 😱 (micro benchmark but still..)

@megamaddu
Copy link

megamaddu commented Mar 11, 2021

Safari prefers literals but with a much less extreme bias:

[Log] Baseline Purescript x 562,969 ops/sec ±0.88% (62 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript x 599,189 ops/sec ±0.82% (63 runs sampled) (bench1.js, line 293)
[Log] Baseline Purescript + Array with String tag x 1,322,561 ops/sec ±0.84% (63 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript + Array with String tag x 1,722,923 ops/sec ±0.81% (63 runs sampled) (bench1.js, line 293)
[Log] Baseline Purescript + Array with Int tag x 1,022,494 ops/sec ±0.84% (60 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript + Array with Int tag x 1,678,704 ops/sec ±1.02% (62 runs sampled) (bench1.js, line 293)
[Log] Baseline Purescript + Object with String tag x 1,251,847 ops/sec ±0.65% (63 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript + Object with String tag x 1,631,020 ops/sec ±1.03% (62 runs sampled) (bench1.js, line 293)
[Log] Baseline Purescript + Object with Int tag x 1,290,772 ops/sec ±0.92% (62 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript + Object with Int tag x 1,631,225 ops/sec ±0.81% (62 runs sampled) (bench1.js, line 293)
[Log] Baseline Purescript + Object with Symbol tag x 918,017 ops/sec ±1.46% (59 runs sampled) (bench1.js, line 293)
[Log] Erlscripten Purescript + Object with Symbol tag x 997,032 ops/sec ±2.11% (57 runs sampled) (bench1.js, line 293)
[Log] Fastest is Erlscripten Purescript + Array with String tag (bench1.js, line 296)

@jy14898
Copy link
Contributor

jy14898 commented Mar 11, 2021

I did a quick test of bench1.js with an "Object + Symbol tag" on node.js:

I think something weird might be going on by just having the tag lifted out into a variable, did you try the same for strings/ints? I just did a quick test on the jsbench in the first post, and records with lifted strings far outperforms inline strings/lifted symbols. Nonetheless, I don't think these microbenchmarks are that accurate, it'll be interesting to see on a bigger codebase

@natefaubion
Copy link
Contributor

natefaubion commented Mar 11, 2021

If the compiler itself is patched, you can test against the purescript-language-cst-parser integration test, which spits out parse times for the package set. It is extremely pattern match heavy. This won't be scientific at all, but can maybe give you a sense of a relative speed up for a real-world project. There is also a script for benching a single file.

@megamaddu
Copy link

I did a quick test of bench1.js with an "Object + Symbol tag" on node.js:

I think something weird might be going on by just having the tag lifted out into a variable, did you try the same for strings/ints? I just did a quick test on the jsbench in the first post, and records with lifted strings far outperforms inline strings/lifted symbols. Nonetheless, I don't think these microbenchmarks are that accurate, it'll be interesting to see on a bigger codebase

Yeah, I think you're right. It's odd that ints aren't always tied with the fastest option.

@jy14898
Copy link
Contributor

jy14898 commented Mar 14, 2021

I went ahead and created a quick mock of the changes for the sake of profiling

Object + String: https://github.com/purescript/purescript/compare/v0.13.8...jy14898:data-repr?expand=1

Object + Int: https://github.com/purescript/purescript/compare/v0.13.8...jy14898:data-repr-int?expand=1

Note that they don't inline any of the constructors, it's still very similar to how the implicit class code worked. Also, the int version is just a hash of the string tags, which obviously is only suitable for testing here

Results using node v14.16.0

[nix-shell:~/purescript-language-cst-parser]$ tail v0.13.8_stock.out -n 2
Mean Parse: 7.123ms
Successfully printed 1969 of 1969 successully parsed modules.

[nix-shell:~/purescript-language-cst-parser]$ tail v0.13.8_string_tagged_object.out -n 2
Mean Parse: 5.208ms
Successfully printed 1969 of 1969 successully parsed modules.

[nix-shell:~/purescript-language-cst-parser]$ tail v0.13.8_int_tagged_object.out -n 2
Mean Parse: 4.986ms
Successfully printed 1969 of 1969 successully parsed modules.

@natefaubion
Copy link
Contributor

Do you mind showing the full stat output?

@jy14898
Copy link
Contributor

jy14898 commented Mar 14, 2021

Do you mind showing the full stat output?

https://gist.github.com/jy14898/bae68599e7794ff53c3df4a1c922e249

@natefaubion
Copy link
Contributor

30% improvement for the slower cases is a nice boost!

@natefaubion
Copy link
Contributor

Note that these times may be strongly affected by how hot the JIT was when it parsed the file. That's why the bench-file script exists for if you want to warm up the JIT on a specific file.

@natefaubion
Copy link
Contributor

natefaubion commented Sep 13, 2021

Circling back around to my suggestion #4020 (comment)

I implemented https://jsbench.me/1dktj6yaxk/1 with unminified src here https://gist.github.com/natefaubion/7fea2c2d8981ea4da5edddf6664548c7. This shows that padding values has a significant effect on performance. In Chrome I saw a speedup in the string tag (with if/else) go from 25k ops/s to 136k ops/s when padded. To me, this confirms my hunch that having lots of variations in the number of constructor arguments for sum-types will end up invalidating inline-caches. I would strongly recommend that whatever we do, we pad variants.

@garyb
Copy link
Member

garyb commented Sep 13, 2021

The padding results in much better performing cases on Firefox on Windows too. 25k ops/s vs 86k ops/s for string tag if/else.

edit: decimal place 😳

@natefaubion
Copy link
Contributor

I put together master...natefaubion:padded-ctors which has padded constructors with string tags.

@natefaubion
Copy link
Contributor

In general I'm seeing 30-35% reduction in run time.

@aranchelk
Copy link

One more aspect to consider when evaluating solutions to this issue is compatibility with structured cloning: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm

Structured cloning may be useful for web workers, indexdb, and directly via the structuredClone() function.

An example of a proposed encoding that's not compatible is tagging with a Symbol, per the documentation linked to above, it's the only primitive type that's not supported.

@shybovycha
Copy link

Just following the discussion from #4540. I understand the approach considered is all about adding the "tag" to the object being constructed for sum type.

I guess this might work well in certain scenarios when sharing the values between the realms of JS/PS. But in my opinion the approach is inherently unsafe for many cases - two different objects (from different packages, let's say) with the same tag value would match regardless of the underlying PureScript type (if the tag values are same - the objects would be considered same).

On the other hand, symbols can be constructed from a same string value, see Symbol.for():

Symbol('moo') === Symbol('moo') // false
Symbol.for('foo') === Symbol.for('foo') // true

I guess this could be utilized together with the tag approach to make comparisons more strict and allow for safe value reuse between JS/PS and packages. Albeit as @megamaddu pointed out, it would require full type specificator in the tag value - so that (arbitrary example) Node from package Data would not have same tag value as Node from package BST or from some random 3rd party JS library.

More important question though: should this behavior (of adding the logic specific to tagging objects to be shared with JS) be preferred over generally reducing the bundle size? What is the more common use case here? Could some of it be switched by a feature flag? E.g. add the tag and comparison logic if the feature flag is enabled and prefer smaller bundle size otherwise (assuming the type is only used in the same bundle / module and never leaks to the outside JS).

@aranchelk
Copy link

the approach is inherently unsafe for many cases - two different objects (from different packages, let's say) with the same tag value would match regardless of the underlying PureScript type

That's not correct, the value constructors (i.e. tags) may have identical runtime representations, but the type checker would not allow PS code to compare them because they'd still have different types.

Symbols superficially look like a good fit but they are a JS type that can't be used for structured cloning -- in real-world use in larger applications it would be very helpful to support structured cloning to facilitate use of web workers. The obvious response is to do your own serialization to move data to and from workers, but recursive JSON encoding is in my experience profiling one of the slower operations.

@shybovycha
Copy link

shybovycha commented Mar 20, 2024

the type checker would not allow PS code to compare them

That's why I specifically mentioned using the types between JS and PS - true, PS compiler will shout at ya, but once you leave its boundaries and try to use the same values with TS/JS - you get into the exact situation I am referring to.

The point I was trying to make with web workers and structured cloning is that this is not the case for all uses of PS, and that's why I proposed the use of compiler flag to control this behavior. Maybe the best approach would be to target general case rather than specific one? E.g. if the user wants to use the code in web workers or the structured cloning in isolation, they could get the whatever approach is more suited for that by adding the appropriate compiler flag to their project config. Assuming by default people won't use PureScript types outside of PureScript code.

Also, there are few points re. using symbols I reckon might solve the serialization issues:


F# / Fable already does something similar to the proposed approach with tags, but in my opinion it is inferior even to the existing PureScript implementation, as it loses any sort of type checks at runtime outside of F# code itself:

type Msg =
        | Increment
        | Decrement

let Msg2Str = function
        | Increment -> "increment"
        | Decrement -> "decrement"

generates:

export class Msg extends Union {
    constructor(tag, fields) {
        super();
        this.tag = tag;
        this.fields = fields;
    }
    cases() {
        return ["Increment", "Decrement"];
    }
}

export function Msg_$reflection() {
    return union_type("Test.Main.Msg", [], Msg, () => [[], []]);
}

export function Msg__get_IsIncrement(this$, unitArg) {
    if (this$.tag === 0) {
        return true;
    }
    else {
        return false;
    }
}

export function Msg__get_IsDecrement(this$, unitArg) {
    if (this$.tag === 1) {
        return true;
    }
    else {
        return false;
    }
}

export function Msg2Str(_arg) {
    if (_arg.tag === 1) {
        return "decrement";
    }
    else {
        return "increment";
    }
}

But then trying to use it with a function accepting any other sum type in JS:

module Program =

    type Cmd =
        | Select
        | Insert
        | Delete

    let Cmd2Str = function
        | Select -> "select"
        | Insert -> "insert"
        | Delete -> "delete"

    let cmd: Cmd = Select
    let msg: Msg = Increment

and in JS

console.log(Cmd2Str(msg)) // yields "select", despite Cmd2Str expecting Cmd argument and msg having the value Increment

console.log(Msg2Str(cmd)) // yields "increment", despite Msg2Str expecting Msg argument and cmd having the value Select

Meanwhile the current approach in PureScript:

module Main where

data Msg = Increment | Decrement

data Cmd = Select | Insert | Delete

msg2str :: Msg -> String
msg2str Increment = "inc"
msg2str Decrement = "dec"

cmd2str :: Cmd -> String
cmd2str Select = "select"
cmd2str Insert = "insert"
cmd2str Delete = "delete"

cmd = Select
msg = Increment

generates

var Increment = /* #__PURE__ */ (function () {
    function Increment() {

    };
    Increment.value = new Increment();
    return Increment;
})();
var Decrement = ...;
var Select = ...;
var Insert = ...;
var Delete = ...;
var msg2str = function (v) {
    if (v instanceof Increment) {
        return "inc";
    };
    if (v instanceof Decrement) {
        return "dec";
    };
    throw new Error("Failed pattern match at Main (line 7, column 1 - line 7, column 25): " + [ v.constructor.name ]);
};
var msg = /* #__PURE__ */ (function () {
    return Increment.value;
})();
var cmd2str = function (v) {
    if (v instanceof Select) {
        return "select";
    };
    if (v instanceof Insert) {
        return "insert";
    };
    if (v instanceof Delete) {
        return "delete";
    };
    throw new Error("Failed pattern match at Main (line 11, column 1 - line 11, column 25): " + [ v.constructor.name ]);
};
var cmd = /* #__PURE__ */ (function () {
    return Select.value;
})();

Then, trying to call function with the wrong argument in JS will fail at runtime:

cmd2str(msg) // fails with Uncaught Error: Failed pattern match at Main (line 11, column 1 - line 11, column 25): Increment

(a bit of a side track) I recently ran into the exact same issue in OpenAPI-generated TypeScript code as what the F# approach has - deserializing sum type responses from backend and then trying to match the response type on the frontend, which ended up always matching any two values even though their type was different.

I bet the same would happen if PureScript had the tag approach with integers or unbound strings - e.g. using same values across different types - like the mapping Cmd.Select => 1 and Msg.Increment => 1 would always result in Increment == Select despite the values residing in two different types.

Same would happen with two unbound strings (as I call them; referring to the values not being tied to the module/package): mappings Cmd.Select => "Select" and OtherType.Select => "Select" would always result in Cmd == OtherType if comparing by the tag value ("Select" === "Select").

Changing that to even Cmd.Select => "MyModule1.Cmd.Select" and OtherType.Select => "ThirdPartyModule.OtherType.Select" would already solve this to an extent, with the caveat of choosing the unique type prefix (when two types would have same prefixes, e.x. AST.Function from PureScript module and AST.Function from JavaScript module).

That is why I advocate for the use of Symbol - even with Symbol.from(String) for those web worker use cases. If hidden in IIFE or as private/static members of the sum type value objects:

var Increment = /* #__PURE__ */ (function () {
    function Increment() {
    };
    Increment.value = new Increment();
    Increment.prototype.tag = Symbol.for('Main.Increment');
    return Increment;
})();

@aranchelk
Copy link

Passing PS types into PS functions from JS and relying on failed pattern matches to do runtime type checking seems pretty far outside any best practice or intended use cases I've seen.

What would that mean for newtypes, which IIRC are supposed to be indistinguishable from the original type at run time? Newtypes are implicitly not supposed to be passed to a function accepting the original type, but they won't have any tag to fail a pattern match.

@shybovycha
Copy link

Passing PS types into PS functions from JS ... seems pretty far outside any best practice or intended use cases I've seen.

As far as I understand, that is what the whole fuss is about - using index db, web workers and structural copying - they all require interop between JS and PS.

What would that mean for newtypes, which IIRC are supposed to be indistinguishable from the original type at run time?

Sorry, not sure I follow - what is "that" and what difference should it make for newtypes?

@aranchelk
Copy link

AFAIK, interop usually takes one of two forms:

  1. Making libraries by wrapping JS in PS and then consuming those libraries from a PS application. JS types are imported as foreign data types and are opaque to PS, you can convert them to PS types in a safe way with the module called Foreign.

  2. Writing libraries in PS that will be consumed by JS, but those PS libraries will be written to take a JS objects and do appropriate validation/conversion (again using the Foreign module).

Instances where you're creating PS data from PS functions passing it around in JS then passed back into PS (without serialization or validation), and relying on pattern matching to catch exceptions -- I think that's far enough from best practice that it's not going to justify creation of a feature. I'm not an authority on those matters, but it runs contrary to everything I've read in articles, official docs, and chats on Discord.

What would be the implication for Newtypes in your proposed style of development?Newtypes are used among other things to segregate an original type from another distinct usage (the newtype) -- authors generally don't want to you to pass the original type into functions that are written to accept the newtype (and vice versa). If you're asking to use symbols to I get runtime type checking, it'll provide a benefit for sums, but it's not at all comprehensive. I brought up newtype because it's a fairly widely used feature that definitionally has the same runtime representation as the underlying type, so it's not likely to get the benefit of any runtime type checking.

@shybovycha
Copy link

Hmm, it does make sense to not have PS types leak into and between JS and back to PS logic, but from the perspective of partially implementing application in PS (like step-by-step code migration from JS to PS, for instance) I can see a lot of cases when this would be desirable.

With newtypes I still don't see how they would (should) behave any differently - aren't they same as smart constructors? As I see it a newtype would have its own function MyNewType() {} with its own tag or symbol.

I am not suggesting moving everything to use symbols. Simply having them for sum types would have benefits already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests