Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Larger compilation results in TeaVM 10.0.0-SNAPSHOT #870

Open
McMurat opened this issue Nov 27, 2023 · 12 comments
Open

Larger compilation results in TeaVM 10.0.0-SNAPSHOT #870

McMurat opened this issue Nov 27, 2023 · 12 comments

Comments

@McMurat
Copy link

McMurat commented Nov 27, 2023

I noticed that the compilation results are larger in TeaVM 10.0.0-SNAPSHOT, compared to 0.9.0 and earlier versions.

To investigate, I disabled obfuscation and compared multiple functions. One thing I noticed is, that TeaVM 10.0.0-SNAPSHOT creates a lot of code to simulate multithreading which was not present in earlier versions. For example (see $rt_resuming() blocks):

let jur_PossessiveCompositeQuantifierSet_matches = ($this, $stringIndex, $testString, $matchResult) => {
    let var$4, $min, $max, $i, $shift, $ptr, $tmp;
    $ptr = 0;
    if ($rt_resuming()) {
        let $thread = $rt_nativeThread();
        $ptr = $thread.pop();$shift = $thread.pop();$i = $thread.pop();$max = $thread.pop();$min = $thread.pop();var$4 = $thread.pop();$matchResult = $thread.pop();$testString = $thread.pop();$stringIndex = $thread.pop();$this = $thread.pop();
    }
    main: while (true) { switch ($ptr) {
    case 0:
        var$4 = $this.$quantifier;
        $min = var$4.$min5;
        $max = var$4.$max5;
        $i = 0;
        while (true) {
            if ($i >= $min) {
                a: {
                    while (true) {
                        if ($i >= $max)
                            break a;
                        if (($stringIndex + $this.$leaf.$charCount() | 0) > $matchResult.$rightBound0)
                            break a;
                        $shift = $this.$leaf.$accepts($stringIndex, $testString);
                        if ($shift < 1)
                            break;
                        $stringIndex = $stringIndex + $shift | 0;
                        $i = $i + 1 | 0;
                    }
                }
                var$4 = $this.$next8;
                $ptr = 1;
                continue main;
            }
            if (($stringIndex + $this.$leaf.$charCount() | 0) > $matchResult.$rightBound0) {
                $matchResult.$hitEnd = 1;
                return (-1);
            }
            $shift = $this.$leaf.$accepts($stringIndex, $testString);
            if ($shift < 1)
                break;
            $stringIndex = $stringIndex + $shift | 0;
            $i = $i + 1 | 0;
        }
        return (-1);
    case 1:
        $tmp = var$4.$matches0($stringIndex, $testString, $matchResult);
        if ($rt_suspending()) {
            break main;
        }
        $stringIndex = $tmp;
        return $stringIndex;
    default: $rt_invalidPointer();
    }}
    $rt_nativeThread().push($this, $stringIndex, $testString, $matchResult, var$4, $min, $max, $i, $shift, $ptr);
};

There are cases in my program (especially switch statements over Strings) which get much longer (and difficult) by this transformation. I am not using any multithreading at all. Is there any possibility to disable this kind of code generation?

On top of that TeaVM creates large reflection tables (including real class names and method name tables, even for lambdas). This has already been the case in earlier versions and is not a new problem, but I wonder why this is done. It should be possible to disable this functionality (or restrict it to runtime classes). I know that the class name is required for toString() of native java/lang/Object. But besides exceptions there is no real need for them in release mode in any codebase which is not relying on reflection.

@konsoletyper
Copy link
Owner

Is there any possibility to disable this kind of code generation

No, there's no way to do that. But you need to figure out what's causing this behaviour. It's not necessarily some actual multi-threaded code. Thread.sleep called within single thread can also cause such transformation.

It should be possible to disable this functionality

No, it's not possible. TeaVM tries to guess set of classes that need reflection and generates meta-information only for those. So in order to reduce meta-information tables, you need to get rid of code that uses class names.

@McMurat
Copy link
Author

McMurat commented Nov 27, 2023

But why does it generate reflection data for lambdas then? And what type of code is considered to use class names? Everything which uses class references like "MyClass.class"? Because I am not using any Class.forName("...") or related methods and the data is created for most (if not all) classes...

@konsoletyper
Copy link
Owner

But why does it generate reflection data for lambdas then

Because TeaVM decided that some of your code takes names of lambda classes. This may be not necessarily be true, but AOT compiler can only act in a conservative way. For example, you can have code like this in one method:

var list = new ArrayList<Object>();
list.add((Runnable) () -> {});

and in some other method something like this

list.get(i).getClass().getName()

then TeaVM MAY think that lambda class in the first snippet requires metadata, even if at run time it's never the case. This depends.

And what type of code is considered to use class names?

For example, o.getClass().getName() requires class name to be encoded somewhere.

I don't want to introduce a switch to control this behaviour. In you case you perhaps don't need this, but someone else would need this feature. Then someone else notices that there's another sort of code that can be thrown away. And TeaVM ends up with 1000 of flags and switches, which are quite hard to document properly.

@McMurat
Copy link
Author

McMurat commented Nov 27, 2023

Okay, thank you for the explanation.

But maybe at least the obfuscated method name arrays, like ["yJ",E(Gf),"yL",D(YU),"xH",D(BM),"xv",D(Dp),"xE",D(BDi),"xG",D(BWR)] could be omitted. I mean anyone relying on calling these methods would not be able to do so anyways when they are obfuscated, right?

@konsoletyper
Copy link
Owner

But maybe at least the obfuscated method name

These aren't names for reflection, but for virtual tables. They can't be omitted.

@McMurat
Copy link
Author

McMurat commented Nov 27, 2023

Allright, thank you!

@McMurat
Copy link
Author

McMurat commented Nov 27, 2023

Okay, I found a reference to Thread. This indeed caused the issue. Maybe this should be documented. Can be closed. Thank you again!

@konsoletyper
Copy link
Owner

May be it is, but never have time, sorry. Also, I think that it's possible to make TeaVM to help you to find code pieces which cause async infection, somewhere in the future.

@McMurat
Copy link
Author

McMurat commented Nov 27, 2023

One last question: I do not use any getClass() calls except in equals() implementations where i often write:

        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        // ...

Will this already cause reflection data to be generated?

@konsoletyper
Copy link
Owner

Not sure about this. I think you need to call getName to cause full metadata generation, but I don't remember actual implementation details.

@konsoletyper
Copy link
Owner

You can try to attach debugger to the TeaVM compiler process, set breakpoint at ClassMetadataRequirements constructor and watch/evaluate following: dependencyInfo.getCallGraph().getNode(GET_NAME_METHOD);. Then by traversing the call graph back you can find out what causes inclusion of metadata.

@konsoletyper
Copy link
Owner

But I want to warn you that according to my measures class names never contribute significant amount to size of the resulting JS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants