Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile classpath configuration is not deterministic #13555

Closed
cesar1000 opened this issue Jun 23, 2020 · 11 comments · Fixed by #13580
Closed

Compile classpath configuration is not deterministic #13555

cesar1000 opened this issue Jun 23, 2020 · 11 comments · Fixed by #13580

Comments

@cesar1000
Copy link

Our main build is hitting an issue where a compile classpath ends up with a different ordering when running the build repeatedly without intervening changes. Here's what I do:

gr :lib:core:database:compiler:test --no-build-cache --rerun-tasks
gr :lib:core:database:compiler:test --no-build-cache

The second run frequently (but not always) recompiles :lib:core:database:compiler even if nothing changed between both builds. The comparison of the build scans reveals that io.reactivex.rxjava2:rxkotlin is added to the classpath in different locations. Gradle Enterprise shows that the resolutions are slightly different. In the first run:

image

And in the second run:

image

This second resolution looks wrong, since org.jetbrains.kotlin:kotlin-stdlib:1.3.40 is a runtime dependency of io.reactivex.rxjava2:rxkotlin. More interestingly, the resolution of :lib:core:util:java:common in the same build doesn't list kotlin-stdlib as a dependency of rxkotlin:

image

The issue shows up in different modules each time, but it affects most of our builds. I repro'ed in the current latest nightly, 6.6-20200622220024+0000.

I collected some extra information and put up the files in a shared drive:

  • The build scans for both builds (from which I took the previous screenshots).
  • The command line arguments to the compilation task in both cases, which show the different classpaths.
  • The full debug outputs from both builds.
@ljacomet
Copy link
Member

Hey @cesar1000, any chance that build has problematic inputs like in #13105 or a more generic situation as hinted at in #13106?

@cesar1000
Copy link
Author

I don't think the issue is in the build configuration – comparing configurations in Gradle Enterprise, the problem seems to be that a runtime only dependency is occasionally getting added to the compilation classpath. The resulting resolved classpath gets reordered whenever this dependency is selected. The fact that a single project resolves differently when resolved independently and when resolved as a dependency of other project also looks suspect.

@ljacomet
Copy link
Member

Looking into this a bit deeper, I believe it shows there is a bug in the fix done for #9415.
Any chance you could try with Gradle 6.4.1 to check if you can reproduce or not in with that version?

Could you try the following and share the output?

Set<String> errors = []
allprojects {
    configurations.all { cnf ->
        cnf.incoming.afterResolve {
            long now = System.currentTimeMillis()
            resolutionResult.allComponents {
                if (variants.any { it.displayName == 'default' }) {
                    def javaApplied = plugins.hasPlugin('java-base')
                    errors << "$now - On project ${project.name} configuration ${cnf.name} resolved component with default variant when Java plugin is ${javaApplied?'applied':'not applied!'}"
                }
            }
        }
    }
}
gradle.buildFinished {
    errors.each { println it }
}

If you get some output, could you attempt adding the java-base plugin to these projects, making sure it is applied before configurations are resolved.

@melix FYI this looks like another instance of a problem with the metadata cache.

@melix
Copy link
Contributor

melix commented Jun 24, 2020

Yes it seems to be the same issue, I can reproduce with large enough dependency graphs and --parallel.

@ljacomet
Copy link
Member

The good news is that the issue is identified and a fix under development. Will keep you updated about testing a version with the fix.

@cesar1000
Copy link
Author

Ah, awesome, thanks 😁 . I added the snippet above to my root build file and ran ./gradlew tasks but got no output. Is that expected? I can try to repro in 6.4.1 later if still useful.

@melix
Copy link
Contributor

melix commented Jun 24, 2020

Yes it's expected since this will only output things for configurations which are actually resolved during the build. Do you have any project which does not apply the java plugin (or java-library)?

@cesar1000
Copy link
Author

Ah of course. Let me run in a build. All our projects are either Java or Android projects so, as far as a I know, they all apply the plugin. Let me test for that anyway.

@cesar1000
Copy link
Author

cesar1000 commented Jun 24, 2020

Oh there you go, a couple of them :)

1593010763942 - On project common-plugin configuration default resolved component with default variant when Java plugin is applied
1593010763944 - On project uiarch-plugin configuration default resolved component with default variant when Java plugin is applied
1593010802858 - On project twitter-android configuration checkstyleRules resolved component with default variant when Java plugin is not applied!
1593010803624 - On project rules configuration _classStructurekaptKotlin resolved component with default variant when Java plugin is applied
1593010807339 - On project twitter-android configuration ktlintRules resolved component with default variant when Java plugin is not applied!

melix added a commit that referenced this issue Jun 24, 2020
The fix for #12951 introduced by #12957 worked but made a number
of other bugs much more likely to happen: by using the right derivation
strategy in each project, we were properly fetching the component
metadata we wanted to process from the cache.

However, this component metadata was mutable, because of laziness.
In particular, derivation of variants require to set the derivation
strategy on "deemed immutable" metadata. The problem is that when
several configurations are resolved concurrently, we ended up mutating
the same component from different threads, with different derivation
strategies.

The ideal fix would be to have real immutable component metadata from
the cache. However, because of laziness (required for performance), we
can't do that.

The fix in this PR is therefore to create a copy of the module metadata
whenever a different derivation strategy needs to be used.

It also highlighted another bug, which is that when we use cached
component metadata rules, the derivation strategy wasn't part of the
cache key, meaning that we would get whatever was first stored in the
binary cache.

We now properly separate the entries in the cache by amending the cache
key with the name of the strategy. This isn't ideal as we could potentially
have _stateful_ strategies, but for now there are no such things in the
wild. Again in reality we'd like to get rid of the "magic snapshotting"
of rules and do something similar to what transforms do, with proper
declared input types.

Fixes #13555
@cesar1000
Copy link
Author

I ran the same build multiple times in 6.4.1 and didn't hit the bug :). I'll try in a nightly as soon as it's available.

@cesar1000
Copy link
Author

Oh, also the project listed above that doesn't apply a java plugin is the root.

melix added a commit that referenced this issue Jun 25, 2020
The fix for #12951 introduced by #12957 worked but made a number
of other bugs much more likely to happen: by using the right derivation
strategy in each project, we were properly fetching the component
metadata we wanted to process from the cache.

However, this component metadata was mutable, because of laziness.
In particular, derivation of variants require to set the derivation
strategy on "deemed immutable" metadata. The problem is that when
several configurations are resolved concurrently, we ended up mutating
the same component from different threads, with different derivation
strategies.

The ideal fix would be to have real immutable component metadata from
the cache. However, because of laziness (required for performance), we
can't do that.

The fix in this PR is therefore to create a copy of the module metadata
whenever a different derivation strategy needs to be used.

It also highlighted another bug, which is that when we use cached
component metadata rules, the derivation strategy wasn't part of the
cache key, meaning that we would get whatever was first stored in the
binary cache.

We now properly separate the entries in the cache by amending the cache
key with the name of the strategy. This isn't ideal as we could potentially
have _stateful_ strategies, but for now there are no such things in the
wild. Again in reality we'd like to get rid of the "magic snapshotting"
of rules and do something similar to what transforms do, with proper
declared input types.

Fixes #13555
melix added a commit that referenced this issue Jun 25, 2020
The fix for #12951 introduced by #12957 worked but made a number
of other bugs much more likely to happen: by using the right derivation
strategy in each project, we were properly fetching the component
metadata we wanted to process from the cache.

However, this component metadata was mutable, because of laziness.
In particular, derivation of variants require to set the derivation
strategy on "deemed immutable" metadata. The problem is that when
several configurations are resolved concurrently, we ended up mutating
the same component from different threads, with different derivation
strategies.

The ideal fix would be to have real immutable component metadata from
the cache. However, because of laziness (required for performance), we
can't do that.

The fix in this PR is therefore to create a copy of the module metadata
whenever a different derivation strategy needs to be used.

It also highlighted another bug, which is that when we use cached
component metadata rules, the derivation strategy wasn't part of the
cache key, meaning that we would get whatever was first stored in the
binary cache.

We now properly separate the entries in the cache by amending the cache
key with the name of the strategy. This isn't ideal as we could potentially
have _stateful_ strategies, but for now there are no such things in the
wild. Again in reality we'd like to get rid of the "magic snapshotting"
of rules and do something similar to what transforms do, with proper
declared input types.

Fixes #13555
ljacomet pushed a commit that referenced this issue Jun 29, 2020
The fix for #12951 introduced by #12957 worked but made a number
of other bugs much more likely to happen: by using the right derivation
strategy in each project, we were properly fetching the component
metadata we wanted to process from the cache.

However, this component metadata was mutable, because of laziness.
In particular, derivation of variants require to set the derivation
strategy on "deemed immutable" metadata. The problem is that when
several configurations are resolved concurrently, we ended up mutating
the same component from different threads, with different derivation
strategies.

The ideal fix would be to have real immutable component metadata from
the cache. However, because of laziness (required for performance), we
can't do that.

The fix in this PR is therefore to create a copy of the module metadata
whenever a different derivation strategy needs to be used.

It also highlighted another bug, which is that when we use cached
component metadata rules, the derivation strategy wasn't part of the
cache key, meaning that we would get whatever was first stored in the
binary cache.

We now properly separate the entries in the cache by amending the cache
key with the name of the strategy. This isn't ideal as we could potentially
have _stateful_ strategies, but for now there are no such things in the
wild. Again in reality we'd like to get rid of the "magic snapshotting"
of rules and do something similar to what transforms do, with proper
declared input types.

Fixes #13555
@ljacomet ljacomet added this to the 6.5.1 milestone Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants