New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests get run multiple times when using maxParallelForks #3973
Comments
@Kantis could you provide some pointers on how to fix this issue? I'm happy to contribute if it's not too outside of my abilities. |
I wonder if using Gradle's Would using Kotest mechanisms exclusively work for you? If not, what's your use case and how would you expect the different parallelism configurations in Gradle and Kotest to work together? |
We tried using Kotest parallelism but we depend on some third party libraries and also native C/Rust libraries that seem to use global state that we can't get around. It's particularly an issue in our integration tests where we instantiate our entire application. Also, we do build for the iOS target and it doesn't look like
I would expect So far Kotest behaves as expected with |
For clarity, I should add: the bug isn't that the all tests in the project are being run once for every fork. The bug is that all tests in the project are being run once for every |
Let's examine your observations and assumptions point by point:
That class file is generated by the Kotlin compiler.
I don't know in which cases the Kotlin compiler generates such a class, but yes, this could interfere with Gradle's way of invoking test engines (see below). The Gradle docs on
In our case, this means that Gradle can start multiple Kotest engines, each in its own process. Depending on the timing, it could be one or more. Let's assume it is 2. Gradle would then probably (the documentation does not tell) supply some test classes (Kotest specs) to engine 1, and others to engine 2.
This seems to be in line with what Gradle does.
In general, yes. To clarify on "available threads": By default, Kotest runs each test on a single thread. On the JVM, it can create more, if so configured. Finally, your code test can create more threads. So what you're seeing is determined by factors not under Kotest's control. If, for some reason, additional classes are generated and Gradle picks them up, Gradle will produce additional invocations.
Yes, if you're invoking code which is not thread-safe, single-threaded tests are mandatory. For integration and end-to-end tests, it is usually desirable to disable testing framework parallelism (
It is not relevant there: Kotest provides parallel test scheduling on the JVM only. To summarize: What you're observing seems to be an unfortunate interaction between Gradle and the Kotlin compiler. I don't see how Kotest could prevent this from happening. So one option would be to run parallel tests using the Kotest mechanisms for thread-safe parts of your code base, and use If you still need to run several parallel test processes, maybe you could configure Gradle to exclude certain class patterns like so (untested)? tasks.withType<Test>().configureEach {
useJUnitPlatform()
filter {
excludeTestsMatching("*$WhenMappings")
}
} |
I think there's still something missing here. If the hypothesis is that because Gradle is passing
What is actually happening is that all test classes in the project get run an extra time test log
Something doesn't add up still. Here's my understanding based on my read of the Gradle code, but I could be way off:
I think there's something Kotest-specific here happening and I'd love to understand more how Kotest receives test classes from Gradle / JUnit Platform and then figure out which tests are run. I suspect something is getting lost going from JUnit => Kotest. This all seems to work when When
Unfortunately, we are exactly trying to parallelize those integration tests because they are slow. We have no need to parallelize unit tests. Using Other leads
|
Hmm mightyguava/gradle@a29b234 does indeed seem to fix the problem, but I still don't understand why the problem manifests in this way. |
Shortly after my post above, I did some additional research and found this note in the Gradle docs:
So seemingly they are saying "we are doing something different for JUnitPlatform, but we won't tell what exactly". Digging into the JUnit Platform Launcher API unveals a huge API surface with lots of indirection (don't call us, we'll call you). That makes it hard to determine causes and effects. But we could be lucky in that Kotest has some say in filtering test classes and I could try to find out more as soon as time permits (probably not this month).
That seems a bit of a stretch to expect Kotest to filter out test classes explicitly passed to it by the surrounding framework. Kotest also should not depend on Kotlin compiler internals such as names of synthetic classes. While Kotest could be made to filter out anything that contains a
I'm not so sure. If it is indeed Kotest selecting the test classes, it should have some way of determining what constitutes a legitimate test class. If not, the surrounding framework should bear responsibility for passing correct test classes only. We need to find out what's the case here before acting on this. If you'd like to speed things up, you could explore Kotest's JUnitPlatform test runner which resides under Also, how did you verify how Gradle actually distributes your test across processes? I have used this code (but in my tests, Kotlin did not create the extra class BarTests : FunSpec({
println("BarTests: pid=${ProcessHandle.current().pid()}, thread=${Thread.currentThread().id}")
test("bad when") {
println("BarTests.bad when: pid=${ProcessHandle.current().pid()}, thread=${Thread.currentThread().id}")
val num = Hello.A
when (num) {
Hello.A -> {}
else -> fail("oops")
}
}
test("bad when 2") {
println("BarTests.bad when2: pid=${ProcessHandle.current().pid()}, thread=${Thread.currentThread().id}")
val num = Hello.A
when (num) {
Hello.A -> {}
else -> fail("oops")
}
}
}) |
Thank you for digging into the issue!
I agree. I've also done some digging and the indirection makes things quite difficult to trace. The behavior seems strange for either hypothesis so this is extra strange to me. If Kotest is receiving explicit test classes from Gradle to run, and it gets one named If Kotest is not receiving explicit test classes from Gradle to run, then how does it know which classes to run in which process? If Kotest is determining which tests to run though, then it should be filtering out the WhenMappings classes.
Pushed mightyguava/gradle@a29b234 to my exemplar. Here's what it prints with logs
Adding a second test case like you did makes each test run 3 times on 3 different PIDs. logs
If I remove the logs
Are you using my example repo? I always get a |
So far, I've tried to investigate the case with a little test project, which I use to create reproducers. I always try to keep such stuff minimal. I could use your example project with a local Kotest build and investigate further, but doing so requires more time than I can make available this month. I think your use case is legitimate and should be supported. If we can figure out the root cause, we can either fix it in Kotest or create an issue in whichever project is responsible. In the latter case, we might be able to create a workaround in Kotest. In the meantime, I understand that you can work around the problem locally, which is a good thing. If you have more information, feel free to post here or even create a PR. |
One additional factor to explore: How do Gradle or JUnitPlatform distribute test classes to parallel processes? Assume we try to parallelize 100 test classes across 4 worker processes, with 97 light-load classes requiring 20% of total CPU time and 3 heavy-load classes requiring the remaining 80% (not unusual for integration tests). Using a static (pre-determined) distribution, we could end up with the 3 heavy-load classes being distributed to the same test worker (process). In that case, we'd parallelize 20% of our total load, leaving 80% on a single process. For a generally useful parallelization, we would need some scheduler managing a (test class) job queue and schedule each test class to a worker engine (actor) once that becomes idle. In other words: A static distribution of test cases across processes, which is determined in advance, would serve only a fraction of use cases compared to a dynamic scheduler/actor-based distribution. Inside Kotest, with coroutines, we of course have that kind of scheduling for thread-based parallelization. It remains to be seen for what use cases the upstream frameworks offer additional value with process-based parallelization. |
@mightyguava I've tried your repro, at 2fa1069070d7a0cb9282c02b3f075816e0a41f04. This is what I got:
So the problem did not reproduce in my environment (OpenJDK Runtime Environment (build 17.0.10+7-Ubuntu-120.04.1)). |
Hmm that's strange, I would not have expected this to be an architecture-specific problem. My environment is an Apple Silicon Macbook running
I tried this in docker with to get as close as I can to your env on my mac, on the same commit... Ubuntu Jammy with Eclipse Build of OpenJDK
and still reproduce the issue
You wouldn't happen to have a dual-core CPU would you? 😅 Wondering if the CPU count is the difference. |
The above tests ran on a machine with an Intel Xeon 4-Core E3-1225 v5. The interesting difference is that I still can find no BTW: What I found is that JUnitPlatform seems to call |
Some updates:
|
Here's the "kotest-PID.log" files. I'm not totally sure how to interpret these logs but I do see the same tests being assigned to multiple PIDs. |
What happened is this:
So the Gradle test task does the initial class discovery. It just picks up classes from the classpath. There is some hardwired code for some older test frameworks (and even pattern matching), but in the JUnitPlatform case, Gradle does not know anything about what constitutes a test class. Now in our parallel execution case, Gradle intends to invoke the JUnitPlatform launcher API repeatedly, each with a LauncherDiscoveryRequest specifying a list of ClassSelectors (basically, a list of classes). And here's the bug: The Gradle test task invokes the launcher API with an empty list of class selectors. The JUnitPlatform launcher API does not specify what to do in this case. Kotest's behavior is to do a full test discovery by scanning the classpath. Your observation that a Furthermore, the entire API seems completely unsuitable for parallelization. With a new commit for better logging, I have observed this: [Discovery] Collected specs via 29 class discovery selectors in 720ms, found 3 specs So if the classpath contains 29 classes, of which only 3 (just 10%) are actual test specs, the chances of having the actual tests distributed correctly across multiple runners are slim. So if we changed Kotest to assume an intentional no-op invocation with JUnitPlatform, this would not solve the problem. For proper parallelization, a better approach would be to ignore the Gradle test task, use a sharding algorithm tailored to your needs, and directly invoke Lines 229 to 246 in 6b718f1
|
Related: gradle/gradle#2669 |
Thanks for the thorough investigation, the bug makes much more sense now!
It sounds like if we change Kotest to assume an intentional no-op invocation with JUnitPlatform, we
If it doesn't break Kotest, making this change would still be valuable because
What do you think? |
I agree mostly with your conclusions, though loss of performance seems possible as well, depending on the scenario. Factors to consider are:
So we'd have to make this configurable and not the default behavior. The launcher API is not my area of expertise and I may not be aware of all the implications, so we need to get some consensus from the team before merging such changes. I could try to push some commit to my branch, like introducing an engine property Does that make sense? |
Yes, those considerations make sense to me. I agree that adding a property instead of enabling it as a default would be a safer option. As well as getting consensus from the team.
What is the behavior with |
It always passes all potential classes on the classpath (it does not know what a test is) via class path selectors, one per class. And if there are none, nothing bad will happen as then the Kotest classpath scan will yield the same result (0 classes). |
Could you try a local build of https://github.com/kotest/kotest/tree/oo2/spec-discovery, along with the following? tasks.named<Test>("jvmTest") {
// ...
systemProperty("KOTEST_DEBUG", "TRUE")
systemProperty("kotest.framework.discovery.classpath.scanning.enabled", "false")
} |
Yes, that seems to have solved the problem of tests running twice! |
Also works for me. Could reproduce the doubling of test executions via tasks.named<Test>("jvmTest") {
// ...
forkEvery = 1
} With that, one process will always get an empty class list. |
Kotest version: 5.8.1
This is one of the weirder bugs I've seen.
Repro here, containing a single commit on top of the
kotest-examples-multiplatform
repo main.If you run
./gradlew cleanJvmTest jvmTest
, you'll see that every test within the module is run twice. This is also visible frombuild/reports/tests/jvmTest/index.html
. IfmaxParallelForks
is set to 1, this issue does not occur.If you copy the test case in
WhenTest
, each new copy of the test case will cause all tests to be run an additional time.Based on trial and error, I narrowed this down to
when
clauses being added to the test that operate on anenum
. Looking at the compiled classes, I suspect it has something to do with this$WhenMappings
class that is generated, e.g.Each test case that uses a
when
on anenum
seems to generate a extra inner class... and my hypothesis is that Kotest's runner (or JUnit framework? or Gradle?) isn't deduping this with the outer class properly.The text was updated successfully, but these errors were encountered: