Consider lowering `runTest` default timeout #3270

JakeWharton · 2022-04-30T02:33:06Z

60 seconds is a really long time for nothing to happen. It's like an eternity for a CPU, especially one which is skipping delays. I would personally prefer something low like 2s (as is the default for JS's Mocha runner) but something like 10s seems a reasonable default that it covers 99.9% of real-world usage without also being a whole minute.

dkhalanskyjb · 2022-05-11T11:23:16Z

The reason for adding a timeout at all was to make sure that the CI doesn't hang when it's obvious that the test won't be passing. If someone runs a test manually and observes it hanging, they can just interrupt it.

Note also that this is not a timeout for the whole test, it's only a timeout for nothing happening in dispatchers that can be observed by the test framework. We won't throw, for example, on something like this:

runTest {
    while (true) {
        delay(1000)
    }
}

So, this timeout is not an indication of the test taking too long, it's an indication of no observable activity, possibly indicating a deadlock. So, you're right that a whole minute is like an eternity—this was the intention: if nothing observable happens a whole eternity later, the CI should fail.

10 seconds are also like an eternity, but only if we're CPU-bound. However, the things forked off to Dispatchers.IO (and thus hidden from the view of the framework) could be some things that rely on real world, like connecting to a database and performing some operations there. Not the type of tests we endorse, but the type of tests that actually exist.

JakeWharton · 2022-05-11T12:45:31Z

I'm sympathetic to the idea that there exists someone writing a unit test which does >10 seconds of wall-clock work, but isn't that why the parameter exists? That outlier can specify a longer timeout since they know they're doing a huge batch DB operation.

dkhalanskyjb · 2022-05-11T12:56:49Z

The worry was that this is an occasional outlier, which will fit in 10 seconds easily 99% of the time, but then will flake due to some connection being surprisingly slow. We expect 60 seconds to be enough for everyone.

This cutoff is necessarily arbitrary, so this is not a hard stance, of course. A question though: who would benefit from tests failing in 10 seconds instead of 60?

CI: ± a minute shouldn't be a big deal there, right?
Manually running a single test: (can only base this on my personal experience, as, again, this is arbitrary) if I see a test taking more than, say, 4-5 seconds, I will already be prompted to interrupt it.
Manually running a test batch: here, I would probably prefer the hanging tests to take a minute and not ten seconds. This way, even with the dispersed attention, I will without a doubt notice which test hangs.

dkhalanskyjb · 2022-05-11T12:59:55Z

Now that I think about it, I'm in the wrong in the "manually running a test batch" scenario, as the test will fail and crash, which will be easy to notice. Is this the scenario that you think suffers due to a large timeout?

hrach · 2022-05-11T14:52:26Z

Manual testing is not only about running the test, but also about compilation, which may take few tens of seconds (especially with Gradle [configuration|build] cache-miss). So I usually switch context and wait for the IDE notification what is the test result. This happened to me multiple times.

JakeWharton · 2022-05-13T17:06:16Z

I generally prefer a lower timeout for our CI. When I inadvertantly break something in a way that omits an emission each test will hang until timeout. Turbine used to have a 1 second timeout, but we removed it in the latest version in favor of the one in runTest. However, that means we've 60x'd the time it takes for CI to fail and report the test results should a hang be introduced. We have classes with anywhere from 5-30 functions testing the same subject. That means our potential wall-clock time for CI timeout has gone from 5-30 seconds to 5 to 30 minutes. Locally I'll manually stop the test suite (if I'm manually running it) when a single test hangs for more than a few seconds, but on CI we often need to see the test report which only is generated if the tests complete normally (i.e., are not canceled).

dkhalanskyjb · 2022-05-16T11:23:53Z

These are all good points: if we treat the timeout in runTest as an indicator that the test must have hanged, lowering the timeout sounds like a good idea: a test that takes 10 seconds to run probably has hanged, and if there's an option to raise this limit for the atypical case, everyone wins.

However, our timeout is for a very specific case when there are no emissions for a long time. For example, it will not catch this typical bug:

runTest {
    var x = 0
    launch {
        delay(1000)
        x = 0
    }
    while (x == 0) {
        delay(100)
    }
}

Since the test always does something, runTest will not fail it. A more general and reliable way to catch hanging tests is to use something like https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-debug/kotlinx.coroutines.debug.junit4/-coroutines-timeout/index.html, designed specifically for this purpose, or at least to wrap the test body in a withTimeout, which, while crude, is still more reliable.

What worries me is that, if we make the timeout low, people may get the impression that runTest provides some proper guarantees about timing out, whereas, in fact, its purpose is mostly that of a last-resort mechanism, which happens to cover a lot of typical cases of hanging tests.

JakeWharton · 2022-05-24T02:49:42Z

Yeah that makes sense.

Given how runTest is used, however, have you considered maybe making it a proper wall-clock timeout? It won't cover 100% of tests since some may be synchronous, but it will cover a lot of them–especially ones which are prone to hanging.

BraisGabin · 2022-06-03T09:33:32Z

I would be happy if I could configure it from a property or something similar. I'm moving from runBlokingTest to runTest and I'm getting really odd issues. I want a report from JUnit about which tests failed. If 100 tests hang that means that I need 100 minutes to get that report!!

And once I have that report I need to go to each one, lower their timeout and start working on them to, once they are fixed, remove the timeout parameter because it just adds verbosity to my code.

I can undesrtand that 60 seconds is a really safe default value. But I don't want that default value because it slow my development flow a lot. And I don't want to add the tiemout value on each runTest because it adds a lot of verbosity. I could create my own runTest and forbid the usage of kotlinx.coroutines.test.runTest using something like detekt, but that adds complexity.

dkhalanskyjb · 2022-06-03T09:40:01Z

@BraisGabin, maybe https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-debug/kotlinx.coroutines.debug.junit4/-coroutines-timeout/index.html would help? runBlockingTest, too, didn't prevent tests from hanging.

amal · 2023-01-18T19:45:00Z

@dkhalanskyjb, @BraisGabin note that withTimeout right now will not work as expected inside the runTest (see #3588 for details).

qwwdfsad · 2023-02-24T17:33:33Z

Will be fixed with 3b22c27

qwwdfsad · 2023-03-27T10:08:35Z

The fix is available in 1.7.0-Beta

mikekulasinski-kulsoft · 2023-05-25T06:13:17Z

Hello, probably quite late to the conversation but 10 seconds is extreemly low value, we have a mid size project consisting of 20+ gradle modules. When gralde parellism is turn on CPU is 100% busy with compiling and running CPU Thread * number of taks.

Our setup uses Koin and Mockk. When we just run a tests on a single module, bootstraping (loading all the mockk classes) takes around 2 seconds for the first test. It happens that we have many modules with 0-20 tests roughly in each of them.

When we run multiple tests on multiple modules the boostrap time extends to even 10-25 seconds! Just to answer upcoming quesitons, the rest of the tests take under 500ms. Please note the test reports hmtls are not done in the order they are run, so the second test in the report is reality the first test to run, so 12 seconds includes the bootstrapping (setting up the testing env) under super heavy load of CPU. Please note we observe the same problem on our local dev machines (Intel i9 12th gen, or Mac M2) as well as on Azure CI env.

We recently updated to:
kotlin = "1.8.21"
kotlinx-coroutines = "1.7.1"

and had to create a custom runTest method that overrides default runTest with 10 seconds time out back to 60 seconds. Any chance we could revert the time to 60 seconds? I consider it a breaking change since for a very long time the test dispatcher had time of 60 seconds.

I am interested in your view

Examples:

Simple test, with mockk and without

result no cpu load

result cpu load 100% (using stress test from s-tui)

sampengilly · 2023-06-09T04:42:32Z

I've had the same issue pop up since this change occurred, but more flaky (i.e. it wouldn't happen on local dev machines where the first test would only take maybe 5 seconds, but it would occasionally take longer than 10 seconds on CI).

There seems to be a discussion active about the slowness of ByteBuddy installation here: mockk/mockk#13

I'm electing to perform my mock setup outside of the call to runTest {} so that the first mock initialisation doesn't count towards a timeout. It's a little disappointing not being able to use the expression body syntax safely if I'm using mocks, but it seems a bit better than changing the timeout for all calls to runTest

PaulWoitaschek · 2023-07-04T15:31:11Z

I created a follow up ticket to solve the issues this brings here: #3800

dkhalanskyjb added the test label May 11, 2022

dkhalanskyjb self-assigned this May 11, 2022

amal mentioned this issue Jan 18, 2023

Code that uses withTimeout often fails and cannot be safely test covered using the kotlinx.coroutines.test #3588

Closed

dkhalanskyjb mentioned this issue Feb 13, 2023

Allow specifying the timeout for runTest #3603

Merged

qwwdfsad closed this as completed Mar 27, 2023

PaulWoitaschek mentioned this issue Jul 4, 2023

The first invocation of mockk exceeds the runTest default timeout #3800

Closed

dturner mentioned this issue Nov 11, 2023

[Bug]: navigationBar_multipleBackStackInterests might be flaky android/nowinandroid#1021

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider lowering `runTest` default timeout #3270

Consider lowering `runTest` default timeout #3270

JakeWharton commented Apr 30, 2022

dkhalanskyjb commented May 11, 2022

JakeWharton commented May 11, 2022

dkhalanskyjb commented May 11, 2022

dkhalanskyjb commented May 11, 2022

hrach commented May 11, 2022 •

edited

JakeWharton commented May 13, 2022 •

edited

dkhalanskyjb commented May 16, 2022

JakeWharton commented May 24, 2022

BraisGabin commented Jun 3, 2022

dkhalanskyjb commented Jun 3, 2022

amal commented Jan 18, 2023

qwwdfsad commented Feb 24, 2023

qwwdfsad commented Mar 27, 2023

mikekulasinski-kulsoft commented May 25, 2023 •

edited

sampengilly commented Jun 9, 2023

PaulWoitaschek commented Jul 4, 2023

Consider lowering runTest default timeout #3270

Consider lowering runTest default timeout #3270

Comments

JakeWharton commented Apr 30, 2022

dkhalanskyjb commented May 11, 2022

JakeWharton commented May 11, 2022

dkhalanskyjb commented May 11, 2022

dkhalanskyjb commented May 11, 2022

hrach commented May 11, 2022 • edited

JakeWharton commented May 13, 2022 • edited

dkhalanskyjb commented May 16, 2022

JakeWharton commented May 24, 2022

BraisGabin commented Jun 3, 2022

dkhalanskyjb commented Jun 3, 2022

amal commented Jan 18, 2023

qwwdfsad commented Feb 24, 2023

qwwdfsad commented Mar 27, 2023

mikekulasinski-kulsoft commented May 25, 2023 • edited

sampengilly commented Jun 9, 2023

PaulWoitaschek commented Jul 4, 2023

Consider lowering `runTest` default timeout #3270

Consider lowering `runTest` default timeout #3270

hrach commented May 11, 2022 •

edited

JakeWharton commented May 13, 2022 •

edited

mikekulasinski-kulsoft commented May 25, 2023 •

edited