Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run detekt's Gradle tasks with Gradle's Worker API #4128

Merged
merged 6 commits into from Feb 26, 2023

Conversation

3flex
Copy link
Member

@3flex 3flex commented Sep 25, 2021

Run Detekt task, DetektGenerateConfigTask and DetektCreateBaselineTask with Gradle's Worker API. This uses process isolation mode which will allow using the toolchains feature in future (#4120). Performance in general seems acceptable though I haven't done proper benchmarks.

Two issues:

  1. When used on this project with an included build I see very poor performance on the first run. I don't see this when including the build in another project. Looking forward to others trying this on their projects to see the performance impact.
  2. On subsequent runs when used on this project with an included build I often see a CNFE which I think is due to a Gradle bug (I don't believe Gradle is restarting the Worker daemons when classpath file contents change). It's intermittent and can be resolved either by running the task a few times and having it complete successfully, or restarting the Gradle daemon.

Neither of these issues will impact users, but contributors will face this. I see issue 2 as being more annoying. It's similar to #2957 but can happen when changes are made in any subproject, not just core modules.

Edit: I've raised issue 2 at gradle/gradle#13678 (comment). I'll open a new issue in a few days if there's no reply.

@@ -70,19 +113,18 @@ internal class DefaultCliInvoker(
msg != null && "Build failed with" in msg && "issues" in msg
}

private class DryRunInvoker(private val logger: Logger) : DetektInvoker {
private class DryRunInvoker : DetektInvoker {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logger can't be passed to the Worker as everything in DetektWorkParameters must be serializable. Using println instead means the tests still pass as anything passed to stdout is logged at QUIET level by Gradle. There's no impact on users as the dry run invoker should only be used internally.

@codecov
Copy link

codecov bot commented Sep 25, 2021

Codecov Report

Merging #4128 (daad0f0) into main (2290ec8) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #4128      +/-   ##
============================================
- Coverage     84.61%   84.60%   -0.01%     
- Complexity     3787     3790       +3     
============================================
  Files           546      546              
  Lines         12918    12918              
  Branches       2268     2268              
============================================
- Hits          10930    10929       -1     
+ Misses          862      861       -1     
- Partials       1126     1128       +2     
Impacted Files Coverage Δ
.../detekt/rules/naming/ConstructorParameterNaming.kt 88.88% <0.00%> (-5.56%) ⬇️
...turbosch/detekt/rules/style/ForbiddenAnnotation.kt 90.24% <0.00%> (+2.43%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@3flex
Copy link
Member Author

3flex commented Sep 25, 2021

Java 8 on Windows kept failing as the build runner didn't have enough memory.

This is somewhat expected - Worker API starts more daemons than before, but moving the detekt classloader out of the Gradle daemon reduces the memory requirements for the Gradle daemon itself.

I've therefore removed the JVM args config that increased Gradle and TestKit memory to compensate which should keep the build within the runner's memory constraints (7GB total).

@3flex 3flex changed the title Run Detekt task with Gradle's Worker API Run detekt's Gradle tasks with Gradle's Worker API Sep 26, 2021
@chao2zhang
Copy link
Member

With the current implementation in a multi-module project, we will have worker processes up to org.gradle.workers.max.

// Before
➜  kotlin-android-template git:(main) ✗ jps
86912 
89369 Jps
86365 
// After
➜  kotlin-android-template git:(main) ✗ jps
86912 
89472 KotlinCompileDaemon
89428 KotlinCompileDaemon
89415 GradleDaemon
89497 GradleWorkerMain
89499 GradleWorkerMain
89498 GradleWorkerMain
89533 Jps
86365 

It appears to me that these additional number of gradle workers processes will impact developers using detekt.

@3flex
Copy link
Member Author

3flex commented Sep 28, 2021

Is the concern about memory usage?

From https://docs.gradle.org/current/userguide/worker_api.html#creating_a_worker_daemon:

These worker daemon processes will persist across builds and can be reused during subsequent builds. If system resources get low, however, Gradle will stop any unused worker daemons.

So we rely on Gradle managing its resources appropriately.

@3flex 3flex added gradle-plugin notable changes Marker for notable changes in the changelog labels Oct 4, 2021
@BraisGabin
Copy link
Member

I runned this scenarios using gradle-profiler gradle-profiler --benchmark --scenario-file ../detekt.scenarios:

detekt-parallel {
  tasks = ["detektMain", "detektTest"]
  gradle-args = ["--no-build-cache", "--parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

detekt-no-parallel {
  tasks = ["detektMain", "detektTest"]
  gradle-args = ["--no-build-cache", "--no-parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

plain-detekt-parallel {
  tasks = ["detekt"]
  gradle-args = ["--no-build-cache", "--parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

plain-detekt-no-parallel {
  tasks = ["detekt"]
  gradle-args = ["--no-build-cache", "--no-parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

I needed to add this task to invalidate the output of all the detekt tasks and avoid the UP-TO-DATE.

task<Exec>("cleanDetekt") {
    commandLine("bash", "-c", "rm -r **/build/reports/detekt || true")
}

The raw data is here:

  • main.csv
  • workers.csv (This is not exactly this branch but this branch rebased on main to be sure that I was comparing only this changes)

A summary:

detekt-no-parallel detekt-parallel plain-detekt-no-parallel plain-detekt-parallel
main 69161.3 37724.3 24227.2 12441.4
worker 74114.2 43063.4 24426 13669.8
comparation greater (7%) greater (14%) same greater (10%)

No idea why plain-detekt-no-parallel maintains the same but it seems that this change makes detekt slower.

This doesn't mean "no-merge". I just wanted to know the impact of this change and show it.

@3flex
Copy link
Member Author

3flex commented Oct 14, 2021

Thanks for doing some benchmarking!

it seems that this change makes detekt slower

As expected. The question is, is the impact acceptable, given we can now use toolchains, as well as reduce memory pressure in the Gradle daemon since work is now done in separate processes? It's a tradeoff. In my opinion yes, it's still worth it, but let's wait for further feedback before merging.

Copy link
Member

@cortinico cortinico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in general in favor of merging this. Thansk @BraisGabin for running some benchmarks 🙏

Given the small size of this diff, could it be possible to ship this as an opt-in (or an opt-out)? I believe our community could help us test this and identify potential early on problems.

Also @davidburstromspotify has a benchmarking suite that stress tests detekt. Would be interesting to try Worker API on that suite.

@3flex
Copy link
Member Author

3flex commented Oct 20, 2021

Given the small size of this diff, could it be possible to ship this as an opt-in (or an opt-out)?

That's reasonable. What do you think about a Gradle property called detekt.use.worker.api? This follows the convention for the same setting for kapt (kapt.use.worker.api).

It would be an opt in first, then can upgrade to opt out later, then potentially have it as the only execution mode in future.

@cortinico
Copy link
Member

It would be an opt in first, then can upgrade to opt out later, then potentially have it as the only execution mode in future.

Agree on this approach 👍

@eygraber
Copy link
Contributor

eygraber commented Nov 9, 2021

I've had...not the most fun experience with the worker API for things like this (as a consumer). Specifically Android Lint started using it, and each worker was allowed to grow to 6gb (which was the Xmx I specified for the daemon). So Gradle might clean them when memory is needed, but until then all of my RAM was allocated (which also slowed down the machine so that in a lot of cases Gradle couldn't run fast enough to actually kill the processes to free the memory).

I think it's a good idea to do this in theory, but in the real world there are a lot of implications.

@3flex
Copy link
Member Author

3flex commented Nov 9, 2021

Just wondering if you tried tweaking org.gradle.workers.max at all?

There are also options for tweaking max memory only for the detekt tasks run using worker API which should help mitigate.

A permanent option for opt in/out might be better than removing the existing setup wholesale.

@eygraber
Copy link
Contributor

eygraber commented Nov 9, 2021

I did, but that affected parallelism overall.

Lint had options to limit the worker memory, which worked, but was trial and error to get right on a per project basis. They eventually switched to run in process by default.

@github-actions
Copy link

github-actions bot commented Feb 8, 2022

This PR is stale because it has been open 90 days with no activity. Please comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Feb 8, 2022
@3flex
Copy link
Member Author

3flex commented Feb 8, 2022

I still plan to complete this.

@cortinico cortinico added never gets stale Mark Issues/PRs that should not be closed as they won't get stale and removed stale labels Feb 8, 2022
@3flex 3flex marked this pull request as draft June 12, 2022 04:34
@github-actions github-actions bot added the build label Jun 12, 2022
@3flex 3flex mentioned this pull request Sep 20, 2022
@arturbosch
Copy link
Member

arturbosch commented Sep 20, 2022

Hi, I was triggered by Gradle closing two linked issues regarding the worker api which I linked in my worker api experiment from 2020 #2896.
Looking over the changes in this PR, I do not quite understand how the classloader caching problem is solved?
Wouldn't the code create a new classloader on each detekt invocation - meaning 1-2 seconds overhead for each gradle module/worker thread?

@3flex
Copy link
Member Author

3flex commented Sep 21, 2022

I do not quite understand how the classloader caching problem is solved?

It's not.

Wouldn't the code create a new classloader on each detekt invocation

It should only do so if there's no available Gradle worker, and number of current workers < org.gradle.workers.max.

Workers API performance should fall somewhere between the current (fastest) setup, and the JavaExec (slowest) version. Workers can be reused, so yes, each worker suffers the initial performance hit as it's spun up, but that only happens for new instances and not ones that are reused.

The big advantage of workers is increased parallelism in the build, as tasks using workers can run in parallel within a Gradle project (so e.g. detektMain and detektTest can run in parallel within the same project). Tasks that don't use workers can only run serially within a project. This can speed up the build if the machine has enough resources available.

The other important note is that detekt's use of Worker API is opt in and behind a flag so it can be treated as an experiment for now so this will not change the existing behaviour or use of the existing classloader cache.

@BraisGabin BraisGabin modified the milestones: 1.22.0, 1.23.0 Oct 25, 2022
@BraisGabin
Copy link
Member

I moved this PR to 1.23.0. It's a big change to introduce it after two RC

@vercel
Copy link

vercel bot commented Jan 6, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
detekt 🔄 Building (Inspect) Jan 6, 2023 at 8:52AM (UTC)

@3flex
Copy link
Member Author

3flex commented Feb 23, 2023

LGTM?

Copy link
Member

@BraisGabin BraisGabin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

detekt-gradle-plugin/build.gradle.kts Outdated Show resolved Hide resolved
Co-authored-by: Brais Gabín <braisgabin@gmail.com>
Copy link
Member

@chao2zhang chao2zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The only concern is #4128 (comment) but given that this PR is already open for over a year and it is behind a feature flag, this should be merged sooner rather than later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build gradle-plugin notable changes Marker for notable changes in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants