Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--diff appears to be slow / not actually using git diff #1714

Open
james-s-w-clark opened this issue Dec 15, 2022 · 2 comments
Open

--diff appears to be slow / not actually using git diff #1714

james-s-w-clark opened this issue Dec 15, 2022 · 2 comments

Comments

@james-s-w-clark
Copy link

I am trying to speed up our CI, which runs stuff like scalafix and scalafmt. scalafmt (with our Mill task, at least) is able to use caching to prevent unnecessary work (inital checks may take a few minutes; further checks with no/small diffs take ~seconds).

Most PRs in CI only change a small percentage of files, so there shouldn't be much to format or fix.

I noted in the documentation there are the --diff and --diff-base branch/tag/commit options for scalafix. I expected that:

  1. files not in git diff would not be checked
  2. possibly even better, only changed content in added/modified/renames would need to be checked
  3. the above would presumably make scalafix much faster with the --diff options, when the percentage of files changed is small. This assumes git is fast, and scalafix has some intense work to do.

I have read your CliGitDiffSuite tests, and it looks like functionality for 1 & 2 is there.

However, at least through the mill-scalafix plugin, --diff actually seems to slow things down.

Here is a link to an issue I opened in mill-scalafix. From a few runs, it looks like --diff is about 50% slower.

In this scalafix repo I tried sbt "scalafixAll --check" and sbt "scalafixAll --check --diff-base main, and didn't see much difference in time taken (even with no diff, on a branch off of main).

@bjaglin
Copy link
Collaborator

bjaglin commented Dec 15, 2022

scalafmt (with our Mill task, at least) is able to use caching to prevent unnecessary work (inital checks may take a few minutes; further checks with no/small diffs take ~seconds)

I assume that's a feature of the mill-scalafmt plugin? If you were using sbt-scalafix, you would also benefit from incremental runs out of the box. I don't think mill-scalafix has anything like that at the moment.

However, at least through the mill-scalafix plugin, --diff actually seems to slow things down.

Could you at least confirm that there were less files targeted (via --verbose)? I am not familiar with mill, so I don't know what kind of side effects we could have (a significant time might be spent outside scalafix, recompiling sources with semanticdb for example). Could you run the same experiment with the CLI?

In this scalafix repo I tried sbt "scalafixAll --check" and sbt "scalafixAll --check --diff-base main, and didn't see much difference in time taken (even with no diff, on a branch off of main).

As sbt-scalafix is incremental, this test is probably biased. Same suggestion as above: you should try to use the CLI directly if you want to assess the impact of --diff.

@james-s-w-clark
Copy link
Author

james-s-w-clark commented Dec 16, 2022

Good call on trying with cs's scalafix to remove bias.

I already gave it a shot yesterday:

  • in our private repo, even with no DisableSyntaxs and only 1 rule (didn't matter which) in .scalafix.conf it was unhappy with errors like } expected but identifier found... java.util.List as JList for example. I didn't expect all the rules (1 at a time) to pick up on that.

  • in this repo I got error: Unknown rule 'OrganizeImports'. If I comment that rule out, there's an error for each file about SemanticDB not found.

I'll try a bit more to get cs's scalafix it working on our source code. A full scalafix there takes a lot longer than for this repo.


The mill-scalafix repo runs with coursier's scalafix off of main (or, a branch off that with no changes).

  • scalafix --verbose --check logs info: Processing (1/34) etc., some [DisableSyntax....], and some example fixes (e.g. def procedure() {} -> def procedure(): Unit = {})
  • scalafix --verbose --check --diff-base main also logs the same
    • so, it looks like raw scalafix isn't using the diffing
    • This is on a branch off main, with no code changes.
  • scalafix --verbose --diff-base main also logs the same (and edits 3 files)

The codebase is fairly small here so I can't compare time/performance yet, but it looks like functionally the diff is having no effect.

@james-s-w-clark james-s-w-clark changed the title --diff appears to be slow --diff appears to be slow / not actually using git diff Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants