Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize GitHub Action jobs in CI #5458

Merged
merged 1 commit into from
May 6, 2024

Conversation

torrocus
Copy link
Contributor

@torrocus torrocus commented Apr 28, 2024

Pull Request Summary

CI tests take a long time because they run different cases one after the other. If something fails, no further tests will be run. CI tests have not been passed for some time now. Fast tests didn't pass on CI for MacOS. This change is intended to help us solve this problem more easily.

Changes proposed in this pull request:

  • split a long-running tests job into smaller jobs that can be parallelized

Feedback

The total CI execution time is the same, but the waiting time for the test results is shorter.

Before

Screenshot before

After

Screenshot after

@torrocus
Copy link
Contributor Author

torrocus commented May 3, 2024

@pkuczynski what do you think about this change?

@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch from 8f388da to fe1e25c Compare May 3, 2024 10:49
@pkuczynski pkuczynski added this to the rvm-1.29.13 milestone May 3, 2024
@pkuczynski pkuczynski added the CI label May 3, 2024
@pkuczynski
Copy link
Member

It make sense to run things in parallel if it speeds up the tests, that's for sure. Just the code duplication is aweful. Can we somehow avoid this?

@eregon wdyt?

@eregon
Copy link
Contributor

eregon commented May 3, 2024

What about using a 2-dimensional matrix for that? That should address the duplication and keep things concise.

@eregon
Copy link
Contributor

eregon commented May 3, 2024

CI tests have not been passed for some time now.

Only on macOS which is a PITA. They work fine on Linux AFAIK.

.github/workflows/specs.yml Outdated Show resolved Hide resolved
@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch from fe1e25c to 55be111 Compare May 3, 2024 14:37
@torrocus
Copy link
Contributor Author

torrocus commented May 3, 2024

@pkuczynski said:
It make sense to run things in parallel if it speeds up the tests, that's for sure. Just the code duplication is aweful. Can we somehow avoid this?

I don't like these duplications either. Unfortunately, every job must perform similar activities:

  • git checkout repo
  • install ruby via rvm
  • install tf gem
  • run selected tests

At the moment I have no idea how to do it better, although I would love to share these parts of the job.

@eregon said:
What about using a 2-dimensional matrix for that? That should address the duplication and keep things concise.

If you mean strategies (matrix for OS), it won't change anything. To introduce parallelism, we need separate jobs. Anyway, where we run jobs on different OSs, I used matrix. I only removed where it was conditional.

@eregon
Copy link
Contributor

eregon commented May 3, 2024

Some general feedback: I think 9min vs 14min doesn't make a huge difference, in either case no one will wait for that actively. And then of course it will redo some work multiple times which is rather wasteful (e.g. compile ruby 2.7.7).
But having isolation between these groups of tests is valuable, especially given the existing issues between them, so this seems worth it.

@eregon
Copy link
Contributor

eregon commented May 3, 2024

If you mean strategies (matrix for OS), it won't change anything. To introduce parallelism, we need separate jobs. Anyway, where we run jobs on different OSs, I used matrix. I only removed where it was conditional.

Use a matrix e.g. like here: https://github.com/ruby/spec/blob/1afa2eab1cab5df6c5706fc5b7e92756b1db4f07/.github/workflows/ci.yml#L10-L12 (the keys would be os and test instead)
Those jobs all run in parallel.
And that solves the duplication, at the cost of some existing if conditionals for tests which don't work on macOS yet (but should eventually).

You can even avoid those if: with an explicit include list like https://github.com/ruby/spec/blob/1afa2eab1cab5df6c5706fc5b7e92756b1db4f07/.github/workflows/ci.yml#L14-L17

@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch from 9a9427d to 55be111 Compare May 3, 2024 15:12
@torrocus
Copy link
Contributor Author

torrocus commented May 3, 2024

Some general feedback: I think 9min vs 14min doesn't make a huge difference, in either case no one will wait for that actively. And then of course it will redo some work multiple times which is rather wasteful (e.g. compile ruby 2.7.7). But having isolation between these groups of tests is valuable, especially given the existing issues between them, so this seems worth it.

Yes, separation is a big advantage. If we add caching in the future, we should gain more time.

@torrocus
Copy link
Contributor Author

torrocus commented May 3, 2024

If you mean strategies (matrix for OS), it won't change anything. To introduce parallelism, we need separate jobs. Anyway, where we run jobs on different OSs, I used matrix. I only removed where it was conditional.

Use a matrix e.g. like here: https://github.com/ruby/spec/blob/1afa2eab1cab5df6c5706fc5b7e92756b1db4f07/.github/workflows/ci.yml#L10-L12 (the keys would be os and test instead) Those jobs all run in parallel. And that solves the duplication, at the cost of some existing if conditionals for tests which don't work on macOS yet (but should eventually).

You can even avoid those if: with an explicit include list like https://github.com/ruby/spec/blob/1afa2eab1cab5df6c5706fc5b7e92756b1db4f07/.github/workflows/ci.yml#L14-L17

Interesting idea. I'll see what I can do with it.

@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch 2 times, most recently from a535be9 to 286e714 Compare May 3, 2024 15:54
@torrocus
Copy link
Contributor Author

torrocus commented May 3, 2024

@pkuczynski @eregon I made changes and it looks better. What do you think?

@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch from 286e714 to c1b8a2b Compare May 3, 2024 16:21
Copy link
Contributor

@eregon eregon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now, 2 more comments

.github/workflows/specs.yml Outdated Show resolved Hide resolved
.github/workflows/specs.yml Outdated Show resolved Hide resolved
eregon
eregon previously approved these changes May 3, 2024
@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch 3 times, most recently from 9adcabd to 430c00b Compare May 3, 2024 20:53
@torrocus torrocus requested a review from pkuczynski May 3, 2024 20:55
eregon
eregon previously approved these changes May 3, 2024
.github/workflows/specs.yml Show resolved Hide resolved
CI tests take a long time because they run different cases
one after the other. If something fails, no further tests will be run.

In brief it parallelized GitHub Action jobs in CI
and use 2-dimensional matrix for strategy,
becase we don't want duplication.
@torrocus torrocus force-pushed the split-ci-process-into-smaller-ones branch from 430c00b to bf3b37c Compare May 3, 2024 21:24
@torrocus torrocus requested a review from eregon May 3, 2024 21:46
@eregon
Copy link
Contributor

eregon commented May 4, 2024

@pkuczynski Could you merge? (I cannot)
And also adapt the Required status checks to the new job names?

@torrocus
Copy link
Contributor Author

torrocus commented May 6, 2024

@pkuczynski Since we have changed the CI process, we also need to change the settings in -> Settings -> Branches -> Branch protection rules. Then select Edit for the "master" branch. And then in the "Require status checks to pass before merging" option, remove the status checks "tests" and add "CI/tests".
Otherwise, only people with force push permission can merge ("tests" check status is waiting for report over and over again).

@pkuczynski
Copy link
Member

Sure I can :) Give me few mins.

@pkuczynski pkuczynski merged commit f116268 into rvm:master May 6, 2024
8 checks passed
@pkuczynski
Copy link
Member

MacOS tests are taking ridicilous long. Maybe we could check why...

@pkuczynski
Copy link
Member

Thank you @torrocus for your contribution

@eregon
Copy link
Contributor

eregon commented May 6, 2024

MacOS tests are taking ridicilous long. Maybe we could check why...

Looking at https://github.com/rvm/rvm/actions/runs/8945080363/usage and specific logs for test/fast/*, it looks like on Linux it's mostly (only?) prebuilt Ruby binaries being used, while on macOS every ruby installed seems compiled from source, which takes a while.

One thing to improve might be to always use the same Ruby version for all tests and avoid uninstalling it.
And obviously if there could be prebuilt Rubies on macOS that'd be even better.

@pkuczynski
Copy link
Member

That make sense. We had issues with building movable ruby in the past for macOS :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants