-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI with bors is slow, still flaky, and provides no way to manually override #11449
Comments
Oh also it's littering our release page with tens of draft releases that aren't meaningful. |
Some community PRs are currently in a weird state where bors won't merge them |
Given #13501, do we still need to keep this open? |
Probably not, especially if GitHub merge queues allow partial retries of failed jobs which was the main complaint for this issue. |
You can jump the queue but I don't think it will let you skip tests (since that's the whole point of the queue). Instead, for P0 situations you could disable the merge queue requirement from the UI -- maybe not ideal, but very easy in a pinch.
I think this is addressed by the only merge non-failing pull requests option, but I haven't used it personally.
|
Pretty much the title.
We're finding that bors has not really improved our issues with CI, it's at best moved them around, and more likely made them worse.
For reasons unrelated to bors our CI actions are flaky and often timeout or crash the github runners. With bors that requires a complete retry of the bors action, while previously we could use githubs "retry failing jobs" feature to just retry the flaky parts.
Bors itself seems to be slow, I've seen PRs sit for hours after a "bors merge" comment with no action taken by bors.
Bors also obstructs the option to manually push things through. For example if we have a P0 bug in a release with a trivial fix we might want to push that fix and release faster accepting the risk of not running all the tests against the benefit of getting a P0 bug fix in customer hands sooner.
Comments like the above have come up repeatedly over the last few months since switching to bors, and we we're generally happy to accept that there would be some teething pain with the switch. However there hasn't been any concrete proposals put forward to fix these issues.
As such this issue is a commitment to either get concrete solutions put in place for the above, that is:
Failing fixes for the above we should move away from using bors, and return to the old system of PR tests and master tests with retries. We will still want to investigate flakyness at some point in that system, but it's a lot less pressing and doesn't have the other negatives of bors. It also makes our KPI of green master meaningful again, giving us back a metric we've lost to see how stability is going.
The text was updated successfully, but these errors were encountered: