Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Change testing farm runs to not occupy a runner #496

Open
cgwalters opened this issue Apr 26, 2024 · 5 comments
Open

ci: Change testing farm runs to not occupy a runner #496

cgwalters opened this issue Apr 26, 2024 · 5 comments
Labels
area/ci Issues related to our own CI

Comments

@cgwalters
Copy link
Collaborator

cgwalters commented Apr 26, 2024

The github.com/containers organization is pretty large and active, but only has the default 20 Github-hosted runners available right now.

We have this repo hooked up to Testing Farm. The problem here is that the 4 distinct TF runs we do each occupy a whole Github-hosted action runner virtual machine to basically poll a remote HTTP server, which is quite wasteful.

It looks to me like the TF action itself supports being configured to report status back to the PR, without holding a runner? There's an update_pull_request_status flag...

Hmm, are we doing things this way because we're cloning the git repository here because we have the tests?

cc @henrywang

@henrywang
Copy link
Contributor

henrywang commented Apr 27, 2024

Yes, https://github.com/virt-s1/bootc-workflow-test is using update_pull_request_status. But that still needs a github action runner. For example, https://github.com/virt-s1/bootc-workflow-test/actions/runs/8845171999/job/24288556567.

Or we can use self hosted runner (container). The https://github.com/virt-s1/bootc-workflow-report repo already uses self hosted github action runner (container). For example, https://github.com/virt-s1/bootc-workflow-report/blob/c723ccca5482495d0860b6f187d971d0058d7560/.github/workflows/trigger-rhel-9-4.yml#L13.
This is the self hosted runner (container) deployment script: https://github.com/virt-s1/kite-action/blob/main/tools/deploy_container.yaml

The self hosted github action runner does not support auto scale. That means the runner has to be there before use it. But I have a solution to support auto scale github action runner. https://github.com/virt-s1/kite-action/tree/main. RHEL for Edge QE CI and osbuild-composer repo CI (RHEL for Edge part) have been using it for 2 years.

@cgwalters
Copy link
Collaborator Author

Or we can use self hosted runner (container).

Oh yes that makes lots of sense. Hmm. Actually...yeah, we should wire up something like this officially to the whole github.com/containers organization. Just thinking through how it works, there's also https://github.com/redhat-actions/openshift-actions-runners and https://github.com/actions/actions-runner-controller which looks even better (I think, although it's not clear offhand if it supports using e.g. container: to configure the pod image as distinct from the runner hoster, if it doesn't that weakens things a lot).

@henrywang
Copy link
Contributor

Yeah, this's a really good solution for self hosted runner. Three reasons I didn't use this solution:

  1. This solution needs a public k8s, I can't find a free one.
  2. RHEL for Edge needs VM or bare metal server, not container. (RHEL for Edge does not use testing farm yet)
  3. openshift is not supported

I can spend some time on this solution and try something. That should be interesting.

@thrix
Copy link

thrix commented May 6, 2024

Circling back to using rather Packit + Testing Farm, afaik people use the Packit + Testing Farm combo to overcome the github runners limitations, as with that in place there are no github runners in place needed.

Just noting that Packit does not need to do any building, it can just trigger Testing Farm.

Also Packit and Testing Farm sync using "webhooks", so Packit does not need to actively wait on Testing Farm, it gets notified from Testing Farm when the state changes. So it scales a lot better.

I am not sure if webhooks could be used with GitHub runners, it would be great ... actually even Fedora CI was able to prevent active waiting using webhook step plugin: https://plugins.jenkins.io/webhook-step/

@cgwalters
Copy link
Collaborator Author

OK yep, just got bit by this again - all 20 runner slots for the containers/ org were taken up by testing farm polling tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci Issues related to our own CI
Projects
None yet
Development

No branches or pull requests

3 participants