Migrate the CI workflows to the new runners on EKS cluster #33371

hussein-awala · 2023-08-13T21:23:22Z

This PR updates the labels used in runs-on to migrate the workflows to the new GHA runners running on Airflow EKS cluster, and managed by an Actions Runner Controller (ARC).

In the new cluster, we can choose between 3 different sizes for the self hosted runners (the instance type could be change):

small -> t4g-medium (2 cpus, 4Gb of ram) -> runs-on: ${{ fromJson(needs.build-info.outputs.runs-on-small) }}
medium -> t4g-xlarge (4 cpus, 16Gb of ram) -> runs-on: ${{ fromJson(needs.build-info.outputs.runs-on-medium) }}
large -> t4g-2xlarge (8 cpus, 32 Gb of ram) -> runs-on: ${{ fromJson(needs.build-info.outputs.runs-on-large) }}

And we can add new runner sets with different characteristics in the future, for ex create a specific docker image which contains some pre-installed packages and use it in specific jobs, or maybe create a new runner image from Airflow CI image and update it via the CI after updating the breeze or Airflow dependencies. (to be discussed later).

Also, this PR removes all the steps which setup python using actions/setup-python@v4, because of a bug in the action (actions/setup-python#705). As a workaround, I pre-installed python in the runner docker image, we can revert this change after fixing the bug, or keep it based on the performance of the runner.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

potiuk · 2023-08-14T09:12:22Z

.github/workflows/ci.yml

@@ -115,6 +115,15 @@ jobs:
      helm-test-packages: ${{ steps.selective-checks.outputs.helm-test-packages }}
      debug-resources: ${{ steps.selective-checks.outputs.debug-resources }}
      runs-on: ${{ steps.selective-checks.outputs.runs-on }}
+      runs-on-small: >-
+        ${{ steps.selective-checks.outputs.runs-on == 'self-hosted'


I have bad experiences with conditions embedded using the expreession language. It's prone to errors. Would not it be better to generate runs-on-small/medium/large as direct output of selective checks?

That has the potential that we can also base it on labels later when we connect ARC - then we could for example apply an "ARC" label to PR to test it on the ARC setup.

This has the nice property that selective checks are also very well covered by unit tests so we can easily test any logic we will apply there.

Suggestion: Maybe the output should be named "selective-checks.outputs.small-runner", "selective-checks.outputs.medium-runner" etc.

That will be a nice "abstraction layer" (selective checks based on labels will choose what the "medium" runner really is.

That will also read better:

runs-on: "${{ fromJson(needs.build-info.outputs.medium-runner) }}"

potiuk · 2023-08-14T09:19:04Z

.github/actions/breeze/action.yml

@@ -25,12 +25,6 @@ outputs:
 runs:
  using: "composite"
  steps:
-    - name: "Setup python"


I am afraid this will not work if we want to merge it and would like to freely switch between the "public", "AWS" and "ARC" runners.

I would prefer to have the case where we have a way to easily switch to choose which runners we are using on a PR-by PR basis (i.e. for example apply label to make specific PR to run on ARC - similalry as we do with "use public runners") - at least for a few week transition period.

I think what can be done is that this action can be conditional based on (yes, you guessed it) selective checks output. I think when you are calling a composite action, you are able to pass it a parameter from the calling action, so we could pass "use-arc" or simlar selective check output and based on that we could either run the setup python action or not.

github-actions · 2023-09-29T00:10:55Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

potiuk · 2023-10-08T19:05:16Z

We might want to come back to this change. It was stale for a while but probably worth keeping it until we get to changing our CI env

hussein-awala · 2023-10-08T19:08:47Z

We might want to come back to this change. It was stale for a while but probably worth keeping it until we get to changing our CI env

Recently I did a similar migration in a personal project, I will come back to this one in two weeks with a much better version.

github-actions · 2023-12-28T00:11:38Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

boring-cyborg bot added the area:dev-tools label Aug 13, 2023

potiuk reviewed Aug 14, 2023

View reviewed changes

github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 29, 2023

github-actions bot closed this Oct 4, 2023

potiuk reopened this Oct 8, 2023

github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 9, 2023

Test GHA ARC

6ae6cb4

hussein-awala closed this Nov 12, 2023

hussein-awala reopened this Nov 12, 2023

hussein-awala force-pushed the migrate_ci_to_eks_runners branch from fa13272 to 011e6c7 Compare November 12, 2023 20:16

Merge branch 'main' into test_arc_1

18288ca

hussein-awala force-pushed the migrate_ci_to_eks_runners branch from 011e6c7 to 18288ca Compare November 12, 2023 20:18

github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Dec 28, 2023

github-actions bot closed this Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate the CI workflows to the new runners on EKS cluster #33371

Migrate the CI workflows to the new runners on EKS cluster #33371

hussein-awala commented Aug 13, 2023

potiuk Aug 14, 2023

potiuk Aug 14, 2023 •

edited

potiuk Aug 14, 2023

github-actions bot commented Sep 29, 2023

potiuk commented Oct 8, 2023

hussein-awala commented Oct 8, 2023

github-actions bot commented Dec 28, 2023

Migrate the CI workflows to the new runners on EKS cluster #33371

Migrate the CI workflows to the new runners on EKS cluster #33371

Conversation

hussein-awala commented Aug 13, 2023

potiuk Aug 14, 2023

Choose a reason for hiding this comment

potiuk Aug 14, 2023 • edited

Choose a reason for hiding this comment

potiuk Aug 14, 2023

Choose a reason for hiding this comment

github-actions bot commented Sep 29, 2023

potiuk commented Oct 8, 2023

hussein-awala commented Oct 8, 2023

github-actions bot commented Dec 28, 2023

potiuk Aug 14, 2023 •

edited