Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] kubeflow-pipeline-e2e-test and kubeflow-pipeline-upgrade-test broken and blocks presubmit #10779

Open
chensun opened this issue May 3, 2024 · 0 comments
Labels
area/testing help wanted The community is welcome to contribute.

Comments

@chensun
Copy link
Member

chensun commented May 3, 2024

This is due to how we build the test images through docker-in-docker, which is broken in the latest available GKE versions.

More context:

# TODO(#9706): Switch back to regular channel once we stop building test images via dind.
# Temporarily use cos as image type until docker dependencies gets removed.
# reference: https://github.com/kubeflow/pipelines/issues/6696
# Hard-coded GKE to 1.25.10-gke.1200 (the latest 1.25 in STABLE channel). Reference:
# https://github.com/kubeflow/pipelines/issues/9704#issuecomment-1622310358
# 08/09/2023 update: 1.25.10-gke.1200 no longer supported, use 1.25.10-gke.2100 instead. Reference:
# https://cloud.google.com/kubernetes-engine/docs/release-notes-nochannel#2023-r17_version_updates
gcloud container clusters create ${TEST_CLUSTER} --image-type cos_containerd --release-channel stable --cluster-version 1.25 ${SCOPE_ARG} ${NODE_POOL_CONFIG_ARG} ${WI_ARG}

And 1.25 is no longer available on GKE, causing deployment failure (e.g.):

++ gcloud container clusters create e2e-f243fff-2539 --image-type cos_containerd --release-channel stable --cluster-version 1.25 --num-nodes=2 --machine-type=e2-standard-8 --enable-autoscaling --max-nodes=8 --min-nodes=2 --workload-pool=ml-pipeline-test.svc.id.goog
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting with version 1.18, clusters will have shielded GKE nodes by default.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s). 
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=No valid versions with the prefix "1.25" found.

Impacted by this bug? Give it a 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing help wanted The community is welcome to contribute.
Projects
None yet
Development

No branches or pull requests

1 participant