Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: fix gke network starvation #1654

Merged
merged 1 commit into from
May 24, 2023

Conversation

brlbil
Copy link
Contributor

@brlbil brlbil commented May 24, 2023

The cilium, and cilium-cli use the same GCP project for provisioning GKE clusters.

If the number of simultaneously provisioned clusters is high enough we get an error like below,

The network "default" does not have available private IP space in 10.0.0.0/9 to reserve a /14 block for pods for cluster {Zone=us-west2-a, ProjectNum=185287498374, ProjectName=*** ....

With current defaults, which are below, the number of clusters that can be provisioned simultaneously is around 31.

  • subnetwork (used by nodes) : /22
  • pod cidr : /14
  • service cidr : /20

The current default network ranges are too big for our use cases, per cluster only 2 nodes are provisioned.
The max pods created is 29, and the max services created is 9

In accordance with network range requirements, the changes below have been made.

  • subnetwork : /26 it is not used in externalworkloads
  • pod cidr : /21 this is the max allowed range
  • service cidr : /24

With these changes, the number of clusters that can be provisioned simultaneously is more than 3500.

Note: These changes have already been merged to cilium with a recent PR

Successful run links are below,

externalworkloads.yaml

gke.yaml

multicluster.yaml

With current defaults, the number of clusters that can be
provisioned simultaneously in the same project is around 31.

This commit changes default subnet range to /26,
pod ip range to /21 and service ip range to /24.

After these changes max clusters that
can be provisioned will be more then 3500.

Signed-off-by: Birol Bilgin <birol@cilium.io>
@brlbil brlbil temporarily deployed to ci May 24, 2023 04:49 — with GitHub Actions Inactive
@brlbil brlbil added the area/CI Continuous Integration testing issue or flake label May 24, 2023
@brlbil brlbil temporarily deployed to ci May 24, 2023 05:49 — with GitHub Actions Inactive
@brlbil brlbil force-pushed the pr/brlbil/ci-fix-gke-network-starvation branch from 599b8a3 to 355c1de Compare May 24, 2023 06:55
@brlbil brlbil temporarily deployed to ci May 24, 2023 06:55 — with GitHub Actions Inactive
@brlbil brlbil marked this pull request as ready for review May 24, 2023 06:55
@brlbil brlbil requested review from a team as code owners May 24, 2023 06:55
Copy link
Contributor

@viktor-kurchenko viktor-kurchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 24, 2023
@tklauser tklauser merged commit 3defcac into main May 24, 2023
35 of 37 checks passed
@tklauser tklauser deleted the pr/brlbil/ci-fix-gke-network-starvation branch May 24, 2023 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants