Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly. #9250

ScheererJ · 2024-02-26T10:00:36Z

How to categorize this PR?

/area networking
/area auto-scaling
/area cost
/kind enhancement

What this PR does / why we need it:
Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly.

Due to the split of istio ingress gateways across zones in highly available seed setup, its resource requests are slightly oversized except for very large active seed clusters. To reduce the unnecessary resource waste this change reduces the resource requests to a quarter (cpu) and half (memory) respectively. The general assumption is that due to its priority the istio ingress gateway may be able to get additional cpu if available on the node. With regards to memory, the limit is left in place with the same value and again its priority may help not being out-of-memory killed. As an additional measure with regards to memory, the autoscaling is extended to also cover memory so that a scale-up can happen under memory pressure. In addition to that, the scale-up/-down behaviour is now explicitly specified with a fast scale-up and a slow scale-down.

Which issue(s) this PR fixes:
None.

Special notes for your reviewer:
For the istio ingress gateway spanning multiple zones, i.e. the default istio ingress gateway, the autoscaling is not optimal as it does not take zones into account. This means the deployment may scale up in a zone, which is not under pressure.
However, as we have not seen a lot of scale-up operations with regards to istio, this is left as a follow-up step. It might be a good idea to combine the single-zone istio ingress gateways to a virtual multi-zonal one and getting rid of the existing default one, but that requires more changes and may be done as a follow-up.

Release note:

Resource requests of istio ingress gateway are reduced and its horizontal autoscaling behaviour specified in more detail, including scale-up under memory pressure

…ng accordingly. Due to the split of istio ingress gateways across zones in highly available seed setup, its resource requests are slightly oversized except for very large active seed clusters. To reduce the unnecessary resource waste this change reduces the resource requests to a quarter (cpu) and half (memory) respectively. The general assumption is that due to its priority the istio ingress gateway may be able to get additional cpu if available on the node. With regards to memory, the limit is left in place with the same value and again its priority may help not being out-of-memory killed. As an additional measure with regards to memory, the autoscaling is extended to also cover memory so that a scale-up can happen under memory pressure. In addition to that, the scale-up/-down behaviour is now explicitly specified with a fast scale-up and a slow scale-down.

ScheererJ · 2024-02-26T10:01:03Z

/cc @vlerenc @voelzmo @dguendisch @axel7born @DockToFuture

vlerenc · 2024-02-26T10:36:23Z

Thank you @ScheererJ. I hope we will not break/tear down anything with the change ("never touch a running system", but in this case...). What can we do - as flanking operations? I could get the numbers from dev and staging? Maybe you can ping me?

Something else? I should have also the restart count and such, but what matters is whether the API servers remained accessible and I do not know whether we would see a small decline in availability (a large probably).

DockToFuture

/lgtm
Did you take a look at the historic data, is one new pod per minute good enough for upscaling?

gardener-prow · 2024-02-26T13:20:54Z

LGTM label has been added.

Git tree hash: 2b40a1e4056b42b027a006de39d3e9052ce31c2c

pkg/component/istio/test_charts/ingress_autoscaler.yaml

axel7born · 2024-02-28T13:55:35Z

/lgtm

rfranzke

/approve

gardener-prow · 2024-02-28T14:18:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [rfranzke]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gardener-prow bot requested review from ialidzhikov and shafeeqes February 26, 2024 10:00

gardener-prow bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 26, 2024

gardener-prow bot requested review from axel7born, dguendisch, DockToFuture, vlerenc and voelzmo February 26, 2024 10:01

Adapt tests

72ee2df

gardener-prow bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 26, 2024

DockToFuture reviewed Feb 26, 2024

View reviewed changes

gardener-prow bot assigned DockToFuture Feb 26, 2024

gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Feb 26, 2024

axel7born reviewed Feb 26, 2024

View reviewed changes

pkg/component/istio/test_charts/ingress_autoscaler.yaml Show resolved Hide resolved

rfranzke requested review from axel7born and DockToFuture February 28, 2024 13:25

gardener-prow bot assigned axel7born Feb 28, 2024

rfranzke reviewed Feb 28, 2024

View reviewed changes

gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2024

gardener-prow bot merged commit f930e8e into gardener:master Feb 28, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly. #9250

Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly. #9250

ScheererJ commented Feb 26, 2024

ScheererJ commented Feb 26, 2024

vlerenc commented Feb 26, 2024

DockToFuture left a comment

gardener-prow bot commented Feb 26, 2024

axel7born commented Feb 28, 2024

rfranzke left a comment

gardener-prow bot commented Feb 28, 2024

Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly. #9250

Reduce resource requests of istio ingress gateway and adapt autoscaling accordingly. #9250

Conversation

ScheererJ commented Feb 26, 2024

ScheererJ commented Feb 26, 2024

vlerenc commented Feb 26, 2024

DockToFuture left a comment

Choose a reason for hiding this comment

gardener-prow bot commented Feb 26, 2024

axel7born commented Feb 28, 2024

rfranzke left a comment

Choose a reason for hiding this comment

gardener-prow bot commented Feb 28, 2024