-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-23221: internalServiceChanged: Fix target port logic #1052
base: master
Are you sure you want to change the base?
OCPBUGS-23221: internalServiceChanged: Fix target port logic #1052
Conversation
@Miciah: This pull request references Jira Issue OCPBUGS-23221, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
30433aa
to
53baea6
Compare
ci/prow/e2e-aws-ovn-single-node failed with the following error:
I filed OCPBUGS-33162 for this error. |
ci/prow/e2e-aws-operator and ci/prow/e2e-azure-operator both failed because |
/assign |
@Miciah will this also need to be backported to 4.12, like #864 (comment) ? |
53baea6
to
584ef42
Compare
@Miciah: This pull request references Jira Issue OCPBUGS-23221, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
https://github.com/openshift/cluster-ingress-operator/compare/53baea6f04f80b40128edd4d55cae58e7dd13dc3..584ef427468eac9d8a5a1948e11ae26f673d3114 updates the logic to preserve the same fields that we ignore when checking whether an update is needed, and also adds an E2E test. |
Yeah, if anyone requests a backport and PM agrees, it makes sense to backport this PR. |
584ef42
to
600ed28
Compare
https://github.com/openshift/cluster-ingress-operator/compare/584ef427468eac9d8a5a1948e11ae26f673d3114..600ed28e6f713ec4fe0d2e94b37061f96bf61852 adds comments about keeping cmp options and |
ci/prow/e2e-azure-operator failed because the installer timed out waiting for the API server respond after bootstrap. ci/prow/e2e-aws-operator failed because |
for i, updatedPort := range updated.Spec.Ports { | ||
for _, currentPort := range current.Spec.Ports { | ||
if currentPort.Name == updatedPort.Name { | ||
updated.Spec.Ports[i].TargetPort = currentPort.TargetPort | ||
updated.Spec.Ports[i].NodePort = currentPort.NodePort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean that we are allowing the spec.ports.nodePorts to be changed from what we expect, to what is set in current?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something/someone changed the nodePort, and only the nodePort, then L147-148 would return "unchanged" because we ignore it. However, if something changed the nodePort, and also changed something we don't ignore, and we get to here, then we change it. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean that we are allowing the spec.ports.nodePorts to be changed from what we expect, to what is set in current?
It does. This is explained in the comments:
cluster-ingress-operator/pkg/operator/controller/ingress/internal_service.go
Lines 120 to 122 in 4a306cb
// Ignore fields that the API, other controllers, or user may | |
// have modified. Note: This list must be kept consistent with | |
// the updated.Spec.Foo = current.Spec.Foo assignments below! |
cluster-ingress-operator/pkg/operator/controller/ingress/internal_service.go
Lines 166 to 168 in 4a306cb
// Preserve fields that the API, other controllers, or user may have | |
// modified. Note: This list must be kept consistent with | |
// serviceCmpOpts above! |
In particular, NodePort
is set by the API, and so NodePort
must be ignored in the comparison and preserved during the update so that the operator does not try to revert the value that the API set.
If something/someone changed the nodePort, and only the nodePort, then L147-148 would return "unchanged" because we ignore it. However, if something changed the nodePort, and also changed something we don't ignore, and we get to here, then we change it. Am I missing something?
updated.Spec.Ports[i].NodePort = currentPort.NodePort
copies the value that was observed (currentPort.NodePort
) to the final version of the service that we will use for the update (updated
). That is, the assignment preserves the current NodePort
value for updates.
Note that we initialize updated
by copying the spec
field from expected
:
cluster-ingress-operator/pkg/operator/controller/ingress/internal_service.go
Lines 151 to 152 in 4a306cb
updated := current.DeepCopy() | |
updated.Spec = expected.Spec |
Then if we want to preserve the current value for some subfield, we need to copy the value from current
:
cluster-ingress-operator/pkg/operator/controller/ingress/internal_service.go
Lines 169 to 178 in 4a306cb
updated.Spec.ClusterIP = current.Spec.ClusterIP | |
updated.Spec.ClusterIPs = current.Spec.ClusterIPs | |
updated.Spec.ExternalIPs = current.Spec.ExternalIPs | |
updated.Spec.HealthCheckNodePort = current.Spec.HealthCheckNodePort | |
updated.Spec.IPFamilies = current.Spec.IPFamilies | |
updated.Spec.IPFamilyPolicy = current.Spec.IPFamilyPolicy | |
for i, updatedPort := range updated.Spec.Ports { | |
for _, currentPort := range current.Spec.Ports { | |
if currentPort.Name == updatedPort.Name { | |
updated.Spec.Ports[i].NodePort = currentPort.NodePort |
At the end of all this, updated
is the result of merging current
and expected
, copying the field values that the operator should update from expected
and copying the field values that the operator should preserve (that is, leave alone) from current
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still,
If something/someone changed the nodePort, and only the nodePort, then L147-148 would return "unchanged" because we ignore it.
Does that mean the change to nodePort won't get saved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return false, nil
statement on line 147/148 means that updateInternalService
won't do an update. In that case, there's no need to preserve NodePort
's value—we're not doing an update anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. If we never expect current to hold any values that are different for the ignored fields, then why do we have to preserve them later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator preserves the ignored fields so that it doesn't inadvertently update fields that we intend for it to ignore.
The operator ignores these fields because it isn't supposed to be managing them. In particular, the API sets NodePort
, so the operator should never try to update NodePort
, even if expected.Spec.Ports[i].NodePort
differs from current.Spec.Ports[i].NodePort
.
Maybe the word "expected" is misleading. expected
has some fields where the operator manages the field and actually expects the field to have a certain value, as well as some fields where the operator is intended to ignore changes to the field and preserve the current value, whatever that value may be.
Ultimately, the operator should update the service if a managed field (for example, spec.internalTrafficPolicy
) does not have the expected value, and the operator should update that particular field. However, the operator should not update the service if only ignored fields (for example, nodePort
) changed, and the operator should not update an unmanaged field if the operator has to update the service because some managed field value needs an update.
e2e-gcp-operator succeeded, and the operator logs have the expected output:
|
Multiple CI jobs failed because |
t.Fatal(err) | ||
} | ||
|
||
findPort := func(name string, service *corev1.Service) *corev1.ServicePort { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this e2e test find/update the port on the IngressController, and check to make sure it is updated on the Service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestHostNetworkPortBinding
verifies that updating IngressController.spec.endpointPublishingStrategy.hostNetwork.statsPort
updates the deployment. TestReconcileInternalService
verifies that the "metrics" port exists, and that the operator is managing it. 🤔... Would it make sense to add a check to TestHostNetworkPortBinding
to verify that the service uses "metrics" (and not a number) as the target port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3a56b8d adds an assertion on the expected ports in TestHostNetworkPortBinding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestHostNetworkPortBinding
would have prevented the original bug if it was testing the same thing as the bug report. It is testing assertContainerHasPort(t, routerContainer, "metrics", 9936)
. Doesn't it need to assert that the service now has the port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, maybe the new assertion for expected ports should have the new port numbers too? https://github.com/openshift/cluster-ingress-operator/pull/1052/files/cfba4d9ad778d01a6ab874682ad0c2d95f81eab0#diff-cf4b4e5424070a666a364d1d1b04011478d888d6d3d65c943ca7783333b7a6e4R1018-R1034
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestHostNetworkPortBinding
would have prevented the original bug if it was testing the same thing as the bug report. It is testingassertContainerHasPort(t, routerContainer, "metrics", 9936)
. Doesn't it need to assert that the service now has the port?
Is the assertion that I added what you expected?
Or, maybe the new assertion for expected ports should have the new port numbers too? https://github.com/openshift/cluster-ingress-operator/pull/1052/files/cfba4d9ad778d01a6ab874682ad0c2d95f81eab0#diff-cf4b4e5424070a666a364d1d1b04011478d888d6d3d65c943ca7783333b7a6e4R1018-R1034
The port changes on the pod, but the port doesn't need to change on the service if the service just references the target port by name instead of by number. That's the idea behind #864. The fault is that the update logic didn't actually work properly, and so the service would still have port 1936 instead of port "metrics".
t.Fatal("serviceMonitorChanged does not behave as a fixed-point function") | ||
} else { | ||
if updatedChanged, _ := serviceMonitorChanged(sm1, sm3); !updatedChanged { | ||
t.Error("serviceMonitorChangedChanged reported changes but did not make any update") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
t.Error("serviceMonitorChangedChanged reported changes but did not make any update") | |
t.Error("serviceMonitorChanged reported changes but did not make any update") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3a56b8d
to
3d06500
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could add a test case for NodePort changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix internalServiceChanged to copy target ports from the expected service rather than preserving target ports from the current service. Only copy NodePort and other fields that we want to preserve. Before this commit, the internalServiceChanged function explicitly preserved the current service's ports' target ports when reconciling the service. However, the function was intended to update the target ports if they changed. In particular, if the current service specifies a numeric target port for the metrics port, then the operator should replace the port number with a port name so that the service uses the correct port even if the pod container's port number changes. Without this change, updating an IngressController's spec.endpointPublishingStrategy.hostNetwork.statsPort API field can break metrics as the service's metrics port's numeric target port may no longer match the pod container's metrics port. Follow-up to commit 44a60f5 and commit 9c6e368. This commit fixes OCPBUGS-23221 https://issues.redhat.com/browse/OCPBUGS-23221 * pkg/operator/controller/ingress/internal_service.go (internalServiceChanged): Fix logic that preserves the current service's fields not to preserve the ports' target ports and instead to preserve the same fields that we ignore with serviceCmpOpts. * pkg/operator/controller/ingress/internal_service_test.go (Test_internalServiceChanged): Verify that internalServiceChanged actually updates the service when it reports that the service has changed. Add a test case to verify that updates to NodePort are ignored. * test/e2e/operator_test.go (TestReconcileInternalService): New test to verify that the operator reconciles the internal service's metrics port's target port by setting a bogus target port and checking that the operator reverts the test's update. * test/e2e/all_test.go (TestAll): Add TestReconcileInternalService to parallel tests.
This is a follow-up to the previous commit for consistency. * pkg/operator/controller/canary/daemonset_test.go (Test_canaryDaemonsetChanged): * pkg/operator/controller/canary/namespace_test.go (Test_canaryNamespaceChanged): * pkg/operator/controller/canary/route_test.go (Test_canaryRouteChanged): * pkg/operator/controller/ingress/deployment_test.go (Test_deploymentConfigChanged): * pkg/operator/controller/ingress/load_balancer_service_test.go (Test_loadBalancerServiceChanged) (Test_loadBalancerServiceAnnotationsChanged): * pkg/operator/controller/ingress/monitoring_test.go (Test_serviceMonitorChanged): * pkg/operator/controller/ingress/namespace_test.go (Test_routerNamespaceChanged): * pkg/operator/controller/ingress/nodeport_service_test.go (Test_nodePortServiceChanged): * pkg/operator/controller/ingress/poddisruptionbudget_test.go (Test_podDisruptionBudgetChange): * pkg/operator/controller/ingressclass/ingressclass_test.go (Test_ingressClassChanged): Verify that the function made some update to the object if it reported that the object changed.
Fix the update logic in crdChanged and nodePortServiceChanged to preserve fields that are ignored when checking whether an update is necessary. Follow-up to commit 4a306cb, commit dd962af, commit 5579aa1, and commit d7ffd6e. * pkg/operator/controller/gatewayapi/crds.go (crdChanged): Preserve spec.Conversion. * pkg/operator/controller/ingress/nodeport_service.go (nodePortServiceChanged): Preserve spec.clusterIPs, spec.ipFamilies, and spec.ipFamilyPolicy.
Set explicit values for the liveness, readiness, and startup probes in the router deployment's yaml manifest. Previously, the router deployment yaml manifest omitted these values when the intended values were equal to the default values. This meant that the API server would set the default values for these parameters, and so the probes ultimately ended up with the intended parameter values. However, omitting these values in the yaml manifest also had the unintended side-effect of making the operator log bogus diffs for these parameters whenever it updated the deployment. After this commit, the log output for a deployment update has a less noisy diff. * pkg/manifests/assets/router/deployment.yaml: Specify explicit values for the failureThreshold, httpGet.scheme, periodSeconds, successThreshold, and timeoutSeconds parameters for the liveness, readiness, and startup probes. * pkg/operator/controller/ingress/deployment.go (copyProbe): Add a copyDefaultValues parameter. If this parameter is true, copyProbe will copy the value for a field even if is equal to the default value. (hashableProbe): Call copyProbe with copyDefaultValues set to false so that empty and default values result in the same hash. (deploymentConfigChanged): Call copyProbe with copyDefaultValues set to true so that default values get copied to the updated object gets diffed for logging. * pkg/operator/controller/ingress/deployment_test.go (checkProbes): New helper function to verify that the given container has the expected probes and probe parameters. (TestDesiredRouterDeploymentSpecAndNetwork): (TestDesiredRouterDeploymentVariety): Use checkProbes.
4d0f520
to
2f0e787
Compare
I realized that I was still missing |
/test e2e-aws-operator |
/test e2e-gcp-operator |
1 similar comment
/test e2e-gcp-operator |
I noticed your changes. I think this is causing an issue in a 4.12 bug that I'm looking into. Posted in slack. Update: we may need to backport to 4.12. |
Internal service and NodePort diffs I am seeing in 4.12:
|
Looks like nodes never became available:
/test e2e-gcp-ovn |
Regarding this failure in e2e-hypershift:
Slack thread discussing quota related issues for Hypershift: https://redhat-internal.slack.com/archives/C01CQA76KMX/p1715956957922319 |
Right, it makes sense to backport at least add17d7 to 4.12 as #864 was backported to 4.12.
Right, these should be fixed by d21abc8 and 1badfc6, so we can backport those too.
This should have been fixed by 5579aa1, which is proposed for backport to 4.13. If you're seeing OCPBUGS-13190 on 4.12, please add 4.12 to OCPBUGS-13190's "Affects Versions", and we can propose backporting #927 to 4.12 as well. The diffs in the ingress-operator logs from the latest e2e-gcp-operator run look really clean now. They are too large to paste in a comment, but you can review them using the following commands:
However, three anomalies jumped out at me. First, the LoadBalancer-type service for
The second update happens about 18 minutes after the first update. It also happens after the serial tests appear to have finished. I suspect it has something to do with the fact that The second anomaly that I noticed was this update from the
The "3h20m" value has to be from this update:
However, The third anomaly I noticed was this:
#1016 has a logic error, which causes the operator to log this message even when it didn't do an update: cluster-ingress-operator/pkg/operator/controller/ingress/controller.go Lines 1135 to 1145 in 009644a
(I've actually noticed this before, but I never got around to filing a bug report...). #1016 was backported to 4.15, so it would be nice to fix it and backport the fix to 4.15. It's just log noise, though, so it might not be justifiable to backport on its own. It is rather noisy, though, and it's trivial to fix, so if you like, I can bundle a fix for this log message into this PR. |
/label qe-approved tested it 4.16.0-0.ci.test-2024-05-20-061242-ci-ln-qyfl0pt-latest
|
/test e2e-hypershift |
@Miciah thanks for the thorough analysis and the bug numbers we should plan to backport to fix all the spurious updates seen in 4.12.40. At the moment I think we can safely ignore the logs from CI testing. I have a summary of all the spurious updates seen in 4.12.40 and will coordinate the backports of this bug and any other new one needed, including one to address #1016 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That is a lot of overlooked deployment updates.
@@ -1514,7 +1520,7 @@ func hashableProbe(probe *corev1.Probe) *corev1.Probe { | |||
|
|||
var hashableProbe corev1.Probe | |||
|
|||
copyProbe(probe, &hashableProbe) | |||
copyProbe(probe, &hashableProbe, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change will make a big difference.
@@ -1645,7 +1652,7 @@ func copyProbe(a, b *corev1.Probe) { | |||
Port: a.ProbeHandler.HTTPGet.Port, | |||
Host: a.ProbeHandler.HTTPGet.Host, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the entire function be gated by the copyDefaultValues
? I see here it still allows changes to HTTPGet and TerminationGracePeriod. In the future, how will developers know which values to allow changes, and which to prevent changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Port
is required, and the default value for Host
is the empty string, so I believe we can copy them unconditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, if you wouldn't mind, can you add a comment like Values that are always updated
and group them, and another comment Values that are only updated when requested
and group them all under a
if copyDefaultValues {
. I feel like this may help/force future developers to make a decision about what to do with values in copyProbe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 3faf391.
/test e2e-aws-ovn-single-node |
There have been a number of fixes for spurious updates since 4.12 (though not all of these have been confirmed to affect 4.12 generally or 4.12.40 specifically):
I don't understand what you mean by that. I think we should always be looking at the logs from CI testing, especially when we are modifying update logic, and most especially when we are specifically trying to address spurious updates. Do you mean that we should ignore the three issues that I mentioned in #1052 (comment) for now and not try to fix the "successfully updated Infra CR with Ingress Load Balancer IPs" log message or diagnose the other anomalies?
All right, I will hold off on fixing the "successfully updated Infra CR with Ingress Load Balancer IPs" issue until a separate bug report is filed. In my opinion, we should backport in chronological order (#879 first, then #927, then #947, and so on) in order to keep the history consistent among branches and avoid conflicts as much as possible, but we will encounter merge conflicts along the way in any case with some of these PRs (certainly with #1052). |
Yes, that is what I meant. |
Three bugs created for the anomalies listed in #1052 (comment) : |
* pkg/operator/controller/ingress/deployment.go (copyProbe): Group together the assignments for the case that copyDefaultValues is true.
@Miciah: This pull request references Jira Issue OCPBUGS-23221, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@Miciah: This pull request references Jira Issue OCPBUGS-23221, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@Miciah: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
internalServiceChanged:
Fix target port logicFix
internalServiceChanged
to copy target ports from the expected service rather than preserving target ports from the current service. Only copyNodePort
and other fields that we want to preserve.Before this PR, the
internalServiceChanged
function explicitly preserved the current service's ports' target ports when reconciling the service. However, the function was intended to update the target ports if they changed.In particular, if the current service specifies a numeric target port for the metrics port, then the operator should replace the port number with a port name so that the service uses the correct port even if the pod container's port number changes. Without this change, updating an IngressController's
spec.endpointPublishingStrategy.hostNetwork.statsPort
API field can break metrics as the service's metrics port's numeric target port may no longer match the pod container's metrics port.Follow-up to #864.
Test_*Changed
: Verify that some update was madeThis is a follow-up to the previous change for consistency.
crdChanged
,nodePortServiceChanged
: Fix preserve logicFix the update logic in
crdChanged
andnodePortServiceChanged
to preserve fields that are ignored when checking whether an update is necessary.Follow-up to #567, #890, and #927.
Set SessionAffinity to avoid confusing diffs
If a service does not specify
spec.sessionAffinity
, the API sets a default value of "None". This means that when creating or updating a service, the empty value and "None" are equivalent, and so when reconciling a service, cluster-ingress-operator treats these values as equal. This results in the correct update behavior: The operator ignores the API defaulting, but if someone sets any value other than the empty value or "None", the operator reverts the update; and if the operator leavesspec.sessionAffinity
empty when reverting the update, the API again defaults it to "None".However, even the updating behavior is correct, it can cause confusing log output when the operator logs the diff with the changes: If the operator has the empty value in the updated object but the current value is actually "None", this shows up in the diff, even though the actual update does not change the field value when taking defaulting into account.
This change adds an explicit
spec.sessionAffinity: None
setting to the services that the operator manages to avoid having the API defaulting show up in these diffs.Follow-up to #407.
Test_desiredLoadBalancerService
: Add assertionsAdd assertions for service
type
,externalTrafficPolicy
,internalTrafficPolicy
, andports
, for consistency withTest_desiredInternalIngressControllerService
.Test_desiredInternalIngressControllerService
: Fix godocFix capitalization of the test function name in the godoc.
TestHostNetworkPortBinding
: Uset.Log
,t.Cleanup
Update
TestHostNetworkPortBinding
to comply with current conventions. Convert some comments intot.Log
statements, replacedefer
witht.Cleanup
, and tweak some error messages.TestHostNetworkPortBinding
: Check service portsUpdate
TestHostNetworkPortBinding
to verify that the internal service has the expected ports. In particular, the metrics port should have its target port set to the name "metrics" rather than a port number.Test_deploymentConfigChanged
: Use k8s.io/utils/ptrReplace use of the deprecated "k8s.io/utils/pointer" package by using "k8s.io/utils/ptr" instead.
Set "service-ca-bundle" volume's default mode
Set an explicit value of 420 decimal (0644 octal) for the router deployment's "service-ca-bundle" volume's default mode.
Previously, the router deployment yaml manifest had
defaultMode
indented incorrectly, and so the operator did not actually set the intended value for the volume's default mode. Because the intended value is equal to the default value, the volume ultimately ends up with the intended default mode anyway. However, the fact that the operator did not specify the default mode made log output show a bogus diff for the volume when the operator updated the deployment.After this change, the yaml is correctly indented, and the log output for a deployment update has a less noisy diff.
Follow-up to #399.
Specify default mode on configmap/secret volumes
Set an explicit value of 420 decimal (0644 octal) for the default mode of all configmap or secret volumes in the router deployment that are defined in Go code (as opposed to being defined in the yaml manifest): client-ca, default-certificate, error-pages, metrics-certs, rsyslog-config, service-ca-bundle, and stats-auth.
After this change, the log output for a deployment update has a less noisy diff.
Specify parameter values for router probes
Set explicit values for the liveness, readiness, and startup probes in the router deployment's yaml manifest.
Previously, the router deployment yaml manifest omitted these values when the intended values were equal to the default values. This meant that the API server would set the default values for these parameters, and so the probes ultimately ended up with the intended parameter values. However, omitting these values in the yaml manifest also had the unintended side-effect of making the operator log bogus diffs for these parameters whenever it updated the deployment.
After this change, the log output for a deployment update has a less noisy diff.
copyProbe
: Refactor to group copying of defaultsGroup together the assignments for the case that
copyDefaultValues
is true.