Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gateway-api: Race condition between routes and Gateway #25573

Merged
merged 1 commit into from
May 26, 2023

Conversation

sayboras
Copy link
Member

There was a race condition between http/tls routes and the underlying Gateway resource. Basically, if the HTTP or TLS Route resource is created before Gateway resource, its status will be updated with the value gateway not found. However, the reconciliation for HTTP or TLS routes is not triggered even after mentioned Gateway is created. The issue is due to an unnecessary predicate on GenerationChangedPredicate, which only checks for changes in spec.

This commit is to remove GenerationChangedPredicate predicate in HTTP and TLS routes.

Status for gateway not found

{
  "lastTransitionTime": "2023-05-22T05:24:01Z",
  "message": "Gateway.gateway.networking.k8s.io \"gateway-tlsroute\" not found",
  "observedGeneration": 1,
  "reason": "InvalidTLSRoute",
  "status": "False",
  "type": "Accepted"
}

Testing was done before and after the changes with below step:

  • Create one TLSRoute with a non-existent Gateway. This TLSRoute status should say gateway not found
  • Create mentioned Gateway Resource with a valid GatewayClass
  • Verify the TLSRoute status, which should be updated to Accepted.

Fixes: #25487

@maintainer-s-little-helper
Copy link

Commit 6d958fc44bc78fee2fe108c9214953f99e96b866 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 22, 2023
@maintainer-s-little-helper
Copy link

Commit 6d958fc44bc78fee2fe108c9214953f99e96b866 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@sayboras sayboras added release-note/bug This PR fixes an issue in a previous release of Cilium. and removed dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. labels May 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 22, 2023
@sayboras sayboras added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels May 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.3 May 22, 2023
@sayboras sayboras marked this pull request as ready for review May 22, 2023 08:39
@sayboras sayboras requested a review from a team as a code owner May 22, 2023 08:39
@sayboras sayboras requested a review from youngnick May 22, 2023 08:39
@sayboras
Copy link
Member Author

sayboras commented May 23, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig High-scale IPcache Test ingress policy enforcement

Failure Output

FAIL: Expected command: kubectl exec -n kube-system cilium-2cs6m -- bpftool map update pinned /sys/fs/bpf/tc/globals/cilium_world_cidrs4 key 0 0 0 0 0 0 0 0 value 1 

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/83/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@sayboras
Copy link
Member Author

/test-runtime

@sayboras
Copy link
Member Author

sayboras commented May 24, 2023

/test-1.26-net-next

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig High-scale IPcache Test ingress policy enforcement

Failure Output

FAIL: Expected command: kubectl exec -n kube-system cilium-4fbbg -- bpftool map update pinned /sys/fs/bpf/tc/globals/cilium_world_cidrs4 key 0 0 0 0 0 0 0 0 value 1 

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/130/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@sayboras
Copy link
Member Author

Re-run as the report is not accessible.

@sayboras
Copy link
Member Author

hitting the flake #25605, all other CI are ✔️ , required review is in.

Marking this ready to merge.

@sayboras sayboras added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 24, 2023
@squeed
Copy link
Contributor

squeed commented May 25, 2023

@sayboras needs rebase

@squeed squeed added dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels May 25, 2023
There was a race condition between http/tls routes and the underlying
Gateway resource. Basically, if the HTTP or TLS Route resource is created
before Gateway resource, its status will be updated with the value
`gateway not found`. However, the reconciliation for HTTP or TLS routes
is not triggered even after mentioned Gateway is created. The issue is due
to an unnecessary predicate on GenerationChangedPredicate, which only
checks for changes in spec.

This commit is to remove GenerationChangedPredicate predicate in HTTP and
TLS routes.

Status for gateway not found

```json
{
  "lastTransitionTime": "2023-05-22T05:24:01Z",
  "message": "Gateway.gateway.networking.k8s.io \"gateway-tlsroute\" not found",
  "observedGeneration": 1,
  "reason": "InvalidTLSRoute",
  "status": "False",
  "type": "Accepted"
}
```

Testing was done before and after the changes with below step:

- Create one TLSRoute with a non-existent Gateway. This TLSRoute status should say `gateway not found`
- Create mentioned Gateway Resource with a valid GatewayClass
- Verify the TLSRoute status, which should be updated to Accepted.

Fixes: cilium#25487
Signed-off-by: Tam Mach <tam.mach@cilium.io>
@sayboras sayboras removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label May 25, 2023
@sayboras
Copy link
Member Author

sayboras commented May 25, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://192.168.56.11:31285/hello" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/179/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@sayboras
Copy link
Member Author

Hit another flake #25524, marking this ready to merge after rebase.

@sayboras sayboras added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 25, 2023
@sayboras
Copy link
Member Author

/test-1.26-net-next

@sayboras sayboras merged commit 8507ecd into cilium:main May 26, 2023
58 checks passed
@sayboras sayboras deleted the tam/flake-conformance-api branch May 26, 2023 06:51
@sayboras sayboras mentioned this pull request May 28, 2023
10 tasks
@sayboras sayboras added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels May 28, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport done to v1.13 in 1.13.3 Jun 2, 2023
mhofstetter added a commit to mhofstetter/cilium that referenced this pull request Nov 6, 2023
This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: cilium#25573 & cilium#25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
mhofstetter added a commit to mhofstetter/cilium that referenced this pull request Nov 8, 2023
This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: cilium#25573 & cilium#25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
squeed pushed a commit that referenced this pull request Nov 10, 2023
This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: #25573 & #25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
gandro pushed a commit to gandro/cilium that referenced this pull request Nov 15, 2023
[ upstream commit 30d3a6f ]

This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: cilium#25573 & cilium#25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
sujoshua pushed a commit to sujoshua/cilium that referenced this pull request Nov 16, 2023
This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: cilium#25573 & cilium#25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
gandro pushed a commit that referenced this pull request Nov 16, 2023
[ upstream commit 30d3a6f ]

This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: #25573 & #25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
pjablonski123 pushed a commit to pjablonski123/cilium that referenced this pull request Dec 15, 2023
This commit adds the missing reference-grant watch to the TLS route controller.

Fixes: cilium#25573 & cilium#25711

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.13.3
Backport done to v1.13
Development

Successfully merging this pull request may close these issues.

CI: ConformanceGatewayAPI - TestConformance/TLSRouteSimpleSameNamespace failed
3 participants