Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior when using NodePort with clustermesh #24692

Open
2 tasks done
seb-lafond opened this issue Apr 3, 2023 · 7 comments
Open
2 tasks done

Strange behavior when using NodePort with clustermesh #24692

seb-lafond opened this issue Apr 3, 2023 · 7 comments
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. pinned These issues are not marked stale by our issue bot. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.

Comments

@seb-lafond
Copy link

seb-lafond commented Apr 3, 2023

Is there an existing issue for this?

#21261

  • I have searched the existing issues

What happened?

Based on slack discussion: https://cilium.slack.com/archives/C53TG4J4R/p1680172555618329

Not sure if this is an issue or normal behavior. We installed two clusters using tunnel mode (let's call them DC1 and DC2) with Cluster Mesh activated and running. No global services are configured. Each cluster has a service type NodePort listening on port 30030. On a DC1 node or pod, when trying to communicate with DC2 using node01-dc2.lab.it:30030, we always get a response from DC1, as if the DC1 node is catching the request. We do not see this behavior when Cluster Mesh is not running.

For example, with Rebel Base:
curl node01-dc1.lab.it:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}
curl node01-dc2.lab.it:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}

What we think it should be:
curl node01-dc1.lab.it:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}
curl node01-dc2.lab.it:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC2"}

Outside the cluster, normal behavior is observed when reaching DC1 or DC2.

Cilium Version

Client: 1.13.0 c9723a8 2023-02-15T14:18:31+01:00 go version go1.19.6 linux/amd64
Daemon: 1.13.0 c9723a8 2023-02-15T14:18:31+01:00 go version go1.19.6 linux/amd64

Kernel Version

Linux 5.10.0-20-amd64 #1 SMP Debian 5.10.158-2 (2022-12-13) x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.1", GitCommit:"8f94681cd294aa8cfd3407b8191f6c70214973a4", GitTreeState:"clean", BuildDate:"2023-01-18T15:51:25Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

cilium-sysdump-20230404-145836.zip

Relevant log output

No response

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@seb-lafond seb-lafond added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Apr 3, 2023
@squeed squeed added area/clustermesh Relates to multi-cluster routing functionality in Cilium. sig/agent Cilium agent related. labels Apr 4, 2023
@squeed
Copy link
Contributor

squeed commented Apr 4, 2023

That doesn't sound good!
Would it be possible for you to upload a sysdump from an affected cluster?

@squeed squeed added the need-more-info More information is required to further debug or fix the issue. label Apr 4, 2023
@rafernandez
Copy link

Hello,

I have exactly the same problem.
To be sure, I created 2 kubernetes clusters from a standard installation via vagrant. Cluster Mesh activated and running.

Os version : Ubuntu 22.04
Kernel version : 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Cilium version : v1.13.1
Kubernetes version : v1.23.17 (via kubeadm)
Topology : 2 Clusters (DC1 and DC2) with 1 master and 1 node

master-node-dc1
worker-node-dc1
master-node-dc2
worker-node-dc2

Cilium has been installed in a standard way :

DC1:

helm upgrade -i cilium cilium/cilium --version 1.13.1 --namespace kube-system --set "ipam.operator.clusterPoolIPv4PodCIDR=21.0.0.0/16" --set "cluster.name=cluster-dc1" --set k8sServiceHost=192.168.62.10 --set k8sServicePort=6443 --set "cluster.id=1" --set "kubeProxyReplacement=strict" 

DC2 :

helm upgrade -i cilium cilium/cilium --version 1.13.1 --namespace kube-system --set "ipam.operator.clusterPoolIPv4PodCIDR=31.0.0.0/16" --set "cluster.name=cluster-dc2" --set k8sServiceHost=192.168.63.10 --set k8sServicePort=6443 --set "cluster.id=2" --set "kubeProxyReplacement=strict"

I deployed the rebel-base pod on each cluster with a service type NodePort listening on port 30030.
I see the same thing, when I try to reach the port 30030 of DC2 from a node or a pod of DC1:

root@master-node-dc1:~# curl http://worker-node-dc1:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}
root@master-node-dc1:~# curl http://worker-node-dc2:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}

Outside the cluster (my workstation), it works fine:

~$ curl 192.168.62.10:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC1"}
~$ curl 192.168.63.10:30030
{"Galaxy": "Alderaan", "Cluster": "Cluster-DC2"}

You will find attached the cilium dump : cilium-sysdump-20230404-162849.zip

Thank you in advance for your help

@giorio94
Copy link
Member

giorio94 commented Apr 6, 2023

I just stumbled upon the same issue, which occurs only when cilium is configured in NodePort KPR mode is enabled (otherwise cross-cluster NodePort services are not supported).

@giorio94 giorio94 removed the need-more-info More information is required to further debug or fix the issue. label Apr 6, 2023
@squeed squeed removed the needs/triage This issue requires triaging to establish severity and next steps. label Apr 6, 2023
@giorio94
Copy link
Member

giorio94 commented Apr 21, 2023

I've had a fresh look at this issue, and my understanding is that it relates to how NodePorts are handled for socket load balancing (the issue does not manifest when targeting the NodePort from outside the cluster). More specifically, when the dst port is in the NodePort range, we then check whether the dst address belongs to a local or remote node [1], and in that case, we lookup the service with the wildcard (0.0.0.0) address [2]. Hence, if the given NodePort service exists in the local cluster, we will always target one of the local backends, even if the target address is that of a remote node.

One challenge that I see about fixing this is that AFAIK at the moment we do not have any way to tell in the datapath whether a given node belongs to the local or a remote cluster. Even knowing that, though, it is not 100% clear to me which could be the correct approach. My intuition is that a tradeoff could be to not perform the load balancing decision locally when the destination is a node in the remote cluster, since a NodePort service for that port on the remote cluster might not exist, or have different backends (hence matching the behavior when reaching that IP:port from outside the cluster).

One final note is that this issue also affects the clustermesh-apiserver NodePort service with the default Helm values configuration (since the NodePort is set to a fixed value there), when kubeProxyReplacement is set to strict. While the first connection to the remote kvstore works properly, reconnections will be redirected to the local kvstore rather than the correct one (I've opened #25033 to add a warning in the helm values file). The cilium CLI, instead, does not explicitly configure a nodeport, hence strongly reducing the likelihood that this happens.

/cc @cilium/sig-datapath

[1]: https://github.com/cilium/cilium/blob/9c06b258463ab5629d16d1ce6b9ecaa5b1e391d7/bpf/bpf_sock.c#L198-L201
[2]: https://github.com/cilium/cilium/blob/9c06b258463ab5629d16d1ce6b9ecaa5b1e391d7/bpf/bpf_sock.c#L205-L206

@giorio94 giorio94 added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. and removed sig/agent Cilium agent related. labels Apr 21, 2023
giorio94 added a commit to giorio94/cilium that referenced this issue Apr 21, 2023
Cilium is currently affected by a known bug (cilium#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
pchaigno pushed a commit that referenced this issue Apr 28, 2023
Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
joamaki pushed a commit to joamaki/cilium that referenced this issue May 2, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (cilium#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
joestringer pushed a commit that referenced this issue May 4, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
joamaki pushed a commit to joamaki/cilium that referenced this issue May 5, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (cilium#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
joamaki pushed a commit to joamaki/cilium that referenced this issue May 5, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (cilium#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
michi-covalent pushed a commit that referenced this issue May 10, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
YutaroHayakawa pushed a commit that referenced this issue May 10, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
YutaroHayakawa pushed a commit to YutaroHayakawa/cilium that referenced this issue May 11, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (cilium#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
youngnick pushed a commit that referenced this issue May 11, 2023
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jun 21, 2023
@giorio94 giorio94 removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jun 21, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Aug 21, 2023
@giorio94 giorio94 removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Aug 21, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Oct 21, 2023
@giorio94 giorio94 added pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Oct 21, 2023
pchaigno pushed a commit that referenced this issue Jan 8, 2024
[ upstream commit 9e83a6f ]

Cilium is currently affected by a known bug (#24692) when NodePorts are
handled by the KPR implementation, which occurs when the same NodePort
is used both in the local and the remote cluster. This causes all
traffic targeting that NodePort to be redirected to a local backend,
regardless of whether the destination node belongs to the local or the
remote cluster. This affects also the clustermesh-apiserver NodePort
service, which is configured by default with a fixed port. Hence, let's
add a warning message to the corresponding values file setting.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. pinned These issues are not marked stale by our issue bot. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

No branches or pull requests

4 participants