Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with coredns timeouts and pods DNS resolution with bpf.masquerade enabled #32489

Open
2 of 3 tasks
pentago opened this issue May 12, 2024 · 7 comments
Open
2 of 3 tasks
Labels
help-wanted Please volunteer for this by adding yourself as an assignee! info-completed The GH issue has received a reply from the author kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps.

Comments

@pentago
Copy link

pentago commented May 12, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

After enabling bpf.masquerade=true, coredns starts timeouting and other pods can't resolve anything.

Cilium Version

Client: 1.15.4 9b3f9a8 2024-04-11T17:25:42-04:00 go version go1.21.9 linux/arm64
Daemon: 1.15.4 9b3f9a8 2024-04-11T17:25:42-04:00 go version go1.21.9 linux/arm64

Kernel Version

Linux dev-control-plane 6.6.26-linuxkit #1 SMP Sat Apr 27 04:13:19 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Kubernetes Version

Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Regression

No response

Sysdump

cilium-sysdump-20240512-205923.zip

Relevant log output

[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:57283->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:38103->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:53718->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:33906->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:34466->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:60107->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:34493->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:41721->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:38282->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 7792796240999121637.6686654646417607282. HINFO: read udp 10.42.0.9:35967->10.100.0.254:53: i/o timeout                                                         │
[ERROR] plugin/errors: 2 google.com. A: read udp 10.42.0.9:43732->10.100.0.254:53: i/o timeout                                                                                          │
[ERROR] plugin/errors: 2 google.com. AAAA: read udp 10.42.0.9:45840->10.100.0.254:53: i/o timeout                                                                                       │
[ERROR] plugin/errors: 2 google.com. A: read udp 10.42.0.9:33932->10.100.0.254:53: i/o timeout                                                                                          │
[ERROR] plugin/errors: 2 google.com. AAAA: read udp 10.42.0.9:38568->10.100.0.254:53: i/o timeout                                                                                       │
[ERROR] plugin/errors: 2 google.com. AAAA: read udp 10.42.0.9:38284->10.100.0.254:53: i/o timeout                                                                                       │
[ERROR] plugin/errors: 2 google.com. A: read udp 10.42.0.9:45192->10.100.0.254:53: i/o timeout                                                                                          │
[ERROR] plugin/errors: 2 google.com. AAAA: read udp 10.42.0.9:34840->10.100.0.254:53: i/o timeout                                                                                       │
[ERROR] plugin/errors: 2 google.com. A: read udp 10.42.0.9:32915->10.100.0.254:53: i/o timeout                                                                                          │


random pod log:
nginx@test-5dd9d7b595-786r7:/$ curl google.com
curl: (6) Could not resolve host: google.com

I install Cilium with this:

helm upgrade --install cilium cilium/cilium \
  --namespace kube-system \
  --set cluster.name=$CLUSTER_NAME \
  --set kubeProxyReplacement=true \
  --set ipv4.enabled=true \
  --set ipv6.enabled=false \
  --set k8sServiceHost=$CLUSTER_NAME-control-plane \
  --set k8sServicePort=6443 \
  --set ipam.mode=cluster-pool \
  --set ipam.operator.clusterPoolIPv4PodCIDRList="10.42.0.0/16" \
  --set ipam.operator.clusterPoolIPv4MaskSize=24 \
  --set k8s.requireIPv4PodCIDR=true \
  --set autoDirectNodeRoutes=true \
  --set routingMode=native \
  --set endpointRoutes.enabled=true \
  --set ipv4NativeRoutingCIDR="10.0.0.0/8" \
  --set bpf.tproxy=true \
  --set bpf.preallocateMaps=true \
  --set bpf.hostLegacyRouting=false \
  --set bpf.masquerade=true \
  --set enableIPv4Masquerade=true \
  --set encryption.enabled=true \
  --set encryption.type=wireguard \
  --set encryption.nodeEncryption=true \
  --set encryption.strictMode.enabled=true \
  --set encryption.strictMode.cidr="10.0.0.0/8" \
  --set encryption.strictMode.allowRemoteNodeIdentities=true \
  --set rollOutCiliumPods=true \
  --set operator.rollOutPods=true

cilium status output:

root@dev-worker2:/home/cilium# cilium status
KVStore:                 Ok   Disabled
Kubernetes:              Ok   1.29 (v1.29.2) [linux/arm64]
Kubernetes APIs:         ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:    True   [eth0    172.18.0.2 fc00:f853:ccd:e793::2 fe80::42:acff:fe12:2 (Direct Routing)]
Host firewall:           Disabled
SRv6:                    Disabled
CNI Chaining:            none
Cilium:                  Ok   1.15.4 (v1.15.4-9b3f9a8c)
NodeMonitor:             Listening for events on 8 CPUs with 64x4096 of shared memory
Cilium health daemon:    Ok
IPAM:                    IPv4: 2/254 allocated from 10.42.2.0/24,
IPv4 BIG TCP:            Disabled
IPv6 BIG TCP:            Disabled
BandwidthManager:        Disabled
Host Routing:            BPF
Masquerading:            BPF   [eth0]   10.0.0.0/8 [IPv4: Enabled, IPv6: Disabled]
Controller Status:       18/18 healthy
Proxy Status:            OK, ip 10.42.2.223, 0 redirects active on ports 10000-20000, Envoy: embedded
Global Identity Range:   min 256, max 65535
Hubble:                  Ok              Current/Max Flows: 137/4095 (3.35%), Flows/s: 1.83   Metrics: Disabled
Encryption:              Wireguard       [NodeEncryption: Enabled, cilium_wg0 (Pubkey: vQfrUsFvKKYFvplB8kScoY0EAl5F6YLRYkYB/DbILnw=, Port: 51871, Peers: 2)]
Cluster health:          3/3 reachable   (2024-05-12T19:12:44Z)
Modules Health:          Stopped(0) Degraded(0) OK(11) Unknown(3)

Anything else?

everything works fine until bpf.masquerade is enabled.
That feature alone is the issue as I tried number of different configurations.
My environment is latest kind cluster running on Docker for Mac.

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct
@pentago pentago added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels May 12, 2024
@squeed
Copy link
Contributor

squeed commented May 15, 2024

I'm not able to reproduce this. I installed a kind cluster with bpf.masquerade and it works as expected.

Did you try changing this setting on a running cluster, or was it from scratch?

@squeed squeed added the need-more-info More information is required to further debug or fix the issue. label May 15, 2024
@pentago
Copy link
Author

pentago commented May 15, 2024

I create a cluster from scratch each time.

In your tests, are you able to resolve anything from some test pod, like alpine packages repo?

@github-actions github-actions bot added info-completed The GH issue has received a reply from the author and removed need-more-info More information is required to further debug or fix the issue. labels May 15, 2024
@squeed
Copy link
Contributor

squeed commented May 15, 2024

I tried with your exact setup -- except for on linux -- and it worked perfectly. There must be some kind of strange discrepancy -- maybe mac is the issue?

One strange thing I see is this line in cilium-dbg status:

Encryption:                           Wireguard       [NodeEncryption: OptedOut, cilium_wg0 (Pubkey: XXX, Port: 51871, Peers: 2)]

whereas on my cluster, I see

Encryption:              Wireguard       [NodeEncryption: Enabled, cilium_wg0 (Pubkey: XXXX, Port: 51871, Peers: 1)]

Not sure if that's potentially an issue. What happens if you disable encryption?

@pentago
Copy link
Author

pentago commented May 15, 2024

I noticed encryption status come and go as I make changes to values file and apply changes by doing helm upgrade. By default it's enabled and works fine.

I suspect the issue is Mac thing as well, just not sure how to debug it.
I guess it's much complex setup on MAcs than on Linux because of Docker Desktop's underlying VM. WOuld be great to have some documentation dealing with that test case.

@squeed
Copy link
Contributor

squeed commented May 16, 2024

Yeah, at the end of the day, docker on mac is not really a supported platform; it's useful for development -- and many Cilium developers use it! But I'm not sure that we have the expertise to dig in to these sorts of issues.

@squeed squeed added the help-wanted Please volunteer for this by adding yourself as an assignee! label May 16, 2024
@pentago
Copy link
Author

pentago commented May 16, 2024

So last night I had some movements.
Aparently BPF masquerading works but if native routing is changed to tunnel mode.

Let's say my setup has:

  • Docker subnet CIDR: 10.100.0.0/24
  • nodes CIDR: 172.18.0.0/24
  • pods CIDR: 10.42.0.0/16
  • services CIDR: 10.43.0.0/16

What would be the correct value for ipv4NativeRoutingCIDR?

Perhaps that's causing the issue on my end.

Most importantly, are bpf.masquerading and routingMode:native supposed to be used together?

@pentago
Copy link
Author

pentago commented May 16, 2024

So I ran into this article where apparently, coredns configmap needs to have fixed nameserver instead of relying on /etc/resolv.conf (not sure why though).

After I tried this, there was no codedns timeout errors and traffic flows as expected.

I used this config:

cluster:
  name: dev

kubeProxyReplacement: true
ipv4:
  enabled: true
ipv6:
  enabled: false

k8sServiceHost: dev-control-plane
k8sServicePort: 6443

ipam:
  mode: cluster-pool
  operator:
    clusterPoolIPv4PodCIDRList: "10.42.0.0/16"  # Pods CIDR
    clusterPoolIPv4MaskSize: 24

k8s:
  requireIPv4PodCIDR: true

autoDirectNodeRoutes: true
routingMode: native
endpointRoutes:
  enabled: true

ipv4NativeRoutingCIDR: "10.42.0.0/16"  # Pods CIDR

bpf:
  tproxy: true
  preallocateMaps: true
  hostLegacyRouting: false
  masquerade: true

ipMasqAgent:
  enabled: true
  config:
    nonMasqueradeCIDRs:
      - 10.42.0.0/16 # Pods CIDR

enableIPv4Masquerade: true

encryption:
  enabled: true
  type: wireguard
  nodeEncryption: true
  strictMode:
    enabled: true
    cidr: "10.42.0.0/16"  # Pods CIDR
    allowRemoteNodeIdentities: true

externalIPs:
  enabled: true

nodePort:
  enabled: true

hostPort:
  enabled: true

hubble:
  enabled: true
  relay:
    enabled: true
    rollOutPods: true
  ui:
    enabled: true
    rollOutPods: true

rollOutCiliumPods: true
operator:
  rollOutPods: true

Other than this, I'd really appreciate if there are any conflicts or missconfiguration in terms of CIDRs I used in chart values that I'm not aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help-wanted Please volunteer for this by adding yourself as an assignee! info-completed The GH issue has received a reply from the author kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps.
Projects
None yet
Development

No branches or pull requests

2 participants