You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Orchestrator version (e.g. kubernetes, mesos, rkt):
[root@master-0 ~]# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.12", GitCommit:"12031002905c0410706974560cbdf2dad9278919", GitTreeState:"clean", BuildDate:"2024-03-15T02:15:31Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.12", GitCommit:"12031002905c0410706974560cbdf2dad9278919", GitTreeState:"clean", BuildDate:"2024-03-15T02:06:14Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}
Operating System and version:
[root@master-0 ~]# uname -a
Linux master-0 4.19.90-52.33.v2207.ky10.x86_64 #1 SMP Fri Dec 22 17:04:59 CST 2023 x86_64 x86_64 x86_64 GNU/Linux
Testing process and results:
[root@worker-0 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: p1p1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
link/ether 9c:c2:c4:55:f6:4a brd ff:ff:ff:ff:ff:ff
3: em1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 9c:c2:c4:5f:0f:aa brd ff:ff:ff:ff:ff:ff
4: p1p2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
link/ether 9c:c2:c4:55:f6:4a brd ff:ff:ff:ff:ff:ff
5: p5p1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
link/ether 9c:c2:c4:55:f6:4a brd ff:ff:ff:ff:ff:ff
6: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 9c:c2:c4:5f:0f:ab brd ff:ff:ff:ff:ff:ff
7: p5p2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
link/ether 9c:c2:c4:55:f6:4a brd ff:ff:ff:ff:ff:ff
8: nm-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 9c:c2:c4:55:f6:4a brd ff:ff:ff:ff:ff:ff
inet 10.83.3.51/24 brd 10.83.3.255 scope global noprefixroute nm-bond
valid_lft forever preferred_lft forever
[root@worker-0 ~]# ethtool nm-bond
Settings for nm-bond:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 40000Mb/s
Duplex: Full
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Link detected: yes
[root@worker-0 ~]# ethtool -k nm-bond
Features for nm-bond:
rx-checksumming: off [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [requested on]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
[root@master-0 ~]# ip link show cali9d99c50de22
76833: cali9d99c50de22@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-e677d036-a8cf-3323-ce92-c0682de0a022
[root@master-0 ~]# ethtool -k cali9d99c50de22
Features for cali9d99c50de22:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: on
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: off [fixed]
tx-sctp-segmentation: on
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: on
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
The text was updated successfully, but these errors were encountered:
Do you use vxlan? Are the nodes in different subnets? I suppose there is offloading turned off on vxlan.calico. It is turned off by default due to a kernel bug in older kernels, but we are turning it on in 3.28 again. You can set "ChecksumOffloadBroken=true" in the FelixConfiguration's featureDetectOverride field. You would need to restart the nodes. You can also manually turn it on using ethtool. Let us know if it helped.
@tomastigera Yes, all nodes are connected to the same switch and VXLAN is not being used. I have set the encapsulation to None. I found in subsequent testing that the total bandwidth can reach the expected value when using concurrency parameters(-P 10), but it is still only about half of the physical bandwidth in single-threaded scenarios.
I also tried using ethtool to disable rx-checksumming and tx-checksumming, but I didn't see any significant change.
I also tried starting two iperf3 containers on the same node to test the same target simultaneously. I expected the results of the two iperf3 containers to add up to the physical bandwidth. However, in fact, the test results of each iperf3 were lower. I can't think of where the problem might be. I thought that in the case of "encapsulation: None", calico only needs to maintain the local routing table and veth pair, and there should not be such a large difference with the physical network.
Expected Behavior
Under normal circumstances, the network performance loss of Calico should be within 10%, right?
Current Behavior
Calico's performance is not as expected and seems unstable. How can I further analyze the cause of network degradation and solve this problem?
The network loss from testing on nodes to pods is over 50%!!!
Possible Solution
Have I missed any important system kernel parameters or Calico configurations?
Steps to Reproduce (for bugs)
Context
1、node to node:
2、pod to node:
3、node to pod:
4、pod to pod:
Your Environment
The text was updated successfully, but these errors were encountered: