-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix socket cpu migration #2076
base: master
Are you sure you want to change the base?
Fix socket cpu migration #2076
Conversation
edef1b5
to
8c5c9d1
Compare
1d90152
to
678759f
Compare
When we use all existing CPUs, when we set RPS it is ok. CPU will be changed only one time and only for loopback, becase for loopback we set skb_hash in tcp_make_synack and this skb will return to us! It is not a problem, because later CPU will be calulated correctly in netif_rx_skb_internal for all other skbs. |
56e90a7
to
8bd8ac8
Compare
964be21
to
1899417
Compare
1899417
to
24bf0ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functionfor
typo
scripts/tfw_lib.sh
Outdated
@@ -92,19 +91,83 @@ irqbalance_ban_irqs() | |||
systemctl restart irqbalance.service >/dev/null | |||
} | |||
|
|||
# This function prepares cpu mask for RSS and RPS. | |||
# It takes into account that we can't calculate | |||
# value, which is greater when (1 << 63) and cam't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cam't
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Do not review the second commit. I move it in separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please don't forget to remove second commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
scripts/tempesta.sh
Outdated
# and broken HTTP1. Some network interfaces have some strange suffix | ||
# like @if14, and we should remove it from device name. | ||
declare devs=$(ip addr show up | grep -P '^[0-9]+' | awk '{ sub(/:/, "", $2); print $2}' \ | ||
| awk '{ split($0,a,"@"); print a[1] }') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awk is a nice and powerful language, so this can be done in one awk program
@@ -46,12 +46,12 @@ tls_mod=tempesta_tls | |||
tdb_mod=tempesta_db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the copyright year
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
570878e
to
68bf486
Compare
24f76b2
to
ea303e5
Compare
Socket cpu migration can lead to two problems: performance degradation and response reordering, which leads to broken HTTP1. Previously we use RSS and RPS to prevent it, but there were several problems in our scripts: - we exclude loopback interfaces from setup, because we don't take into account response reordering problem. - we don't take into account that some interfaces have some suffix lile @if14, and we should remove it from device name in our scripts. - we don't try to setup combined RSS queues, only RX queues, but there are a lot of cases when network interface has only combined queues. - we don't take into account overflow when we calculate 1 << x, when x is greater or equal then 64. - we don't take into account overflow when we write value, which is greater then (1 << 32) - 1 in rps_cpus, when we setup RPS. - we don't setup RPS for network interface if, RSS setup fails. - we don't ban irqs for irqbalance for each network device immediately. But if there are a lot of devices there is a big race between setting RSS for first device and ban irqs for it. This race is anought for irqbalance daemon to change our settings. This patch fix all this problems. Closes #2075
ea303e5
to
8edd7e9
Compare
Socket cpu migration can lead to two problems:
performance degradation and response reordering,
which leads to broken HTTP1.
Previously we use RSS and RPS to prevent it, but
there were several problems in our scripts:
we don't take into account response reordering
problem.
have some suffix lile @if14, and we should remove
it from device name in our scripts.
RX queues, but there are a lot of cases when network
interface has only combined queues.
1 << x, when x is greater or equal then 64.
as $(perl -le 'printf("%x", (1 << '$CPUS_N') - 1)').
But in this case we use all cpus, not only one.
fails.
This patch fix all this problems.
Closes #2075