You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a dup of #3611, but since that issue has a long history of comments and was stale, I decided to create a new one with updated argumentation.
One of the more important changes in SM 3.2 repair was sticking to the one job per one host rule. I believe that for a bigger cluster this might kill any parallelism on the node level. Let's analyze a big cluster: 2 dcs 30 nodes each - all nodes have max_repair_ranges_in_parallel = 7. By default, each keyspace in such cluster would consist of 60 * 256 = 15360 token ranges. Assuming that they keyspace has replication {'dc1': 3, 'dc2': 3'}, we have (30!/(3! * 27!))^2 = 4060^2 = 16 483 600 possible replica sets. Assuming that token ranges are distributed uniformly across all possible replica sets, it is rather unlikely that a single repaired replica set owns more than 1 token range. This combined with the fact that SM sends repair jobs only for a single replica set results in SM sending only a single token range per repair job despite max_repair_ranges_in_parallel = 7.
This behavior could be controlled by an additional flag or repair config option in scylla-manager.yaml.
In terms of testing, it would be good to see performance improvement on a big cluster like: 2ds, 15 nodes each, keyspace with RF 3 in each dc, setup in which the repair indeed has to do some work (missing rows on some nodes). This bigger setup would definitely require a help from QA.
The text was updated successfully, but these errors were encountered:
This is a dup of #3611, but since that issue has a long history of comments and was stale, I decided to create a new one with updated argumentation.
One of the more important changes in SM 3.2 repair was sticking to the
one job per one host rule
. I believe that for a bigger cluster this might kill any parallelism on the node level. Let's analyze a big cluster: 2 dcs 30 nodes each - all nodes havemax_repair_ranges_in_parallel = 7
. By default, each keyspace in such cluster would consist of60 * 256 = 15360
token ranges. Assuming that they keyspace has replication{'dc1': 3, 'dc2': 3'}
, we have(30!/(3! * 27!))^2 = 4060^2 = 16 483 600
possible replica sets. Assuming that token ranges are distributed uniformly across all possible replica sets, it is rather unlikely that a single repaired replica set owns more than 1 token range. This combined with the fact that SM sends repair jobs only for a single replica set results in SM sending only a single token range per repair job despitemax_repair_ranges_in_parallel = 7
.This behavior could be controlled by an additional flag or repair config option in
scylla-manager.yaml
.In terms of testing, it would be good to see performance improvement on a big cluster like: 2ds, 15 nodes each, keyspace with RF 3 in each dc, setup in which the repair indeed has to do some work (missing rows on some nodes). This bigger setup would definitely require a help from QA.
The text was updated successfully, but these errors were encountered: