Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is a 2%-4% performance regression in sysbench select_random_points #51898

Closed
Yui-Song opened this issue Mar 19, 2024 · 10 comments · Fixed by #53195
Closed

There is a 2%-4% performance regression in sysbench select_random_points #51898

Yui-Song opened this issue Mar 19, 2024 · 10 comments · Fixed by #53195

Comments

@Yui-Song
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. deploy a tidb cluster
  2. run sysbench select_random_points

2. What did you expect to see? (Required)

No performance regression

3. What did you see instead (Required)

  • 2424-01-30, QPS= 62463 , commit 25f44f5
    image

  • 2424-01-31, QPS= 59776 , commit 27fbccb
    image

4. What is your TiDB version? (Required)

@Yui-Song Yui-Song added the type/bug This issue is a bug. label Mar 19, 2024
@Yui-Song
Copy link
Contributor Author

/type performance
/type regression
/sig execution
/severity critical
/label affects-8.0

@Yui-Song
Copy link
Contributor Author

/remove-label may-affects-7.5
/remove-label may-affects-7.1
/remove-label may-affects-6.5
/remove-label may-affects-6.1
/remove-label may-affects-5.4

@XuHuaiyu
Copy link
Contributor

截屏2024-03-20 11 27 32

In the new CPU profile, we can see that there is additional CPU overhead in the index usage reporter section. This overhead was introduced by commit 7087f70.
We will conduct tests on this commit to confirm whether the regression is indeed introduced by this commit.

@XuHuaiyu
Copy link
Contributor

Change the sig label to sig/sql-infra

@XuHuaiyu XuHuaiyu added sig/sql-infra SIG: SQL Infra and removed sig/execution SIG execution labels Mar 20, 2024
@YangKeao YangKeao self-assigned this Mar 20, 2024
@YangKeao
Copy link
Member

I tried to run sysbench random_points on #50773 and #50643. The change is not significant:

image

Actually, the variance is too big 🤦 (The upper 3 are #50773, and the next 3 are #50643)

@Yui-Song
Copy link
Contributor Author

/remove-severity critical
/severity major

@Yui-Song
Copy link
Contributor Author

/remove-label may-affects-7.5
/remove-label may-affects-7.1
/remove-label may-affects-6.5
/remove-label may-affects-6.1
/remove-label may-affects-5.4

@Yui-Song
Copy link
Contributor Author

Tracing back to this PR will lead to regression in the following three benchmarks: BenchmarkBatchPointGet, BenchmarkPreparedPointGet, and BenchmarkPointGet.

https://github.com/pingcap/tidb/blob/master/Makefile#L299
make bench-daily TO=/path/to/file.json

img_v3_029d_1bbff454-ff74-481b-bfce-9cf340c0076g

@YangKeao
Copy link
Member

YangKeao commented May 11, 2024

To re-produce this issue, I wrote a simpler implementation of select_random_points in go, with client-side connection balancer, and it has a much stabler performance: https://gist.github.com/YangKeao/c02ada99d2602d5e7fb5fb94f0581015

For the commit with index usage, the average QPS (after enough warm-up, though I don't know why warm-up does help) is 58393, and the commit without index usage is 59029. The degradation is around 1.07%.

(Also, I don't know why this go-version is a little bit slower than the lua version, maybe the go client is not as fast as the driver used by sysbench).

I have had some ideas to boost the performance of index usage a little. See #53195

@Yui-Song
Copy link
Contributor Author

This issue is related to another issue, #51852. All three workloads involve a large number of coprocessor operations. The performance can be fixed through the PR #53429 and #52887. For detailed details, please refer to the PR analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants