Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a Developer I want to analyse the Security Server proxy performance to find bottlenecks in the current code #1360

Open
5 tasks
raits opened this issue Sep 22, 2022 · 2 comments
Labels
help wanted Extra attention is needed

Comments

@raits
Copy link
Contributor

raits commented Sep 22, 2022

We should investigate the current Security Server proxy implementation to see if there are any bottlenecks in the messaging that could be improved.

A suitable setup for the investigation is a client/test runner (e.g., Apache bench tool "ab"), 2 * Security Servers and a mock service:

Client/test runner → SS1 (ClientProxy) → SS2 (ServerProxy) → Mock service

The JIRA issue this was created from can be found here: https://nordic-institute.atlassian.net/browse/XRDDEV-1568

Acceptance criteria:

  • Performance testing is done on JAVA 11
  • The proxy messaging paths (client and server proxies) are analysed and potential bottlenecks documented
    • For example, the results are visualized using Flame Graphs
  • Tools and configurations used for testing are documented
  • Suggestions to improve proxy performance are documented
@zpotoloom
Copy link

Using niis/xroad-security-server-standalone:bionic-7.0.2

Main performance bottleneck lays in https://github.com/nordic-institute/X-Road/blob/develop/src/proxy/src/main/java/ee/ria/xroad/proxy/protocol/ProxyMessage.java

There's an unused variable that was only used during REST POC implementation
public static final int REST_BODY_LIMIT = 8192; //store up to limit bytes into memory

Currently every REST message body is dumped to disk using non-optimal way
https://github.com/nordic-institute/X-Road/blob/develop/src/common/common-util/src/main/java/ee/ria/xroad/common/util/CachingStream.java#L59C39

java.nio uses by default 8k buffer size
this results in "impressive" IO operations for one request with 10MB message body:

root@ss:/# inotifywait -m -r /var/tmp/xroad/ > inotify.out
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
^C
root@ss:/# cat inotify.out | sort | uniq -c | sort -rn
   3012 /var/tmp/xroad/ MODIFY tmpattach18211735386472262976.tmp
   1348 /var/tmp/xroad/ MODIFY tmpattach5517139367076729444.tmp
    861 /var/tmp/xroad/ ACCESS tmpattach18211735386472262976.tmp
      2 /var/tmp/xroad/ OPEN tmpattach5517139367076729444.tmp
      2 /var/tmp/xroad/ OPEN tmpattach18211735386472262976.tmp
      2 /var/tmp/xroad/ CLOSE_WRITE,CLOSE tmpattach5517139367076729444.tmp
      2 /var/tmp/xroad/ CLOSE_WRITE,CLOSE tmpattach18211735386472262976.tmp
      1 /var/tmp/xroad/ DELETE tmpattach5517139367076729444.tmp
      1 /var/tmp/xroad/ DELETE tmpattach18211735386472262976.tmp
      1 /var/tmp/xroad/ CREATE tmpattach5517139367076729444.tmp
      1 /var/tmp/xroad/ CREATE tmpattach18211735386472262976.tmp

1k mesage body

ab -c 10 -t 60 -H 'X-Road-Client: CS/ORG/1111/TestClient' \
http://host.docker.internal:8080/r1/CS/ORG/1111/TestService/perftest/1k.json
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking host.docker.internal (be patient)
Finished 3284 requests


Server Software:
Server Hostname:        host.docker.internal
Server Port:            8080

Document Path:          /r1/CS/ORG/1111/TestService/perftest/1k.json
Document Length:        1024 bytes

Concurrency Level:      10
Time taken for tests:   60.026 seconds
Complete requests:      3284
Failed requests:        0
Total transferred:      4889876 bytes
HTML transferred:       3362816 bytes
Requests per second:    54.71 [#/sec] (mean)
Time per request:       182.782 [ms] (mean)
Time per request:       18.278 [ms] (mean, across all concurrent requests)
Transfer rate:          79.55 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    4   2.7      3      21
Processing:    31   81  32.9     90     235
Waiting:       30   80  32.6     90     234
Total:         32   85  34.6     95     239

Percentage of the requests served within a certain time (ms)
  50%     95
  66%    104
  75%    109
  80%    112
  90%    121
  95%    132
  98%    143
  99%    154
 100%    239 (longest request)
iostat -d -k 1 60 sdc | awk 'BEGIN {count = 0; r_sum = 0; w_sum = 0} /sdc/ {count++; r_sum += $3; w_sum += $4} END {printf "Average Read IOPS: %.2f\nAverage Write IOPS: %.2f\n", r_sum/count, w_sum/count}'
Average Read IOPS: 2.20
Average Write IOPS: 539.06

10MB message body

2 concurrent connections was the maximum that testing machine could handle without timeouts
(AMD Ryzed 7 PRO 4750U, cheap M.2 SSD, Docker Desktop on Windows )

ab -s 60 -c 2 -t 60 -H 'X-Road-Client: CS/ORG/1111/TestClient' \
http://host.docker.internal:8080/r1/CS/ORG/1111/TestService/perftest/10M.json
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking host.docker.internal (be patient)
Finished 42 requests


Server Software:
Server Hostname:        host.docker.internal
Server Port:            8080

Document Path:          /r1/CS/ORG/1111/TestService/perftest/10M.json
Document Length:        10485760 bytes

Concurrency Level:      2
Time taken for tests:   83.779 seconds
Complete requests:      42
Failed requests:        1
   (Connect: 0, Receive: 0, Length: 1, Exceptions: 0)
Total transferred:      429934323 bytes
HTML transferred:       429916160 bytes
Requests per second:    0.50 [#/sec] (mean)
Time per request:       3989.489 [ms] (mean)
Time per request:       1994.744 [ms] (mean, across all concurrent requests)
Transfer rate:          5011.48 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    2   0.6      2       3
Processing:   701 2601 7179.6   1618   47884
Waiting:        0  812 468.1    844    3345
Total:        702 2602 7179.6   1620   47885

Percentage of the requests served within a certain time (ms)
  50%   1620
  66%   1650
  75%   1678
  80%   1693
  90%   1763
  95%   1826
  98%  47885
  99%  47885
 100%  47885 (longest request)
root@ss:/# iostat -d -k 1 60 sdc | awk 'BEGIN {count = 0; r_sum = 0; w_sum = 0} /sdc/ {count++; r_sum += $3; w_sum += $4} END {printf "Average Read IOPS: %.2f\nAverage Write IOPS: %.2f\n", r_sum/count, w_sum/count}'
Average Read IOPS: 2.01
Average Write IOPS: 14064.13


I would propose to implement a configurable buffer size for the attachment storage process.
This would allow to tune the performance for different use cases depending on average message sizes.

Also for small messages it would be nice to have an option to store attachments in memory instead of disk.

@raits
Copy link
Contributor Author

raits commented Sep 27, 2023

Hello @zpotoloom!

Thank you for looking into this issue and proposing a solution. Your suggestion to make the buffer size configurable so that users can tune it based on their needs makes sense, as well as bypassing the disk altogether for smaller messages.

Unfortunately, we will not be able to introduce this change for version 7.4.0 yet, but we will look into implementing the suggestion for version 7.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
Status: Todo
Development

No branches or pull requests

2 participants