Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuring nighthawk-client not to reusing session makes results 10 times worse (and utilization of w CPU/RAM assign to nighthawk server is very low less than 10% ) #853

Open
lzielezinska opened this issue May 18, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@lzielezinska
Copy link

Title: Results for nighthawk client configured not to reuse the session are 10 times less (and utilization of w CPU/RAM assign to nighthawk server is very low, less than 10% )

Description:
Command used to run nighthawk:
nighthawk_client -p http2 --max-requests-per-connection 1 --address-family v4 "https://<IP>:<PORT>" --concurrency auto --rps "1000" --connections "$10" --duration "60" --request-body-size "400" --transport-socket '{"name": "envoy.transport_sockets.tls", "typed_config": { "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext","max_session_keys":"0"}}'

When the paramater --transport-socket is not passed, we have noticed 10 times better results and almost 100% utilization of CPU assigned to nighthawk_server. We would like to run nighthawk in a way that client is not reusing the session, parameter max_session_keys":"0" configured that.

Reproduction steps:
Run nighthawk with parameter:
nighthawk_client ... --transport-socket '{"name": "envoy.transport_sockets.tls", "typed_config": { "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext","max_session_keys":"0"}}' ...
force Nighthawk client not to reuse session, but makes result very low and Nighthawk_server almost not working

@lzielezinska lzielezinska added the bug Something isn't working label May 18, 2022
@lzielezinska lzielezinska changed the title nighthawk client configured to the not-reusing session are 10 times less (and utilization of w CPU/RAM assign to nighthawk server is very low less than 10% ) Configuring nighthawk-client not to reusing session makes results 10 times worse (and utilization of w CPU/RAM assign to nighthawk server is very low less than 10% ) May 18, 2022
@mum4k
Copy link
Collaborator

mum4k commented May 25, 2022

Thank you for the bug report @lzielezinska. Nighthawk reuses significant portions of the Envoy code which includes code that establishes TLS sessions. It is possible that you have discovered a hot spot in the Envoy code.

To confirm this, would you be able to provide a few more details?

  1. The full command line and nighthawk output (including all the counters) for the execution without the custom transport socket.
  2. The full nighthawk output (including all the counters) for the execution with the custom transport socket.
  3. Could you also indicate what this is running against (what is the server). Is it the Nighthawk test server or some other system? If it is another system - were you able to verify its CPU usage during the experiment with the custom transport socket?

We can look to see if there is anything obvious in the outputs and counters. Once we have this, we can try to reproduce this problem and try to profile the execution.

@mum4k mum4k self-assigned this May 25, 2022
@pawelrxyz
Copy link

pawelrxyz commented May 31, 2022

hello let me share some info.

@mum4k
Copy link
Collaborator

mum4k commented Jun 2, 2022

Thank you for providing the details @pawelrxyz. The lowered CPU usage suggests some thread/lock contention. I will try to reproduce this problem and profile it to identify the contended path.

@mum4k
Copy link
Collaborator

mum4k commented Aug 3, 2022

I reran the experiment and collected CPU and contention profiles. I wasn't able to reproduce the problem and did not see any unexpected behavior when I executed the test.

nighthawk_output_standard.txt is an experiment that runs without the custom TLS socket, i.e. session resumption was enabled. Confirmed by counters:

    counters {
      name: "ssl.handshake"
      value: 886082
    }
    counters {
      name: "ssl.session_reused"
      value: 885932
    }

nighthawk_output_custom_socket.txt is an experiment that runs with the custom TLS socket, i.e. session resumption was disabled. The counter ssl.session_reused is missing, sessions weren't reused.

Each test took about 5 min of wall time. I observed the following:

  • NH client contentions went down from 10.68s to 1.37s.
  • NH client CPU usage went up from 61s to 220s.
  • NH test server contentions went down from 711.19s to 546.72s.
  • NH test server CPU usage went up from 20.18s to 77.11s.

Since contentions went down, I no longer suspect lock contention. The increased CPU usage in the NH test server seems to be directly related to the need to negotiate TLS sessions which is expected. See the diff below which compares the two runs:
nh_test_server_cpu_diff

The increased CPU usage in the NH client seems to be related to more calls to sched_yield() which I suspect is due to threads waiting for TLS session negotiation on the server side. When that is filtered out, the remaining increased CPU usage is also related to negotiation of TLS sessions. See the diff below which compares the two runs:
nh_client_cpu_diff

Since I wasn't able to reproduce this, I could use your help here. For one, you could review this experiment and point out if I diverged from yours anywhere. Even better, if you can still reproduce this, could you provide both CPU and contention profiles from your run, so that we can see what the NH client and test server are doing while you are observing the decreased CPU usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants