Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel/test-tls-server-verify is awfully slow on Windows #1461

Closed
jbergstroem opened this issue Apr 18, 2015 · 21 comments
Closed

parallel/test-tls-server-verify is awfully slow on Windows #1461

jbergstroem opened this issue Apr 18, 2015 · 21 comments
Labels
test Issues and PRs related to the tests. windows Issues and PRs related to the Windows platform.

Comments

@jbergstroem
Copy link
Member

Running this test on one of our jenkins windows bots consistently takes more than 35 seconds, while taking 0.6s on my (os x) desktop.

@jbergstroem jbergstroem added windows Issues and PRs related to the Windows platform. test Issues and PRs related to the tests. labels Apr 18, 2015
@mathiask88
Copy link
Contributor

I think the cause for the long runtime is in OpenSSL. The s_client on windows generates extra entropy by loading screen memory and this takes ~1-2s on each spawn. I don't get why they are not using something like windows CryptGenRandom() function but I'm no openssl or even crypto expert so correct me if I am wrong.

@jbergstroem
Copy link
Member Author

@mathiask88 thanks for digging that up. Seems to have been around for a while.

@trevnorris
Copy link
Contributor

/cc @indutny @piscisaureus on using CryptGetRandom()

@piscisaureus
Copy link
Contributor

@mathiask88 The test has always been slow for me, but I never got any further than figuring out that the openssl command-line client is really slow. Thanks for digging this up.

@mathiask88
Copy link
Contributor

@jbergstroem Yes and I think it is not necessary. The client runs app_RAND_load_file() that returns on windows something like C:\.rnd and tries to read that entropy file. If this fails RAND_status() gets called and if this is called for the first time openssl gets initialized with RAND_poll() and that function calls CryptGetRandom() on Windows NT+ or Windows CE 3.0+ anyway. So I think RAND_screen() is a relict where OpenSSL didn't use an OS-based seed initialization.

But I tested a bit more and it seems that the screen loading is not the bottleneck here. It doesn't take ~1s as I thought. If I run a server like in the test and then a *.bat file with echo QUIT | openssl-cli.exe s_client -connect 127.0.0.1:12346 the openssl-cli takes about 2s to exit after _(un)authed.

@orangemocha
Copy link
Contributor

The test is slow on Windows because of two 1 second delays in the Windows version of openssl-cli.exe. We have reported the issue to openssl: http://rt.openssl.org/Ticket/Display.html?user=guest&pass=guest&id=3849

Now working on a workaround, modifying the test to let all those test cases run in parallel instead of serialized, so that it finishes quickly in spite of the openssl-cli delay.

We will submit the change to io.js as well as Node.

@Fishrock123
Copy link
Member

We will submit the change to io.js as well as Node.

Awesome, thanks!

@orangemocha
Copy link
Contributor

Do you folks mind reviewing the change here: nodejs/node-v0.x-archive#25368, to make sure we have a consistent +1? I can then cherry pick the commits to io.js in another PR. Thanks!

@rvagg
Copy link
Member

rvagg commented May 23, 2015

/cc @nodejs/crypto to @orangemocha's comment above, sounds positive to me but could do with a word of encouragement from someone in the crypto team

@shigeki
Copy link
Contributor

shigeki commented May 27, 2015

I looked at this and found that we can have one more improvement to use child.kill() not to wait for child processes to be ended. shigeki@ad45f88

RAND_screen() on my Windows takes more than one second. I think we need not to have a good client randomness in this test. Adding a -no-rand-screen option for openssl s_client on Windows is one of ideas to remove this overhead. shigeki@12be7ec and shigeki@dae36fb

The benchmark results are

Current
TotalSeconds      : 86.2152129

joyent/node#25368 
TotalSeconds      : 41.0233476

joyent/node#25368 + child.kill()
TotalSeconds      : 33.9199469

joyent/node#25368 + child.kill() + -no-rand-screen
TotalSeconds      : 9.5351881

9.5 seconds are still slow compared to Unix but it seems to be enough good for CI.

@jbergstroem
Copy link
Member Author

I'm all for --no-rand-screen; this openssl binary isn't used for anything other than verifying our test suite.

@orangemocha
Copy link
Contributor

Regarding using child.kill(), I wonder if this changes the test in way that could limit its effectiveness in catching bugs. But I will add shigeki@ad45f88 to the PR and let people who are more experienced in this area provide feedback.

Regarding, --no-rand-screen it sounds like a wonderful feature. But it would be a floating patch on a dependency, and I wonder if that would be appropriate here. In node we usually tend to limit those floating patches to changes that have already been made in the upstream project, or are high priority and at least have a clear path for being accepted upstream.

@orangemocha
Copy link
Contributor

Updated nodejs/node-v0.x-archive#25368

@shigeki
Copy link
Contributor

shigeki commented May 28, 2015

@orangemocha The patch of -no-rand-screen would not be accepted to the upstream because it can not be applied in a general use. This fix was made carefully to be as small as minimum and limited to have changes of only openssl-cli.exe binary so that libopenssl.a linked to node is nothing to do with it.

There are private patches which are already applied in deps/openssl. For openssl-1.0.2a case, these are
https://github.com/nodejs/io.js/blob/master/deps/openssl/doc/UPGRADING.md#2-apply-private-patches
But I agree that those private patches should be kept in minimum.

@orangemocha
Copy link
Contributor

@shigeki : added all your commits to nodejs/node-v0.x-archive#25368 to get feedback there.

The PR at nodejs/node-v0.x-archive#25368 is ready for review. cc @nodejs/crypto

@shigeki
Copy link
Contributor

shigeki commented May 29, 2015

@indutny @bnoordhuis Could you give a comment on shigeki@12be7ec if we can add it into deps/openssl? It surely saves our time to wait for CI results on Windows.

@bnoordhuis
Copy link
Member

@orangemocha @shigeki If you want your changes to get landed in io.js, can you file them as a PR here?

@orangemocha
Copy link
Contributor

sure @bnoordhuis , against v1.x?

@bnoordhuis
Copy link
Member

@orangemocha master (or next if you want it to go into v3.x.) If it needs to get back-ported to a v1.x. release, please mention that in the pull request.

@orangemocha
Copy link
Contributor

io.js PR is here: #1836

joaocgreis added a commit to JaneaSystems/node that referenced this issue Jun 3, 2015
OpenSSL s_client introduces some delay on Windows. With all clients
running sequentially, this delay is big enough to break CI. This fix runs
the clients in parallel (unless the test includes renegotiation),
reducing the total run time.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
joaocgreis added a commit to JaneaSystems/node that referenced this issue Jun 3, 2015
Different servers must use different ports. Since we can count only on
common.PORT and common.PORT+1, run only 2 servers in parallel.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
joaocgreis added a commit to JaneaSystems/node that referenced this issue Jun 3, 2015
When running in parallel, it is not easy to identify what server and
client failed when the test fails. This adds identifiers to all lines
of console output.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
joaocgreis pushed a commit to JaneaSystems/node that referenced this issue Jun 3, 2015
For better performance of the test, the parent kills child processes
so as not to wait them to be ended.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
shigeki pushed a commit to shigeki/node that referenced this issue Mar 27, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Mar 28, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Backport-PR-URL: #19638
Fixes: #1461
PR-URL: #1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
shigeki pushed a commit to shigeki/node that referenced this issue Aug 15, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
rvagg pushed a commit that referenced this issue Nov 23, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: #1461
PR-URL: #1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
rvagg pushed a commit that referenced this issue Nov 23, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: #1461
PR-URL: #1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
rvagg pushed a commit that referenced this issue Nov 24, 2018
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: #1461
PR-URL: #1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
shigeki pushed a commit to shigeki/node that referenced this issue Feb 26, 2019
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
sam-github pushed a commit to sam-github/node that referenced this issue Sep 5, 2019
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: nodejs#1461
PR-URL: nodejs#1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
BethGriggs pushed a commit that referenced this issue Sep 19, 2019
In openssl s_client on Windows, RAND_screen() is invoked to initialize
random state but it takes several seconds in each connection.
This added -no_rand_screen to openssl s_client on Windows to skip
RAND_screen() and gets a better performance in the unit test of
test-tls-server-verify.
Do not enable this except to use in the unit test.

Fixes: #1461
Backport-PR-URL: #28230
PR-URL: #1836
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

No branches or pull requests

9 participants