Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-enable crypto dh keygen testing on Arm #32233

Open
rhenwood-arm opened this issue Mar 12, 2020 · 9 comments
Open

re-enable crypto dh keygen testing on Arm #32233

rhenwood-arm opened this issue Mar 12, 2020 · 9 comments
Labels
arm Issues and PRs related to the ARM platform. crypto Issues and PRs related to the crypto subsystem. test Issues and PRs related to the tests.

Comments

@rhenwood-arm
Copy link

rhenwood-arm commented Mar 12, 2020

A Crypto test has been disabled on AArch64 as part of PR: #31178

After a brief chat with Sam, he suggests a discussion about this should include @tniessen. I would like to discover the root-cause of the poor performance. The build logs for the job have expired for the Arm builds - is there a record of the hardware that was used for the Arm build?

EDIT: corrected summary.

@addaleax addaleax added crypto Issues and PRs related to the crypto subsystem. test Issues and PRs related to the tests. arm Issues and PRs related to the ARM platform. labels Mar 12, 2020
@sam-github
Copy link
Contributor

To be clear, its not that all crypto testing is disabled, its that two specific tests are disabled, and one of those tests was introduced in that PR, so its not exactly a side-effect.

Anyhow, the arm hardware that would have been used is the arm hardware used for any arm builds.

Which I don't know anything about, but the build job is https://ci.nodejs.org/job/node-test-commit-arm/

I suggest you PR a removal of the skips, we'll trigger a build, you can see what fails, and then follow the build failures to the system info for the machines that failed. Its a bit java specfic, but it'll look like https://ci.nodejs.org/computer/test-packetnet-ubuntu1604-arm64-2/systemInfo

Other than that, we don't really know much about the machines, but I should have ssh access (though some of the arms are behind jump hosts), so if you have specific questions (that is, commands you'd like me to run, maybe cat /proc/cpus?), then I can try to do that for you.

@nodejs/platform-arm might know more about this.

@rhenwood-arm
Copy link
Author

@sam-github - thanks for correcting my inaccurate description. I will work on re-triggering a build so I get some data to review.

@sam-github
Copy link
Contributor

No trouble. And sorry we don't have a better inventory of all the ci machines, but there are a fair number, and the absence of automated tooling to update that kind of info, it would just drift out of sync.

@sam-github sam-github changed the title re-enable crypto testing on Arm re-enable crypto dh keygen testing on Arm Mar 12, 2020
@rvagg
Copy link
Member

rvagg commented Mar 13, 2020

(Not up to speed on context here, just chiming in about machine details)

More info in the description of the machine @ https://ci.nodejs.org/computer/test-packetnet-ubuntu1604-arm64-2/, I hope this is helpful:

Donated by Packet.net: Type 2A 96-core Cavium ThunderX Baremetal ARMv8

We also have a good relationship with Packet.net through their ARM ecosystem investment. If you need more expertise then ping @vielmetti and he might be able to connect to more information and/or expertise.

@bnoordhuis
Copy link
Member

The tests were disabled because they time out and I suppose that's not too surprising because they need to do a lot of (computationally expensive) cryptography in a fairly short amount of time.

Probably the best way forward is to break them up into smaller files.

@tniessen
Copy link
Member

Probably the best way forward is to break them up into smaller files.

Could we maybe just disable (or extend) the timeout on affected machines?

@bnoordhuis
Copy link
Member

The slower machines already get a 2x timeout (grep tools/test.py for TIMEOUT_SCALEFACTOR) and disabling it altogether is bad for obvious reasons (they'd hang forever.)

@tniessen
Copy link
Member

If they still run into timeouts, it indicates that they are more than twice as slow as the other machines, right? Shouldn't the timeout ratio should match the expected execution time ratio?

@bnoordhuis
Copy link
Member

It's just these two tests, other tests pass fairly reliably within the timeout. I don't want to tweak TIMEOUT_SCALEFACTOR because that might end up hiding slow tests in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm Issues and PRs related to the ARM platform. crypto Issues and PRs related to the crypto subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

No branches or pull requests

6 participants