Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashing in nss init test #1839

Open
martinthomson opened this issue Apr 18, 2024 · 3 comments
Open

Crashing in nss init test #1839

martinthomson opened this issue Apr 18, 2024 · 3 comments

Comments

@martinthomson
Copy link
Member

I'm not sure if this is misuse of NSS APIs or a bug in NSS initialization, but this is happening quite a bit to me. I don't have time right now to investigate.

     Running tests/init.rs (target/debug/deps/init-18835beddefbfa4e)

running 2 tests
Assertion failure: lock != NULL, at ../../../../pr/src/pthreads/ptsynch.c:175
test init_withdb ... ok
error: test failed, to rerun pass `-p neqo-crypto --test init`

Caused by:
  process didn't exit successfully: `/home/martin/code/neqo/target/debug/deps/init-18835beddefbfa4e` (signal: 6, SIGABRT: process abort signal)
Stack trace
* thread #3, name = 'init_withdb', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff7dba9fc libc.so.6`pthread_kill + 300
    frame #1: 0x00007ffff7d66476 libc.so.6`raise + 22
    frame #2: 0x00007ffff7d4c7f3 libc.so.6`abort + 211
    frame #3: 0x00007ffff7f6a513 libnspr4.so`PR_Assert(s="lock != NULL", file="../../../../pr/src/pthreads/ptsynch.c", ln=175) at prlog.c:571:5
    frame #4: 0x00007ffff7f8713a libnspr4.so`PR_Lock(lock=0x0000000000000000) at ptsynch.c:175:5
    frame #5: 0x00007ffff7f79f45 libnspr4.so`PR_CallOnce(once=0x00005555556d02b8, func=(init-d2141f227c08e7ce`nss_doLockInit at nssinit.c:534:1)) at prinit.c:774:5
    frame #6: 0x000055555562eb65 init-d2141f227c08e7ce`nss_Init(configdir="/home/martin/code/neqo/test-fixture/db", certPrefix="", keyPrefix="", secmodName="secmod.db", updateDir="", updCertPrefix="", updKeyPrefix="", updateID="", updateName="", initContextPtr=0x0000000000000000, initParams=0x0000000000000000, readOnly=1, noCertDB=0, noModDB=0, forceOpen=0, noRootInit=0, optimizeSpace=0, noSingleThreadedModules=0, allowAlreadyInitializedModules=0, dontFinalizeModules=0) at nssinit.c:580:9
    frame #7: 0x000055555562f1d9 init-d2141f227c08e7ce`NSS_Initialize(configdir="/home/martin/code/neqo/test-fixture/db", certPrefix="", keyPrefix="", secmodName="secmod.db", flags=1) at nssinit.c:889:12
    frame #8: 0x00005555555e36a2 init-d2141f227c08e7ce`neqo_crypto::init_db::_$u7b$$u7b$closure$u7d$$u7d$::h6cccd5c60e5c4ea4 at lib.rs:163:13
    frame #9: 0x00005555555e3d32 init-d2141f227c08e7ce`std::sync::once_lock::OnceLock$LT$T$GT$::get_or_init::_$u7b$$u7b$closure$u7d$$u7d$::h2550b0c13514ffb8 at once_lock.rs:250:50
    frame #10: 0x00005555555e3c0b init-d2141f227c08e7ce`std::sync::once_lock::OnceLock$LT$T$GT$::initialize::_$u7b$$u7b$closure$u7d$$u7d$::h87f60e1ed9e123a1(p=0x00007ffff7a15110) at once_lock.rs:376:19
    frame #11: 0x00005555555e4167 init-d2141f227c08e7ce`std::sync::once::Once::call_once_force::_$u7b$$u7b$closure$u7d$$u7d$::h146f8429fe820fd0(p=0x00007ffff7a15110) at once.rs:208:40
    frame #12: 0x00005555555e489a init-d2141f227c08e7ce`std::sys_common::once::futex::Once::call::hfa47d030df88ef30(self=0x00005555556d0200, ignore_poisoning=true, f=0x00007ffff7a152a8) at futex.rs:124:21
    frame #13: 0x00005555555e4037 init-d2141f227c08e7ce`std::sync::once::Once::call_once_force::h3e05b6c06111e63d(self=0x00005555556d0200, f={closure_env#0}<core::result::Result<neqo_crypto::NssLoaded, neqo_crypto::err::Error>, std::sync::once_lock::{impl#0}::get_or_init::{closure_env#0}<core::result::Result<neqo_crypto::NssLoaded, neqo_crypto::err::Error>, neqo_crypto::init_db::{closure_env#0}<&str>>, !> @ 0x00007ffff7a152f8) at once.rs:208:9
    frame #14: 0x00005555555e3b99 init-d2141f227c08e7ce`std::sync::once_lock::OnceLock$LT$T$GT$::initialize::h2a16e900f7cbd97b(self=0x00005555556d01c8, f={closure_env#0}<core::result::Result<neqo_crypto::NssLoaded, neqo_crypto::err::Error>, neqo_crypto::init_db::{closure_env#0}<&str>> @ 0x00007ffff7a15320) at once_lock.rs:375:9
    frame #15: 0x00005555555e3e04 init-d2141f227c08e7ce`std::sync::once_lock::OnceLock$LT$T$GT$::get_or_try_init::hf774e982ab37704a(self=0x00005555556d01c8, f={closure_env#0}<core::result::Result<neqo_crypto::NssLoaded, neqo_crypto::err::Error>, neqo_crypto::init_db::{closure_env#0}<&str>> @ 0x00007ffff7a15398) at once_lock.rs:298:9
    frame #16: 0x00005555555e3cfd init-d2141f227c08e7ce`std::sync::once_lock::OnceLock$LT$T$GT$::get_or_init::hba25c0f3d5f8da29(self=0x00005555556d01c8, f={closure_env#0}<&str> @ 0x00007ffff7a15400) at once_lock.rs:250:15
    frame #17: 0x00005555555e309d init-d2141f227c08e7ce`neqo_crypto::init_db::hf7895a0b6d64f1f4(dir=(data_ptr = "/home/martin/code/neqo/test-fixture/dbneqo-crypto/tests/init.rs", length = 38)) at lib.rs:149:15
    frame #18: 0x00005555555e22be init-d2141f227c08e7ce`init::init_withdb::h1911f7a1b0bd088f at init.rs:47:5
    frame #19: 0x00005555555e1c87 init-d2141f227c08e7ce`init::init_withdb::_$u7b$$u7b$closure$u7d$$u7d$::hd4e1bf586e84a384((null)=0x00007ffff7a155d6) at init.rs:46:17
@larseggert
Copy link
Collaborator

Interesting! I don't hit this when I do cargo nextest run (which I normally use), but I can reproduce with cargo test.

@martinthomson
Copy link
Member Author

nextest might not run the two tests concurrently. I get the same running the test binary directly. It seems to be down to a race or contention between concurrent attempts to initialize.

I took a brief look at the NSS initialization and there are a few things that might point at problems. For instance, a lack of locking on NSS_IsInitialized is concerning (I can guess why, but it's blatantly unsafe). The fact that most of the routine is not covered by a mutex is also a little worrying. Neither immediately points to the problem, but it probably needs some investigation (it might be the true source of our NSS initialization woes, you know the ones from #482).

@larseggert
Copy link
Collaborator

larseggert commented Apr 18, 2024

It seems to be failing in the twice tests I added as part of 3151adc. At least if I comment those out, I don't get the failure anymore.

Specifically, init_twice_withdb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants