Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodejs runs out of memory trying to connect to HTTPS host with self-signed certificate #37757

Closed
nils91 opened this issue Mar 15, 2021 · 15 comments
Labels
confirmed-bug Issues with confirmed bugs. tls Issues and PRs related to the tls subsystem.

Comments

@nils91
Copy link
Contributor

nils91 commented Mar 15, 2021

  • Version: 15.11.0 (also tested with 12 and 14 on same machine)
  • Platform: Ubuntu 16.04 (Linux 4.4.0-1127.8.2.vz7.151.14 deps: update openssl to 1.0.1j #1 SMP Tue Jun 9 12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux)
  • Subsystem: https

What steps will reproduce the bug?

When trying to connect to a https-enabled host and using a self-signed certificate provided to node via NODE_EXTRA_CA_CERTS node hangs for a few minutes, using more and more memory, until it exits with Allocation failed - JavaScript heap out of memory. Full stacktrace below.

How often does it reproduce? Is there a required condition?

The server needs to be up and the correct server certificate needs to be used.

What is the expected behavior?

I would expect the connection to be made successfully, since the self-signed certificate has been provided to node.

What do you see instead?

<--- Last few GCs --->

[15068:0x52f3bd0]   324861 ms: Scavenge (reduce) 4053.6 (4128.3) -> 4053.4 (4129.3) MB, 72.6 / 0.0 ms  (average mu = 0.851, current mu = 0.769) allocation failure
[15068:0x52f3bd0]   325180 ms: Scavenge (reduce) 4054.9 (4129.3) -> 4054.8 (4131.1) MB, 86.6 / 0.0 ms  (average mu = 0.851, current mu = 0.769) allocation failure
[15068:0x52f3bd0]   325442 ms: Scavenge (reduce) 4056.6 (4131.1) -> 4056.4 (4132.6) MB, 71.5 / 0.0 ms  (average mu = 0.851, current mu = 0.769) allocation failure


<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
 1: 0xa7f490 node::Abort() [node]
 2: 0x9a5c4d node::FatalError(char const*, char const*) [node]
 3: 0xc6c2ae v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xc6c627 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xe360a5  [node]
 6: 0xe65633 v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject, int) [node]
 7: 0xe721e6 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [node]
 8: 0xe5e30f v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [node]
 9: 0xe5e588 v8::internal::PageEvacuationTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) [node]
10: 0xe503a9 v8::internal::ItemParallelJob::Run() [node]
11: 0xe74170 void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompactCollector*, v8::internal::ItemParallelJob*, v8::internal::MigrationObserver*, long) [node]
12: 0xe749b3 v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [node]
13: 0xe74d75 v8::internal::MarkCompactCollector::Evacuate() [node]
14: 0xe874e1 v8::internal::MarkCompactCollector::CollectGarbage() [node]
15: 0xe433a8 v8::internal::Heap::MarkCompact() [node]
16: 0xe44d38 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
17: 0xe482dc v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
18: 0xe0bf3a v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
19: 0xe05444 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
20: 0xe07540 v8::internal::FactoryBase<v8::internal::Factory>::NewRawOneByteString(int, v8::internal::AllocationType) [node]
21: 0xe1955e v8::internal::Factory::NewStringFromOneByte(v8::internal::Vector<unsigned char const> const&, v8::internal::AllocationType) [node]
22: 0xc839e2 v8::String::NewFromOneByte(v8::Isolate*, unsigned char const*, v8::NewStringType, int) [node]
23: 0xbb5914 node::crypto::GetFingerprintDigest(node::Environment*, evp_md_st const*, x509_st*) [node]
24: 0xbb736e node::crypto::X509ToObject(node::Environment*, x509_st*) [node]
25: 0xbb81e4 node::crypto::GetPeerCert(node::Environment*, std::unique_ptr<ssl_st, node::FunctionDeleter<ssl_st, &SSL_free> > const&, bool, bool) [node]
26: 0xc2b567 node::crypto::TLSWrap::GetPeerCertificate(v8::FunctionCallbackInfo<v8::Value> const&) [node]
27: 0xcd8cbb  [node]
28: 0xcda26c  [node]
29: 0xcda8e6 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
30: 0x14fa219  [node]
./run.sh: line 5: 15068 Aborted                 node index.js

Additional information

All files i´ve used to reproduce: node_https_test.zip
https_renew_cert.sh is a script used to generate the key/certificate pair. (Sample certificate and key is also included in the zip, so you don´t have to regenerate them, if you don´t have oppenssl installed)
server.js is the server. Execute with node server.js.
index.js is the client. Started from run.sh
run.sh: Starts the client and sets NODE_EXTRA_CA_CERTS. (You need to update the path to the certificate)

@RaisinTen
Copy link
Contributor

❯ node -v
v15.2.0
❯ cp ~/Downloads/node_https_test.zip .
❯ unzip node_https_test.zip
Archive:  node_https_test.zip
  inflating: cert.conf               
  inflating: cert.pem                
  inflating: https_renew_cert.sh     
  inflating: index.js                
  inflating: key.pem                 
  inflating: run.sh                  
  inflating: server.js               
❯ export NODE_EXTRA_CA_CERTS="$PWD/cert.pem"
❯ node server.js
^Z
[1]  + 9905 suspended  node server.js
❯ node index.js
^Z
[2]  + 9921 suspended  node index.js
❯ fg %node\ server.js
[1]  - 9905 continued  node server.js
^Z
[1]  + 9905 suspended  node server.js
❯ fg %node\ index.js
[2]  - 9921 continued  node index.js
An error Error: unable to verify the first certificate
    at TLSSocket.onConnectSecure (node:_tls_wrap:1498:34)
    at TLSSocket.emit (node:events:329:20)
    at TLSSocket._finishInit (node:_tls_wrap:933:8)
    at TLSWrap.ssl.onhandshakedone (node:_tls_wrap:707:12) {
  code: 'UNABLE_TO_VERIFY_LEAF_SIGNATURE'
}

Can't reproduce. Am I doing something wrong here?

@nils91
Copy link
Contributor Author

nils91 commented Mar 15, 2021

Looks fine to me. Just did exactly what you did, reproduces the problem for me. Are you running ubuntu? I also tried it on fedora after you answered and then i get the same error you do.

@RaisinTen
Copy link
Contributor

Yes, Ubuntu.

@nils91
Copy link
Contributor Author

nils91 commented Mar 16, 2021

I´ve set up a repo with an action to reproduce the issue: https://github.com/nils91/node_https_test.
Action output:

Run actions/setup-node@v2.1.2
Attempting to download 15.x...
Not found in manifest.  Falling back to download directly from Node
Acquiring 15.11.0 from https://nodejs.org/dist/v15.11.0/node-v15.11.0-linux-x64.tar.gz
Extracting ...
/usr/bin/tar xz --strip 1 --warning=no-unknown-keyword -C /home/runner/work/_temp/b0e5d0e1-53c0-465e-a30d-7fc2e5c30f1e -f /home/runner/work/_temp/49fe5a6a-bbc1-4afa-b53c-46330d65ee3c
Adding to the cache ...
Done
0s
Run uname -a
Linux fv-az189-765 5.4.0-1040-azure 42-Ubuntu SMP Fri Feb 5 15:39:06 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
0s
Run node -v
v15.11.0
5s
Run node server.js &
0s
Run chmod +x ./run.sh
2m 11s
Run ./run.sh

<--- Last few GCs --->

[1730:0x5d04ef0]   106385 ms: Mark-sweep (reduce) 2046.2 (2085.8) -> 2046.1 (2086.3) MB, 2473.3 / 0.0 ms  (average mu = 0.080, current mu = 0.011) allocation failure scavenge might not succeed
[1730:0x5d04ef0]   108843 ms: Mark-sweep (reduce) 2047.1 (2086.3) -> 2047.0 (2087.3) MB, 2427.4 / 0.0 ms  (average mu = 0.047, current mu = 0.013) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xa7f490 node::Abort() [node]
 2: 0x9a5c4d node::FatalError(char const*, char const*) [node]
 3: 0xc6c2ae v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xc6c627 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xe360a5  [node]
 6: 0xe36c4c  [node]
 7: 0xe445db v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 8: 0xe482dc v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 9: 0xe0c077 v8::internal::Factory::AllocateRawWithAllocationSite(v8::internal::Handle<v8::internal::Map>, v8::internal::AllocationType, v8::internal::Handle<v8::internal::AllocationSite>) [node]
10: 0xe128b8 v8::internal::Factory::NewJSObjectFromMap(v8::internal::Handle<v8::internal::Map>, v8::internal::AllocationType, v8::internal::Handle<v8::internal::AllocationSite>) [node]
11: 0xe13fa3 v8::internal::Factory::NewJSArrayBuffer(std::shared_ptr<v8::internal::BackingStore>, v8::internal::AllocationType) [node]
12: 0xc96b99 v8::ArrayBuffer::New(v8::Isolate*, std::shared_ptr<v8::BackingStore>) [node]
13: 0xbb79c3 node::crypto::X509ToObject(node::Environment*, x509_st*) [node]
14: 0xbb81e4 node::crypto::GetPeerCert(node::Environment*, std::unique_ptr<ssl_st, node::FunctionDeleter<ssl_st, &SSL_free> > const&, bool, bool) [node]
15: 0xc2b567 node::crypto::TLSWrap::GetPeerCertificate(v8::FunctionCallbackInfo<v8::Value> const&) [node]
16: 0xcd8cbb  [node]
17: 0xcda26c  [node]
18: 0xcda8e6 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
19: 0x14fa219  [node]
./run.sh: line 5:  1730 Aborted                 (core dumped) node index.js
Error: Process completed with exit code 134.

@nils91
Copy link
Contributor Author

nils91 commented Mar 18, 2021

I might have found a workaround: Removing the line

keyUsage = digitalSignature, keyEncipherment

from the cert.conf file and regenerating the certificate using the new config allows node to get a connection. However, that might not be applicable to all use-cases, as some applications can be picky about the keyUsage configuration.
Edit: keyUsage, at least right now, requires keyCertSign to be set. See openssl/openssl#1418 for details. For this issue, adding keyCertSign to the keyUsage works as a workaround.

@RaisinTen
Copy link
Contributor

I forked your repo here and ran some tests on few other versions. In spite of running the test on v15.11.0 again, the function called after X509ToObject in the stack trace seems to differ:

<--- Last few GCs --->

[1560:0x6003ef0]    85165 ms: Mark-sweep (reduce) 2043.8 (2083.1) -> 2043.4 (2083.8) MB, 1600.2 / 0.0 ms  (+ 0.0 ms in 69 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 1880 ms) (average mu = 0.485, current mu = 0.233) [1560:0x6003ef0]    87646 ms: Mark-sweep (reduce) 2044.4 (2083.8) -> 2044.3 (2084.6) MB, 2454.3 / 0.0 ms  (average mu = 0.255, current mu = 0.011) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
 1: 0xa7f490 node::Abort() [node]
 2: 0x9a5c4d node::FatalError(char const*, char const*) [node]
 3: 0xc6c2ae v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xc6c627 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xe360a5  [node]
 6: 0xe65633 v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject, int) [node]
 7: 0xe721e6 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [node]
 8: 0xe5e30f v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [node]
 9: 0xe5e588 v8::internal::PageEvacuationTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) [node]
10: 0xe503a9 v8::internal::ItemParallelJob::Run() [node]
11: 0xe74170 void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompactCollector*, v8::internal::ItemParallelJob*, v8::internal::MigrationObserver*, long) [node]
12: 0xe749b3 v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [node]
13: 0xe74d75 v8::internal::MarkCompactCollector::Evacuate() [node]
14: 0xe874e1 v8::internal::MarkCompactCollector::CollectGarbage() [node]
15: 0xe433a8 v8::internal::Heap::MarkCompact() [node]
16: 0xe44d38 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
17: 0xe482dc v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
18: 0xe0bf3a v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
19: 0xe05444 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
20: 0xe07540 v8::internal::FactoryBase<v8::internal::Factory>::NewRawOneByteString(int, v8::internal::AllocationType) [node]
21: 0xe0ddfd v8::internal::Factory::NewStringFromUtf8(v8::internal::Vector<char const> const&, v8::internal::AllocationType) [node]
22: 0xc83882 v8::String::NewFromUtf8(v8::Isolate*, char const*, v8::NewStringType, int) [node]
23: 0xbb774e node::crypto::X509ToObject(node::Environment*, x509_st*) [node]
24: 0xbb81e4 node::crypto::GetPeerCert(node::Environment*, std::unique_ptr<ssl_st, node::FunctionDeleter<ssl_st, &SSL_free> > const&, bool, bool) [node]
25: 0xc2b567 node::crypto::TLSWrap::GetPeerCertificate(v8::FunctionCallbackInfo<v8::Value> const&) [node]
26: 0xcd8cbb  [node]
27: 0xcda26c  [node]
28: 0xcda8e6 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
29: 0x14fa219  [node]
./run.sh: line 5:  1560 Aborted                 (core dumped) node index.js
Error: Process completed with exit code 134.

@RaisinTen
Copy link
Contributor

https://github.com/RaisinTen/node_https_test/runs/2140664480?check_suite_focus=true says that the crash was introduced in v15.5.0, so I think the relevant change lies somewhere in #36597. Since I don't see any change in src/crypto*, I can't really tell where the relevant change lies. ¯\_(ツ)_/¯

@RaisinTen
Copy link
Contributor

cc @nodejs/crypto

@RaisinTen RaisinTen added the confirmed-bug Issues with confirmed bugs. label Mar 18, 2021
@addaleax addaleax added the tls Issues and PRs related to the tls subsystem. label Mar 23, 2021
@schamberg97
Copy link
Contributor

schamberg97 commented Mar 24, 2021

https://github.com/RaisinTen/node_https_test/runs/2140664480?check_suite_focus=true says that the crash was introduced in v15.5.0, so I think the relevant change lies somewhere in #36597. Since I don't see any change in src/crypto*, I can't really tell where the relevant change lies. ¯_(ツ)_/¯

There, however, was an OpenSSL update in 15.5.0. Perhaps this caused the problem?

@addaleax
Copy link
Member

#37889 contains some background on the OpenSSL change that might be related here, as well as a standalone repro.

@addaleax
Copy link
Member

Looks like this is the relevant infinite loop:

while (X509_check_issued(cert->get(), cert->get()) != X509_V_OK) {

I'm not really familiar enough with the OpenSSL APIs to understand why, though.

@nils91
Copy link
Contributor Author

nils91 commented Mar 25, 2021

According to its documentation X509_check_issued checks if 'the keyUsage field (if present) of issuer allows certificate signing'. That method is defined in here and seems to return X509_V_ERR_KEYUSAGE_NO_CERTSIGN if no keyCertSign is present (what #37889 is about). As far as i can tell after looking through the source for openssl and the snippet @addaleax posted above for a bit there is no branch in the code that stops the loop if that error occurs.
Actually it seems to be X509_V_ERR_UNSPECIFIED.

@fknx
Copy link

fknx commented Mar 29, 2021

I am not sure if this is relevant, but I've verified the behavior change from OpenSSL 1.1.1g to 1.1.1h. With OpenSSL 1.1.1g the connection openssl.exe s_client -connect 127.0.0.1:4001 -CAfile localhost.cer fails with the Verify return code: 21 (unable to verify the first certificate) error, while it succeeds with OpenSSL 1.1.1h (Verify return code: 0 (ok)).

nils91 added a commit to nils91/node that referenced this issue Apr 4, 2021
@ramosbugs
Copy link

I can confirm this issue shows up on Mac OS 10.15.7 as well, though it just infinite loops for me (haven't let it run long enough to exhaust available memory, but I see it increasing steadily).

To make matters worse, Let's Encrypt's official recommendation (which I think is correct) includes the exact keyUsage extension that triggers this bug: https://letsencrypt.org/docs/certificates-for-localhost/#making-and-trusting-your-own-certificates

thanks for looking into this!

targos pushed a commit that referenced this issue Apr 29, 2021
Refs: #37757
Refs: #37889

PR-URL: #37990
Fixes: #37757
Reviewed-By: James M Snell <jasnell@gmail.com>
@sangaman
Copy link

I might have found a workaround: Removing the line

keyUsage = digitalSignature, keyEncipherment

from the cert.conf file and regenerating the certificate using the new config allows node to get a connection. However, that might not be applicable to all use-cases, as some applications can be picky about the keyUsage configuration. Edit: keyUsage, at least right now, requires keyCertSign to be set. See openssl/openssl#1418 for details. For this issue, adding keyCertSign to the keyUsage works as a workaround.

I was running into this bug and this comment saved me. With a cert that just had digitalSignature and keyEncipherment, I could not use tls and node rapidly gobbled up memory until crashing. If I just add certificate signing to the list of usages, the OOM goes away and tls worked. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. tls Issues and PRs related to the tls subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants