Revert "tls: add highWaterMark option for connect" #33387

puzpuzpuz · 2020-05-13T12:30:51Z

Reverts commit 58682d8

As it was discussed in #33262, TLS assumes 16KB as the maximum record size, so it makes no sense to expose hwm option in tls.connect(). Even if we do so, it's also a problem as connections may be reused in https (see #30107 (comment)).

Adds a note on applicability of highWaterMark for tls.TLSSocket (though, I'm not sure if that's the right place for this note).

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

This reverts commit 58682d8.

richardlau · 2020-05-13T12:48:28Z

Can we revert this (as opposed to (doc?) deprecate) as #32786 went out in 13.4.0 and 14.1.0?

doc/api/https.md

BridgeAR · 2020-05-13T12:41:10Z

doc/api/tls.md

@@ -650,6 +650,10 @@ Methods that return TLS connection metadata (e.g.
 [`tls.TLSSocket.getPeerCertificate()`][] will only return data while the
 connection is open.

+The `highWaterMark` parameter of the duplex [Stream][] makes no sense for
+`tls.TLSSocket`, because the TLS protocol assumes 16KB as the maximum record
+size.


I suggest to move this comment down below and to mention that it's not supported by extending the following entry * ...: Any [`socket.connect()`][] option not already listed..

Moved it. Please check the update.

puzpuzpuz · 2020-05-13T12:57:36Z

Can we revert this (as opposed to (doc?) deprecate) as #32786 went out in 13.4.0 and 14.1.0?

@richardlau that's a good question. I don't think anyone started using this option already. I could keep entries in changes sections and add a new one for the removal. This will help to avoid confusion when users see an option that silently disappeared. WDYT?

richardlau · 2020-05-13T12:59:58Z

Can we revert this (as opposed to (doc?) deprecate) as #32786 went out in 13.4.0 and 14.1.0?

@richardlau that's a good question. I don't think anyone started using this option already. I could keep entries in changes sections and add a new one for the removal. This will help to avoid confusion when users see an option that silently disappeared. WDYT?

I think I'd be okay with that. The code will just ignore the option in the case anyone is setting it, correct?

Also restores change entries for `highWaterMark` option in `tls` and `https` docs.

puzpuzpuz · 2020-05-13T13:12:26Z

I think I'd be okay with that. The code will just ignore the option in the case anyone is setting it, correct?

Yes, that's right. Also that option wasn't working as expected when set (see #33262), so this PR shouldn't break any user code.

addaleax · 2020-05-13T17:52:39Z

@benjamingr Is bringing this over in the issue thread, and it probably makes sense to discuss it there, but what does the highWaterMark option have to do with the TLS record size? I don’t really see how they are related, conceptually. The highWaterMark option is telling us how much data to buffer at most in order to reduce latency at the cost of extra memory – but it doesn’t matter whether it takes 1 or 2 or 20 TLS frames to write that, does it?

addaleax

See the comment above – I’m not opposed but we should be clear about why it makes sense to remove this.

puzpuzpuz · 2020-05-13T18:15:57Z

The highWaterMark option is telling us how much data to buffer at most in order to reduce latency at the cost of extra memory – but it doesn’t matter whether it takes 1 or 2 or 20 TLS frames to write that, does it?

@addaleax as it was discussed in #33262, such buffering does not happen in tls for some reasons. So, we either need to revert the change, or to fix tls. I came up with the revert as the fastest option, but the fix is certainly more valuable for users. So, I'm going to check what it takes to fix the option.

addaleax · 2020-05-13T18:28:07Z

@puzpuzpuz This test passes for me:

Test in the fold

'use strict';
const common = require('../common');
if (!common.hasCrypto)
  common.skip('missing crypto');

const assert = require('assert');
const tls = require('tls');
const { PassThrough } = require('stream');
const fixtures = require('../common/fixtures');

const pem = (n) => fixtures.readKey(`${n}.pem`);

const server = tls.createServer({
  key: pem('agent1-key'),
  cert: pem('agent1-cert')
}, common.mustCall((client) => {
  // Do not read from the client here.
  client.destroy();
  server.close();
}));

server.listen(0, common.mustCall(() => {
  const stream = tls.connect({
    port: server.address().port,
    rejectUnauthorized: false,
    highWaterMark: 128000,
  }, common.mustCall(() => {
    const pt = new PassThrough();
    pt.pipe(stream);
    for (let i = 0; i < 256; i++)
      pt.write('a'.repeat(1000));
    process.nextTick(() => {
      assert.strictEqual(stream.writableLength, 128000);
    });
  }));
  stream.on('error', () => {});
}));

So I’d take that as proof that the HWM option does work?

benjamingr · 2020-05-13T18:44:46Z

@addaleax

but it doesn’t matter whether it takes 1 or 2 or 20 TLS frames to write that, does it?

That has been my understanding as well. I agree we shouldn't remove highWatermark support from tls just because it doesn't work with HTTPs.

In general setting a highWatermark in one part of a stream doesn't ensure that backpressure will be correctly respected throughout the different transformations of the stream.

(A semi related interesting side note is the time when David Fowler who built the new .net web framework raised this issue of buffering streams when discussing I/O interfaces in Deno a while ago denoland/deno#387 (comment) to denoland/deno#387 (comment) )

@puzpuzpuz I'm sorry but I'm still confused:

as it was discussed in #33262, such buffering does not happen in tls for some reasons.

In #33262 (comment) @bnoordhuis said:

I don't think readableHighWaterMark can be made meaningful for TLS/HTTPS because that 16k will still be sitting in a buffer somewhere.

There is a difference between "doesn't currently work" and "doesn't conceptually make sense". I'm not sure why having 16k sitting in a buffer somewhere mean Node.js can't buffer multiples of 16K with higher highWaterMark.

puzpuzpuz · 2020-05-13T19:34:34Z

@puzpuzpuz This test passes for me:
...
So I’d take that as proof that the HWM option does work?

Not really or, to be more precise, it works partially. The original discussion was about readableHighWaterMark part of HWM, so the test should roughly look like the following one.

Test in the fold

'use strict';
const common = require('../common');
if (!common.hasCrypto)
  common.skip('missing crypto');

const assert = require('assert');
const tls = require('tls');
const { Buffer } = require('buffer');
const fixtures = require('../common/fixtures');

const pem = (n) => fixtures.readKey(`${n}.pem`);

const hwm = 16 * 1024; // this value works, while 128000 does not

const server = tls.createServer({
  key: pem('agent1-key'),
  cert: pem('agent1-cert')
}, common.mustCall((socket) => {
  socket.write(Buffer.allocUnsafe(hwm), common.mustCall(() => {
    socket.destroy();
    server.close();
  }));
}));

server.listen(0, common.mustCall(() => {
  const socket = tls.connect({
    port: server.address().port,
    rejectUnauthorized: false,
    highWaterMark: hwm,
  }, common.mustCall(() => {}));
  socket.on('data', (data) => {
    assert.strictEqual(data.length, hwm);
  });
  socket.on('error', () => {});
}));

puzpuzpuz · 2020-05-13T19:40:45Z

@puzpuzpuz I'm sorry but I'm still confused:

as it was discussed in #33262, such buffering does not happen in tls for some reasons.

In #33262 (comment) @bnoordhuis said:

I don't think readableHighWaterMark can be made meaningful for TLS/HTTPS because that 16k will still be sitting in a buffer somewhere.

There is a difference between "doesn't currently work" and "doesn't conceptually make sense". I'm not sure why having 16k sitting in a buffer somewhere mean Node.js can't buffer multiples of 16K with higher highWaterMark.

@benjamingr I wasn't saying "doesn't conceptually make sense", but "doesn't currently work" (and it doesn't work as expected - see #33387 (comment)). Could you specify what exactly confuses you?

addaleax · 2020-05-13T19:41:33Z

@puzpuzpuz But that’s not what the HWM indicates – it’s not the size of the chunks in which data is read, it is the limit up to which data is read before a 'data' handler or .read() call consumes it.

puzpuzpuz · 2020-05-13T19:45:13Z

@puzpuzpuz But that’s not what the HWM indicates – it’s not the size of the chunks in which data is read, it is the limit up to which data is read before a 'data' handler or .read() call consumes it.

@addaleax then my understanding wasn't correct. Let me double check how HWM works. Thanks for the clarification.

benjamingr · 2020-05-13T19:49:12Z

@puzpuzpuz the goal of watermarks is to enable steaming and save I/O at the cost of memory. As an example see this diagram from the Chromium wiki that describes how Chrome downloads MP4 files:

In your example setting the highWatermark to 128k only changes the amount of buffering not the size of each 'data'.

puzpuzpuz · 2020-05-13T20:17:27Z

@addaleax then my understanding wasn't correct. Let me double check how HWM works. Thanks for the clarification.

@addaleax after checking the documentation it seems that my understanding was correct, i.e. HWM sets the up to limit for a readable stream. Considering this, I agree that the option in tls works correctly by the definition of HWM.

Yet, I'm curious about the 16KB limitation. Do you happen to know where it comes from? TLS record size may be the answer, but as discussed here multiple records could be buffered when reading from the socket.

bnoordhuis · 2020-05-14T10:51:38Z

I'm basing my comment in the other issue on this line from doc/api/stream.md:

The amount of data potentially buffered depends on the highWaterMark option

In the context of receiving data (sending might a different matter), a HWM < 16k is meaningless because it's effectively ignored (because TLS frame size.)

HWM > 16k is really HWM + 16k because the TLS frame record buffer isn't under control of JS land. Any HWM buffering is on top of that frame record buffer.

benjamingr · 2020-05-14T10:59:35Z

, a HWM < 16k is meaningless because it's effectively ignored (because TLS frame size.)

That makes sense, I'm +1 on at least documenting HWM < 16K not being supported.

Any HWM buffering is on top of that frame record buffer.

Right, that is my expectation and why I'd use HWM - why is there an issue with hwm > 16K?

puzpuzpuz · 2020-05-14T11:41:11Z

That makes sense, I'm +1 on at least documenting HWM < 16K not being supported.

I'll close this PR and submit another one with the doc change. But I'm also curious about the reason behind this restriction.

benjamingr · 2020-05-14T11:56:41Z

But I'm also curious about the reason behind this restriction.

What restriction in particular? HWM must be at least 16K because buffering less than that in TLS doesn't make much sense because any size smaller than one frame record can't be understood without the rest of the frame.

I don't think it has to be at most 16Kb, I think Anna's test shows it (with the support already added) can be more than 16Kb and be multiple frames.

We can discuss another property that lets users control the size of each chunk emitted (separate from hwm), that would presumably have to be multiples of 16kb though I'm not sure we'd even want to support customizing that on the stream (as it's an implementation detail).

puzpuzpuz · 2020-05-14T12:04:04Z

What restriction in particular?

The "I'd use HWM - why is there an issue with hwm > 16K?" one from #33387 (comment)

puzpuzpuz · 2020-05-14T12:07:09Z

I don't think it has to be at most 16Kb, I think Anna's test shows it (with the support already added) can be more than 16Kb and be multiple frames.

The test shows that the option applies to writableHighWaterMark in the underlying TLS stream. And if I'm not mistaken the latest messages consider readableHighWaterMark behavior.

bnoordhuis · 2020-05-14T21:40:13Z

why is there an issue with hwm > 16K?

It's not a reliable measure if you use it to limit memory usage.

Say I have M bytes of memory. Most people probably expect S=M/HWM, where S is the number of TLS streams that can exist simultaneously.

But it doesn't work that way. The real formula is S=M/(HWM+X), where X is the additional overhead of the TLS stream: frame record buffer, keying material, etc.

X can be sizable (an extra 40 or 50k is not exceptional) and can vary over time. Maybe I set HWM=32k but the actual memory footprint might well be double that.

benjamingr · 2020-05-15T00:00:23Z

I always thought of hwm like one would a literal watermark, it's the amount of data I have where I stop asking for more. It was never the absolute maximum amount of data I can hold in a stream. It's only the amount of data I stop asking for new data at.

There is no expectation on my part that setting the hwm somehow limits how much memory my program could consime. If you believe users have such an expectation we should probably document the fact it is not possible to calculate the number of streams a server can support with a given amount of memory based on hwm

bnoordhuis · 2020-05-15T17:13:42Z

I asked a few people and if what you're saying is the right way to think about HWM, then our docs are supremely terrible at conveying that because no one guessed it correctly (including me.)

From stream.md:

Once the total size of the internal read buffer reaches the threshold specified
by `highWaterMark`, the stream will temporarily stop reading data from the
underlying resource until the data currently buffered can be consumed (that is,
the stream will stop calling the internal `readable._read()` method that is
used to fill the read buffer).

My, ah, proof readers all thought that means it's a memory usage cap. And yes, I was careful to avoid leading questions. :-)

jasnell · 2020-05-15T22:30:46Z

@benjamingr :

I always thought of hwm like one would a literal watermark, it's the amount of data I have where I stop asking for more. It was never the absolute maximum amount of data I can hold in a stream. It's only the amount of data I stop asking for new data at.

This is correct. Once the highwatermark is hit the object will still happily keep buffering more but will keep telling you to stop. Specific stream implementations could choose to enforce stricter limits but doing so is optional. Specifically, it's a threshold, not a limit.

benjamingr · 2020-05-16T12:43:06Z

@jasnell @bnoordhuis I opened a pull request partially based on James's comment and my previous one for the docs. #33432

Feel free to make changes/suggestions directly or if you believe we should make a more substantial change in the docs

BridgeAR · 2020-05-23T16:46:28Z

I am going to close this due to the discussion above. Please reopen in case someone would rather keep this open. I think it might be a good idea to open a PR that adds the test mentioned above by @addaleax

puzpuzpuz · 2020-05-23T18:41:58Z

@BridgeAR thanks for closing this PR. I wasn't able to follow the discussion for some time and after reading the updates I agree with all points.

@benjamingr thanks for opening the doc enhancement PR!

Revert "tls: add highWaterMark option for connect"

6324732

This reverts commit 58682d8.

puzpuzpuz added tls Issues and PRs related to the tls subsystem. doc Issues and PRs related to the documentations. labels May 13, 2020

puzpuzpuz force-pushed the revert-tls-hwm-option branch from 16d1ce8 to c23f5d3 Compare May 13, 2020 12:31

puzpuzpuz mentioned this pull request May 13, 2020

HTTPS highwatermark not working #33262

Closed

BridgeAR reviewed May 13, 2020

View reviewed changes

doc: add note for hwm restriction in tls

de46fc4

Also restores change entries for `highWaterMark` option in `tls` and `https` docs.

puzpuzpuz force-pushed the revert-tls-hwm-option branch from c23f5d3 to de46fc4 Compare May 13, 2020 13:11

addaleax requested changes May 13, 2020

View reviewed changes

benjamingr mentioned this pull request May 16, 2020

doc: add comment about highWaterMark limit #33432

Closed

4 tasks

BridgeAR closed this May 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "tls: add highWaterMark option for connect" #33387

Revert "tls: add highWaterMark option for connect" #33387

puzpuzpuz commented May 13, 2020

richardlau commented May 13, 2020

BridgeAR May 13, 2020

puzpuzpuz May 13, 2020

puzpuzpuz commented May 13, 2020

richardlau commented May 13, 2020

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

addaleax left a comment

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

benjamingr commented May 13, 2020

puzpuzpuz commented May 13, 2020 •

edited

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

puzpuzpuz commented May 13, 2020

benjamingr commented May 13, 2020

puzpuzpuz commented May 13, 2020

bnoordhuis commented May 14, 2020

benjamingr commented May 14, 2020

puzpuzpuz commented May 14, 2020

benjamingr commented May 14, 2020

puzpuzpuz commented May 14, 2020

puzpuzpuz commented May 14, 2020

bnoordhuis commented May 14, 2020

benjamingr commented May 15, 2020

bnoordhuis commented May 15, 2020

jasnell commented May 15, 2020

benjamingr commented May 16, 2020

BridgeAR commented May 23, 2020

puzpuzpuz commented May 23, 2020

Revert "tls: add highWaterMark option for connect" #33387

Revert "tls: add highWaterMark option for connect" #33387

Conversation

puzpuzpuz commented May 13, 2020

Checklist

richardlau commented May 13, 2020

BridgeAR May 13, 2020

Choose a reason for hiding this comment

puzpuzpuz May 13, 2020

Choose a reason for hiding this comment

puzpuzpuz commented May 13, 2020

richardlau commented May 13, 2020

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

addaleax left a comment

Choose a reason for hiding this comment

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

benjamingr commented May 13, 2020

puzpuzpuz commented May 13, 2020 • edited

puzpuzpuz commented May 13, 2020

addaleax commented May 13, 2020

puzpuzpuz commented May 13, 2020

benjamingr commented May 13, 2020

puzpuzpuz commented May 13, 2020

bnoordhuis commented May 14, 2020

benjamingr commented May 14, 2020

puzpuzpuz commented May 14, 2020

benjamingr commented May 14, 2020

puzpuzpuz commented May 14, 2020

puzpuzpuz commented May 14, 2020

bnoordhuis commented May 14, 2020

benjamingr commented May 15, 2020

bnoordhuis commented May 15, 2020

jasnell commented May 15, 2020

benjamingr commented May 16, 2020

BridgeAR commented May 23, 2020

puzpuzpuz commented May 23, 2020

puzpuzpuz commented May 13, 2020 •

edited