Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sending raw headers #2103

Closed
1 task done
pimterry opened this issue Dec 6, 2022 · 7 comments
Closed
1 task done

Support sending raw headers #2103

pimterry opened this issue Dec 6, 2022 · 7 comments

Comments

@pimterry
Copy link
Contributor

pimterry commented Dec 6, 2022

Is there an existing issue for this?

  • I've searched for any related issues and avoided creating a duplicate issue.

Description

(This is not a bug, it's a feature request, but there's no available template for that in https://github.com/websockets/ws/issues/new).

It would be helpful to be able to set raw headers for a WS client connection. This is notably useful because many services now use bot detection (e.g. many services that use Cloudflare/Akamai/etc) which blocks requests from all non-browser clients, detecting them both by TLS fingerprint and by fingerprinting details of HTTP connections, e.g. the header order, and the casing of case-insensitive header names.

Node itself already supports this in various HTTP APIs (e.g.) by accepting a flat array of [key, value, key, value] headers for the headers option (in addition to the object header format) which are then sent as-is without any further transformation to handle duplicates or header name casing. Headers are similarly exposed in the same raw format by the request.rawHeaders property.

Apparently this is also helpful for performance when proxying, which was Node's original justification for supporting this, allowing proxies to receive and forward the raw headers with no processing necessary.

It would be useful if WS also supported this. Right now WS assumes that opts.headers is always an object, e.g. here:

ws/lib/websocket.js

Lines 715 to 721 in ea76193

opts.headers = {
...opts.headers,
'Sec-WebSocket-Version': opts.protocolVersion,
'Sec-WebSocket-Key': key,
Connection: 'Upgrade',
Upgrade: 'websocket'
};

ws version

All (feature request)

Node.js Version

All (feature request)

System

All (feature request)

Expected result

It should be possible to set raw headers like so:

new WebSocket(wsUrl, {
    headers: ['my-header', 'my-value', 'Another-Header', 'another-value', 'my-header', 'dupe-value']
});

and have the upstream request made with all headers (including the duplicate) in the given order, and with the header casing preserved as provided.

Actual result

It's not possible to provide raw header data.

Attachments

I'm happy to investigate this myself and open a PR for it - I'm mainly just looking for confirmation that this is a feature you're open to.

I think because changing this would imply tracking headers internally everywhere in raw format, this could complicate internal header handling slightly, but I think it's manageable without much impact. It could also subtly change the format of headers sent upstream in the existing object header option case (e.g. the best approach would imply preserving the user-specified case from object header keys, instead of lowercasing automatically as now). There should be no changes though that would affect the normal semantics of the request, and I expect that's acceptable to everybody who isn't already concerned with sending raw headers like this anyway.

@lpinca
Copy link
Member

lpinca commented Dec 6, 2022

If it does not add a lot of complexity I'm open to it. Feel free to open a PR.

@lpinca
Copy link
Member

lpinca commented Dec 6, 2022

I think this will be kinda messy and support for a flat array was backported only back to Node.js 14.14.0. ws still supports Node.js 10.

The header casing and order is preserved also when using an object.

$ cat index.js 
const WebSocket = require('ws');

const server = new WebSocket.Server({ port: 0 }, function () {
  const ws = new WebSocket(`ws://localhost:${server.address().port}`, {
    headers: { 'A': '1', 'b': 2 }
  });

  ws.on('open', ws.close);
});

server.on('connection', function (ws, req) {
  console.log(req.rawHeaders);
  server.close();
});
$ node index.js 
[
  'A',
  '1',
  'b',
  '2',
  'Sec-WebSocket-Version',
  '13',
  'Sec-WebSocket-Key',
  'YXby+A0rUpyIjJvrED+bcA==',
  'Connection',
  'Upgrade',
  'Upgrade',
  'websocket',
  'Sec-WebSocket-Extensions',
  'permessage-deflate; client_max_window_bits',
  'Host',
  'localhost:52113'
]

so I think we can accept the user provided object as is and override the Sec-WebSocket-* headers and the other headers set by ws only if they are not already set.

The only thing missing would be duplicates but I think no browser sends duplicate headers in the WebSocket handshake request.

What do you think?

@pimterry
Copy link
Contributor Author

pimterry commented Dec 9, 2022

These are all good points! Thanks for looking into this.

I've just done some further investigation and testing of the specific issue that inspired this (httptoolkit/httptoolkit#363) and it seems that my header issue isn't directly caused by not reproducing raw headers as I had assumed.

In this specific case, the issue just comes down to the Origin header. This connection works:

new WebSocket('wss://remote-auth-gateway.discord.gg/?v=2', {
    headers: {
        Origin: 'https://discord.com'
    }
});

but changing Origin to origin fails with a 403.

Most likely that's a Discord server bug: header names should be case insensitive, but they've never worried about that because all browsers send it in uppercase.

In my case, this lowercasing happened because I have to use separate logic for proxying with WS compared to HTTP requests, because forwarding the raw headers directly isn't supported, and as part of transforming the headers into the object form they were being lowercased incorrectly.

I'm going to fix that now on my side. In future it would still be useful to have a raw header API like this in WS, for consistency with Node's HTTP APIs, for more direct control generally (e.g. if you really do want to control the order of duplicate headers for some reason - browsers aren't the only clients you might want to emulate) and to avoid the performance cost & extra hassle of transforming raw headers into objects for every proxied websocket.

In the meantime though, this means it's no longer urgent for me, and so imo waiting until WS completely drops support for Node versions <14 before looking at this seems like the easiest option.

@lpinca
Copy link
Member

lpinca commented Dec 9, 2022

The feature has basically always been supported if an array of arrays is used.

var http = require('http');
var net = require('net');

var server = net.createServer();

server.on('connection', function (socket) {
  var data = '';

  socket.setEncoding('utf-8');
  socket.on('data', function (chunk) {
    process.stdout.write(chunk);

    data += chunk;

    if (/\r\n\r\n$/.test(data)) {
      socket.write(
        'HTTP/1.1 200 OK\r\n' +
          'Content-Length: 0\r\n' +
          'Connection: close\r\n\r\n'
      );
    }
  });
});

server.listen(function () {
  var request = http.get({
    port: server.address().port,
    headers: [
      ['foo', '1'],
      ['foo', '2']
    ]
  });

  request.on('response', function (response) {
    response.on('end', function () {
      server.close();
    });

    response.resume();
  });
});

Ironically it is not documented here https://nodejs.org/api/http.html#httprequestoptions-callback. I think it should be.

There is no need to wait for ws to drop support for Node.js < 14, but I find it hard to justify the added complexity (three different ways to specify headers and the mess it means for redirect handling) only to allow duplicate header names. Again, the user provided casing is already preserved (when not following redirects) and the user provided order can be preserved with a small patch.

@pimterry
Copy link
Contributor Author

There is no need to wait for ws to drop support for Node.js < 14, but I find it hard to justify the added complexity (three different ways to specify headers and the mess it means for redirect handling) only to allow duplicate header names.

That makes sense, thanks. That said, it's not just support for duplicate header names - it is also just a more convenient & performant API if you already have the headers in a raw format, as proxies generally will. But regardless, my immediate problem is now fixed, so if you're definitely not interested in this then that's fair enough, and it's not a huge problem from my POV.

It's not urgent anyway, so it can always just be ignored until a more concrete use case appears.

the user provided order can be preserved with a small patch

You mean by checking whether the extra headers that will be added are already set, and preserving their order if so, right?

That makes sense to me. You're right that that's the more important issue, since browsers don't generally send duplicate headers for websockets right now, and that would definitely help when dealing with any header fingerprinting issues like this in future.

@lpinca
Copy link
Member

lpinca commented Dec 12, 2022

You mean by checking whether the extra headers that will be added are already set, and preserving their order if so, right?

Correct.

@pimterry
Copy link
Contributor Author

You mean by checking whether the extra headers that will be added are already set, and preserving their order if so, right?

Correct.

Ok, so I started to look into this, and it seems like this actually works already too 😆

More specifically, on master with node v16.18, when you run this:

const WebSocket = require('ws');

const server = new WebSocket.Server({ port: 0 }, function () {
  const ws = new WebSocket(`ws://localhost:${server.address().port}`, {
    headers: {
      'Sec-WebSocket-Extensions': 'bad-value',
      Upgrade: 'websocket',
      host: `localhost:${server.address().port}`,
      'A': '1',
      'sec-websocket-key': 'qweasd',
      'Sec-WebSocket-Version': '-1',
      connection: 'Upgrade',
      'b': 2
    }
  });

  ws.on('open', ws.close);
});

server.on('connection', function (ws, req) {
  console.log(req.rawHeaders);
  server.close();
});

It prints the raw headers received by the server, in the same order as they're defined above, just resetting the casing & values:

[
  'Sec-WebSocket-Extensions',
  'permessage-deflate; client_max_window_bits',
  'Upgrade',
  'websocket',
  'host',
  'localhost:33561',
  'A',
  '1',
  'Sec-WebSocket-Key',
  '64M1COpC6V+3nTAS70/rgw==',
  'Sec-WebSocket-Version',
  '13',
  'Connection',
  'Upgrade',
  'b',
  '2'
]

Just to document this explicitly, I think that means:

  • It's already possible to control the order of all headers, except duplicates.
  • It's possible to control the case of all headers, except those controlled by WS (e.g. sec-websocket-key and connection above are sent as Sec-WebSocket-Key and Connection).
  • It's possible to control the value of all headers, except those controlled by WS (i.e. all Sec-WebSocket-* headers, Connection & Upgrade values above are always overridden).

Given that, I'm going to close this. I think we've shown you can control almost everything already, the uncontrolled details all match browser behaviour anyway, and the original issue was mainly caused by me not using the API correctly to control the case with these tools that are already available.

I would still find a raw header API nice-to-have one day, just for convenience (since I'm using raw headers everywhere else) and to guarantee perfect reproduction of headers in general (i.e. casing for controlled headers, duplicates, etc) but it's just not that important - there's enough control of raw formatting here already and currently I don't have a single example where those extras would matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants