Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement net.Socket interceptor #515

Open
wants to merge 75 commits into
base: main
Choose a base branch
from

Conversation

kettanaito
Copy link
Member

@kettanaito kettanaito commented Mar 3, 2024

Yet another attempt at the Socket-based interceptor. This time, we are tapping into the HTTPParser from Node.js to give us the HTTP message parsing.

Roadmap

  • Implement SocketController.respondWith(). The challenge is to convert the Fetch API Response to the HTTP message and push it to the socket instance. We can use https://github.com/har-tools/http-message if nothing else works.
  • Implement error replays. Depends on how we handle the compatibility tests below.
    • I think duplicate connect and ready events are OK since those are internal and no consumers really depend on them. The main worry is the duplicate .write() callbacks since the request body would be written twice in the case of request listener (body read 1) + a bypassed request to an existing hostname (body read 2). We can always suppress the callbacks and then replay them.
  • Push the mockConnect() event until after the "request" interceptor event has been awaited.
    • Cannot do this. ClientRequest won't start writing the request body until it receives the "connect" and "ready" events from the socket.
  • Add a new test suite for compatibility
    • 1: Bypassed request to an existing host (no request listeners).
    • 2: Bypassed request to a non-existing host (no request listeners).
    • 3: Mocked request to an existing host (no response returned).
    • 4: Mocked request to an existing host (mocked response returned).
    • 5: Mocked request to a non-existing host (no response returned).
    • 6: Mocked request to a non-existing host (mocked response returned).
  • Socket compliance:
    • Respect the signal Socket constructor options (I presume for aborting the connection). Consider AbortSignal.timeout(n) as well. test(MockHttpSocket): add "signal" tests #559
    • Respect the right order of events in all compatibility scenarios.
    • Support Socket.setTimeout() that should time out the connection if it hasn't been established within the given time window. Support these for actual socket connections. test: add setTimeout tests #558
    • Add compliance tests for the case when a single Socket is reused for multiple requests. We need to collocate request/responses correctly and handle their streams independently (likely bound the request/response stream to particular request/response instances instead of setting it as a property on the MockHttpSocket class).
  • Support TLS connections. Those use tls.connect(), not net.connect().
  • Add compliance test for non-HTTP transfer over Socket (e.g. SMTP, see this). We should do nothing about these, it's out of scope for this feature.
    • Irrelevant since we aren't patching net.connect() anymore. We utilize MockSocket to create a custom http.Agent and it's only applied to the http module.
  • Destroy the this.requestStream and this.responseStream at the right time. Consider that the consumer might've not consumed either of the streams by the time the *End() methods on MockHttpSocket are called (that will cause a premature close error).
  • Clean up ClientRequest/utils for it contains a lot of unused code now. chore: clean up ClientRequest utils #557
  • Inherit TLSSocket properties like .authorized and .encrypyed (see Fix socket behaviour #455). This is related to inheriting other properties like timeout and signal. (fix(MockHttpSocket): forward tls socket properties #556)
  • Update README.md since the interception algorithm is fundamentally different now.
  • Fix follow-redirect-http test, it's pending forever (skipped in 230c238).
  • fix(MockHttpSocket): handle response stream errors #548

@kettanaito
Copy link
Member Author

We can spy on ClientRequest.prototype.write() and ClientRequest.prototype.end() to set an internal [kRequestStream] readable stream. That way, we can access the stream before the socket emits the connect event.

Since the body stream is the last thing we need to call the request listener, we can call the listener even before deciding whether to emit the connect+ready on the socket instance.

@mikicho, what do you think about this approach?

@mikicho
Copy link
Contributor

mikicho commented Mar 3, 2024

Yet another attempt at the Socket-based interceptor

The N time's a charm. :)

Since the body stream is the last thing we need to call the request listener, we can call the listener even before deciding whether to emit the connect+ready on the socket instance.

I like the direction you are going!

Would it work in the following case?

const req = http.request(...)
req.write('sync') // OK

await sleep(1000)
req.write('async') // I think it will work, but I'm not sure about it. my concern is that the ClientRequest would await for the `connect` event to continue
req.end()

@mikicho
Copy link
Contributor

mikicho commented Mar 3, 2024

It appears that the process.binding function is going to be removed.
Update: seems like we can use const { HTTPParser } = require('_http_common') (or import { HTTPParser } from 'node:_http_common') instead.

Also, a question, Would the socket-based interceptor support fetch as well? or it would still require a different interceptor?
If not, we may implement very similar apparch with setGlobalDispatcher

const self = this
const originalConnect = net.Socket.prototype.connect

net.Socket.prototype.connect = function (normalizedOptions) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ...normalizedOptions here, if it's passed on as arguments to connect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gioragutt, I think it should be but Node.js provides here a single normalized object of options. Let's dive into the source for net.Socket.connect to verify that's the case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, how is the spread in line 44 working? JS syntax wise it wouldn't make sense otherwise

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I write about this here:

/**
* @note In some cases, "Socket.prototype.connect" will receive already
* normalized arguments. The call signature of that method will differ:
* .connect(port, host, cb) // unnormalized
* .connect([options, cb, normalizedSymbol]) // normalized
* Check that and unwrap the arguments to have a consistent format.
*/
const unwrappedArgs = Array.isArray(args[0]) ? args[0] : args
const normalizedSocketConnectArgs = net._normalizeArgs(unwrappedArgs)
const createConnection = () => {
return originalConnect.apply(this, args)
}

TL;DR Node.js will normalize the args for .connect() in some cases but not the others. That's why we get different type of args if the socket connection is called as a part of http.request() and raw like new Socket().connect().

@kettanaito
Copy link
Member Author

kettanaito commented Mar 4, 2024

Request event duplication issue

With this approach, we have to mock the connection to the host every time in order for the ClientRequest (the higher-level consumer) to start streaming the buffered request body through the socket (it won't start that until it receives the connect event).

This means a few different sequences of events depending on the type of the request.

Request to an existing endpoint with a mocked response

  1. lookup (mocked).
  2. connect (mocked).
  3. ready (mocked).
  4. INTERNAL.
    1. Reading the request body.
    2. Firing "request" on the intercepted.
    3. Piping the mocked response to the socket.
  5. data.
  6. response.
  7. end.

Request to a non-existing endpoint with a mocked response

The same thing as the one above.

Request to a non-existing host without a mocked response

This is where things get a bit tricky.

  1. lookup (mocked).
  2. connect (mocked).
  3. ready (mocked).
  4. INTERNAL (the same request streaming to the interceptor).
  5. error. Emit the suppressed lookup (connection) error since there's no mock in the interceptor so the consumer must receive the error about requesting a non-existing host.

This scenario is different from the event order in the unpatched scenario. The initial lookup event would include an error, and mirror itself in the error event on the socket also.

In our case, the lookup/connect/ready are successful. This is problematic for a few reasons:

  1. Diverges from the "normal" event flow.
  2. The socket errored and when the client will try writing to it (request body), while we will be able to forward those chunks to the HTTPParser, the underlying socket will throw Error: write EBADF error, which I believe indicates an invalid write.

While no direct consumers care about the lookup/connect/ready events, the error event is rather crucial. In either case, keeping the event flow as close to the normal behavior as possible is desirable.

Ideas how to resolve this

Option 1: Shadow request

We can spy on the net.Socket, and whenever it's constructed, construct a "shadow" copy of it internally and run the entire request/response on that shadow while keeping the original socket pending.

class SocketController {
    constructor(socket) {
        this.socket = socket;
        // Create a shadow socket.
        this.shadow = new net.Socket(...socketArgs);
        
        // Always mock the connection on the shadow.
        this.mockConnect(this.shadow)
        
        // Observe the shadow, not the original socket.
        this.shadow.write = proxy
        this.shadow.push = proxy
        
        this.onShadowHasMock = (response) => {
            // If the shadow socket received a mocked response
            // from the "request" listener in the interceptor,
            // mock the original socket connection (it doesn't matter now)
            // and pipe the response to the original socket.
            this.mockConnect(this.socket)
            pipeResponse(response, this.socket)
        }
        
        this.onShadowHasNoMock = () => {
            // If the interceptor had no mocked response for this request,
            // continue with the original socket as-is.
            // The original socket has been pending up to this moment.
        }
        
    }
}

In order for us to still respect the original socket and not to implement a dummy Socket class to provision this "waiting" period, we can do something like an infinitely pending promise on the crucial methods of the original socket that we then resolve in case of onShadowHasNoMock.

this.socket.emit = applyProxyHere((...args) => {
    waitForShadow().then(() => {
        Reflect.apply(...args)
    })
})

If the shadow socket has a mock, the waitForShadow promise will keep pending and won't ever be resolved (alternatively, we can either resolve it with a flag or reject it and provide a .catch() handler). If the shadow socket finds no mocked response, it will resolve the promise, which will trigger the method on the original socket in the order that they were originally called.

This should include at least .emit and .write, but may require to add the promise plug to more methods.

Problems with this solution:

  • The shadow socket won't receive any writes until the original socket pipes them to the shadow socket. And the original socket won't start piping unless the ClientRequest receives the connect event from it. Catch 22.

@kettanaito kettanaito force-pushed the feat/yet-another-socket-interceptor branch from 54aa046 to 5d9a812 Compare March 4, 2024 12:25
@kettanaito
Copy link
Member Author

I tried solving the socket event order issue in f7fded8. I believe it works great, now I'm writing tests to confirm that.

@mikicho
Copy link
Contributor

mikicho commented Mar 5, 2024

this.mockConnect(this.shadow)

How is this propogate the connect/ready/lookup events to the ClientRequest so it will continue to write the data?

Catch 22

Problems with this solution:

Probably not a problem, but I want to make sure I understand, in this option we will emit connect/ready/lookup twice in case of non-mocked request, right?

While no direct consumers care about the lookup/connect/ready events

It seems like Axios doesn't use this, but got exposes this data to its users, I'm not sure for what tho...

@mikicho
Copy link
Contributor

mikicho commented Mar 5, 2024

I opened a feature request for official mocking API like undici has: nodejs/node#51979

@kettanaito
Copy link
Member Author

It seems like Axios doesn't use this, but got exposes this data to its users, I'm not sure for what tho...

It doesn't expose that data but rather hooks into those socket events to provide timings measurements. That's okay. If you take a look at the current state of the branch, it emits the events correctly for all scenarios we've talked about. There's no duplication.

I opened a feature request for official mocking API like undici has: nodejs/node#51979

This is great! I'd certainly love to hear Node's opinion on this. I would count on this landing any time soon so our work on the Socket interceptor is relevant for at least a few years.

Probably not a problem, but I want to make sure I understand, in this option we will emit connect/ready/lookup twice in case of non-mocked request, right?

Not anymore.

@mikicho
Copy link
Contributor

mikicho commented Mar 6, 2024

Not anymore.

This is impressive! how so?

@kettanaito
Copy link
Member Author

kettanaito commented Mar 9, 2024

Need help

I've pushed the latest state of the changes that supports both HTTP and HTTPS requests. I've moved things around a bit, mainly made the MockSocket to be agnostic of the type of the messages sent (just giving us the observability), introduced a MockHttpSocket and utilizes the mock socket and pipes its streams through the HTTP parser, and then integrates with the new _ClientRequestInterceptor that's now Agent-based (we aren't mocking the ClientRequest at all now).

I'm encountering errors when intercepting HTTP/HTTPS requests with a body. Error:

pnpm test:node test/modules/http/intercept/http.request.test.ts
 FAIL  modules/http/intercept/http.request.test.ts > intercepts a POST request
AbortError: The operation was aborted
 ❯ Readable.emit ../node:events:513:28

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'ABORT_ERR' }
Caused by: Error: Premature close
 ❯ Readable.emit ../node:events:513:28

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'ERR_STREAM_PREMATURE_CLOSE' }
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/1]⎯

This indicates either that the MockSocket connection has been closed (not the case, the connection is never established to begin with), or that the MockSocket is still in writable state when we try to .emit('data') when forwarding data events from the original socket (in the passthrough() method). The latter seems to be more likely since this.writable is still true. You can also observe the finish/end events on the mock socket to see that they are emitted after the passthrough() start.

This makes sense because right now we call socket.passthrough() in the interceptor on the request event, which fires as soon as the request headers are sent. There may be some time passing until the request body has been written completely. I think this indicates an issue that we mustn't start the passthrough until the socket is finished or is in the readable state.

I've tried things like

// the interceptor
socket.once('end', () => socket.passthrough())

But they haven't fixed the problem.

Isolated example

I've put up an isolated example of this in https://github.com/kettanaito/node-tls-mock. Strangely, all tests are passing there, even for requests with body. The socket implementation seems to be the same. I advise to inspect the differences first to see if I've missed something while porting it to this branch.

@kettanaito
Copy link
Member Author

Fixing the failing tests in #548.

@kettanaito
Copy link
Member Author

Opened a fix to forward TLS Socket properties (authorized, encrypted, etc.): #556

@kettanaito kettanaito marked this pull request as ready for review April 17, 2024 21:05
@kettanaito kettanaito changed the title feat: implement socket interceptor feat: implement net.Socket interceptor Apr 18, 2024
@mikicho
Copy link
Contributor

mikicho commented Apr 20, 2024

Maybe we want to merge this to fix IPv6?

@mikicho mikicho assigned mikicho and unassigned mikicho Apr 20, 2024
@mikicho
Copy link
Contributor

mikicho commented Apr 20, 2024

Nock (unintentionally?) covers an edge case of a request that ends in the connect event.
With current socket-based implementation the request never ends.

const { ClientRequestInterceptor } = require('@mswjs/interceptors/ClientRequest')
const http = require('http');

const interceptor = new ClientRequestInterceptor({
  name: 'my-interceptor',
})
interceptor.apply();
interceptor.on('request', ({ request }) => {
  request.respondWith(new Response())
});


const req = http.request('http://example.com')

req.on('socket', socket => {
  socket.once('connect', () => {
    console.log(1);  // never get called when innterceptor applied.
    req.end()
  })
  socket.once('secureConnect', console.log)
})

WDYT?

@mikicho
Copy link
Contributor

mikicho commented Apr 20, 2024

How do we need to handle Expect: 100-continue request header?

What Node.js does:

  1. flush headers (docs, source)
  2. Server side (should) returns 100 Continue response (docs)
  3. Node emits continue event. (docs)
Reproducible code
const { ClientRequestInterceptor } = require('@mswjs/interceptors/ClientRequest')
const http = require('http');
const interceptor = new ClientRequestInterceptor({
  name: 'my-interceptor',
})
interceptor.apply();
interceptor.on('request', ({ request }) => {
  request.respondWith(new Response())
});
const req = http.request({
  host: 'example.com',
  path: '/',
  headers: { Expect: '100-continue' },
})
req.on('response', res => {
  res.on('data', console.log)
  res.on('end', () => { })
})
req.on('continue', () => {
  console.log(1); // never get called when interceptor applied.
  req.end()
})

I wonder if we should emit a continue event before calling the interceptor.
I think I can open PR when we decide how to handle this.

// not connecting anymore, the response was mocked.
// If the socket has been destroyed, the error was mocked.
// Treat both as the result of the listener's call.
if (!socket.connecting || socket.destroyed) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the Socket world, the socket won't be writableEnded until .end() is called, which can take time (e.g. request body is streaming).

Instead, I rely on .connecting, which we immediately set to false now in .mockConnect(), or the .destroyed, which will also immediately be set to false it you mock an error.

@kettanaito
Copy link
Member Author

Nock (unintentionally?) covers an edge case of a request that ends in the connect event.

@mikicho, we cannot cover this. connect depends on req.end() so it cannot be called in the connect event listener when the interceptor is enabled. Are there any other details around that test case in Nock (is it a regression)?

How do we need to handle Expect: 100-continue request header?

This reads to be an http.ClientRequest level of behavior. Is it not happening already? That's odd.

We should support it but we should also be careful not to prematurely support it. As in, if it shouldn't belong to the Socket level and we ignore a potential bug that prevents Node.js from calling checkContinue() as it does in a non-interceptor scenario.

Let's add an integration test for this first and then see! Would love to dive deeper to learn what exactly Node.js does step-by-step, and at which step it stops when using the interceptor.

@kettanaito
Copy link
Member Author

@mikicho, hi! You've mentioned there are also some remaining issues left here. Could you please post them in the comment to this pull request? I'd love to see what's left so I can tackle some of them whenever I have a moment. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants