Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow calling next on finish #137

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kwasimensah
Copy link

Pull Request for #136 (comment). Not married to this, but showing a rough outline of what I'm asking for. Defaults to the current behavior of ending the request cycle if sending the file is successful. Allows post-processing the request.

Still figuring out the right way to add a test that:

  1. starts a request
  2. makes the server sleep for 5 seconds (approximate a long database query)
  3. the client times out/aborts the request
  4. The server still calls the next middleware

My use case involves puppeteer and making server.close wait until after 4). But it looks like you have pure unit tests here so it might be more setup than you want.

Note on-finished isn't good enough because the "finish" event isn't fired if "close" is fired first (the client aborts the request). And I'm assuming people would want to know if the file didn't completely send.

Defaults to the current behavior of ending the request cycle.
Alllows post-processing the request.
@dougwilson
Copy link
Contributor

Note on-finished isn't good enough because the "finish" event isn't fired if "close" is fired first (the client aborts the request). And I'm assuming people would want to know if the file didn't completely send.

Did you actually test it? It sounds like you are making a big assumption here, as on-finished is not just a listener for the Node.js "finish" event -- it wouldn't need to exist if it where. It also listens on the close event if it fires first due to the client closing the request.

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are the only changes, it wouldn't be merged into this module under the current philosophy of our middlewares, where if you can already achieve the thing, we don't add knobs to just do what you can already do. This seems to just add two .once calls you can add right before invoking this middleware to achieve identical behavior.

@kwasimensah
Copy link
Author

kwasimensah commented Jul 20, 2020

Yes, I tried listening to "finish" and "close". Given:

async function myDatabaseMiddleWare(req, res, next) {
   await databaseQuery1();
   await databaseQuery2();
  next();
}

if server.close is called between databaseQuery1 and databaseQuery2, the "close" even will fire for req but the middleware will still try to call databaseQuery2. Express doesn't try to stop the rest of the middleware function.

This means server.close will return before its safe to actually tear down the database connection because it will still be used for databaseQuery2. I'm trying to catch this with the (super simplified) code below.

const requestsStillRunningMiddleware = 0;

// at the start of the middleware stack
app.use((req, res, next)=>{
  ++requestsStillRunningMiddleware;
  next();
})


// at the end of the middleware stack
app.use((req, res, next)=>{
  --requestsStillRunningMiddleware;
  if(requestsStillRunningMiddleware=== 0){
      // signal that it's safe to clean up resources used in middleware.
   }
})

But it requires all middleware to call next (which express-static does not do right now). I'm beginning to think it'll just be easier to move express-static out of this sandwich than try to get it call next().

Terminus and Lightship require an external service to ping your server

Not that I am aware of. Can you show where that is needed? I use terminus in my own apps and have nothing pinging the server, so I'm really confused on where you are getting that information from.

You're right, I misread this. Lightship runs its own service on your own computer and terminus seems to do it's own thing without creating a new server.

They do not keep track of connections between the server and backend services (postgres/redis/etc) except through explicit health checks. But this doesn't allow you to make sure no one is using those resources before you consider the server "closed".

That is correct, as they need you to provide that mechanism. The reason they track the HTTP connections is because Node.js does that. I use MySQL through the "mysql" module and it does the same thing: tracks checked out connections. This means that I can use terminus to join the two tracking mechanisms together to get a shutdown that does not terminate my database connections in the middle of a transaction.

Right, and I want to provide that mechanism by tracking unfinished middleware stacks. For postgres for example, connections come from a pool. So if i was just tracking open connections databaseQuery2 wouldn't have taken it's connection out yet so shutdown logic wouldn't see it and think you're down to 0 after databaseQuery1 and before databaseQuery2 starts. so you open yourself up to a race.

@kwasimensah
Copy link
Author

kwasimensah commented Jul 20, 2020

For tests, I wanted your advice on how you'd like to see an integration-type test i.e. the ability cancel/timeout a request and to make the _serve function more complicated to deal with a middleware the sleeps.

I also noticed the tests don't actually use express for setting up a middleware stack and I wanted to see if that's something you would want given that it would add a non-trivial amount of logic to this test.

@dougwilson
Copy link
Contributor

It sounds like you're trying to implement a graceful shutdown system. There are some examples on the Express website: http://expressjs.com/en/advanced/healthcheck-graceful-shutdown.html Those shutdown modules should be providing all the functionality you are looking for out of the box for managing shutting down your server.

@kwasimensah
Copy link
Author

If these are the only changes, it wouldn't be merged into this module under the current philosophy of our middlewares, where if you can already achieve the thing, we don't add knobs to just do what you can already do. This seems to just add two .once calls you can add right before invoking this middleware to achieve identical behavior.

Touche. I'll just add the callbacks on close and finish

@dougwilson
Copy link
Contributor

For tests, I wanted your advice on how you'd like to see an integration-type test i.e. the ability cancel/timeout a request and to make the _serve function more complicated to deal with a middleware the sleeps.

Yea, no problem at all! I would need you to provide over some kind of format what you want the test to do and I can help transform that test behavior into something that can fit into our test suite.

I also noticed the tests don't actually use express for setting up a middleware stack and I wanted to see if that's something you would want given that it would add a non-trivial amount of logic to this test.

That's right, at this module works with several other non-Express systems that all adhere to the req,res,next type of contract. Testing in Express will cause us to miss compatibility issues where we actually use Express-specific "isms" and thus only test against Node.js HTTP server directly (which everything, including Express, provides compatibility to).

@kwasimensah
Copy link
Author

It sounds like you're trying to implement a graceful shutdown system. There are some examples on the Express website: http://expressjs.com/en/advanced/healthcheck-graceful-shutdown.html Those shutdown modules should be providing all the functionality you are looking for out of the box for managing shutting down your server.

Terminus and Lightship require an external service to ping your server, which is overkill for an integration test. All of them including http-terminator keep track of connections between the client and the server. They do not keep track of connections between the server and backend services (postgres/redis/etc) except through explicit health checks. But this doesn't allow you to make sure no one is using those resources before you consider the server "closed".

I'll just add the callbacks on close and finish and that will unblock me for now.

@dougwilson
Copy link
Contributor

Terminus and Lightship require an external service to ping your server

Not that I am aware of. Can you show where that is needed? I use terminus in my own apps and have nothing pinging the server, so I'm really confused on where you are getting that information from.

They do not keep track of connections between the server and backend services (postgres/redis/etc) except through explicit health checks. But this doesn't allow you to make sure no one is using those resources before you consider the server "closed".

That is correct, as they need you to provide that mechanism. The reason they track the HTTP connections is because Node.js does that. I use MySQL through the "mysql" module and it does the same thing: tracks checked out connections. This means that I can use terminus to join the two tracking mechanisms together to get a shutdown that does not terminate my database connections in the middle of a transaction.

@kwasimensah
Copy link
Author

Terminus and Lightship require an external service to ping your server

Not that I am aware of. Can you show where that is needed? I use terminus in my own apps and have nothing pinging the server, so I'm really confused on where you are getting that information from.

They do not keep track of connections between the server and backend services (postgres/redis/etc) except through explicit health checks. But this doesn't allow you to make sure no one is using those resources before you consider the server "closed".

That is correct, as they need you to provide that mechanism. The reason they track the HTTP connections is because Node.js does that. I use MySQL through the "mysql" module and it does the same thing: tracks checked out connections. This means that I can use terminus to join the two tracking mechanisms together to get a shutdown that does not terminate my database connections in the middle of a transaction.

argh, I accidentally edited #137 (comment) instead of replying to this comment.

@dougwilson
Copy link
Contributor

So if i was just tracking open connections databaseQuery2 wouldn't have taken it's connection out yet so shutdown logic wouldn't see it and think you're down to 0 after databaseQuery1 and before databaseQuery2 starts. so you open yourself up to a race.

Ah, so that means you are not mapping a HTTP request/response to a single database transaction. That is certainly a strange design, not to encapsulate all your db calls into a single transaction per request, but that is the nature of these things -- there are so many ways to design these systems, it is difficult to make a one-size-fits-all.

I want to provide that mechanism by tracking unfinished middleware stacks.

The downside is that the current Express router design makes this impossible, as Express cannot tell an unfinished middleware from one that is doing activity -- this boils down to middleware just being a function call where the function may or may not re-enter the router, depending on the design of the application. It seems like you want to attempt to track this by having the middleware here always call next() -- but of course you will quickly run into middleware after middleware you attempt to use does not do this, making it a very long road to get this change to happen by approaching the problem in this way.

Of course, the good news is that, for example, you can implement the change in this PR completely outside of the middleware in question, which makes the behavior suddenly work, seemingly. The biggest downside to this PR is that it is assuming that when you call "next()" it goes to the middleware after this middleware, but that is not actually how the Express router works -- next() calls the next middleware after the last one it entered for that given request/response pair. As an example, say you had your example above

app.use(thisModuleUsingYourNewOption)
app.use((req, res, next) => { console.log('1'); next() }
app.use(async function myDatabaseMiddleWare(req, res, next) {
   console.log('2');
   await databaseQuery1();
   console.log('3');
   await databaseQuery2();
   console.log('4');
  next();
})
app.use((req, res, next) => { console.log('5'); next() }

If the file was completed sending the response during databaseQuery1(), you would end up with the following sequence logged: 1, 2, 5, 3, 4. I'm not sure if your code would expect the out-of-order execution, but that would be the current behavior. It would mean, for example, if 5 needed something from the database queries, it would not exist there by surprise, so you'd always need to check in each middleware which of your previous middleware ran, as your order is no longer going to be the same depending on where the client aborted.

There are various other edge cases as well you'll need to look out for, but if your application is not worried about such cases, there won't be any issues caused by it. It's really a case-by-case basis for how the various apps are designed.

@kwasimensah
Copy link
Author

So if i was just tracking open connections databaseQuery2 wouldn't have taken it's connection out yet so shutdown logic wouldn't see it and think you're down to 0 after databaseQuery1 and before databaseQuery2 starts. so you open yourself up to a race.

Ah, so that means you are not mapping a HTTP request/response to a single database transaction. That is certainly a strange design, not to encapsulate all your db calls into a single transaction per request, but that is the nature of these things -- there are so many ways to design these systems, it is difficult to make a one-size-fits-all.

I want to provide that mechanism by tracking unfinished middleware stacks.

The downside is that the current Express router design makes this impossible, as Express cannot tell an unfinished middleware from one that is doing activity -- this boils down to middleware just being a function call where the function may or may not re-enter the router, depending on the design of the application. It seems like you want to attempt to track this by having the middleware here always call next() -- but of course you will quickly run into middleware after middleware you attempt to use does not do this, making it a very long road to get this change to happen by approaching the problem in this way.

Of course, the good news is that, for example, you can implement the change in this PR completely outside of the middleware in question, which makes the behavior suddenly work, seemingly. The biggest downside to this PR is that it is assuming that when you call "next()" it goes to the middleware after this middleware, but that is not actually how the Express router works -- next() calls the next middleware after the last one it entered for that given request/response pair. As an example, say you had your example above

app.use(thisModuleUsingYourNewOption)
app.use((req, res, next) => { console.log('1'); next() }
app.use(async function myDatabaseMiddleWare(req, res, next) {
   console.log('2');
   await databaseQuery1();
   console.log('3');
   await databaseQuery2();
   console.log('4');
  next();
})
app.use((req, res, next) => { console.log('5'); next() }

If the file was completed sending the response during databaseQuery1(), you would end up with the following sequence logged: 1, 2, 5, 3, 4. I'm not sure if your code would expect the out-of-order execution, but that would be the current behavior. It would mean, for example, if 5 needed something from the database queries, it would not exist there by surprise, so you'd always need to check in each middleware which of your previous middleware ran, as your order is no longer going to be the same depending on where the client aborted.

There are various other edge cases as well you'll need to look out for, but if your application is not worried about such cases, there won't be any issues caused by it. It's really a case-by-case basis for how the various apps are designed.

That is completely surprising. I believed that console.log('1') wouldn't be invoked until thisModuleUsingYourNewOption called next() and seems to run counter to https://expressjs.com/en/guide/using-middleware.html

If the current middleware function does not end the request-response cycle, it must call next() to pass control to the next middleware function. Otherwise, the request will be left hanging.

I think the docs could use some clarity on whats considered " the next middleware function". The current wording makes it seem like request-response cycle stops on thisModuleUsingYourNewOption until it calls next.

@dougwilson
Copy link
Contributor

dougwilson commented Jul 20, 2020

Hi @kwasimensah apologies, I did mess up the example I provided -- that's what I get for trying to write this all up while I'm supposed to be at work :P Perhaps would you like to shelve this conversation until the weekend, so I can have some time to spend on it? I feel really bad now if that mistake lead you down a rabbit hole and I want to try and provide you accurate information, but with all the week-based distractions at hand I think it will be hard to serve you well on this complex topic.

@kwasimensah
Copy link
Author

Hi @kwasimensah apologies, I did mess up the example I provided -- that's what I get for trying to write this all up while I'm supposed to be at work :P Perhaps would you like to shelve this conversation until the weekend, so I can have some time to spend on it? I feel really bad now if that mistake lead you down a rabbit hole and I want to try and provide you accurate information, but with all the week-based distractions at hand I think it will be hard to serve you well on this complex topic.

Works for me to wait till the weekend.

@dougwilson
Copy link
Contributor

Thank you, I really appreciate it! Please ping me once into the weekend if you don't see a response to this from me; many people stop by these modules and demand immediate attention, so can be a distraction, causing me to either loose the needed time or just forget to follow up on a specific issue, so don't feed bad to ping me if it's into Saturday and you haven't heard anything more.

But also please feel free to continue to work on this pull request over the week. For example, updating it by making it do something you cannot already do without modifications to the module, adding documentation around your new feature to the README and any appropriate tests. And if you don't think you're going to work more on this pull request, you're also welcome to close it -- it isn't mergable as it stands, at least, so would need continued work to get into a landable state regardless. Could be a good thing to work on async to the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants