Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests are held when exceeded a certain call rate/seconds #707

Open
jRichardeau opened this issue May 17, 2022 · 9 comments
Open

Requests are held when exceeded a certain call rate/seconds #707

jRichardeau opened this issue May 17, 2022 · 9 comments
Assignees
Labels
api: logging Issues related to the googleapis/nodejs-logging-winston API. logging-resilience Issues related to logging resilience priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jRichardeau
Copy link

Hello,

I'm using @google-cloud/logging-winston and the express middleware for logging purpose.

When using it on production within a service that receives more than 100 requests/second I see a huge increase of memory usage and latency of the service.

Moreover when configuring the transport with redirectToStdout option the problem is not there.

I think I've identified who is holding the requests but I don't know why, it happens in the readable-stream module.
If we add a console.log in this file /node_modules/readable-stream/lib/_stream_writable.js in the writeOrBuffer function like this console.log('bufferedRequestCount', state.bufferedRequestCount);, the value of state.bufferedRequestCount is continually growing and so holding all the requests in memory.

I think the pain threshold is about 50 requests/s, under that it's ok.

You can reproduce the issue with the example above.

Environment details

  • OS: Mac OS 12.2.1
  • Node.js version: 12.22.7
  • npm version: 6.14.15
  • @google-cloud/logging-winston version: 4.2.3
  • @google-cloud/trace-agent: 5.1.6
  • express: 4.17.3,
  • winston: 3.6.0

Steps to reproduce

  1. Create an express application.
  2. Use @google-cloud/logging-winston middleware for express
  3. Call a endpoint with a request rate > 50/s

Example code of express application

const express = require('express')
const winston = require('winston');
const lw = require('@google-cloud/logging-winston');
const port = 3000

const logger = winston.createLogger();

async function init() {
  const mw = await lw.express.makeMiddleware(logger);

  const app = express()
  app.use(mw);

  app.get('/hello', (req, res) => {
    req.log.info('Hello');
    res.send('Hello World!')
  });

  app.listen(port, () => {
    console.log(`App listening on port ${port}`)
  });
}

init();

I'm using Artillery to call the endpoint with a certain request rate, this is the yml configuration file :

config:
  target: "http://localhost:3000"
  phases:
    - duration: 60
      arrivalRate: 50
      name: Warm up

scenarios:
  - name: "Calling express"
    flow:
      - get:
          url: "/hello"

When stopping the calls the "buffer" is emptied but in production the calls never end 😄 so the memory and latency are increasing.

Thank you for your help

@jRichardeau jRichardeau added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 17, 2022
@product-auto-label product-auto-label bot added the api: logging Issues related to the googleapis/nodejs-logging-winston API. label May 17, 2022
@losalex losalex assigned losalex and unassigned daniel-sanche Jun 5, 2022
@jRichardeau
Copy link
Author

Hi, any news on this ?
Thanks

@losalex losalex added the logging-resilience Issues related to logging resilience label Aug 29, 2022
@losalex
Copy link
Contributor

losalex commented Nov 10, 2022

Thanks @jRichardeau for opening this issue and really sorry for late response.
I wonder if this issue still relevant - can you please let me know if you still facing issues if you upgrade to the latest @google-cloud/logging-winston?

@jRichardeau
Copy link
Author

Thank your for your reply @losalex but after testing again i'm still facing the same issues even when upgrading @google-cloud/logging-winston and winston to the latest versions.

NB: You may have to change the arrivalRate to 100 in the example above to reproduce.

@losalex losalex added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Nov 11, 2022
@losalex
Copy link
Contributor

losalex commented Nov 11, 2022

Thanks @jRichardeau for response - correct me if I wrong, seems that there was slight improvement with respect of overall rate throughput, but still not enough. One more question - seems you perform testing on your local Mac, correct? If yes - I wonder how much powerful it is and whether you had any chance to test it in other environments like Google Cloud.

@jRichardeau
Copy link
Author

There may be a slight improvement, but do you really think it's a power issue?
The problem is not that the buffer fills up faster than it empties, but it never empties when there are too many requests, it just grows.

But ok I will try it on GCP.

Here is my local config :

image

@losalex
Copy link
Contributor

losalex commented Nov 14, 2022

@jRichardeau, thanks for answer - yes, I wanted to eliminate any factors related to HW or connectivity since those usually have the major impact on performance. Also correct me if I wrong, but readable-stream problem you are raising is not related to logs entries upload, but rather to middleware interaction, correct?

@jRichardeau
Copy link
Author

I can't tell you much more about where the issue comes from, I've already spent some time trying to figure out if the problem is in my code or not and it seems that it's not the case.

The only thing I can say is that "when configuring the transport with redirectToStdout option the problem is not there" so it seems to come from uploading the log entries but can't be sure about it.

Does it happen only when using the middleware ? Probably yes but it would have to be tested.

Did you manage to reproduce the problem with the example above?

@jRichardeau
Copy link
Author

Hello,

I finally tested it on GCP and I encountered the same performance issue.
We so forced the redirectToStdout option on all our services with the new useMessageField: false option.

@losalex losalex removed their assignment Jul 3, 2023
@mochsner
Copy link

mochsner commented Oct 30, 2023

We seem to be running into a similar issue with our platform. On GCP we have found that when we use GCP as a sink with either extremely large objects (and/or circular objects), the process eventually crashes with a heap overflow. This occured more frequently on GCP (App Engine) than locally, but we were eventually able to reproduce it locally by pointing to a GCP App Engine logging destination manually.

new LoggingWinston({
			labels: {
				name: "local",
			},
			projectId: process.env.PROJECT_ID,
			keyFilename: process.env.GCLOUD_APPLICATION_DEFAULT_CREDENTIALS,
			defaultCallback: (err) => {
				if (err) {
					console.error(`GcpLogger Error occurred: ${err}`, err);
				}
			},
		})

We've gone through and reduces our overall object sizes for logs, but we are not comfortable enabling Winston sending to GCP directly at the moment, so we are needing to use console as a sink, which has impacts on performance since my understanding is that console.log is fully synchronous (even on AppEngine).

Notably:

  1. We did not have any issues or memory leaks when JUST using Winston sending over Serilog's SEQ data sink.
  2. We did have the cloudTrace.start() running for a while (import * as cloudTrace from '@google-cloud/trace-agent'; on our servers, but I believe we were running into it even after this was disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the googleapis/nodejs-logging-winston API. logging-resilience Issues related to logging resilience priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

5 participants