Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stackdriver Trace Integration #57

Open
grant opened this issue Jul 18, 2019 · 18 comments
Open

Stackdriver Trace Integration #57

grant opened this issue Jul 18, 2019 · 18 comments

Comments

@grant
Copy link
Contributor

grant commented Jul 18, 2019

Stackdriver Trace Integration

Add Stackdriver Trace support to the Functions Framework.

Library: https://github.com/googleapis/cloud-trace-nodejs
Google Issue: https://issuetracker.google.com/issues/124761993

@stew-r Can you add details as to what this would do? Would this be a tutorial or part of the actual framework?

Example Code

"@google-cloud/debug-agent": "^4.0.1",
"@google-cloud/profiler": "^2.0.2",
"@google-cloud/trace-agent": "^4.1.0",
import * as debugAgent from '@google-cloud/debug-agent';
import * as profiler from '@google-cloud/profiler';
import * as trace from '@google-cloud/trace-agent';

let toolsAreSetup = false;

/**
 * Starts Google Cloud tooling.
 * - StackDriver Trace
 * - StackDriver Profiler
 * - StackDriver Debugger
 */
export function startTooling() {
  // Ensure tools are only setup once per process.
  if (toolsAreSetup) {
    return null;
  } else {
    toolsAreSetup = true;
  }

  // Start Trace
  trace.start();

  // Start profiler
  profiler
    .start({
      serviceContext: {
        service: getServiceName(),
        version: 'v1',
      },
    })
    .catch(err => {
      console.error('Failed to start profiler', err);
    });

  // Start debugger
  const debug = debugAgent.start({
    serviceContext: {
      service: getServiceName(),
      version: 'v1',
      // version: functions.config().tsc.version
    },
  });

  // Return all tools
  return {
    trace,
    profiler,
    debug,
  };
}

/**
 * Gets the name of the function for logging.
 * @returns {string} The name of the function with a '_function' suffix.
 */
function getServiceName(): string {
  const serviceName = process.env.FUNCTION_TARGET || 'unknown_cloud';
  return `${serviceName}_function`;
}
@stew-r
Copy link
Contributor

stew-r commented Jul 18, 2019

Ideally, this would be part of the Functions Framework and a user of the framework gets trace support built in automatically. The user does not need to think about importing a client or doing any setup. They can disable this feature by passing some configuration.

@grant
Copy link
Contributor Author

grant commented Jul 19, 2019

I added a WIP PR for this feature.

  • Should these tools be added as an explicit dependency to the FF? (Is that an acceptable increase in GCF dependencies)?
  • Should the disable flag be something like:
    • --disable-stackdriver, FUNCTION_DISABLE_STACKDRIVER

Edit: It looks like there's a codelab for getting this setup:
https://codelabs.developers.google.com/codelabs/cloud-function-logs-traces/index.html?index=..%2F..index#0

@Elyx0
Copy link

Elyx0 commented Sep 28, 2019

Struggling aswell here from googleapis/nodejs-logging#591 any updates?

@grant
Copy link
Contributor Author

grant commented Jan 9, 2020

So far there has been no priority for this feature, no upvotes.
The issue is that require('@google-cloud/trace-agent').start(); must be the first module loaded.

I don't think it would be hard to implement that, but we need someone to help implement a good PR.

@Elyx0
Copy link

Elyx0 commented Mar 18, 2020

Can I add this myself @grant, does it need to be the first module in the index.src exporting the cloud functions or required inside every functions?

@dinigo
Copy link

dinigo commented Mar 31, 2020

As I understad it, this feature was available at some point, but was removed afterwards, and now it's being re-implemented in the FF?

What kind of help do you need in the PR? I may be able to savage some time from here and there. We (Divisadero) rely on Cloud Functions and want more visibility of what's happening within executions

@grant
Copy link
Contributor Author

grant commented Mar 31, 2020

I'm not an expert in Stackdriver Trace so I'm not sure of the requirements for this.

We need a proposed implementation for optional trace support that would not affect the performance of all functions that don't enable it. Then we need a PR that follows that proposal that we can get approved by the eng team.

I think GCF might already have trace support, I'm not sure.

@manwithsteelnerves
Copy link

manwithsteelnerves commented Apr 1, 2020

Any update on this?
Can someone suggest how can I assure require('@google-cloud/trace-agent').start(); will be loaded at start?

Currently by default the trace dashboard showing some details but not sure if I add a trace agent, I can debug my major problem. Currently I have delays of around more than 10-40 seconds for each function, which is really bothering. And this won't happen every time.

Can someone give more details on this agent?

  1. Request started - Internal log from system
  2. Start - Point where my express app code starts
  3. End - Point where the function ends

I have no clue why its taking so much time to start executing my code :|

RequestTrace

@manwithsteelnerves
Copy link

Can someone give a hint on how to consider the stack trace logging in the above image?

@grant
Copy link
Contributor Author

grant commented Apr 13, 2020

I haven't looked at this since.
I would assume the steps would be:

@sk-
Copy link
Contributor

sk- commented Jun 9, 2020

Hi @grant, setting the environment variable does not work.

However, I have partially succeeded on enabling opentelemetry for both background and http functions.

Note that we only get HTTP spans, and we don't get specific FASS ones (see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/faas.md).

In order to achieve that, one needs to inject a middleware before finishing the creation of the express app. This is similar to what I did to extract the trace and execution id (see googleapis/nodejs-logging#591 (comment)).

It'd be great if you could add the executionId and traceId to the context, before calling the background function.

Also, if you could add a hook to inject a custom middleware without having to fiddle with the inner implementation of the framework and express, that would make our code much more reliable.

My idea is to write either a functions-framework plugin for opentelemetry or an express middleware adding the trace.

I think a plugin is not possible as we cannot control when the script is being included, specially if you are using Firebase Functions. On the other hand, the middleware requires a reliable way of injecting the plugin.

@grant
Copy link
Contributor Author

grant commented Jul 31, 2020

Looking at internal worker, here is the older Node versions (non FF) feature support. (I've verified this code does not contain internal workings.)

I think we simply need to include something like this in the FF at the top of the invoker.

let tracer;
if (process.env.GOOGLE_CLOUD_TRACE_ENABLED &&
    process.env.GOOGLE_CLOUD_TRACE_ENABLED.toLowerCase() === 'true') {
  // If Trace API is not enabled in user projet, below will still succeed but
  // traces will not be registered.
  // Override root span name as it uses request.path for requests to an
  // 'express' server.Also force traces flush after each function execution by
  // setting buffer size to 0.
  tracer = require('@google-cloud/trace-agent').start({
    // As the worker only services one request at a time, this enables an
    // optimization that reduces overhead introduced by context tracking across
    // asynchronous boundaries.
    clsMechanism: 'singular',
    serviceContext: {service: FUNCTION_NAME, version: FUNCTION_VERSION},
    rootSpanNameOverride: FUNCTION_NAME,
    bufferSize: 0,
    stackTraceLimit: -1,
    contextHeaderBehavior: 'require',
    samplingRate: 0,
  });
}

let debug;
if (process.env.GOOGLE_CLOUD_DEBUG_ENABLED &&
    process.env.GOOGLE_CLOUD_DEBUG_ENABLED.toLowerCase() === 'true') {
  // If Debugger API is not enabled in user projet, below will still succeed but
  // debugee will not be registered.
  debug = require('@google-cloud/debug-agent').start({
    serviceContext: {service: FUNCTION_NAME, version: FUNCTION_VERSION},
  });
}

if (process.env.GOOGLE_CLOUD_PROFILER_ENABLED &&
    process.env.GOOGLE_CLOUD_PROFILER_ENABLED.toLowerCase() === 'true') {
  // If Profiler API is not enabled in user projet, below will still succeed but
  // Profiler data will not be registered.
  require('@google-cloud/profiler').start({
    serviceContext: {service: FUNCTION_NAME, version: FUNCTION_VERSION},
  });
}

This piece of code must be injected before any other code is executed (simple variables/functions are OK).

There's also some other code required.

// When invoking a function...
 setTraceHeader(res);
 // execute users function


/**
 * Adjusts res object before it's ready to be passed back.
 * @param {!express.Response} res
 */
var setTraceHeader = function(res) {
  try {
    if (tracer != null && tracer.isRealSpan(tracer.getCurrentRootSpan())) {
      res.set(GOOGLE_CLOUD_SPAN_SAMPLED_HEADER_FIELD, '1');
    } else {
      res.set(GOOGLE_CLOUD_SPAN_SAMPLED_HEADER_FIELD, '0');
    }
  } catch (ex) {
    console.error('Error when setting trace header ' + ex.toString());
  }
};

Deps:

    "@google-cloud/trace-agent" : "3.3.1",
    "@google-cloud/debug-agent" : "3.0.0",
    "@google-cloud/profiler" : "0.2.2"

(Internal cs/worker/nodejs/worker.js)

@kintel
Copy link

kintel commented Aug 1, 2020

@grant I think your suggestion will work, but keep in mind that @google-cloud/trace-agent is will be deprecated in favour of OpenTelemetry sometime the next 2-6 months.

@grant
Copy link
Contributor Author

grant commented Sep 23, 2020

Well, it looks like we should add this then?
https://github.com/open-telemetry/opentelemetry-js-contrib/blob/master/examples/express/tracer.js

The instructions/API for using this is very confusing imo.

@sk-
Copy link
Contributor

sk- commented Sep 25, 2020

@grant Below is the config helper we are using in our codebase, with some comments to clarify what each config means. Note that it's not necessary to explicitly add the plugins, as those will be autodetected. Note also that this code is using OpenTelemetry 0.10, as there are some breaking changes in the last version and the google pacakges have not been updated yet.

const opentelemetry = require('@opentelemetry/api');
const {
  TraceExporter,
} = require('@google-cloud/opentelemetry-cloud-trace-exporter');
const { NodeTracerProvider } = require('@opentelemetry/node');
const { BatchSpanProcessor } = require('@opentelemetry/tracing');
const {
  CompositePropagator,
  HttpCorrelationContext,
  HttpTraceContext,
  ProbabilitySampler,
} = require('@opentelemetry/core');
const {
  CloudPropagator,
} = require('@google-cloud/opentelemetry-cloud-trace-propagator');

exports.initTracing = () => {
  const provider = new NodeTracerProvider({
    // This is important to keep the cost reasonable
    sampler: new ProbabilitySampler(0.01),
  });
  provider.register({
    // Use CloudPropagator + defaults
    propagator: new CompositePropagator({
      propagators: [
        new HttpCorrelationContext(),
        new HttpTraceContext(),
        // The first 2 propagators are the default ones, and the CloudPropagator is the one used to extract Google defined headers.
        new CloudPropagator(),
      ],
    }),
  });

  // This is the exporter used to send the traces to Google Cloud
  const exporter = new TraceExporter();

  provider.addSpanProcessor(new BatchSpanProcessor(exporter));
  opentelemetry.trace.setGlobalTracerProvider(provider);
};

@aabmass
Copy link

aabmass commented Sep 25, 2020

Note that it's not necessary to explicitly add the plugins, as those will be autodetected

@sk how are the plugins autodetected? I don't see anything about that in the OTel JS docs. The individual plugins would still have to be installed as well, but you're saying you don't need the plugins config?

    plugins: {
      express: {
        enabled: true,
        path: '@opentelemetry/plugin-express',
      },
      http: {
        enabled: true,
        path: '@opentelemetry/plugin-http',
      },
    }

@sk-
Copy link
Contributor

sk- commented Sep 25, 2020

@aabmass the Getting Started guide says:

OpenTelemetry can collect tracing data automatically using plugins. Vendors/Users can also create and use their own. Currently, OpenTelemetry supports automatic tracing for:

@sk-
Copy link
Contributor

sk- commented Dec 10, 2020

Any progress on this?

One issue we have been experiencing is that some spans are sent after the functions ends, which trigger an error, because resources are not available after the function ends.

So we would need a way for firesbase-functions/function-framework to allow us to register a completed hook and in that hook call forceFlush. See open-telemetry/opentelemetry-js#1296

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants