Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda: OTEL not working. Timeout when using adotInstrumentation #26414

Closed
willdady opened this issue Jul 19, 2023 · 6 comments
Closed

Lambda: OTEL not working. Timeout when using adotInstrumentation #26414

willdady opened this issue Jul 19, 2023 · 6 comments
Assignees
Labels
@aws-cdk/aws-lambda Related to AWS Lambda bug This issue is a bug. cause/not-a-bug Not a bug (might still be a documentation issue, might still need work) documentation This is a problem with documentation.

Comments

@willdady
Copy link

Describe the bug

Adding the adotInstrumentation property to a lambda function causes it to timeout. The function handler does not execute. I am following the example shown in the docs.

Expected Behavior

Function should execute handler and not timeout.

Current Behavior

Function times-out without executing the function handler. See Reproduction Steps below for the code in-use.

info	service/telemetry.go:121	Setting up own telemetry...
2023-07-19T00:03:35.170Z	info	service/telemetry.go:143	Serving Prometheus metrics	{"address": ":8888", "level": "basic"}
2023-07-19T00:03:35.211Z	info	components/components.go:30	In development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "logging", "stability": "in development"}
2023-07-19T00:03:35.330Z	info	service/service.go:90	Starting otelcol...	{"Version": "v0.62.1", "NumCPU": 2}
2023-07-19T00:03:35.330Z	info	extensions/extensions.go:42	Starting extensions...
2023-07-19T00:03:35.330Z	info	pipelines/pipelines.go:74	Starting exporters...
2023-07-19T00:03:35.330Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "traces", "name": "awsxray"}
2023-07-19T00:03:35.330Z	info	pipelines/pipelines.go:82	Exporter started.	{"kind": "exporter", "data_type": "traces", "name": "awsxray"}
2023-07-19T00:03:35.449Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-07-19T00:03:35.450Z	info	pipelines/pipelines.go:82	Exporter started.	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-07-19T00:03:35.450Z	info	pipelines/pipelines.go:86	Starting processors...
2023-07-19T00:03:35.569Z	info	pipelines/pipelines.go:98	Starting receivers...
2023-07-19T00:03:35.570Z	info	pipelines/pipelines.go:102	Receiver is starting...	{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2023-07-19T00:03:35.570Z	info	otlpreceiver/otlp.go:71	Starting GRPC server	{"kind": "receiver", "name": "otlp", "pipeline": "traces", "endpoint": "0.0.0.0:4317"}
2023-07-19T00:03:35.570Z	info	otlpreceiver/otlp.go:89	Starting HTTP server	{"kind": "receiver", "name": "otlp", "pipeline": "traces", "endpoint": "0.0.0.0:4318"}
2023-07-19T00:03:35.570Z	info	pipelines/pipelines.go:106	Receiver started.	{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2023-07-19T00:03:35.570Z	info	pipelines/pipelines.go:102	Receiver is starting...	{"kind": "receiver", "name": "otlp", "pipeline": "metrics"}
2023-07-19T00:03:35.570Z	info	pipelines/pipelines.go:106	Receiver started.	{"kind": "receiver", "name": "otlp", "pipeline": "metrics"}
2023-07-19T00:03:35.570Z	info	service/service.go:107	Everything is ready. Begin running and processing data.
{"level":"debug","msg":"Register ","response :":"{\n\t\"functionName\": \"CdkOtelLambdaTestStack-FunctionWithOtel04BB4E1A-yvDLQCigmrFR\",\n\t\"functionVersion\": \"$LATEST\",\n\t\"handler\": \"index.handler\"\n}"}
{"level":"debug","msg":"Waiting for event..."}
2023/07/19 00:03:35 Registered extension ID: "e0f13bc4-632a-41b7-a37a-7478a6847d13"
EXTENSION	Name: collector	State: Ready	Events: [INVOKE, SHUTDOWN]
START RequestId: 68e67878-56ea-490a-ba93-d5a4f077668a Version: $LATEST
2023-07-19T00:03:42.270Z	info	service/collector.go:196	Received shutdown request
2023-07-19T00:03:42.270Z	info	service/service.go:116	Starting shutdown...
2023-07-19T00:03:42.270Z	info	pipelines/pipelines.go:118	Stopping receivers...
2023-07-19T00:03:42.270Z	info	pipelines/pipelines.go:125	Stopping processors...
2023-07-19T00:03:42.270Z	info	pipelines/pipelines.go:132	Stopping exporters...
2023-07-19T00:03:42.270Z	info	extensions/extensions.go:56	Stopping extensions...
2023-07-19T00:03:42.270Z	info	service/service.go:130	Shutdown complete.
{"level":"debug","msg":"Received ","event :":"{\n\t\"eventType\": \"SHUTDOWN\",\n\t\"deadlineMs\": 1689725024209,\n\t\"requestId\": \"\",\n\t\"invokedFunctionArn\": \"\",\n\t\"tracing\": {\n\t\t\"type\": \"\",\n\t\t\"value\": \"\"\n\t}\n}"}
{"level":"debug","msg":"Received SHUTDOWN event"}
{"level":"debug","msg":"Exiting"}
2023-07-19T00:03:42.310Z 68e67878-56ea-490a-ba93-d5a4f077668a Task timed out after 10.10 seconds

END RequestId: 68e67878-56ea-490a-ba93-d5a4f077668a
REPORT RequestId: 68e67878-56ea-490a-ba93-d5a4f077668a	Duration: 10102.44 ms	Billed Duration: 10000 ms	Memory Size: 128 MB	Max Memory Used: 79 MB	
XRAY TraceId: 1-64b72852-11a867dc7998f13e0475f10f	SegmentId: 3cf46649712b6496	Sampled: true

Reproduction Steps

The following stack creates 2 function instances using the same handler code. The one with the adotInstrumentation fails with a timeout.

Stack

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as sqs from 'aws-cdk-lib/aws-sqs';
import * as lambdaNodeJS from 'aws-cdk-lib/aws-lambda-nodejs';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as path from 'path';

export class CdkOtelLambdaTestStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const queue = new sqs.Queue(this, 'Queue');

    const sharedFunctionConfig: lambdaNodeJS.NodejsFunctionProps = {
      runtime: lambda.Runtime.NODEJS_18_X,
      entry: path.join(__dirname, 'lambda', `queue-put.ts`),
      environment: {
        QUEUE_URL: queue.queueUrl,
      },
      timeout: cdk.Duration.seconds(10),
    };

    // This function fails with a timeout!
    const functionWithOtel = new lambdaNodeJS.NodejsFunction(
      this,
      'FunctionWithOtel',
      {
        ...sharedFunctionConfig,
        adotInstrumentation: {
          layerVersion: lambda.AdotLayerVersion.fromJavaScriptSdkLayerVersion(
            lambda.AdotLambdaLayerJavaScriptSdkVersion.LATEST,
          ),
          execWrapper: lambda.AdotLambdaExecWrapper.REGULAR_HANDLER,
        },
      },
    );
    queue.grantSendMessages(functionWithOtel);

    // This executes as expected
    const functionWithoutOtel = new lambdaNodeJS.NodejsFunction(
      this,
      'FunctionWithoutOtel',
      {
        ...sharedFunctionConfig,
      },
    );
    queue.grantSendMessages(functionWithoutOtel);
  }
}

Function code

import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs';

const { QUEUE_URL } = process.env;

const sqsClient = new SQSClient({});

export const handler = async (event: any) => {
  console.log('Got event: ', JSON.stringify(event));

  console.log('Sending message to queue');

  await sqsClient.send(
    new SendMessageCommand({
      QueueUrl: QUEUE_URL!,
      MessageBody: JSON.stringify({
        message: 'Hello world!',
      }),
    }),
  );

  console.log('Done');

  return JSON.stringify({ message: 'Done' });
};

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.80.0

Framework Version

No response

Node.js Version

18

OS

MacOS

Language

Typescript

Language Version

No response

Other information

No response

@willdady willdady added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 19, 2023
@github-actions github-actions bot added the @aws-cdk/aws-lambda Related to AWS Lambda label Jul 19, 2023
@indrora
Copy link
Contributor

indrora commented Jul 19, 2023

Does this happen when you create the lambda manually?

Create the lambda as you would normally in the CDK, add the appropriate layer, and enable tracing following the instructions here: https://aws-otel.github.io/docs/getting-started/lambda/lambda-js

@indrora indrora added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 19, 2023
@willdady
Copy link
Author

Yes. It still fails. I discovered the actual error causing the timeout is Cannot redefine property: handler. This also occurs when setting up OTEL manually when following the instructions described here: https://aws-otel.github.io/docs/getting-started/lambda/lambda-js.

I discovered this issue occurs because the Javascript emitted by esbuild uses Object.defineProperty when the source Typescript file is using es module exports e.g. export const handler = .... It will work fine if your source Typescript uses a cs module export like module.exports = { handler }; because esbuild will pass that through to the resulting transpiled file unchanged.

Object.defineProperty causes modules.exports to be immutable so OTEL is not able to redefine (monkey-patch) the handler function, hence the error.

I think it would be a good idea to update the CDK docs to mention that you MUST use an cs module export when using adotInstrumentation AND esbuild. It will not work otherwise.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 20, 2023
@indrora
Copy link
Contributor

indrora commented Jul 20, 2023

Interesting! Technically, that's outside the bounds of the AWS CDK, but you're right that it should be clearly documented. It sounds like this also needs to be logged with the AWS OTEL Lambda layer team ( https://github.com/aws-observability/aws-otel-lambda ) so that they can clearly inform users.

Turns out, they know this problem: aws-observability/aws-otel-lambda#99

This should be documented on both sides.

@indrora indrora self-assigned this Jul 20, 2023
@indrora indrora added documentation This is a problem with documentation. cause/not-a-bug Not a bug (might still be a documentation issue, might still need work) and removed needs-triage This issue or PR still needs to be triaged. labels Jul 20, 2023
@indrora
Copy link
Contributor

indrora commented Jul 20, 2023

I've opened a PR with the OTEL documentation: aws-otel/aws-otel.github.io#594

@pahud
Copy link
Contributor

pahud commented Nov 30, 2023

Closing this issue as not relevant to CDK. Please open a new one if any concerns.

@pahud pahud closed this as completed Nov 30, 2023
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-lambda Related to AWS Lambda bug This issue is a bug. cause/not-a-bug Not a bug (might still be a documentation issue, might still need work) documentation This is a problem with documentation.
Projects
None yet
Development

No branches or pull requests

3 participants