Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End to end tracing support - Lambda #983

Merged
merged 66 commits into from
Oct 25, 2021

Conversation

bhautikpip
Copy link
Contributor

bhautikpip and others added 30 commits February 4, 2021 16:21
bhautikpip and others added 2 commits August 13, 2021 11:43
Co-authored-by: garrettwegan <22334769+garrettwegan@users.noreply.github.com>
@bhautikpip
Copy link
Contributor Author

@Aneurysm9 @MrAlias Please take a look whenever you get a chance. This PR is derived from #882

tp := sdktrace.NewTracerProvider(
sdktrace.WithResource(res),
)
lambda.Start(<some lambda handler>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the example need either otel.SetTracerProvider(tp) or passing tp explicitly into the Lambda handler? Otherwise the variable is unused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, made a suggestion

|`faas.name` | MyLambdaFunction
|`faas.version` | $LATEST

Of note, `faas.id` and `cloud.account.id` are not set by the Lambda resource detector because they are not available outside a Lambda invocation. For this reason, when using the AWS Lambda Instrumentation these attributes are set as additional span attributes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be solved through lazy initialization?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without asking the user to add more code. The faas.id and cloud.account.id attributes are only available inside of the handler, and our solution is meant to be strictly a wrapper of the handler such that we can't actually do anything inside the wrapper. So we could ask the user to lazy-initialize the tracer provider in their handler, but that's a much worse experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the user have to do that lazy-initialization, rather than the framework? E.g., at this point it seems like you'd have access to those invocation-scoped variables. It's true that you'd have to do separately for each possible reflective type of the handler function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that we will then couple the Lambda resource detector and the Lambda instrumentation, which ought to be two separate components. The Lambda instrumentation does indeed have access to those invocation-scoped attributes, and is able to initialize the tracer provider. But the Lambda resource detector must operate only with what is available in the static environment, so if a user wants to initialize a custom tracer provider outside their handler as seen in the example it will work the same.

I agree that it's unfortunate the lambda resource detector doesn't capture these attributes, but I believe this is a limitation of Lambda itself more than any individual instrumentation, which is why the span attribute is called out in the spec and there's talks of not even marking faas.id as a resource attribute at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. I was commenting based on "what would an operator find most useful" vs "what is a clean way to factor the code." Given the ambiguity / ongoing discussion around faas.id, I think this code makes a reasonable choice.

|`faas.name` | MyLambdaFunction
|`faas.version` | $LATEST

Of note, `faas.id` and `cloud.account.id` are not set by the Lambda resource detector because they are not available outside a Lambda invocation. For this reason, when using the AWS Lambda Instrumentation these attributes are set as additional span attributes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without asking the user to add more code. The faas.id and cloud.account.id attributes are only available inside of the handler, and our solution is meant to be strictly a wrapper of the handler such that we can't actually do anything inside the wrapper. So we could ask the user to lazy-initialize the tracer provider in their handler, but that's a much worse experience.

tp := sdktrace.NewTracerProvider(
sdktrace.WithResource(res),
)
lambda.Start(<some lambda handler>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lambda.Start(<some lambda handler>)
otel.SetTracerProvider(tp)
lambda.Start(myHandler)

tp := sdktrace.NewTracerProvider(
sdktrace.WithResource(res),
)
lambda.Start(<some lambda handler>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, made a suggestion

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this PR has been needing reviews for some time, it might be possible to get the code here merged quicker if we break up the PR.

go.opentelemetry.io/contrib/detectors/aws/lambda v0.22.0
go.opentelemetry.io/contrib/instrumentation/github.com/aws/aws-lambda-go/otellambda v0.22.0
go.opentelemetry.io/contrib/propagators/aws v0.22.0
go.opentelemetry.io/otel v1.0.0-RC2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be updated to 1.0.0 or 1.0.1?

github.com/aws/aws-lambda-go v1.24.0
github.com/stretchr/testify v1.7.0
go.opentelemetry.io/contrib v0.22.0
go.opentelemetry.io/otel v1.0.0-RC2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be updated to 1.0.0 or 1.0.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it still needs to be updated?

detectors/aws/lambda/detector.go Show resolved Hide resolved
require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/stretchr/testify v1.7.0
go.opentelemetry.io/otel v1.0.0-RC2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhautikpip if you have a minute would you be able to update these versions to stable and rebase the PR so conflicts/CI checks are resolved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codeboten @willarmiros updated the version and resolved the conflicts. It should pass the CI checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@willarmiros
Copy link
Contributor

I realize this PR has been needing reviews for some time, it might be possible to get the code here merged quicker if we break up the PR.

I agree and wish it were made in a more modular fashion, but as it stands there are only 2 components in this PR - the Lambda instrumentation and the Lambda resource detector. We could separate them but the detector is only 5 files, and so I'm not sure a 25 file vs. 30 file PR would be more digestible & worth the effort :( Open to other suggestions though if I'm not realizing something.

@alolita
Copy link
Member

alolita commented Oct 12, 2021

@dashpole can you please review and approve. We are blocked on this PR from being merged due to inadequate number of approver reviews. 🙏

| Options | Input Type | Description | Default |
| --- | --- | --- | --- |
| `WithTracerProvider` | `trace.TracerProvider` | Provide a custom `TracerProvider` for creating spans. Consider using the [AWS Lambda Resource Detector][lambda-detector-url] with your tracer provider to improve tracing information. | `otel.GetTracerProvider()`
| `WithFlusher` | `otellambda.Flusher` | This instrumentation will call the `ForceFlush` method of its `Flusher` at the end of each invocation. Should you be using asynchronous logic (such as `sddktrace's BatchSpanProcessor`) it is very import for spans to be `ForceFlush`'ed before [Lambda freezes](https://docs.aws.amazon.com/lambda/latest/dg/runtimes-context.html) to avoid data delays. | `Flusher` with noop `ForceFlush`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would anyone ever not want to flush at the end of an invocation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we didn't want to force Lambda customers using async patterns to incur additional execution time from this instrumentation, but I'm not 100% sure what the reasoning behind making the default no-op was. Perhaps @Aneurysm9 remembers better.


// wrap lambda handler function
func main() {
lambda.Start(otellambda.WrapHandlerFunction(HandleRequest))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps InstrumentHandler() is a better description of what this does? Or NewHandler() would match otelhttp's naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about NewHandler since we're not instantiating a new function handler, we are indeed just wrapping it. I think InstrumentHandler() makes sense but would like to have @bhautikpip weigh in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep InstrumentHandler() looks better to me.

// init aws config
cfg, err := awsConfig.LoadDefaultConfig(ctx)
if err != nil {
panic("configuration error, " + err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return this error instead of panicing?

input := &s3.ListBucketsInput{}
result, err := s3Client.ListBuckets(ctx, input)
if err != nil {
fmt.Printf("Got an error retrieving buckets, %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return the error here?

fmt.Printf("Got an error retrieving buckets, %v", err)
}

fmt.Println("Buckets:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Probably better to use log.Info, rather than fmt for logging. Here and elsewhere in this file.

}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://api.github.com/repos/open-telemetry/opentelemetry-go/releases/latest", nil)
if err != nil {
fmt.Printf("failed to create http request, %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the function handling the error should log, rather than logging and returning the error. Same applies below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting to remove error handling? Sorry did not understand your comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying you (probably?) don't need the print statement, since you are returning the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's fair but I think we can keep this as it probably might be useful for debugging or kind of helps understand better where the error is happening.

detector := lambdadetector.NewResourceDetector()
res, err := detector.Detect(ctx)
if err != nil {
fmt.Printf("failed to detect lambda resources: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return (ideally with log.Fatalf) here?

github.com/aws/aws-lambda-go v1.24.0
github.com/stretchr/testify v1.7.0
go.opentelemetry.io/contrib v0.22.0
go.opentelemetry.io/otel v1.0.0-RC2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it still needs to be updated?

ctx = i.configuration.Propagator.Extract(ctx, mc)

var span trace.Span
spanName := os.Getenv("AWS_LAMBDA_FUNCTION_NAME")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be configurable through config options, rather than env vars? Or is this guaranteed to be set in lambda?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhautikpip
Copy link
Contributor Author

@dashpole addressed your comments! Feel free to take a look again.

@willarmiros
Copy link
Contributor

@MrAlias @Aneurysm9 now that this is approved any chance we can get it merged?

@Aneurysm9 Aneurysm9 merged commit d86dc5a into open-telemetry:main Oct 25, 2021
@MrAlias MrAlias mentioned this pull request Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants