Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Express-OTEL] Transaction Name Quality #6301

Closed
3 tasks done
smeubank opened this issue Nov 25, 2022 · 14 comments · Fixed by #6334
Closed
3 tasks done

[Express-OTEL] Transaction Name Quality #6301

smeubank opened this issue Nov 25, 2022 · 14 comments · Fixed by #6334
Assignees

Comments

@smeubank
Copy link
Member

Is there an existing issue for this?

How do you use Sentry?

Sentry Saas (sentry.io)

Which package are you using?

@sentry/node

SDK Version

7.21.1

Framework Version

node v16.16.0, express 4.16.4

Link to Sentry event

https://sentry.io/organizations/sentry-test/performance/summary/?project=4504209954111488&query=&referrer=performance-transaction-summary&statsPeriod=24h&transaction=%7B%22ismaster%22%3A%22%3F%22%7D&unselectedSeries=p100%28%29

Steps to Reproduce

  1. @danielkhan has a demo express app running locally interacting with MongoDB

Expected Result

Quality transaction names

Some of these examples are from API routes and others from DB interactions

Actual Result

Here we see the different transaction type examples, for HTTP the method is not set, unparameterized transaction names, or names which don't fit the expectation

image

checking the Sentry-SDKs org as well we see a mix of <>, custom, route. Would need to look into more individual events there to check the transaction source as the cause, and unclear how much of the data is a mix of Sentry SDK or OTEL SDK -> Sentry

Also not sure that we have data for the express OTEL demo service yet, screenshot is the NextJs services

https://sentry.io/organizations/sentry-sdks/performance/?project=4504094581063680&project=4504078691663872&project=4504078555545600&project=4504118601187328&project=4504118275604480&project=4504118274883584&statsPeriod=7d

image

@AbhiPrasad
Copy link
Member

@smeubank for the opentelemetry-demo, I elected to use opentelemetry to instrument the NextJS server-side instead of sentry just to see what the data looks like, we can switch to Sentry later on.

re: Here we see the different transaction type examples, for HTTP the method is not set, unparameterized transaction names, or names which don't fit the expectation

This is because we only do minimal transformation of transaction name/span description/operation in the SDK. When we introduce the Relay side transforms, this will improve over time.

@smeubank
Copy link
Member Author

further context:

the source of a URL has an impact on the product will group and display transactions in the Performance product

Due to concerns around high cardnality during the implementation and rollout of Dynamic Sampling improvements to the Sentry backend, it was decided that if txns do not have properly parameterized txn names then they cannot really provide a lot of value since there will be some variance in agregating the performance data associated with them

Outcome:

  • source == url == <> transactions are placed into this "bucket" and diplayed in product as such,
    • these transactions may actually not functionally belong together - degraded product experience
  • source == route == normal txn name is displayed and product works as expected
  • source == custom == normal txn name is displayed and product works as expected
    • assume user is setting name in SDK in a way best suited to their desired experienced

@smeubank
Copy link
Member Author

Parent issue, where work was done to improve this across JS frameworks:

#5342

Scroll to ->

image

@AbhiPrasad
Copy link
Member

In OpenTelemetry, the semantic convention attributes have clear outlines about their potential cardinality. For example with the HTTP semantic conventions (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md), http.route is low cardinality (source === 'route') while http.target is high cardinality (source === 'url')

@danielkhan
Copy link

I'd like us to use Jaeger as a reference here.
The transaction name should be something like the HTTP Method and the HTTP route.
If it's a database request it should be named like that and we should show the icon for the respective database.
This is a bit of a random list of improvements but we should prioritize them as it's a huge part of the value prop around OTel.

Capto_Capture 2022-11-25_11-42-50_AM

@danielkhan
Copy link

@AbhiPrasad would it be possible to handle these very obvious cases in the JS SDK until Relay catches up?

@AbhiPrasad
Copy link
Member

We elected to just put basic transformations in favor of an MVP - I would just wait till the Relay work is done for this since that is the more holistic solution, otherwise we will have to keep duplicating this for Ruby, Java (soon to release as well).

@danielkhan
Copy link

@AbhiPrasad hopefully Relay is done before all the other SDKs are.
In the meantime, I'd think of this of a simple mapping table that helps us test this logic locally on the SDK before it's added to the relay.
I am all in for a holistic, central solution but the cost/reward ratio of the temporary fix seems pretty tempting.

Or let's put it like that: The way it is now, I don't want to show it to anyone.

@jan-auer
Copy link
Member

@AbhiPrasad where are these values from the "actual result" screenshot in the issue coming from? These entries look like a mixture of payloads, low-level span ops, and incomplete information (in the case of "HTTP GET"). If we ignore the <<unparameterized>> case for a moment, even server-side clustering will not be able to restore better transaction names for those cases. How come that something like dns.lookup ends up in a transaction name?

Now, to <<unparameterized>>: It depends what is in there. Assuming that these are proper URLs, clustering will help here. I'm just surprised that we're not getting proper route names in Express. @danielkhan mentioned that opentelemetry instrumentation is able to get the route properly - I assume referring to the "router - /" span in the Jaeger screenshot.

The conversation about semantic conventions absolutely makes sense, but to me it looks like the root cause of the original issue is a plain SDK bug.

@AbhiPrasad
Copy link
Member

How come that something like dns.lookup ends up in a transaction name

This is what the OpenTelemetry span name is, and our baseline heuristic is to turn the otel span name -> sentry transaction name by default (OpenTelemetry span names are low cardinality values).

We can adjust this heuristic once we start doing semantic conventions - we can generate a more useful transaction name based on a combination of the span name + span attributes.

Now, to <>: It depends what is in there

This was coming from transactions from vanilla javascript, which was generated by a Sentry SDK, not a OpenTelemetry one - the OpenTelemetry transactions should be always low cardinality, we've taken many steps to make sure that is the case.

@danielkhan
Copy link

This was coming from transactions from vanilla javascript, which was generated by a Sentry SDK, not a OpenTelemetry one - the OpenTelemetry transactions should be always low cardinality, we've taken many steps to make sure that is the case.

Just to make sure, only voting-browser is instrumented with our vanilla SDK, everything else is OTel.

Capto_Capture 2022-11-24_02-13-13_PM

@AbhiPrasad
Copy link
Member

AbhiPrasad commented Nov 25, 2022

Just to make sure, only voting-browser is instrumented with our vanilla SDK, everything else is OTel.

My mistake then. This is still due to our transformation problem since we set transaction source based on attribute existence in otel: https://github.com/getsentry/sentry-javascript/blob/master/packages/opentelemetry-node/src/utils/parse-otel-span-description.ts#L101. This helps decide transaction name quality.

Looking at https://sentry.io/organizations/sentry-test/performance/voting-service-green:6f73334593524e8ba3a9a7934d0ba6ac/?transaction=GET+%2F, an example event from voting-service-green, it seems to be have an http.route of an empty string, so we fall back to the high cardinality http.target.

image

This is something we'll have to fix with the express auto-instrumentation for OpenTelemetry node - they are not setting the parameterized route correctly.

@danielkhan
Copy link

Got it

This is something we'll have to fix with the express auto-instrumentation for OpenTelemetry node - they are not setting the parameterized route correctly.

Yes, but let's not rely on that solution the meantime. There may be a good reason why it is like that and Jaeger can make sense of the data as is.

I did some digging and this might be a solution in the meantime:

If span.kind === server: ${http.method} ${http.target}
If span.kind === client: ${http.method}

The discussion we are having is a good example why it makes sense to temporarily implement this directly on the SDK.
That's the only way to validate this.

@AbhiPrasad
Copy link
Member

#6345 should fix the incorrect status code (we were expecting the wrong type here).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

4 participants