Clarification on duplicated instrument names under different meters #2064

joaopgrassi · 2021-10-27T11:55:56Z

TL;DR

What should be the SDK behavior when two instruments with the same name are registered under different meters and no views are registered/involved?

Meter                                      Instrument                        Unit
com.postgress                              sql.query.errors                     1 
com.mysql                                  sql.query.errors                     1

The behavior in .NET is that the second instrument is dropped and never reaches the exporter. In Go, it seems we get errors (as per this Slack thread)

Long version

Hey all. I wanted to get my feet wet with OTel metrics, so I started by reading the metrics API and SDK specification. Then, I started playing around with the .NET SDK and I ran into some confusion around the instrument names. I have a long thread with @cijothomas and we were both a bit confused so I'm opening this so we can get some clarification.

Lets say we have this use-case:
We develop an app that records request_count (just as an example, could be anything). The app uses a third-party package that has its own meter and an instrument with the same name to record its own request_count.

Meter                                      Instrument                        Unit
MyCompany.MyProduct.MyApp                  request_count                     1 
MyLibrary.MyService                        request_count                     1

We now have two instruments with the same name, under different meters. This should be fine, as the API spec is clear on this:

Different Meters MUST be treated as separate namespaces. The names of the Instruments under one Meter SHOULD NOT interfere with Instruments under another Meter.

This is also listed as a capability in the Compliance matrix of Implementations with the Specification:

It is possible to register two instruments with same name under different Meters.

So I went ahead and created a sample app to "simulate" this use case. You can find the code here.
It's a simple console app that imports a class lib to simulate the third-party package. The console app records on its own instrument first, then invokes a method on the library which records on its own instrument.
Both meters are registered in the Console app as:

 using var meterProvider = Sdk.CreateMeterProviderBuilder()
    .AddMeter(AppMeter.Name) // Meter for the app we are developing
    .AddMeter("MyLibrary*")  // Meter for the third-party library
    .AddConsoleExporter()
    .Build();

When I run the sample app, I expected this output in the console:

Export request_count, Meter: MyCompany.MyProduct.MyApp/1.0
(2021-10-27T09:38:27.5162313Z, 2021-10-27T09:38:27.5349794Z] LongSum
Value: 1
Export request_count, Meter: MyLibrary.MyService/1.0
(2021-10-27T09:38:27.5162313Z, 2021-10-27T09:38:27.5349794Z] LongSum
Value: 1

But instead, I only saw this:

Export request_count, Meter: MyCompany.MyProduct.MyApp/1.0
(2021-10-27T09:38:27.5162313Z, 2021-10-27T09:38:27.5349794Z] LongSum
Value: 1

While debugging, I saw that the second instrument is dropped. There is a map of instrument names, and since the names are the same, the second is simply dropped. It never reaches the exporter.
The same thing happens if I use the OTLP exporter to export to a collector:

otel-local-collector_1  | 2021-10-13T16:19:48.423Z      INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 1}
otel-local-collector_1  | 2021-10-13T16:19:48.423Z      DEBUG   loggingexporter/logging_exporter.go:66  ResourceMetrics #0
otel-local-collector_1  | Resource labels:
otel-local-collector_1  |      -> service.name: STRING(unknown_service:getting-started)
otel-local-collector_1  | InstrumentationLibraryMetrics #0
otel-local-collector_1  | InstrumentationLibrary MyCompany.MyProduct.MyApp 1.0
otel-local-collector_1  | Metric #0
otel-local-collector_1  | Descriptor:
otel-local-collector_1  |      -> Name: request_count
otel-local-collector_1  |      -> Description:
otel-local-collector_1  |      -> Unit:
otel-local-collector_1  |      -> DataType: Sum
otel-local-collector_1  |      -> IsMonotonic: true
otel-local-collector_1  |      -> AggregationTemporality: AGGREGATION_TEMPORALITY_CUMULATIVE
otel-local-collector_1  | NumberDataPoints #0
otel-local-collector_1  | StartTimestamp: 2021-10-13 16:19:48.5310304 +0000 UTC
otel-local-collector_1  | Timestamp: 2021-10-13 16:19:48.5433528 +0000 UTC
otel-local-collector_1  | Value: 1

I started a discussion in the .NET slack channel and it was pointed out, that the API spec is clear, but that this does not mean the SDK is responsible for resolving/flattening the meter names (e.g., appending the meter name to the instrument to produce a "unique" metric stream name). I completely agree with that.
It was also mentioned, that Views should be used to "solve" this problem. The point was that I should use a View to "rename" the conflicting metric by providing a view with a custom name. This logic is defined in the SDK spec here.
The relevant portion is:

If there is an error [...] or a conflict (e.g. the View requires to export the metrics using a certain name, but the name is already used by another View), provide a way to let the user know (e.g. expose self-diagnostics logs).

I read this multiple times and still can't wrap my head around it. What is confusing/not clear to me is

What if I don't use/want views?
In the statement above, the part "View requires to export the metrics using a certain name, but the name is already used by another View" is a bit confusing. To me, that refers to conflicting view names and not instrument names. For example: I created a view with name view_a and configured it to export instruments with request_*. Then, I create another view, also called view_a, that also collects request_*. This leads to a a conflict, because although the instruments match for both views, my view conflicts with an existing one. In fact, the view names would clash, even if the instruments' names were different. To me, it is not clear whether this section refers to the view names (view_a), or the instrument names (request_*).

Another thing that came up in the discussion was that there's limitations in the exporters. For example Prometheus does not support duplicated metric names. This makes sense to me, especially in this case. I expected the SDK to not drop the instruments due to duplicated names. However, this makes it the exporters' job to do whatever they need to to make it work. In this case, prepending the meter names to the instrument name, to make it "unique". But if the metric data never reaches the exporters in the first place, this is not possible.
This is particularly concerning to me, because as an application owner, I would expect:

NOT having to know the names of the instruments in the library (what if the code is not open source?)
NOT having to configure a View in order to "rename" a conflicting instrument from the third-party library with a instrument from my app (seems like more of a workaround tbh)
all metrics to show up - I imagine this would be hard to debug and frustrating for users. (E.g., I just added OTel, configured the MeterProvider correctly with my meters, but my metrics don't show up. 😟).

So, in essence, what should be the SDK behavior when we have two instruments with the same name, registered under different meters, and no views involved? And if there's views involved, should the view name be used to check conflicts or the instrument names?

The text was updated successfully, but these errors were encountered:

reyang · 2021-10-27T16:55:27Z

Lets say we have this use-case: We develop an app that records request_count (just as an example, could be anything). The app uses a third-party NuGet package that has its own meter and an instrument with the same name to record its own request_count.
Meter                                      Instrument                        Unit
MyCompany.MyProduct.MyApp                  request_count                     1 
MyLibrary.MyService                        request_count                     1
We now have two instruments with the same name, under different meters. This should be fine, as the API spec is clear on this:

Different Meters MUST be treated as separate namespaces. The names of the Instruments under one Meter SHOULD NOT interfere with Instruments under another Meter.

From API perspective this is totally fine.
From SDK perspective, this is confusing and not fine unless the conflict is resolved by the application developer or by the library developer.
I can see that calling the Instrument request.count is not well aligned with the spirit of the semantic conventions. I think they should be called myapp.http.client.request.count and myservice.http.request.count.

So, in essence, what should be the SDK behavior when we have two instruments with the same name, registered under different meters, and no views involved? And if there's views involved, should the view name be used to check conflicts or the instrument names?

It's currently documented here https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#view.

joaopgrassi · 2021-10-28T08:30:11Z

conflict is resolved by the application developer or by the library developer.

But this is exactly one of my concerns from above. I'm not sure this should be a burden on the application developer/owner. If it's a responsibility on the library author, then I think we should make it clear that they need to append the Meter name to the instrument name to make it "unique" (or any other technique). I didn't find any guidance like this in the spec.

The only thing I found was this in the Supplementary Guidelines:

It is highly recommended that you align with the OpenTelemetry Semantic Conventions, rather than inventing your own semantics.

Which doesn't really help much I think. In reality I think it's not far-fetched to assume users will have their own semantics that are relevant to their business/use-case which are not defined in the semantic conventions.

But let's assume they all follow the conventions: We already have a problem here don't we? If they do follow the conventions they will end up with duplicate instrument names, unless something uses the meter name in the instrument at some point.

I can see that calling the Instrument request.count is not well aligned with the spirit of the semantic conventions. I think they should be called myapp.http.client.request.count and myservice.http.request.count.

request.count was just an example, but yes, that makes sense to me. But this is the issue I'm trying to get clarification here. Where is myapp or myservice prefixes added to the instrument names? Is it by the SDK? is it by the Lib author? Is it by the exporter?

It's currently documented here main/specification/metrics/sdk.md#view.

I'm confused. This is the instructions for views and I asked what should happen when no views are involved. I'm very fresh into metrics, so if it's something that I'm lacking I assume everyone that is also new to metrics using OTel will also face.

In that link, there's a section:

[...] Here are some examples when a View might be needed:

And there's nothing saying that a view should be used to "solve" conflicting instrument names. It only has this:

If the MeterProvider has no View registered, take the Instrument and apply the default configuration.

This could all be that it's a "known concept" in the metrics space and since I'm new to all of this I have this hard time understanding it. But this is the exact point why we should clarify and make things better because I guess most of the target audience of OTel will also not have this knowledge.

dyladan · 2021-11-03T14:55:43Z

I agree with Joao that this is a bug in the implementation. From a library author perspective, I should be able to obtain a meter and be sure the name I use won't conflict with other libraries. The current spec states quite clearly that instruments with the same name created by different meters must not interfere with each other. To me, that means it is not up to the end-user application, but that this should be guaranteed by the API/SDK.

I can see that calling the Instrument request.count is not well aligned with the spirit of the semantic conventions.

IMO this is unrelated. The behavior should not depend on the library author picking a "correct" semantic convention as the application developer may want to create custom metrics and there could always be cases where there is no semantic convention yet.

reyang · 2021-11-11T06:37:37Z

Here are some related discussions #2035 (comment).

jmacd · 2022-02-04T21:32:13Z

See #2317.

jmacd · 2022-08-04T20:03:10Z

I believe this was addressed in #2317, please re-open if not.

joaopgrassi added the spec:metrics Related to the specification/metrics directory label Oct 27, 2021

github-actions bot assigned carlosalberto Oct 27, 2021

pirgeo mentioned this issue Nov 17, 2021

Instrument name collisions for different Meters open-telemetry/opentelemetry-dotnet#2627

Closed

jmacd closed this as completed Aug 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on duplicated instrument names under different meters #2064

Clarification on duplicated instrument names under different meters #2064

joaopgrassi commented Oct 27, 2021 •

edited

reyang commented Oct 27, 2021

joaopgrassi commented Oct 28, 2021

dyladan commented Nov 3, 2021

reyang commented Nov 11, 2021

jmacd commented Feb 4, 2022

jmacd commented Aug 4, 2022

Clarification on duplicated instrument names under different meters #2064

Clarification on duplicated instrument names under different meters #2064

Comments

joaopgrassi commented Oct 27, 2021 • edited

TL;DR

Long version

reyang commented Oct 27, 2021

joaopgrassi commented Oct 28, 2021

dyladan commented Nov 3, 2021

reyang commented Nov 11, 2021

jmacd commented Feb 4, 2022

jmacd commented Aug 4, 2022

joaopgrassi commented Oct 27, 2021 •

edited