Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semconv: "normal" updates should not require updating import statements #4886

Closed
muncus opened this issue Feb 6, 2024 · 5 comments
Closed
Labels
enhancement New feature or request

Comments

@muncus
Copy link

muncus commented Feb 6, 2024

Problem Statement

OTel SDK updates frequently require widespread editing of import statements for the semconv packages, to ensure that the SDK, any detectors used, and the consumer's code are all using the same imported version.
I find this tedious, because one must first determine the semconv version the SDK, and any detectors are using (which usually means reading the code itself), then make updates to one's own code (or worse, await updates to the detector).
I find the effort required to be significantly greater than any other library, especially for a minor update, and I'd love to have an easier way forward.

As an example, a routine renovate PR required this additional fix. (note, this is not a conversation about renovate, a human could have made the same PR).

The relevant code here is doing nothing particularly fancy, but does add a (stable, well-known) attribute, which requires using the semconv package.

The test failures only occur when a test is performed on GCP, where the gcp detector returns a non-null resource, and failed to merge it with the resource.WithTelemetrySDK() Resource.

Proposed Solution

In the spirit of #4876, we could give SDK consumers a way to "just use the semconv built into the SDK" through a method like resource.GetSDKSemConv(), which just returns whatever semconv package the SDK is using.

The SDK consumer code no longer needs to import semconv directly:

	res, err := resource.New(ctx,
		// Use the GCP resource detector to detect information about the GCP platform
		resource.WithDetectors(gcp.NewDetector()),
		// Keep the default detectors
		resource.WithTelemetrySDK(),
		// Add your own custom attributes to identify your application
		resource.WithAttributes(
			resource.GetSDKSemConv().ServiceNameKey.String("my-application"),
		),
	)

If an SDK consumer wishes to be explicit about the semconv versions they export, they can still do so.

This would also give detector authors the ability (should they choose) to avoid needing to bump all of their semconv imports when updating to a new version, relying instead on the SDK versions in their go.mod files (thus eliminating one source of semconv version conflicts!)

Alternatives

#4876 alone may reduce the number of runtime errors that occur because of this, but updates will remain tedious.

Versioning and releasing the semconv package separately would allow go's dependency resolution to address conflicts, but would mean all SDK consumers are forced to use the same semconv version, which I do not believe is desired.

Prior Art

Similar issues have been raised in the past: #2341, opentelemetry-go-contrib/3657. Each of those have additional linked references.

@muncus muncus added the enhancement New feature or request label Feb 6, 2024
@MrAlias
Copy link
Contributor

MrAlias commented Feb 6, 2024

OpenTelemetry semantic conventions are not guaranteed to be stable version to version: open-telemetry/opentelemetry-specification#3443

In the spirit of #4876, we could give SDK consumers a way to "just use the semconv built into the SDK" through a method like resource.GetSDKSemConv(), which just returns whatever semconv package the SDK is using.

This is not possible given the mentioned instability. The API across packages is not guaranteed to be stable nor compatible. What if the "semantic conventions" returned by GetSDKSemConv() has no ServiceNameKey or that key name has changed? Given you can already determine the semantic conventions used for a Resource via its schema URL, there is nothing stopping you from attempting the solution you mention.

We have no plans to support this, it has been discussed in the past and has been rejected. We provide stable APIs for these packages but make not guarantees about stability or compatibility across these packages. This is intentional given the mentioned restrictions.

@MrAlias MrAlias closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2024
@MrAlias
Copy link
Contributor

MrAlias commented Feb 6, 2024

@muncus definitely drop by the OpenTelemetry semantic conventions working group if you would like to express your opinion about how this instability is affecting you. Having more user voices on this issue would be very helpful.

@muncus
Copy link
Author

muncus commented Feb 7, 2024

(quoted out of order)

We have no plans to support this, it has been discussed in the past and has been rejected. We provide stable APIs for these packages but make not guarantees about stability or compatibility across these packages. This is intentional given the mentioned restrictions.

Can you direct me to past discussions? I'm willing to follow up with the semconv working group as you suggested, but i'd like to be well informed before joining that discussion. I found little in the github search beyond the issues i've already linked.

This is not possible given the mentioned instability. The API across packages is not guaranteed to be stable nor compatible. What if the "semantic conventions" returned by GetSDKSemConv() has no ServiceNameKey or that key name has changed? Given you can already determine the semantic conventions used for a Resource via its schema URL, there is nothing stopping you from attempting the solution you mention.

As a library consumer, I'd prefer this behavior over the current behavior.
If the ServiceNameKey is no longer present, that's a compile-time error (and will be highlighted in an IDE). The current behavior is a runtime error in some environments. A change in value can be detected with a test (or a read of the changelog), and remedied with a bit of Collector rewriting. These events are rarer than routine SDK updates and I believe this method will result in lower long-term effort.

In practice, i think users will likely come to the same conclusion as the discussion in #4846 - that semconv should be avoided in one's own service code (because of the maintenance overhead), which seems like an undesirable outcome. 😞

I look forward to catching up on the past discussion, and continuing this line of though with the semconv working group - thanks for the suggestion.

@jsuereth
Copy link

jsuereth commented Feb 7, 2024

We discussed this issue a bit in the TC. This is actually a known issue in the specification, with a lot of nuance to it:

I believe this is actually an issue with the Resource Model and specification, not specifically semconv or Go. We plan to make progress at that level, via directly looking at #3361 as we make incremental improvements to Resource.

Foundation-ally, Resource currently conflates three aspects:

  • Identity
  • Description
  • Relationships

We're going to find better ways to tease these apart in the specification.

@felix
Copy link

felix commented Apr 3, 2024

So we bumped into this today. We had

import semconv "go.opentelemetry.io/otel/semconv/v1.4.0"

and updated some OTel components that used a newer version. We did not update the semconv package version. This results in a runtime error across all services, which is worrying. The fix is identified in the error:

import semconv "go.opentelemetry.io/otel/semconv/v1.24.0"

but it would be much nicer at compile time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants