Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing OpenTelemetry (otel) Env Vars to the Shim Runtime #10173

Open
Tracked by #10
Mossaka opened this issue May 6, 2024 · 0 comments
Open
Tracked by #10

Passing OpenTelemetry (otel) Env Vars to the Shim Runtime #10173

Mossaka opened this issue May 6, 2024 · 0 comments

Comments

@Mossaka
Copy link
Member

Mossaka commented May 6, 2024

What is the problem you're trying to solve

As a maintainer of runwasi project, an area I've identified for improvement is the OpenTelemetry collection from containerd-shims. Currently, containerd supports collecting tracing, metrics and logs through OTLP exporters using standard OTLP env vars (e.g. #8645 and otel config options). But it lacks a similar mechanism for collecting detailed otel telemetry directly from containerd-shims.

Describe the solution you'd like

As discussed in the community meeting at April 24, 2024, we reached a consensus that a good first step to support shim otel is to pass down the OTLP env vars from containerd to the shim.

More specifically, we may want to pass these env vars to the The start command of the Shim

const (
	sdkDisabledEnv = "OTEL_SDK_DISABLED"

	otlpEndpointEnv       = "OTEL_EXPORTER_OTLP_ENDPOINT"
	otlpTracesEndpointEnv = "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"
	otlpProtocolEnv       = "OTEL_EXPORTER_OTLP_PROTOCOL"
	otlpTracesProtocolEnv = "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL"

	otelTracesExporterEnv = "OTEL_TRACES_EXPORTER"
)

Note: it seems like containerd is already passing OS env vars to the shim binary at https://github.com/containerd/containerd/blob/main/pkg/shim/util.go#L71. This proposal is to make the env vars more explictly in the shim runtime APIs.

Additional context

We have also discussed a few other approaches in the community call:

  1. Envelope the trace span as an event to containerd
  2. Utilize the existing ttrpc communication channels to transport telemetry data. Meaning this will define new APIs for OTEL

The above two approaches all have the same downside - exposing too much OTEL specific concepts to containerd.

I am mostly interested in the performance impact that this method might affect the performance of containerd. I'd love to gather your feedback, additional ideas, or concerns regarding the proposed method.

FYI @cpuguy83 @dmcgowan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant