Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opentelemetry support #2152

Merged
merged 6 commits into from Jun 16, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 12 additions & 6 deletions client/client.go
Expand Up @@ -10,15 +10,16 @@ import (

"github.com/containerd/containerd/defaults"
grpc_middleware "github.com/grpc-ecosystem/go-grpc-middleware"
"github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc"
controlapi "github.com/moby/buildkit/api/services/control"
"github.com/moby/buildkit/client/connhelper"
"github.com/moby/buildkit/session"
"github.com/moby/buildkit/session/grpchijack"
"github.com/moby/buildkit/util/appdefaults"
"github.com/moby/buildkit/util/grpcerrors"
opentracing "github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/trace"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
)
Expand Down Expand Up @@ -54,8 +55,9 @@ func New(ctx context.Context, address string, opts ...ClientOpt) (*Client, error
needWithInsecure = false
}
if wt, ok := o.(*withTracer); ok {
unary = append(unary, otgrpc.OpenTracingClientInterceptor(wt.tracer, otgrpc.LogPayloads()))
stream = append(stream, otgrpc.OpenTracingStreamClientInterceptor(wt.tracer))
var propagators = propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{})
unary = append(unary, otelgrpc.UnaryClientInterceptor(otelgrpc.WithTracerProvider(wt), otelgrpc.WithPropagators(propagators)))
stream = append(stream, otelgrpc.StreamClientInterceptor(otelgrpc.WithTracerProvider(wt), otelgrpc.WithPropagators(propagators)))
}
if wd, ok := o.(*withDialer); ok {
gopts = append(gopts, grpc.WithContextDialer(wd.dialer))
Expand Down Expand Up @@ -182,12 +184,16 @@ func loadCredentials(opts *withCredentials) (grpc.DialOption, error) {
return grpc.WithTransportCredentials(credentials.NewTLS(cfg)), nil
}

func WithTracer(t opentracing.Tracer) ClientOpt {
func WithTracer(t trace.Tracer) ClientOpt {
return &withTracer{t}
}

type withTracer struct {
tracer opentracing.Tracer
tracer trace.Tracer
}

func (wt *withTracer) Tracer(instrumentationName string, opts ...trace.TracerOption) trace.Tracer {
return wt.tracer
Comment on lines +195 to +196

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a concrete Tracer here, I'd suggest having a TracerProvider as this is dropping information coming from the instrumentation that should be recorded on spans created by the returned Tracer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I used Tracer was because there was no way to get access to the current traceprovider from the context.

}

func resolveDialer(address string) (func(context.Context, string) (net.Conn, error), error) {
Expand Down
6 changes: 3 additions & 3 deletions client/solve.go
Expand Up @@ -22,10 +22,10 @@ import (
"github.com/moby/buildkit/solver/pb"
"github.com/moby/buildkit/util/entitlements"
ocispec "github.com/opencontainers/image-spec/specs-go/v1"
opentracing "github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
fstypes "github.com/tonistiigi/fsutil/types"
"go.opentelemetry.io/otel/trace"
"golang.org/x/sync/errgroup"
)

Expand Down Expand Up @@ -93,8 +93,8 @@ func (c *Client) solve(ctx context.Context, def *llb.Definition, runGateway runG
statusContext, cancelStatus := context.WithCancel(context.Background())
defer cancelStatus()

if span := opentracing.SpanFromContext(ctx); span != nil {
statusContext = opentracing.ContextWithSpan(statusContext, span)
if span := trace.SpanFromContext(ctx); span != nil {
tonistiigi marked this conversation as resolved.
Show resolved Hide resolved
statusContext = trace.ContextWithSpan(statusContext, span)
}

s := opt.SharedSession
Expand Down
4 changes: 2 additions & 2 deletions cmd/buildctl/common/common.go
Expand Up @@ -8,9 +8,9 @@ import (
"time"

"github.com/moby/buildkit/client"
opentracing "github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"github.com/urfave/cli"
"go.opentelemetry.io/otel/trace"
)

// ResolveClient resolves a client from CLI args
Expand Down Expand Up @@ -64,7 +64,7 @@ func ResolveClient(c *cli.Context) (*client.Client, error) {

ctx := CommandContext(c)

if span := opentracing.SpanFromContext(ctx); span != nil {
if span := trace.SpanFromContext(ctx); span != nil {
opts = append(opts, client.WithTracer(span.Tracer()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span.Tracer() method has been removed and will not be present starting with v1.0.0-RC1. It is also less safe with OpenTelemetry to re-use a Tracer from an incoming Span as Tracer can contain information related to different instrumentation. Instead, any instrumentation that needs a Tracer should accept a TracerProvider and use that to construct a Tracer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span.Tracer() method has been removed and will not be present starting with

That is very surprising to me. Having access to the current tracer to create child spans inside the caller's span has been a key feature to me, as it was in opentracing to add flexible instrumentation to the packages. With this removed, I don't see a logical way how tracing would be implemented in a project like BuildKit. I do not want to use global tracer. It is a code smell and makes it so that the packages make assumptions of how tracing is organized in the final binary. If project depending on Buildkit wants to set up a global tracer, everything will still work. BuildKit's packages should be reusable on their own; in many cases, different components of it are included in other tools. The packages do not make assumptions that they only work as singletons or that you can't have two instances of an object using different tracers. I also don't want every constructor/pkg to contain code on how to initialize the tracer and carry that information. That would be a big maintenance cost with no additional value. Sometimes in BuildKit a context goes through 10+ layers of components.

Letting functions to ignore tracing if none is configured and add extra contextualized child spans if the caller code set up a tracer/parent-span seems a very logical way to manage this. Even if a project is very complicated, with different components and tracing points, the root context usually starts from a very specific location(eg. gprc client).

I looked at the issues open-telemetry/opentelemetry-go#1892 open-telemetry/opentelemetry-go#1887 and to me, if this behavior is not expected, it is an issue with the delegation of the global tracer. I saw some delegation in the global package and thought this was what it was for already. Above else, I think it shows how bad pattern using a global tracer is for synchronizing individual components.

Why would you even want to take parent span from context but use a different tracer. Doesn't it automatically mean that your data is messed up and you can't get a proper report back?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you even want to take parent span from context but use a different tracer. Doesn't it automatically mean that your data is messed up and you can't get a proper report back?

The Tracer carries information regarding the instrumenting library. This information includes the name of the instrumenting library and optionally its version and the schema identifier for attributes produced by the instrumentation. In this sense it is different from the OpenTracing tracer which, to my understanding, carried the export pipeline configuration that in OpenTelemetry is carried by the TracerProvider.

With the Tracer carrying this additional information it is not safe to take a Tracer from a parent span and use it directly. It may identify a different instrumenting library and provide incorrect information regarding the attribute schema.

It seems that what you're looking for is the ability to extract a TracerProvider from a Span, from which a Tracer scoped to the currently-instrumenting library could be created. We'd love to talk more about these use cases at our next SIG meeting, though I expect providing that ability would need to go through the OTEP process before being included in a new specification version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a different implementation of Span/Tracer than the sdk/trace? It is confusing to me why you are ok with these different implementations to share same spancontext and automatically link to it while they contain the traceid/spanid of something that you claim could be part of the wrong trace. If you want these implementations to use same spancontext key but provide a way to possibly detect conflicts between implementations, add a validation method for it. Also, isn't this the whole point of using these interfaces that I do not target a specific implementation, and my trace-points will always work, whatever tracer is configured by the caller. If my code only works with a specific implementation of tracer I would need to validate it anyway.

To have reusable packages, I need a way for my library to just define the tracing points. The library should not care how the actual tracing infrastructure is configured, if two instances of the same object use the same tracer or not etc. This is all up to the caller. Also, every constructor/function should not take and pass tracer objects around. They can of course choose to optionally take them if they want to provide override points, but the default path should just work. With this change, you are not providing a way to achieve that(without me defining my own wrapper that I can continue to pass with context, meaning packages will be incompatible with non-buildkit callers).

It seems that what you're looking for is the ability to extract a TracerProvider from a Span, from which a Tracer scoped to the currently-instrumenting library could be created

That might technically work, but if it means a traceid per spanid it makes no logical sense. I just need a way to pass tracing context through my components. You do it half-way by providing a spancontext but force me to use a global variable for the other half of what StartSpan needs. That's not an option if you spent a lot of time designing your components so that they would not use the global state and have such limitations. Another thing I don't get is that getting access to TraceProvider is actually dangerous. Eg. my library should not have a possibility to add span processors or shut it down. My library should only have the capability to create additional spans in the context that caller provides and that is exactly the only thing enabled by the Tracer interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To have reusable packages,

An example of a package that I claim is quite useless atm. https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/net/http#NewTransport . This is supposed to be a super reusable package, for different services, implementations of roundtrip etc. It takes the Tracer(Provider) on NewTransport that is the context that has nothing to do with tracing. This is a background pool with its own lifecycle, the init code is nowhere near the place users would make requests, somewhere there is a cleanup routine to clean up old connections etc. None of it has anything to do with request tracing. The actual tracing configuration comes in with the request to RoundTrip and is passed to the next RoundTrip implementation. This works perfectly fine for maintaining the relationship of the spans that are managed by context passed to RoundTrip. But because it doesn't read the tracer itself from context and that is taken with a side-channel there is no actual other way to use this component than to use a global tracer, meaning this component is not reusable at all and spans passed to RoundTrip might be from a completely different tracer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you available tomorrow for a conversation with @MrAlias and I? I fear that we may be talking past each other or not operating from consistent assumptions and understandings and it may be faster to resolve that with synchronous communication. It is very important to me to understand your concerns and whether addressing them may require changes to the public API we are looking to stabilize in the near term.

}

Expand Down
29 changes: 9 additions & 20 deletions cmd/buildctl/common/trace.go
Expand Up @@ -3,26 +3,24 @@ package common
import (
"context"
"os"
"strings"

"github.com/moby/buildkit/util/appcontext"
"github.com/moby/buildkit/util/tracing/detect"
_ "github.com/moby/buildkit/util/tracing/detect/jaeger"
opentracing "github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/ext"
otlog "github.com/opentracing/opentracing-go/log"
"github.com/urfave/cli"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/trace"
)

func AttachAppContext(app *cli.App) error {
ctx := appcontext.Context()

tracer, err := detect.OTTracer()
tracer, err := detect.Tracer()
if err != nil {
return err
}

var span opentracing.Span
var span trace.Span

for i, cmd := range app.Commands {
func(before cli.BeforeFunc) {
Expand All @@ -34,10 +32,8 @@ func AttachAppContext(app *cli.App) error {
}
}

span = tracer.StartSpan(name)
span.LogFields(otlog.String("command", strings.Join(os.Args, " ")))

ctx = opentracing.ContextWithSpan(ctx, span)
ctx, span = tracer.Start(ctx, name)
span.SetAttributes(attribute.KeyValue{Key: "command", Value: attribute.ArrayValue(os.Args)})
tonistiigi marked this conversation as resolved.
Show resolved Hide resolved

clicontext.App.Metadata["context"] = ctx
return nil
Expand All @@ -47,7 +43,7 @@ func AttachAppContext(app *cli.App) error {

app.ExitErrHandler = func(clicontext *cli.Context, err error) {
if span != nil {
ext.Error.Set(span, true)
span.SetStatus(codes.Error, err.Error())
Comment on lines 46 to +47

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span will always be non-nil.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except in here it will be nil if the Before callback never gets called.

}
cli.HandleExitCoder(err)
}
Expand All @@ -60,7 +56,7 @@ func AttachAppContext(app *cli.App) error {
}
}
if span != nil {
span.Finish()
span.End()
tonistiigi marked this conversation as resolved.
Show resolved Hide resolved
}
return detect.Shutdown(context.TODO())
}
Expand All @@ -70,10 +66,3 @@ func AttachAppContext(app *cli.App) error {
func CommandContext(c *cli.Context) context.Context {
return c.App.Metadata["context"].(context.Context)
}

type nopCloser struct {
}

func (*nopCloser) Close() error {
return nil
}
1 change: 1 addition & 0 deletions cmd/buildctl/main.go
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/moby/buildkit/util/appdefaults"
"github.com/moby/buildkit/util/profiler"
"github.com/moby/buildkit/util/stack"
_ "github.com/moby/buildkit/util/tracing/detect/jaeger"
"github.com/moby/buildkit/version"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
Expand Down
25 changes: 19 additions & 6 deletions cmd/buildkitd/main.go
Expand Up @@ -25,7 +25,6 @@ import (
"github.com/docker/go-connections/sockets"
"github.com/gofrs/flock"
grpc_middleware "github.com/grpc-ecosystem/go-grpc-middleware"
"github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc"
"github.com/moby/buildkit/cache/remotecache"
inlineremotecache "github.com/moby/buildkit/cache/remotecache/inline"
localremotecache "github.com/moby/buildkit/cache/remotecache/local"
Expand Down Expand Up @@ -53,10 +52,12 @@ import (
"github.com/moby/buildkit/version"
"github.com/moby/buildkit/worker"
specs "github.com/opencontainers/image-spec/specs-go/v1"
"github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
"go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/trace"
"golang.org/x/sync/errgroup"
"google.golang.org/grpc"
)
Expand All @@ -69,6 +70,8 @@ func init() {
reexec.Init()
}

var propagators = propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{})

type workerInitializerOpt struct {
config *config.Config
configMetaData *toml.MetaData
Expand Down Expand Up @@ -211,13 +214,15 @@ func main() {
}
}

tracer, err := detect.OTTracer()
tracer, err := detect.Tracer()
if err != nil {
return err
}

streamTracer := otelgrpc.StreamServerInterceptor(otelgrpc.WithTracerProvider(constTracerProvider{tracer: tracer}), otelgrpc.WithPropagators(propagators))

unary := grpc_middleware.ChainUnaryServer(unaryInterceptor(ctx, tracer), grpcerrors.UnaryServerInterceptor)
stream := grpc_middleware.ChainStreamServer(otgrpc.OpenTracingStreamServerInterceptor(tracer), grpcerrors.StreamServerInterceptor)
stream := grpc_middleware.ChainStreamServer(streamTracer, grpcerrors.StreamServerInterceptor)

opts := []grpc.ServerOption{grpc.UnaryInterceptor(unary), grpc.StreamInterceptor(stream)}
server := grpc.NewServer(opts...)
Expand Down Expand Up @@ -531,8 +536,8 @@ func getListener(addr string, uid, gid int, tlsConfig *tls.Config) (net.Listener
}
}

func unaryInterceptor(globalCtx context.Context, tracer opentracing.Tracer) grpc.UnaryServerInterceptor {
withTrace := otgrpc.OpenTracingServerInterceptor(tracer, otgrpc.LogPayloads())
func unaryInterceptor(globalCtx context.Context, tracer trace.Tracer) grpc.UnaryServerInterceptor {
withTrace := otelgrpc.UnaryServerInterceptor(otelgrpc.WithTracerProvider(constTracerProvider{tracer: tracer}), otelgrpc.WithPropagators(propagators))

return func(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
ctx, cancel := context.WithCancel(ctx)
Expand Down Expand Up @@ -736,3 +741,11 @@ func getDNSConfig(cfg *config.DNSConfig) *oci.DNSConfig {
}
return dns
}

type constTracerProvider struct {
tracer trace.Tracer
}

func (tp constTracerProvider) Tracer(instrumentationName string, opts ...trace.TracerOption) trace.Tracer {
return tp.tracer
}
7 changes: 2 additions & 5 deletions go.mod
Expand Up @@ -23,12 +23,11 @@ require (
github.com/gofrs/flock v0.7.3
github.com/gogo/googleapis v1.4.0
github.com/gogo/protobuf v1.3.2
github.com/golang/protobuf v1.5.0
github.com/golang/protobuf v1.5.2
github.com/google/go-cmp v0.5.5
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510
github.com/gorilla/mux v1.8.0 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.2.0
github.com/grpc-ecosystem/grpc-opentracing v0.0.0-20180507213350-8e809c8a8645
github.com/hashicorp/go-immutable-radix v1.0.0
github.com/hashicorp/golang-lru v0.5.3
github.com/hashicorp/uuid v0.0.0-20160311170451-ebb0a03e909c // indirect
Expand All @@ -43,8 +42,6 @@ require (
github.com/opencontainers/runc v1.0.0-rc93
github.com/opencontainers/runtime-spec v1.0.3-0.20200929063507-e6143ca7d51d
github.com/opencontainers/selinux v1.8.0
github.com/opentracing-contrib/go-stdlib v1.0.0
github.com/opentracing/opentracing-go v1.2.0
github.com/pkg/errors v0.9.1
github.com/pkg/profile v1.5.0
github.com/serialx/hashring v0.0.0-20190422032157-8b2912629002
Expand All @@ -55,8 +52,8 @@ require (
github.com/tonistiigi/vt100 v0.0.0-20210615222946-8066bb97264f
github.com/urfave/cli v1.22.2
go.etcd.io/bbolt v1.3.5
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.20.0
go.opentelemetry.io/otel v0.20.0
go.opentelemetry.io/otel/bridge/opentracing v0.20.0
go.opentelemetry.io/otel/exporters/otlp v0.20.0
go.opentelemetry.io/otel/exporters/trace/jaeger v0.20.0
go.opentelemetry.io/otel/sdk v0.20.0
Expand Down