Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spans aren't being marked as errors in Cloud Trace #730

Open
andypwarren opened this issue Oct 4, 2023 · 11 comments
Open

Spans aren't being marked as errors in Cloud Trace #730

andypwarren opened this issue Oct 4, 2023 · 11 comments
Assignees
Labels
bug Something isn't working priority: p2

Comments

@andypwarren
Copy link

Hi,

I'm instrumenting a gRPC server with OpenTelemetry and Google Cloud Trace. I can see spans in my Trace dashboard but they aren't being coloured red if an rpc returns an internal error. I'm using the otelgrpc.UnaryServerInterceptor() (code here) which calls span.SetStatus with the otel error code and the grpc message if any of these statuses are returned

  • grpc_codes.Unknown
  • grpc_codes.DeadlineExceeded
  • grpc_codes.Unimplemented
  • grpc_codes.Internal
  • grpc_codes.Unavailable
  • grpc_codes.DataLoss

I've also tried calling span.SetStatus outside the interceptors and Cloud Trace doesn't colour them red either so I don't think the problem is with the interceptor code.

I've created a simple demo app to reproduce this using the example grpc-go Greeter service with the addition of tracing using otel and cloudtrace.

When forcing a request to fail this is what I see in cloud trace

Screen Shot 2023-10-04 at 16 21 05
The interceptor has added the attribute rpc.grpc.status_code: 13 but the span status isn't showing up.

Ideally this would produce a red dot in the trace graph and the span would be coloured red.

Many thanks,

Andy

@dashpole dashpole added bug Something isn't working priority: p2 labels Oct 5, 2023
@dashpole dashpole self-assigned this Oct 5, 2023
@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

This seems suspicious...

case codes.Error:
sp.Status = &statuspb.Status{Code: int32(codepb.Code_UNKNOWN), Message: s.Status().Description}

@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

Seems potentially related to #143

@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

@aabmass do you remember why we set codes.Error to codepb.Code_UNKNOWN?

@aabmass
Copy link
Contributor

aabmass commented Oct 6, 2023

Unknown represents an unknown error, along the lines of HTTP 500 status code. Since OTel only has two possible statuses (OK and ERROR), gRPCs UNKNOWN (error) seems reasonable.

Do you know what status codes actually show red in Cloud Trace?

@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

It does seem like we are doing the right thing based on https://pkg.go.dev/google.golang.org/genproto/googleapis/rpc/code#Code

// Unknown error. For example, this error may be returned when
// a Status value received from another address space belongs to
// an error space that is not known in this address space. Also
// errors raised by APIs that do not return enough error information
// may be converted to this error.
//
// HTTP Mapping: 500 Internal Server Error
Code_UNKNOWN Code = 2

Do you know what status codes actually show red in Cloud Trace?

I'll see if I can find the answer to that question.

@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

I tested all status codes, and none appear to make the span look like an error

@dashpole
Copy link
Contributor

dashpole commented Oct 6, 2023

I'll reach out to the trace UI team.

@BradleyChatha
Copy link

BradleyChatha commented Oct 10, 2023

For further context, the way we're getting around this currently is by setting the attribute /http/status_code to 500 regardless of whether the context is for a HTTP server or not.

It seems to be the only way to make the trace UI render it as an error.

@andypwarren
Copy link
Author

Hi @dashpole, is there any update on this?

@dashpole
Copy link
Contributor

The cloud trace folks are aware of the issue, and suggested the same workaround pointed out above: #730 (comment). I'm not sure about timelines, but i'll post here when there are updates.

@shraddhaag
Copy link

+1. We are facing this problem as well. Thanks for pointing to the workaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: p2
Projects
None yet
Development

No branches or pull requests

5 participants