Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unparseable metadata error desperately needs better error response #36245

Open
ravenblackx opened this issue Apr 3, 2024 · 4 comments
Open

Comments

@ravenblackx
Copy link

What version of gRPC and what language are you using?

1.59.4 (but it's not fixed later), C++

What did you do?

Send an http-based grpc request with (for example) header x-bin: $$$

What did you expect to see?

At least one of:

  • InvalidArgument status code
  • error message describing the problem
  • log message describing the problem
  • grpc message works anyway and just doesn't parse this header as binary

What did you see instead?

  • UnknownError status code (2)
  • "Stream removed" as the error message
  • other end of connection just sees stream reset and doesn't even get "Stream removed".
  • no log messages at all.
  • even with debug logging turned on, there is no log message associated with the problem header directly, there is only messages like this logged on every subsequent header of the request.
    hpack_parser.cc:705]                  HTTP:8439:HDR:SVR: x-forwarded-proto: https (parse error: INTERNAL: Error parsing 'x-bin' metadata: illegal base64 encoding [type.googleapis.com/grpc.status.int.stream_id='0'])
    

Error message content comes from here and it only gets logged here, in the debug log of the next header - there is no error/warn/info log when the actual error occurs in or around the call to ParseBinary.

This inadequate error response caused a 6 week long debugging process in production trying to find what was causing these unexpected failures, with the team that owns the service believing that the client/proxy closed the connection (because "Stream removed" as an error sounds like something someone else did, not a confession) and the team that owned the proxy confident that it was the server that closed it. The onset of these failures approximately coincided with an update to grpc 1.59.4 from, I think, 1.56, so it might be that an older grpc version didn't abort on unparseable -bin headers, but it also might just be that no clients were sending them until after the update and the timing was a coincidence.

@dfawley
Copy link
Member

dfawley commented Apr 4, 2024

This situation is not covered in https://github.com/grpc/grpc/blob/master/doc/statuscodes.md, but INTERNAL seems most appropriate since it looks very similar to: "Error parsing request proto". Go would do the following:

https://github.com/grpc/grpc-go/blob/c31cec33dd5d1b524a010acbef9d6932dfc791dc/internal/transport/http2_server.go#L454-L457

That is, INTERNAL with a grpc-message header indicating the key/value of the header that was problematic and the error from decoding (presuming the client is using an application/grpc content-type), and also HTTP status 400 (Bad Request).

@ravenblackx
Copy link
Author

ravenblackx commented Apr 8, 2024

Additional variations on the same problem - a header with a | character in the key, causing Illegal header key: [key] which, in the same way, only logs in the debug logs of subsequent headers, not any kind of directly related or higher-priority log message, and with the same unhelpful "stream removed" message logged (and literally just closing the stream with no message, from the client/proxy perspective).

And we haven't seen happening, but I noticed while looking through the code for the one which was happening, that similarly, a non-bin header with value containing any ascii characters <32 would provoke the exact same kind of unhelpful failure state.

@yashykt
Copy link
Member

yashykt commented Apr 9, 2024

This is unfortunate. Can you tell me what the error_details field seen is?

@ravenblackx
Copy link
Author

I think you're asking about the error sent to the client? If so, there is none, because the server just closes the stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants