Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iomgr][Windows] Return proper error code to client when connection is reset #33502

Merged
merged 2 commits into from
Jun 22, 2023

Conversation

drfloob
Copy link
Member

@drfloob drfloob commented Jun 21, 2023

Fixes #24206 by ensuring that IOCP/socket errors in the iomgr on_read callback are properly annotated with the gRPC Unavailable status. The WindowsEventEngine was already doing this correctly (try running the client with $env:GRPC_EXPERIMENTS="event_engine_client").

This also adds two small cleanups:

  • Cleanly prints statuses with their child statuses in a few spots within the chttp2 transport logging (previously, child messages were printed with garbled bits)
  • Adds friendly names to a subset of WSA errors that we're likely to see from common operations. The top-level status message will no longer just say "WSA Error" in many cases.

CC @Hamza-Q

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@@ -70,12 +70,42 @@ absl::Status grpc_os_error(const grpc_core::DebugLocation& location, int err,
}

#ifdef GPR_WINDOWS
std::string WSAErrorToShortDescription(int err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Windows not provide a function that does this already? I would have expected that it would. Posix has had strerror() for a long time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@drfloob drfloob Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use FormatMessage in gpr_format_message, but it produces longer messages. For example, for the WSAECONNRESET error, FormatMessage produces "An existing connection was forcibly closed by the remote host", but for a top-level Status summary (which becomes the error strings that clients receive), something like "Connection reset" will do. The short version seems friendlier.

Here is an example error:

D0620 18:33:23.211000000 15608 src/core/ext/transport/chttp2/transport/chttp2_transport.cc:2198] MARK_STREAM_CLOSED: t=000001946F442850 s=000001946F441290(id=1) read+write [UNKNOWN:Endpoint read failed {occurred_during_write:0, created_time:"2023-06-21T01:33:23.210383893+00:00", file_line:2534, file:"src/core/ext/transport/chttp2/transport/chttp2_transport.cc", children:[UNAVAILABLE:Connection Reset {file:"src/core/lib/iomgr/tcp_windows.cc", file_line:189, created_time:"2023-06-21T01:33:23.2102455+00:00", wsa_error:10054, grpc_status:14, os_error:"An existing connection was forcibly closed by the remote host.\r\n", syscall:"IOCP/Socket"}]}]

This propagates to the surface as an error with status GRPC_STATUS_UNAVAILABLE and message "Connection Reset".

Edit: adding linebreaks in the example for readability
Edit: Replaced the Cancel pings log with a "mark stream closed" log, which is actually what propagates up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Windows not provide a function that does this already?

To add, if you go code searching for things like WSAECONNRESET, you'll find that an error-to-short-name mapping is not uncommon. It does seem like an oversight for Windows to not provide this already.

GRPC_CHTTP2_IF_TRACING(
gpr_log(GPR_INFO, "%p CANCEL PINGS: %s", t, error.ToString().c_str()));
GRPC_CHTTP2_IF_TRACING(gpr_log(GPR_INFO, "%p CANCEL PINGS: %s", t,
grpc_core::StatusToString(error).c_str()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education: why is ToString not the right thing?

Copy link
Member Author

@drfloob drfloob Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows at least, the child errors are printed with embedded hex when you use status.ToString(). Try to pick out the file line number for the child error in this example:

D0620 15:33:35.497000000 19492 src/core/ext/transport/chttp2/transport/chttp2_transport.cc:2198] MARK_STREAM_CLOSED: t=000001C9618D5830 s=000001C9618DB710(id=1) read+write [UNKNOWN: Endpoint read failed [type.googleapis.com/grpc.status.str.file='src/core/ext/transport/chttp2/transport/chttp2_transport.cc'] [type.googleapis.com/grpc.status.int.file_line='2534'] [type.googleapis.com/grpc.status.time.created_time='2023-06-20T22:33:35.497401355+00:00'] [type.googleapis.com/grpc.status.children=''\x01\x00\x00\x08\x02\x12\x44\x41n existing connection was forcibly closed by the remote host.%0D%0A\x1aX\n1type.googleapis.com/grpc.status.time.created_time\x12#2023-06-20T22:33:35.497371459+00:00\x1a\x34\n-type.googleapis.com/grpc.status.int.file_line\x12\x03\x31\x39\x30\x1aM\n(type.googleapis.com/grpc.status.str.file\x12!src/core/lib/iomgr/tcp_windows.cc'] [type.googleapis.com/grpc.status.int.occurred_during_write='0']]

@drfloob drfloob marked this pull request as ready for review June 21, 2023 19:03
@drfloob drfloob merged commit 055158b into grpc:master Jun 22, 2023
64 of 65 checks passed
@copybara-service copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Jun 23, 2023
mario-vimal pushed a commit to mario-vimal/grpc that referenced this pull request Jul 13, 2023
…s reset (grpc#33502)

Fixes grpc#24206 by ensuring that IOCP/socket errors in the iomgr on_read
callback are properly annotated with the gRPC Unavailable status. The
WindowsEventEngine was already doing this correctly (try running the
client with `$env:GRPC_EXPERIMENTS="event_engine_client"`).

This also adds two small cleanups:
* Cleanly prints statuses with their child statuses in a few spots
within the chttp2 transport logging (previously, child messages were
printed with garbled bits)
* Adds friendly names to a subset of WSA errors that we're likely to see
from common operations. The top-level status message will no longer just
say "WSA Error" in many cases.

CC @Hamza-Q
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bloat/none imported Specifies if the PR has been imported to the internal repository lang/core per-call-memory/neutral per-channel-memory/neutral release notes: yes Indicates if PR needs to be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GRPC-Core Client inconsistency between Windows/Unix when server shuts down during a request
3 participants