-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix attach goroutine/fd leak when no I/O is ready #45052
base: master
Are you sure you want to change the base?
Conversation
19e39b2
to
53b4d3d
Compare
/cc @corhere |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the client-disconnects solution is more clever than it needs to be. In the case of TCP, the lowest common denominator for connection state tracking, each peer can either gracefully close its write stream (FIN
) or forcefully abort the whole connection (RST
) but there is no affordance in the TCP state machine to signal gracefully that a peer is no longer interested in receiving any more data. And the only portable way to check whether a socket is still open is to try reading or writing at least one byte and see if it succeeds. What if we always used the client-closed-for-writing signal as our cue to stop copying stdout/stderr from the container? That is, have a goroutine which calls conn.Read()
even when stdin isn't attached to the container, and closes stdout/stderr when the read returns io.EOF
or similar. The same code path would work for UNIX domain sockets, TCP sockets, WebSockets and Windows named pipes; no platform-specific event handling required! And all the bases would be covered for TCP: FIN, RST and keep-alive timeout will all result in EOF on read.
@corhere You are right I think this could work. |
So it seems like there is a problem with that 3rd commit, possibly because the read side is shutdown when there's nothing on stdin. |
Sigh, the CLI half-closes the connection to signal EOF on stdin. So much for that idea. It's looking like the sorts of OS-specific close notifications you had implemented initially is really the only thing we could do to fix the leaks within the current API versions. Using socket half-close to signal stdin closed rather than attach-done is arguably a design flaw in our container-attach protocol. With the WebSocket attach protocol also busted (ws provides framing but no multiplexing so stdout can't be distinguished from stderr) I see an opportunity to fix all these flaws... the third time around. I'm imagining new WebSocket attach protocol (paired with new endpoints?) for attach and exec which:
|
Rebased this and removed the 3rd commit which didn't work as hoped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super thrilled about the un-idiomatic use of callbacks in notifyClosed
but I can understand why: the more idiomatic designs (synchronous blocking call returning an error, return a child context in the same vein as "os/signal".NotifyContext) would take a nontrivial amount of additional plumbing and refactoring. I have to concede that you have gone with the most pragmatic implementation.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
287892b
to
ffb951d
Compare
Rebased and moved EINTR handlers to the unix_noeintr package. |
00b1448
to
f9468e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (with a couple nits), with a bit of being sad about not being able to solve this on Windows, but I don't begrudge not wanting to figure out a separate async I/O API.
f9468e1
to
39fce09
Compare
Failure on rootless looks related; https://github.com/moby/moby/actions/runs/8428203898/job/23080313424?pr=45052#step:8:1254
I kicked CI again to see if it's flaky |
When only stdin is attached the goroutine can only ever exit if: 1. The container pipe is closed while trying to write to it 2. The client closes the stdin read pipe This is because `io.Copy` does a read on the read side then a write to the write side. If reading from the client's stdin pipe blocks, the goroutine will never get notified that the container pipe is closed. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Updated the integration test to hopefully make it less likely to flake. |
2df3ff1
to
ac05457
Compare
2fb7062
to
e87ce9d
Compare
In cases where the client disconnects and there is nothing to read from a stdio stream after that disconnect, the copy goroutines and file descriptors are leaked because `io.Copy` is just blocked waiting for data from the container's I/O stream. This fix only applies to Linux. Windows will need a separate fix. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This var for the incoming attach request. Just within this one function we also have `cfg`, and `ctr` already, so `c` just makes things more confusing. Not to mention `c` is usually referencing a container object in other parts of the code. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Ok I think the test issue was that it was using the main integration daemon and the goroutine count was fluctuating at just the right times to break it. |
Ping, does this LGTY? |
btw, I've opened a proposal on Go to extend the syscall.RawConn interface (or add an additional interface) to receive close notifications. |
Ping |
err := unix_noeintr.EpollCtl(epFd, unix.EPOLL_CTL_ADD, int(fd), &unix.EpollEvent{ | ||
Events: unix.EPOLLHUP, | ||
Fd: int32(fd), | ||
}) | ||
if err != nil { | ||
log.G(ctx).WithError(err).Warn("notifyClosed: failed to register fd for close notifications") | ||
return | ||
} | ||
|
||
events := make([]unix.EpollEvent, 1) | ||
if _, err := unix_noeintr.EpollWait(epFd, events, -1); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think EPOLLHUP
is emitted on a graceful close? Not 100% sure though, but in this case it would cause this goroutine to block indefinitely?
Should this also poll for EPOLLRDHUP
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything using EPOLL_CTL_ADD with EPOLLHUP is not needed because it always looks for EPOLLHUP events.
EPOLLRDHUP is sent when the peer closed the write side.
EPOLLHUP, as I recall, should be when the connection is gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick test with pure syscalls: https://gist.github.com/vvoland/cc9d31f0202cd233c2c2affd74294e6f
And it looks like shutdown
+ close
on the client side doesn't trigger EPOLLHUP
.
This might not actually be the same in our case since we're much higher up in the stack and there might be other things actually triggering the EPOLLHUP
along the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I must admit - I don't know exactly how EPOLLHUP
works, so I'll just trust you if you say that it works as expected 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think EPOLLRDHUP would be like the tcp-half close situation which is why a simpler implementation that doesn't require epoll wouldn't work as was discussed above.
When only stdin is attached the goroutine can only ever exit if:
This is because
io.Copy
does a read on the read side then a write to the write side.If reading from the client's stdin pipe blocks, the goroutine will never get notified that the container pipe is closed.
Another similar case is when the client disconnects but
io.Copy
is blocking waiting for data from the container's stdout/stderr streams.This is significantly trickier to resolve.
The 2nd commit fixes this for Linux only.
Fixes #37182