Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: make EPOLLERR awake AsyncFd::readable #4444

Closed
wants to merge 9 commits into from

Conversation

BraulioVM
Copy link
Contributor

Ref: #4349

Motivation

Currently, if a user is awaiting on a AsyncFd becoming readable and we
get a polling error (which may happen in some cases, see the referenced
issue), the awaiter will never be woken. Ideally, they would be notified
of this error in some way.

Solution

The solution here involves communicating to the Readiness object that
there was an error, and when that happens, the Readiness future will
return a synthetic error that boils down to "there was a polling error".
If the user wants to find more information about the underlying error,
they would have to find some other way (not clear how yet? as pointed
out in the issue, for sockets one could use SO_ERROR).

Open questions

  1. How do we go about testing? My manual tests involved using the example piece of code in EPOLLERR does not wake up AsyncFd::readable() #4349 but that requires sudo which might not be great for automated testing. It's also kind of an involved setup. I will have to look into this.
  2. EPOLLERR also triggers is_write_closed in the mio event. I don't know if we should notify writers of an error or just of "write closed".
  3. More things that I've written down here and there in the source code

Currently, if a user is awaiting on a `AsyncFd` becoming readable and we
get a polling error (which may happen in some cases, see the referenced
issue), the awaiter will never be woken. Ideally, they would be notified
of this error in some way.

The solution here involves communicating to the `Readiness` object that
there was an error, and when that happens, the `Readiness` future will
return a synthetic error that boils down to "there was a polling error".
If the user wants to find more information about the underlying error,
they would have to find some other way (not clear how yet? as pointed
out in the issue, for sockets one could use `SO_ERROR`).
@Darksonn Darksonn added A-tokio Area: The main tokio crate M-io Module: tokio/io labels Jan 29, 2022
Comment on lines 252 to 253
// should we wakeup all waiters in case of error? I guess that
// depends on the semantics of EPOLLERR but I don't know yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to wake up all of them on error.

Comment on lines +446 to +451
// the information is read via the `curr` in the scheduled io
if ready.is_error() {
return Poll::Ready(Err(
std::io::Error::new(std::io::ErrorKind::Other, "Polling error")
));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could get a better error for the user here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in trying to find the underlying cause of the polling error? The issue contained some discussion on that. See #4349 (comment) and #4349 (comment)

The way I see it this interface would just return something along the lines of "Polling error" and then the user of this interface (who knows what kind of FD they are working with) would do whatever is needed to recover the underlying error (using SO_ERROR for example, if dealing with a socket).

I wish there was an ErrorKind::PollingError though, but maybe it's just that I don't know enough rust error handling to know how to expose this error to the user of the interface?

// TODO: I'm not sure why this safety claim is true. What about
// it being `Done` prevents it from being shared? As far as I can tell
// the `wake0` method in the scheduledio doesn't care about this being
// Done?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a bit unclear, but I think it's correct. How can it transition to Done while being shared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me get back to you on that, I have to think through it

I've added some throwaway code to implement this functionality, which I
already implemented for `readable()`. I haven't been able to test it
yet, but that's the next thing I plan to do.

After coming up with a satisfactory testing story, I will think through
each of the TODOs one by one
@BraulioVM
Copy link
Contributor Author

I've implemented the functionality (epollerr awaking waiters) for the AsyncRead slot as well (also for the AsyncWrite one actually, but that one is tricky because ready.is_error() implies ready.is_write_closed(), which already waiters with a write interest). This is not part of the original issue but it makes sense to me to be consistent? We can do it in separate PRs if you prefer though.

My current plan is to find a satisfactory way to test these changes. It doesn't seem immediately obvious how to do it, because the only way I know to trigger a EPOLLERR is through a setup that generally requires root permissions (the one in the issue). Once I write some tests, I will think through the TODOs I've left in the code, and clean-up the code I've already written (which isn't great yet).

Please let me know if you have any thoughts

@BraulioVM
Copy link
Contributor Author

I think I've found a way to test this automatically! By searching EPOLLERR on https://github.com/torvalds/linux (who would have guessed this would actually work) I've found https://github.com/torvalds/linux/blob/46f4945e2b39dda4b832909434785483e028419d/fs/eventfd.c#L175-L176 Indeed, the man page for eventfd (2) mentions:

If an overflow of the counter value was detected, then
select(2) indicates the file descriptor as being both
readable and writable, and poll(2) returns a POLLERR
event.

So this looks very promising! I will try this over the following days

@BraulioVM
Copy link
Contributor Author

So this looks very promising!

That was wrong. It's true that if the counter's value is 0xfffffffff then polling on the eventfd will return EPOLLERR, but it's very hard to write that value to the eventfd in the first place. According to the man page

As noted above, write(2) can never overflow the
counter. However an overflow can occur if 2^64 eventfd
"signal posts" were performed by the KAIO subsystem
(theoretically possible, but practically unlikely)

I will search for other ways to trigger EPOLLER

I also include the code for a couple of manual tests I've introduced to
verify my changes work as expected. Ideally we would have good automated
unit tests but I haven't found reliable ways to reproduce `EPOLLERR`
that do not require `root`
Added an early return that simplifies things a bit IMO and simplified an
if/else if/else branch below
Poll::Ready(Ok(ReadyEvent {
tick: TICK.unpack(curr) as u8,
ready,
ready: direction.mask(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah this change is not correct. I thought I was simplifying the if/else if/else but it's not a correct transformation. Will fix

@BraulioVM
Copy link
Contributor Author

I've addressed a few of the TODOs that I had left in my changes, but the main problem with this PR remains: automated testing.

I have not found any way to reliably reproduce an EPOLLERR that doesn't require root (or CAP_SYS_ADMIN). I have checked in two example programs that can be run with sudo that can reproduce EPOLLER, just to show what kind of manual testing I'm doing. Even then, those programs do not cover all the changes that I've made.

Before investing more in these clunky manual tests, I would like to hear if you have any tips on how to test this. Some ideas that I have:

  1. Allow some tests to use sudo and have them run in a VM in CI that allows us to do this. If we had something like this, I would put more time in improving the tests I currently have (using fuse to trigger an epollerr) and make them more amenable to run in an automated way.
  2. Mocking. I think this would require a lot of very intrusive changes in the io module just to be able to simulate a mio giving us a is_error() event.
  3. Something else?

@BraulioVM
Copy link
Contributor Author

Maybe I can add unit tests in the schedule_io module. I know unit tests are generally discouraged in this project but they could be worth it here. I'll have a look later

@Darksonn
Copy link
Contributor

Darksonn commented Feb 6, 2022

If adding unit tests will allow you to test it, then that's fine.

@BraulioVM
Copy link
Contributor Author

BraulioVM commented Feb 7, 2022

(this is still very much WIP)

I think I might be able to unit test part of the changes. I've written one such test in https://github.com/tokio-rs/tokio/pull/4444/files#diff-585bbdb06d4f023035fc2520bc464acb733f7f212921936f13c9873798cbfc92R602 , and will keep trying to add more tests over the following days

@Darksonn
Copy link
Contributor

Darksonn commented Feb 7, 2022

Ok. Let me know when you need additional review.

@Blub
Copy link

Blub commented Jun 1, 2022

I'm running into the same issue. And I think an ErrorKind::Other is a bad way signal this, IMO a simple wake-up should be fine? (Perhaps with an is_error() check on the AsyncFd?). Alternatively, for some cases such as fuse, being able to wait explicitly for EPOLLERR would also be useful.
In fact, other EPOLL* flags can also be of particular interest. Eg. to poll freezer cgroup.events you want to explicitly wait for EPOLLPRI while EPOLLIN is always true.
Maybe we can expose a bit more of the io driver machinery so we can make custom registrations?

@yorickdewid
Copy link

I'm seeing the same issue. I've a readiness call on an network-over-USB device. If the USB device is removed, the kernel driver will wait for all handles/sockets to the device to be closed which will never happen because tokio is waiting for a future that will never be woken (SO_ERROR is set to ENETDOWN). In turn this locks the entire kernel USB subsystem, no USB devices can be add or removed until the socket is closed and the kernel resources freed to the kernerl.

Since there is no easy workaround, has there been any progress on this PR?

@folkertdev
Copy link
Contributor

doing some cross-linking here. In #5566, Interest::PRIORITY and AsyncFd::ready have been added. On linux (and android, I guess) that makes it possible to wait for EPOLLPRI (cc @Blub)

I've now opened an additional PR that reports error readiness, see #5781. That means that with AsyncFd::ready you can check for read and/or error readiness, which I suspect provides a workaround for #4349.

What it doesn't do yet is interact with the polling functions (e.g. poll_read_ready). My understanding is that those poll functions are useful when implementing custom Async{Read, Write} instances for system primitives?

@Darksonn
Copy link
Contributor

I'm going to close this as solved by #5781. But let me know on the issue if I've missed something.

@Darksonn Darksonn closed this Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-io Module: tokio/io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants