Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subscriber: record a histogram of wake-to-poll times #50

Closed
seanmonstar opened this issue Jun 15, 2021 · 4 comments
Closed

subscriber: record a histogram of wake-to-poll times #50

seanmonstar opened this issue Jun 15, 2021 · 4 comments
Labels
C-subscriber Crate: console-subscriber. E-easy Effort: easy. Good for newcomers S-feature Severity: feature. This is adding a new feature.

Comments

@seanmonstar
Copy link
Member

A way for someone to better understand how the runtime itself is behaving is to record a history of how long it takes a task to be polled after it was been woken. We could either record this information per task, or on a whole. I think as a whole is probably sufficient, and would mean less data needs to be sent per task (see #47 (comment)). But if there's a very helpful use case for knowing this data per task, we could do that.

@seanmonstar seanmonstar added the S-feature Severity: feature. This is adding a new feature. label Jun 15, 2021
@seanmonstar seanmonstar added C-subscriber Crate: console-subscriber. E-easy Effort: easy. Good for newcomers labels Aug 24, 2021
@sieut
Copy link

sieut commented Sep 10, 2021

Hi @seanmonstar, I've been reading up on #47 to see how to do this. Do you mind if I take a stab at this issue?

@hawkw
Copy link
Member

hawkw commented Sep 10, 2021

It's possible that this data is also collected by tokio's runtime metrics feature (tokio-rs/tokio#4073), which I would like to start integrating into the console. It might be worth looking at the upstream API to see if that's the case.

Otherwise, we can definitely record this with the data we're already aggregating.

@sieut
Copy link

sieut commented Sep 11, 2021

Thank you @hawkw for the note. I read the RFC and it doesn't look like there's something we can use for this issue specifically. It looks like that work is more concerned with collecting perf-counter type metrics than aggregating latency.

hawkw pushed a commit that referenced this issue Apr 19, 2023
Each task displays the sum of the time it has been idle and busy, as
well as the total. The idle time includes the time between when a task
is woken, and when the runtime actually polls that task.

There are cases where a task may be scheduled for a long time after
being woken, before it is polled. Especially if many tasks are woken at
the same time and don't yield back to the runtime quickly.

To add visibility to this, the total time that a task is scheduled
before being polled has been added. Additionally, a new task state
`Scheduled` has been added. This is displayed in both the tasks table
and in the task detail view.

In the `console-api`, the total `scheduled_time` for the task has been added to
the `TaskStats` message in `tasks.proto`.

This is the first of two parts. In the second part (#409), a histogram for
scheduled time will be added, the equivalent of the poll time histogram
which is already available on the task detail screen.

To show a pathological case which may lead to needing to see the
scheduled time, a new example has been added to the `console-subscriber`

## PR Notes

This PR does something adjacent to what is described in #50, but not quite.

The unicode character used for a `SCHED` task is ⏫.

The second PR (#409) will record a scheduled time histogram the same as it
recorded for poll times. These two changes should go in together (and certainly
shouldn't be released separately). However, this PR is already quite big, so they'll
be separated out.

The idea is that this PR isn't merged until the next one has also been reviewed
and approved. It would be good to get some feedback at this stage though.

The task list view with the new column for `Sched` time:

<img width="1032" alt="a tasks table view for the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232456977-2921f884-4673-420f-ba4f-3646627d44db.png">

The `Task` block in the task detail view showing the new `Scheduled` time entry.

<img width="510" alt="The task block on the task detail view for the rx task in the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232457332-e455e086-9468-42c9-8fda-7965d8d1e6f8.png">
hawkw pushed a commit that referenced this issue Sep 29, 2023
Each task displays the sum of the time it has been idle and busy, as
well as the total. The idle time includes the time between when a task
is woken, and when the runtime actually polls that task.

There are cases where a task may be scheduled for a long time after
being woken, before it is polled. Especially if many tasks are woken at
the same time and don't yield back to the runtime quickly.

To add visibility to this, the total time that a task is scheduled
before being polled has been added. Additionally, a new task state
`Scheduled` has been added. This is displayed in both the tasks table
and in the task detail view.

In the `console-api`, the total `scheduled_time` for the task has been added to
the `TaskStats` message in `tasks.proto`.

This is the first of two parts. In the second part (#409), a histogram for
scheduled time will be added, the equivalent of the poll time histogram
which is already available on the task detail screen.

To show a pathological case which may lead to needing to see the
scheduled time, a new example has been added to the `console-subscriber`

## PR Notes

This PR does something adjacent to what is described in #50, but not quite.

The unicode character used for a `SCHED` task is ⏫.

The second PR (#409) will record a scheduled time histogram the same as it
recorded for poll times. These two changes should go in together (and certainly
shouldn't be released separately). However, this PR is already quite big, so they'll
be separated out.

The idea is that this PR isn't merged until the next one has also been reviewed
and approved. It would be good to get some feedback at this stage though.

The task list view with the new column for `Sched` time:

<img width="1032" alt="a tasks table view for the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232456977-2921f884-4673-420f-ba4f-3646627d44db.png">

The `Task` block in the task detail view showing the new `Scheduled` time entry.

<img width="510" alt="The task block on the task detail view for the rx task in the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232457332-e455e086-9468-42c9-8fda-7965d8d1e6f8.png">
@hds
Copy link
Collaborator

hds commented Feb 14, 2024

This task is done now per task. As we don't currently have any "as a whole" or "per runtime" histograms, I think that we can consider it complete after #406 was merged.

Any histograms per runtime would be added once runtimes themselves have been instrumented in Tokio.

@hds hds closed this as completed Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-subscriber Crate: console-subscriber. E-easy Effort: easy. Good for newcomers S-feature Severity: feature. This is adding a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants