Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta: Runtime metrics stabilization #4073

Open
6 of 7 tasks
carllerche opened this issue Aug 26, 2021 · 14 comments
Open
6 of 7 tasks

meta: Runtime metrics stabilization #4073

carllerche opened this issue Aug 26, 2021 · 14 comments
Assignees
Labels
A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-metrics Module: tokio/runtime/metrics

Comments

@carllerche
Copy link
Member

carllerche commented Aug 26, 2021

Tracks the stabilization of runtime statistics.

RFC: #3845
PRs: #4043

Roadmap

  • Release current implementation as unstable to crates.io (chore: prepare Tokio v1.11.0 #4083)
  • Polish docs (remove TODO)
  • Add more counters
  • Write tokio-metrics providing a higher level api to be consumed that is easier to understand.
  • Resolve open questions.
  • Validate design by evaluating users' experience reports.

Open questions

  • Naming: Should the type be named Metrics, Stats, or PerfCounters.
  • Should RuntimeStats::workers() return &[WorkerStats] or an iterator.
  • Should there be a feature flag to enable stats explicitly?
    • Should there be a runtime Builder option to enable / disable stats.
  • Builder API for more complex configurations, like histograms (rt: instrument task poll times with a histogram #5685).
  • inc_budget_forced_yield_count should become a per-worker metric.
  • Should some current counters be lowered to internal counters?
    • steal_count
    • steal_operations
    • overflow_count

Additional counters

  • Duration between last two polls.
    • There are open questions related to how this should be used.
  • Worker queue depth
    • It is unclear how this should be tracked.
@carllerche carllerche added C-enhancement Category: A PR with an enhancement or bugfix. A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Aug 26, 2021
@Darksonn Darksonn added M-metrics Module: tokio/runtime/metrics and removed M-runtime Module: tokio/runtime labels Aug 27, 2021
@carllerche
Copy link
Member Author

@LucioFranco @Matthias247 any opportunities to try using the metrics?

@Matthias247
Copy link
Contributor

I was on vacation and after that mostly block on other things. But I might be able to try this out this week.

I will however say upfront that I'll expect mostly to report back on the general accessor APIs and how integration into an application will look like. I think that putting poll_count/steal_count/etc on a dashboard will not be super useful for most people, because the numbers in itself have no significant meaning. They don't necessarily indicate that something is right or wrong. The not-yet-implemented timing metrics are more interesting, because they would indicate issues with code in tasks blocking too long. I will nevertheless check and see how the other metrics would look like.

@carllerche
Copy link
Member Author

We should add a counter tracking the number of "false-positive" runtime wakeups. This would be incremented when a worker wakes up without having any work to do.

@e-ivkov
Copy link

e-ivkov commented Dec 31, 2021

The not-yet-implemented timing metrics are more interesting, because they would indicate issues with code in tasks blocking too long.

I second this. In our project built on tokio, we implemented a custom macro to track poll times. It would be very useful to have it in tokio.

@LucioFranco LucioFranco changed the title meta: Runtime stats stabilization meta: Runtime metrics stabilization Feb 16, 2022
zonyitoo added a commit to shadowsocks/shadowsocks-rust that referenced this issue Apr 24, 2022
@zonyitoo
Copy link
Contributor

zonyitoo commented May 5, 2022

It have been quite a long time, what's the status of this issue? @carllerche

It is quite useful for library authors to know the number of workers in the current Runtime.

@LucioFranco
Copy link
Member

Work has stalled a bit on this, but we would be happy to mentor people on this work.

@zonyitoo
Copy link
Contributor

Hello @LucioFranco , is this feature ready for publish?

@LucioFranco
Copy link
Member

I don't believe this is ready for stabilization.

@Noah-Kennedy
Copy link
Contributor

What's the status on this? It looks like nothing has happened for a while here.

@LucioFranco, I would be willing to pick up where you left off here, I just need to know what needs done.

@LucioFranco
Copy link
Member

I believe we have a bunch of metrics in the runtime now and tokio-metrics has an initial version published. @carllerche would know the rest. Go ahead and pick it up, I won't have time for a bit. Happy to be tagged for reviews tho.

@carllerche
Copy link
Member Author

I aim to stabilize this feature by EOY.

As I look at the current counters, I think some of the more "scheduler implementation detail" counters should be switched to be internal counters. For example, the following counters are more useful for debugging the scheduler than debugging application performance issues:

  • steal_count
  • steal_operations
  • overflow_count

@carllerche
Copy link
Member Author

carllerche commented May 24, 2023

At the risk of reopening bikeshedding, we must also decide on a name: metrics, stats, or counters.

Given that we have more than just counters, that probably rules out counters as a name.

Lets see what ChatGPT suggests 😆 :

In the context of Tokio's scheduler or any similar system, the terms "metrics" and "statistics" can overlap significantly. However, based on the definitions provided, the examples you've listed—how many tasks were scheduled, how many times the worker went to sleep, a histogram of how long it took to poll each task—could be considered both metrics and statistics. Here's why:

  1. As Statistics: These measures describe certain aspects of your data or system operation. They give you quantitative summaries of the system performance and behaviors, which is in line with the definition of statistics.

  2. As Metrics: These numbers can also be viewed as metrics, especially if you're using them to track and assess the performance of your system, identify bottlenecks, or measure against a particular performance target or benchmark.

So, these terms can be interchangeable in many cases. The term you choose to use can depend on your specific context or preference, or how you intend to use the data. If you're using these numbers primarily to understand system behavior or describe what's happening, "statistics" might be the more appropriate term. If you're using them to evaluate system performance or track against goals, "metrics" might be more suitable.

In general, in software systems and libraries like Tokio, we often talk about "performance metrics" or just "metrics" as it implies ongoing tracking and often is used for making decisions about system improvements or changes.

@Noah-Kennedy
Copy link
Contributor

Artificial Indecisiveness

@cloneable
Copy link

cloneable commented Jul 20, 2023

Hi! Sorry, is it too late to ask to turn all gauges into counter pairs?
Metrics like active_tasks_count or injection_queue_depth are fast moving gauges and even taking a snapshot every few seconds doesn't say much about what's going inside Tokio. It would be better to use two counters: one for additions, one for removals, and during snapshotting one can calculate the rate of how much went into a queue or many tasks got spawned during snapshotting interval and the current queue length and the number of active tasks is the delta between the counters, if needed. So it's much more usable for monitoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-metrics Module: tokio/runtime/metrics
Projects
None yet
Development

No branches or pull requests

8 participants