Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for total accumulated process CPU usage #1044

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

bruceg
Copy link

@bruceg bruceg commented Aug 14, 2023

Most, if not all, CPU usage accounting for processes provides values that count from the creation of the process. This total value is useful for a variety of accounting tasks beyond the snapshot value that is currently available in sysinfo. This change adds a fn total_cpu_usage to trait ProcessExt to provide that value.

Note that this has explicit breaking stubs for the FreeBSD and Windows implementations, as I was unclear how best to implement those functions. I would be happy to fill those in to complete this PR but I would appreciate a little direction what the best implementation path is between storing a new field in struct Process or computing the value on-the-fly in fn total_cpu_usage.

Most, if not all, CPU usage accounting for processes provides values that count
from the creation of the process. This total value is useful for a variety of
accounting tasks beyond the snapshot value that is currently available in
sysinfo. This change adds a `fn total_cpu_usage` to `trait ProcessExt` to
provide that value.
src/traits.rs Outdated Show resolved Hide resolved
@GuillaumeGomez
Copy link
Owner

So for Windows, there is hope depending if there are queries to allow to get this information or not. For FreeBSD, might be more complicated depending if https://docs.rs/libc/latest/x86_64-unknown-freebsd/libc/struct.kinfo_proc.html has the information you need or not.

Also one thing to check: are you sure that utime and stime never get reset in linux? (and mac too)

@bruceg
Copy link
Author

bruceg commented Aug 15, 2023

Also one thing to check: are you sure that utime and stime never get reset in linux? (and mac too)

I believe those values are 64-bit counters in the kernel, and so won't reset for a large number of millennia.

@bruceg
Copy link
Author

bruceg commented Aug 15, 2023

For FreeBSD, might be more complicated depending if https://docs.rs/libc/latest/x86_64-unknown-freebsd/libc/struct.kinfo_proc.html has the information you need or not.

The info is available in the (more) portable getrusage system call. What are your thoughts on calling that, either when the struct Process is created or when calling total_accumulated_cpu_usage? Never mind, struct kinfo_proc includes the rusage data, so everything I need is there already.

@bruceg bruceg marked this pull request as ready for review August 15, 2023 23:08
@bruceg
Copy link
Author

bruceg commented Aug 15, 2023

Would you like me to squash down some of the fixup commits?

@GuillaumeGomez
Copy link
Owner

Hi, just to say I didn't forget about this PR. Still wondering if I want this feature or not. I don't have the need for it myself so it'd be complicated to find out when it's broken or to know what's to be expected when adding support for new platforms.

@bruceg
Copy link
Author

bruceg commented Aug 28, 2023

FWIW I'm not the only user looking for this data (though that user would want other bits as well).

@GuillaumeGomez
Copy link
Owner

I hear that, doesn't conflict the reasons I listed why I was hesitating. ;)

@cederberg
Copy link

I saw this PR just now. Might I suggest that the API simply return the accumulated CPU use in seconds instead of a percentage? It should be easy enough to calculate a percentage from CPU-seconds, but the inverse would lose a lot of precision. If at all possible.

Also, traditionally people like to see utime and stime as two separate counters. Although personally, I'd just sum them right up. But for a library it might be better to more closely expose the available data as the OS provides?

Reading the file changes I also cannot help but to note that skipping the percentage calculus would also make the changes smaller... 😀

@GuillaumeGomez
Copy link
Owner

I saw this PR just now. Might I suggest that the API simply return the accumulated CPU use in seconds instead of a percentage? It should be easy enough to calculate a percentage from CPU-seconds, but the inverse would lose a lot of precision. If at all possible.

That means we should likely provide an API to get CPU seconds as well separately in addition to getting total CPU percentage. Forcing both codes to exist would likely make it more clear, not sure.

Also, traditionally people like to see utime and stime as two separate counters. Although personally, I'd just sum them right up. But for a library it might be better to more closely expose the available data as the OS provides?

I think it'd be better to not provide internals too much and just give the total time. Simpler to handle different OSes.

Reading the file changes I also cannot help but to note that skipping the percentage calculus would also make the changes smaller... 😀

I think it's fine to keep it. Like I said above, keeping total CPU time on one side and total process time on the other will likely make things more clear (simpler I don't know).

Anyway, @bruceg, after thinking about it for quite some time, I think having this feature is ok. What do you think about what @cederberg and I talked about?

@bruceg
Copy link
Author

bruceg commented Sep 11, 2023

Sorry, I'm not entirely clear what "the API" being discussed references. The proposed method returns cumulative CPU seconds, so I think it's already doing what @cederberg is requesting. Am I misunderstanding?

There is one way to reconcile the possible usages, though it is a breaking change. The method could return a new type that has methods to access both values. I don't believe there is a way to make the new type work like the f32 it currently returns, though. This would, however, making extending support to add separate system and user times simpler.

@bruceg
Copy link
Author

bruceg commented Sep 11, 2023

In any case, I'd be happy to do whatever is needed to move the code forward to make it acceptable to all.

@GuillaumeGomez
Copy link
Owner

I didn't double-check after reading:

Reading the file changes I also cannot help but to note that skipping the percentage calculus would also make the changes smaller... 😀

but I should definitely have as you indeed don't return percentage but the total CPU seconds (in f32, that's where I think @cederberg got confused). So it seems good to me as is, sorry about that.

@cederberg: Does it look like what you want with this clarification?

@cederberg
Copy link

Right. I was confused by two things:

  1. Using f32 instead of u64 as I would've expected.
  2. The changes in windows/process.rs, in particular total_accumulated_cpu_usage that seems to perform a calculation of percentage of total system CPU time.

On another note, isn't the hard-coded 100 value for ticks per second in /linux/process.rs defined properly somewhere?

// from FreeBSD source /bin/ps/print.c
let accum_cpu_usage = (kproc.ki_runtime as f64 / 1000000.0) as f32
+ kproc.ki_childtime.tv_sec as f32
+ kproc.ki_childtime.tv_usec as f32 / 1000000.0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right. Here is the relevant code from FreeBSD:

	secs = k->ki_p->ki_runtime / 1000000;
	psecs = k->ki_p->ki_runtime % 1000000;
	if (sumrusage) {
		secs += k->ki_p->ki_childtime.tv_sec;
		psecs += k->ki_p->ki_childtime.tv_usec;
	}

It is clear that child time is only added if sumrusage is true. It is set by the -S switch:

   -S      Change  the way the process times, namely cputime, systime, and
	       usertime, are calculated	by  summing  all  exited  children  to
	       their parent process.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right, I thought this needed to account for the child's time, but that is handled separately.

@bruceg
Copy link
Author

bruceg commented Sep 12, 2023

2. The changes in [windows/process.rs](https://github.com/GuillaumeGomez/sysinfo/pull/1044/files#diff-54c5b015ad8152587800fa45b984b4bce88054dda2afc331cf5b4bfc7c9fe2ae), in particular `total_accumulated_cpu_usage` that seems to perform a calculation of percentage of total system CPU time.

Right, that confused me too, but that is essentially a copy of the calculations in compute_cpu_usage without calculating a delta.

On another note, isn't the hard-coded 100 value for ticks per second in /linux/process.rs defined properly somewhere?

The relevant fields in stat export their data in units of "jiffies" aka "ticks". That tick is defined by HZ which is fixed at 100 (for all arches except for alpha apparently, which effectively doesn't matter at this point).

.old_system_user_cpu
.saturating_add(self.old_system_sys_cpu) as f32
* self.nb_cpus as f32
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't think this is right. It should only return old_process_user_cpu + old_process_sys_cpu. The other parts are dividing with the global CPU time and number of CPUs, which just isn't right here.

@cederberg
Copy link

Could I ask again why f32 is used to return the result instead of u32 or u64?

The source values are all integers. Floats are slow and prone to rounding errors. And if parts of a second is really of interest, we could just as well return milliseconds in a u64, right?

@cederberg
Copy link

The relevant fields in stat export their data in units of "jiffies" aka "ticks". That tick is defined by HZ which is fixed at 100 (for all arches except for alpha apparently, which effectively doesn't matter at this point).

You are surely right, but what I'm after is a reference to SystemInfo that already contains this value (fetched via sysconf instead of hard-coded): clock_cycle: sysconf(_SC_CLK_TCK)

@bruceg
Copy link
Author

bruceg commented Sep 13, 2023

Could I ask again why f32 is used to return the result instead of u32 or u64?

The source values are all integers. Floats are slow and prone to rounding errors. And if parts of a second is really of interest, we could just as well return milliseconds in a u64, right?

f32 was used to mirror the existing interfaces that use the same types and to allow returning a base unit of seconds independent of the underlying calculations. I would actually prefer to use f64 for the rounding/scaling issue, so I would be easily convinced to go that way. I strongly doubt any performance difference would be measurable here, if it even exists.

I will defer to the judgement of @GuillaumeGomez on the correct type/units to use if he has an opinion.

@bruceg
Copy link
Author

bruceg commented Sep 13, 2023

You are surely right, but what I'm after is a reference to SystemInfo that already contains this value (fetched via sysconf instead of hard-coded): clock_cycle: sysconf(_SC_CLK_TCK)

Ah, I had not seen that sysinfo was already fetching that, thanks for the pointer. I'll use that, even though it will always everywhere be 100 (thus saith Linus).

@bruceg
Copy link
Author

bruceg commented Sep 13, 2023

I'm seeing a bunch of tests failing on CI that I can't explain. In particular, this test on Ubuntu is failing now, but I haven't changed either the test or the code underlying it AFAICT. Any idea what might be going on?

@GuillaumeGomez
Copy link
Owner

The check_processes_cpu_usage test is flaky so you can ignore it. I suppose check_processes_total_accumulated_cpu_usage needs to be fixed though. ;)

f32 was used to mirror the existing interfaces that use the same types and to allow returning a base unit of seconds independent of the underlying calculations. I would actually prefer to use f64 for the rounding/scaling issue, so I would be easily convinced to go that way. I strongly doubt any performance difference would be measurable here, if it even exists.

I was also surprised you used f32 and not u64. If there is no big reason to use f32, I'd prefer if you used u64 for coherency. Also, what are you referring to when you talk about "the existing interfaces" btw?

The source values are all integers. Floats are slow and prone to rounding errors. And if parts of a second is really of interest, we could just as well return milliseconds in a u64, right?

About "Floats are slow", they are in fact not that slow. At same size, a float is always slower than an integer, however a smaller sized float is faster than an integer, so f16 is faster than i32, f32 is faster than u64, etc. I should really try to find where I saw these benchmarks but it was very interesting. However in here, performance isn't really relevant I think considering that the impact should be close to non-existent.

@bruceg
Copy link
Author

bruceg commented Sep 13, 2023

The check_processes_cpu_usage test is flaky so you can ignore it. I suppose check_processes_total_accumulated_cpu_usage needs to be fixed though. ;)

Agreed, I'm looking at them.

I was also surprised you used f32 and not u64. If there is no big reason to use f32, I'd prefer if you used u64 for coherency. Also, what are you referring to when you talk about "the existing interfaces" btw?

I used f32 mostly because cpu_usage used f32, and the natural base unit for the return value is seconds for which the fractional part is significant. Using u64 is possible but would require scaling it to some arbitrary amount to cover the existing OSes. AFAICT every other interface that returns time is in units of seconds too. Mind you, those others tend to have lower underlying granularity (i.e. uptime).

@GuillaumeGomez
Copy link
Owner

From what I can see, only FreeBSD seems to actually have a significant fractional part. We could keep using f32 (and stop having intermediate f64 too, unless precision is that important?) or we could instead return CPU total time in milliseconds and use u64. If you have an opinion @cederberg here too?

@bruceg
Copy link
Author

bruceg commented Sep 20, 2023

So, in the most recent commit, I added a OnceLock to store the global clock tick scaling data for MacOS, needed to turn the process timing values into seconds, since this is constant once it is generated. However, the std version of that type is not stabilized yet in the MSRV for this project. How would you prefer I move forward on this?

  1. Pass down timing info into the Process::new functions and store it in the struct for each process (FWIW I started down that path, and the parameter ended up needing to be added to quite a few functions),
  2. Re-calculate the scaling data in Process::new, causing repeated calls to mach_timebase_info,
  3. Add a dependency on once_cell to get the stable version of this type, or
  4. Implement the OnceLock manually using Once and an unsafe block?

Note that some of this is applicable to the Linux side as well, but there we are only fetching the clock tick value through sysconf, which is less onerous.

Edit: Technically there is a 5th option, which is to bump the MSRV to 1.70.0, but I figure that's a bridge too far given that it becomes a breaking change for some users.

@GuillaumeGomez
Copy link
Owner

Edit: Technically there is a 5th option, which is to bump the MSRV to 1.70.0, but I figure that's a bridge too far given that it becomes a breaking change for some users.

Next release will be a major one so it's definitely not a concern.

So, in the most recent commit, I added a OnceLock to store the global clock tick scaling data for MacOS, needed to turn the process timing values into seconds, since this is constant once it is generated. However, the std version of that type is not stabilized yet in the MSRV for this project. How would you prefer I move forward on this?

I had equivalent "static" info. For example in freebsd. I don't think using a static is a good tactic here. Why not creating a struct which query this information on initialization and then keep it around? If the object is destroyed/recreated, then we re-query the information, it's fine. In the documentation it's explicitly written that the System object should be instantiated only once and then should be kept around. Did I miss something obvious maybe?

@bruceg
Copy link
Author

bruceg commented Sep 21, 2023

Why not creating a struct which query this information on initialization and then keep it around? If the object is destroyed/recreated, then we re-query the information, it's fine. In the documentation it's explicitly written that the System object should be instantiated only once and then should be kept around. Did I miss something obvious maybe?

I can set that up in the System object and pass it down. It just ends up adding a parameter to quite a few functions in the chain, but it's all internal so not a big deal. I'll make that change.

Edit: Handling the conversion in update_process instead of when retrieving the value ended up simplifying the code.

@GuillaumeGomez
Copy link
Owner

Hi, sorry forgot a bit about it. It seems like some test is failing on freebsd. Do you need help?

@bruceg
Copy link
Author

bruceg commented Oct 17, 2023

Hi, sorry forgot a bit about it. It seems like some test is failing on freebsd. Do you need help?

Yes, I think I will. We opted to take another path to provide this value since we only need to support Linux and MacOS at this point, and so the motivating factor necessitating this change is effectively moot. We may come back to this if we ever need to support Windows, but that's a long ways down the road.

@GuillaumeGomez
Copy link
Owner

Noted. I'll try to come back to this when I have enough time (hopefully in a few weeks...) then if you didn't before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants