Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect process cpu usage? #1225

Closed
systemro opened this issue Mar 15, 2024 · 12 comments
Closed

incorrect process cpu usage? #1225

systemro opened this issue Mar 15, 2024 · 12 comments

Comments

@systemro
Copy link

systemro commented Mar 15, 2024

Describe the bug
we make an application that runs at 1Hz to get

  • the overall system CPU usage
  • processes by name CPU usage

we found that the cpu usage for the processes is inaccurate, we do get random spike and if we sum all the processes cpu usage, it is not the same as the overall CPU usage

below is the screenshot, the top graph (blue line) indicates the overall CPU usage, and the graph below shows each processes cpu usage.
if we sum the cpu usage of each processes on the 2nd graph(the red vertical line, with numbers dialog), the total cpu usage is > 1200%,

  • the PC should have a max value of 1200% only
  • it is more than the overall cpu usage which is about 78%

image

system:

  • linux ubuntu 20.04

version:

  • sysinfo: v0.30.7

may we know what is the causes for this inaccuracy, and how can we improve it?

@systemro systemro added the bug label Mar 15, 2024
@GuillaumeGomez
Copy link
Owner

Without code, not much I can do.

@systemro
Copy link
Author

systemro commented Mar 18, 2024

below is the code i used

#[derive(Debug)]
struct Output {
    global_cpu_percentage: f64,
    processes_status: Vec<ProcessStatus>
}

#[derive(Debug)]
struct ProcessStatus {
    cpu_percentage: f64,
    pid: u32,
    name: String,
    memory_bytes: f64,
}

fn main() {
    let mut sys = sysinfo::System::new_all();
    
    loop {
        std::thread::sleep(std::time::Duration::from_secs(1));

        sys.refresh_specifics(
            sysinfo::RefreshKind::new()
                .with_memory(sysinfo::MemoryRefreshKind::everything())
                .with_cpu(sysinfo::CpuRefreshKind::everything()),
        );

        let global_cpu_percentage = sys.global_cpu_info().cpu_usage() as f64;
        
        // list of process interested to be monitored by us, below is just a sample of process name only
        let processes_name = vec![
            "htop",
            ".vscode"
        ];

        let processes_status = processes_name
            .into_iter()
            .flat_map(|process_name| sys.processes_by_name(&process_name))
            .map(|process| process.pid())
            .collect::<Vec<_>>()
            .into_iter()
            .filter_map(|pid| {
                sys.refresh_process_specifics(
                    pid,
                    sysinfo::ProcessRefreshKind::new().with_cpu().with_memory(),
                );

                let process = sys.processes().get(&pid)?;

                let name = process
                    .exe()
                    .unwrap()
                    .to_str()
                    .unwrap_or_else(|| process.name())
                    .to_string();

                Some(ProcessStatus {
                    cpu_percentage: process.cpu_usage() as f64,
                    pid: pid.as_u32(),
                    name,
                    memory_bytes: process.memory() as f64,
                })
            })
            .collect::<Vec<_>>();

        let output = Output {
            global_cpu_percentage,
            processes_status
        };

        println!("{:?}", output);

    }
}

@GuillaumeGomez
Copy link
Owner

Code looks good. Not sure why you get processes CPU usage above 1200 when adding them though. As for global CPU usage, it's the total percentage of CPU usage across the system, so its maximum is 100% whereas processes can use more than one CPU so they can use more than 100%.

@systemro
Copy link
Author

we think the global cpu usage is correct, but not the processes cpu usage

we tried to get the same information using the python package of psutil,
we never get the CPU spike on the processes cpu usage, we felt something is strange on this lib, but we cant figure it out what is the causes..

@GuillaumeGomez
Copy link
Owner

One last question: what does htop display in comparison?

@systemro
Copy link
Author

we have gather the cpu usage data using the same process and plot them all

  • using sysinfo crate
  • using psutil python package
  • using top command

you can see from the screenshot, there is a spike when using the sysinfo crate

image

@systemro
Copy link
Author

systemro commented Mar 22, 2024

i am not sure if this is the cause, if we refresh all the processes at start we dont get the random CPU spikes anymore

sysinfo::RefreshKind::new()
                .with_processes(sysinfo::ProcessRefreshKind::new().with_cpu().with_memory())

it seems that refreshing individual process by PID inside an iterator will cause the random CPU spike

@GuillaumeGomez
Copy link
Owner

Ah ok, I got your issue. As explained in the documentation:

⚠️ To start to have accurate CPU usage, a process needs to be refreshed twice because CPU usage computation is based on time diff (process time on a given time period divided by total system time on the same time period).

So the first CPU usage reading is always invalid.

@GuillaumeGomez
Copy link
Owner

I think my answer was completely off target. Looking at your code again, you did it correctly. The only thing I'm suspicious about is the System::new_all. In this case, you retrieve a lot of information, including environment variables and things like that. Could you give a try to System::new and then a refresh for only the process information you want?

@systemro
Copy link
Author

we made the changes as suggested, it seems the random CPU spike does not exist anymore, do you know what could cause this behavior in the System::new_all?

in addition, we are not able to query the processes_by_name, we would assume in the start of the application, we should at least call once, so sys cant populate the processes right, and then just refresh by specific

sysinfo::RefreshKind::new()
     .with_processes(sysinfo::ProcessRefreshKind::new().with_cpu().with_memory())

@GuillaumeGomez
Copy link
Owner

processes_by_name is looking only over processes already known to sysinfo, so if you didn't fill it before, it will do nothing indeed.

So the call you wrote is exactly what you need to do first instead of new_all.

Like mentioned, I think the issue is that sysinfo is taking a lot of CPU to gather all information, which you're not interested about for most. I think I should maybe remove new_all to force people to be careful about it...

@GuillaumeGomez
Copy link
Owner

I opened #1234 to remove the new_all methods. That will make it less likely to have your issue again. Closing this issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants