Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡I have a Coral, but my CPU usage is still high #3860

Closed
NickM-27 opened this issue Sep 16, 2022 · 15 comments
Closed

⚡I have a Coral, but my CPU usage is still high #3860

NickM-27 opened this issue Sep 16, 2022 · 15 comments
Labels
documentation Documentation for Frigate

Comments

@NickM-27
Copy link
Sponsor Collaborator

NickM-27 commented Sep 16, 2022

Common config issues causing higher than expected CPU load:

We get this question pretty often. While object detection is a considerable load on Frigate, there are other loads as well:

  • Decoding the video stream to run motion & object detection
  • Other modifications like resizing or filtering the stream (which runs on the CPU)
  • Calculating motion detection (which runs on the CPU)

Using hwaccel to decode the stream is always highly recommended:

As is explained in the docs decompressing video streams takes a significant amount of CPU power. Video compression uses key frames (also known as I-frames) to send a full frame in the video stream. The following frames only include the difference from the key frame, and the CPU has to compile each frame by merging the differences with the key frame. More detailed explanation. Higher resolutions and frame rates mean more processing power is needed to decode the video stream, so try and set them on the camera to avoid unnecessary decoding work.

Please see the hardware acceleration docs for how to setup hardware acceleration for your GPU.

Poorly configured camera detect config:

It is important to properly set your cameras detect config

Frigate will resize the frames from the decoded camera stream to whatever is set in detect -> width / height unless it is the same size as the actual stream.

This means if you camera is 1280 x 960 and your detect config is:

detect:
  width: 1080
  height: 720

then frigate will resize the frames to 1080 x 720 which will use a non-negligible amount of CPU to do. This is why it is recommended to run detect on the actual size of your stream.

NOTE: The default detect config is 1080 x 720 so you always need to set it to exactly what it is.

Motion detection:

Motion detection is run on the CPU. The higher the resolution of the stream, the more work it is to detect motion frame to frame. This is one of the reasons why using high resolutions is discouraged.

It is also important to add motion masks to places likely to not have objects like trees, skylines, etc.

NOTE: Motion masks are not meant to block out actual objects, do not use them for this.

@Majestic7979
Copy link

Found this issue because despite using a GTX1080 + tensorrt image and installing and the required nvidia stuff on the host machine and verifying with nvidia-smi in the container, I still get 200-300% CPU usage. Each of the links provided by Nick is 404, so unfortunately I can't troubleshoot using the info in here.

@MagoPoza
Copy link

Good afternoon, I have 2 tapo cameras configured. One C310 and another C320WS. I adjusted the settings with version 0.12.0 and got 33% CPU consumption on each one.
With the 0.12.1 update, the CPU usage on the C320WS tapo has dropped to 17% (Great) but on the C310 without touching any of the settings it has risen to 47%

@NickM-27
Copy link
Sponsor Collaborator Author

You should make your own support issue.

@MagoPoza
Copy link

Deberías crear tu propio problema de soporte.

You're right I'm sorry. I just created a new issue: #6831

@rovingclimber
Copy link

rovingclimber commented Jul 10, 2023

Hi!

I just wanted to add a comment here regarding another cause that's not listed here ...

I have an Asrock J5040-itx board (Intel J5040 CPU) and I added an M.2 Coral dual TPU (E-Key) to the on-board E-key slot. I noticed that, after adding the TPU, CPU utilization was high, even though I was now offloading detect to the TPU successfully.

After a bunch of testing I've confirmed that this setup causes high CPU utilization at idle, even if the coral drivers aren't loaded. I'm guessing there's some sort of basic hardware incompatibility that's causing this. I've demonstrated this with a fresh proxmox install (no VMs / containers), a fresh Debian install and even just running a Live CD - with the TPU card installed, CPU usage is at ~15% at idle, removing the TPU it's down at ~1%. Under load it's even worse, seems like the TPU is basically nix-ing the CPU at a hardware level by some mechanism unrelated to OS / Frigate or drivers.

Just thought this was worth mentioning because it might explain otherwise unexplained poor performance for some people. I've ordered this adapter to see if it works any better (should also allow access to both TPUs, as only one shows via the M.2 slot on the board)

Attached is cpu chart from a fresh, idle proxmox install showing the effect of just plugging the TPU in:

Coral M2 issues

@NickM-27
Copy link
Sponsor Collaborator Author

I think more info would be needed to say if the coral is the cause, and that does not match my experience with pcie corals at all.

@rovingclimber
Copy link

rovingclimber commented Jul 10, 2023

Something really horrible is happening....

I've just done a bunch of benchmark testing on a fresh Debian install. I've run multiple tests with and without the coral PCIe card plugged in to the M2 slot (no coral driver installed, just the hardware, nothing should be "touching" the TPU), and found the following:

Disk I/O (measured with bonnie++) is fairly similar with and without the coral stick (maybe 10% apart).
Single-threaded CPU benchmark using 7z b -mmt1 shows a small performance hit with the coral plugged in(2477 vs 2566)
Multi-threaded CPU benchmark using 7z b -mmt shows a huge performance hit with the coral stick plugged in.
I get ~9200 MIPS avg compression without the coral stick plugged in.
I get ~1000 MIPS avg with the stick plugged in.
Anecdotally even at idle HTOP shows one core randomly pegged at high usage for some bursts when the coral is plugged in.

So anyway, this definitely isn't a Frigate issue, but I can confirm that on some hardware there is an issue some people might hit that can cause horrible performance.

If anyone else out there is using J5040-ITX board, something worth checking out!

@PozoSer
Copy link

PozoSer commented Sep 26, 2023

@NickM-27
Copy link
Sponsor Collaborator Author

Thanks, looks like a couple links got changed, fixed them 👍

@felipecrs
Copy link
Contributor

felipecrs commented Nov 6, 2023

I would recommend adding some other bullets to the issue description:

  • Conversion of audio codec (which runs on the CPU)
  • Audio detection (for Frigate 0.13 onwards) (which runs on the CPU)

@NickM-27
Copy link
Sponsor Collaborator Author

NickM-27 commented Nov 6, 2023

Do you have any data on CPU usage? My audio conversion ffmpeg process converts pcma to aac and uses 0.1% of a single core

@felipecrs
Copy link
Contributor

felipecrs commented Nov 6, 2023

Conversion to AAC for me (Intel J4125) with go2rtc/ffmpeg takes around 3-4% of CPU for each camera. Currently I am converting 4 cameras, which usually means above 10% of usage.

It is never negligible for me, but it varies a lot over time.

@felipecrs
Copy link
Contributor

I am testing this right now. I extracted three cameras to go2rtc.yaml and started the go2rtc add-on standalone for comparison. Then I opened up the three streams in VLC in AAC to trigger the ffmpeg conversion for all of them.

When checking the usage from Supervisor, it's usually 10%. When checking the usage from glances, it's usually 30%.

Supervisor's CPU usage is calculated against 100% while Glances calculates against all cores (i.e. 400%).

@NickM-27
Copy link
Sponsor Collaborator Author

NickM-27 commented Nov 6, 2023

Right, go2rtc restreaming is much more than audio transcoding (which is done using ffmpeg). From what I have seen audio transcoding itself is very lightweight, the ffmpeg process used for that uses less than 1% of a single core on my computer

@felipecrs
Copy link
Contributor

That's interesting. I should refine my testing methodology then. But it's good to know I'm not spending that much CPU cycles in this conversion then. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation for Frigate
Projects
None yet
Development

No branches or pull requests

6 participants