Linux: CPU usage vs. load average

Submitted by root on Tue, 04/12/2016 - 15:37

No matter the monitorization solution, CPU utilization is important but if you really want to know what is happening with your system, you should look at load average. I will explain below why.

The current CPU utilization does not reflect the actual load of the system because when a host is heavily loaded, its CPU usage doesn't have to be necessary close to or at 100%.
Furthermore, the CPU utilization tends to generate (in monitoring tools) a significant number of false alerts, even if the percentage alert is close to 100%. Explanation follows.

There is one way you can benefit of this type of monitoring: to receive alert when a server's CPU utilization is at 100% in 5 minutes interval.
In this situation, you could consider someone should take a look at this server even if this doesn't necessary mean the system is overloaded. If you receive too many alerts, the minutes value can be increased.

Load Average

What could bring more value to the feedback we receive from our servers, is the load average.
The load average consists in 3 values which are the average of 1, 5 and 15 minutes.

The three values of load average of one, five, and fifteen minutes are mathematically calculated since the system is started up but they decay exponentially at different speed (by X after 1, 5 and 15 minutes). Not going too deep here, the most important is 1 minute load average doesn't include only the last 60 seconds activity but it is not far away. It is about 63% of last minute plus 37% since startup. The same goes with the load of 5 and 15 minutes.

Each process using or waiting for CPU adds 1 to the load number. in Linux, also the processes in uninterruptible sleep are included (those that usually wait for disk activity). This means, you can have a heavily loaded (and probably unresponsive) server with CPU utilization nearly to zero and not receiving any alert.
One relevant example: a stalled NFS share can put the processes using it in uninterruptible sleep, increasing the load while CPU usage is apparently normal. So you would have zero alerts for a troubled system.

Load average versus CPU threads

There is one more aspect, relevant for load average monitoring: the number of CPUs.
On two systems, first with 4 processors, second with 10 processors, the 1 minute load average of 6 doesn't have the same impact.
So the load value alone does not give us the true status of the system. It is mandatory to be reflected in the number of threads.

Taking the example below, we can easily conclude that a load average of 33 is OK for this system as we have 4 cpus with 16 threads each:

user@gzlinux $ grep -E "cpu cores|siblings|physical id" /proc/cpuinfo | xargs -n 11 echo | sort | uniq
physical id : 0 siblings : 16 cpu cores : 8
physical id : 1 siblings : 16 cpu cores : 8
physical id : 2 siblings : 16 cpu cores : 8
physical id : 3 siblings : 16 cpu cores : 8
user@gzlinux $ cat /proc/cpuinfo | grep processor | wc -l
64
user@gzlinux $ cat /proc/loadavg
33.34 34.92 22.28 10/46253 386038

Inspiration and further information about load average:

https://en.wikipedia.org/wiki/Load_%28computing%29
http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

Everything You Must Know About Logitech PRO Racing Wheel! root
The TCL 34R83Q is Insane... But There’s a Catch! My (Almost) PERFECT MiniLED Curved Ultrawide Yet. root
Choose Your Perfect Mic / Lavalier \| AIQIU M31 vs. Boya BY-M3/BY-M2 & Puluz PU425 root
Play Steam Games on iPhone with Backbone One Playstation Edition gen2 USB-C / Steam Deck Alternative root
The TCL 34R83Q is Insane! But There’s a Catch. Honest Review About a Real Product. Read Description. root
Server-Ready MS-01 - A Small PC with Everything: SFP+, USB4, 2.5gbps LAN, Enterprise SSD, TPM 2.0 root
Nagios: UNKNOWN: Drive S/N : No health status line found, root
Nagios: Error: Could not stat() command file root
tmutil: command line interface into Time Machine root
macOS: tmutil set quota root
ESXi: how to change the MTU of vmkernek (vmk) interface (when UI fails) root
ESXi: A bunch of useful commands and information root
How to repair a sparsebundle image root
macOS: How to Show and Hide Hidden Files in macOS Finder root
Samsung 870 QVO 2TB: Speed Tests and comparison with Adata SU650 and Kingston K600 root
Ubiquiti EdgeRouter X - thoughts and tips root
Qnap TR-004 short review - why I cannot use it root
Speed test of Icy Box IB-3805-C31, a 5 disks enclosure from Raidsonic root
Asus ROG Sheath in 2024: a HUGE Mousepad, Bigger Than Logitech G840 XL root
Debian DRBD: How to resize drbd and OCFS2 online root
Ubuntu: Fast Install With VMware Fusion root
The one with APC Back-UPS BE850G2-GR: Charge Whatever With It, From Phone to Computer And BEYOND :D root
File Sharing Problems? Downgrade from macOS Ventura to Monterey root
VIOFO A139 Quad HD Dash Camera - 3 Channels Camera in a 2 Channels Package. Confused? root
CarPlay not working with iPhone 15 Pro? Seven Type C cables tested with Mazda 3 BP with USB-A Port root
Apple Watch Series 9? Not Yet! I Still Love My "Seven of Nine" 19 months later root
The Wooden Playhouse Project root
ESXi 7: Create VMFS Partition Via Command Line root
5 Months with iPhone 15 Pro: Not PERFECT But... Must See. From Gaming to ProRes Requirements. root
How to build a 25U 600x600 Server Rack root

term:~# less tar.gz.ro

Stay at 127.0.0.1, wear a 255.255.255.0.

Linux: CPU usage vs. load average

Load Average

Load average versus CPU threads

Inspiration and further information about load average:

Thou shalt not steal!

Recent content

Popular content

Today's:

All time:

Last viewed:

Syndicate

You are here

Load Average

Load average versus CPU threads

Inspiration and further information about load average:

Thou shalt not steal!

Recent content

Search form

Syndicate