Simulation

Up until now my focus was on one core. In real life it is more likely we have several POWER processors assigned to an AIX virtual machine (logical partition, lpar). With dedicated configuration each virtual processor is backed by a physical core; one less variable to think about.

Let’s assume I have an lpar with 8 virtual processors mapped to POWER7 cores. Each core can run single threaded, SMT2, or SMT4 depending on the workload. I created a simulation in excel to show CPU utilization for both core capacity used and time spent on logical CPUs for the lpar. Check out the simulation sheet in power7.xls.

Each “1” in the blue area represents a maxed out thread (logical CPU) on the machine.
Green cells show CPU capacity from cores and threads:
– for core it is the sum of core capacity used
– for thread it is time spent on logical CPUs

As you “turn on” more threads you are doing the job of the OS scheduler. You will see both core capacity used and time spent on logical CPUs for the whole machine. It might help to understand how these numbers relate…and diverge as we put more load on the lpar.

We are already familiar with the scenario when 1 logical CPU maxed out by 1 hog. Core capacity used is 0.625 while the hog process is running on logical CPU the whole time. The thread/logical CPU makes the lpar’s core 7.8% busy; at the same time AIX thinks the lpar’s CPU utilization is 3.1%.

power7_thread1

 

The default AIX scheduling policy is to try to run all processes single threaded, that provides the most power, that is the fastest. So each hog is assigned to different core. With 8 hogs active core utilization is already 62.5%, but the OS sees that as 25%.power7_thread8

When load goes beyond the number of cores SMT2 kicks in; the core splits, overall throughput–represented by core capacity used–increases… although with diminishing effect (we doubled the number of hogs but core capacity only went up from 5 to 7).power7_thread16

 

As we dispatch more hogs some of the cores must switch to SMT4 to run them. With 32 hogs running all threads are used, both core and thread utilization is at 100%. At this point core capacity used is 8 seconds and logical CPU capacity used is 32 seconds for each wall clock second.

power7_thread32

 

I created a similar excel sheet for POWER8.xls.

What do you see? On POWER8 how much is the core based utilization when OS is telling us the machine is 50% utilized?
(hint…94.5%)

 

If you check the utilization sheets, you can guess what I will write about next.

Advertisements