The power of POWER7

I got the idea to summarize core, thread utilizations, and thread response times from IBM AIX POWER7 CPU Usage and Throughput…a great article!

Another great source is Local, Near & Far Memory part 3 – Scheduling processes to SMT & Virtual Processors, “With POWER7 we have off or SMT=2 and SMT=4. With SMT=4 we get four programs/processes (or four threads of execution) at the same time.  This is a technique that boosts performance when there are lots of processes and/or threads to run at the same time… as SMT increases the number of instructions executed by a single thread on the CPU-core goes down but then as we are running more threads the throughput goes up.”

The article goes on to explain about the performance boost of the POWER7 processor when running in higher simultaneous multi-threading (SMT) modes. Using the relative power increase estimates I created this table:

smt	relative power	core used	thread	response time	logical CPU	CPU%
1	1  		0.62		0.62	1   		1  		25 <- 62% core capacity used shows up as 25%
2	1.4		0.88		0.44	1.43 =0.62/0.44 2  		50 <- 88% core capacity used shows up as 50%
4	1.6		1		0.25	2.5  =0.62/0.25 4		100

The power of POWER7 increases as more threads are used.
With 2 threads the estimated boost is 40% more capacity compared to 1 thread; 2 threads make the core 88% busy, each thread using 44% of the core.
4 threads produce 60% more power; core is maxed out, each thread using 25% of the core.

relative_performance_by_smt_mode

Thread response times are calculated from thread utilizations relative to the 1st thread, i.e. less utilization means longer response time. Response times increase as more threads are active. With 4 concurrent threads it takes 2.5 times longer to finish the same job as when the core runs 1 thread.

“logical CPU” column shows how much CPU resource is used by the hogs.
When hogs max out the logical CPUs, each hog is using 1 CPU second per wall clock second. AIX tools like vmstat, mpstat report CPU% at the logical CPU (thread) level.

Therefore we can express the amount of CPU used by hog(s) as
1) core capacity consumed
2) time spent on logical CPU

Unfortunately some tools still think “2) time spent on logical CPU” when they are reporting figures from “1) core capacity consumed”…remember my low OS CPU utilization in AWR report and in OEM grid control?

Advertisements