POWER7 vs POWER8

In this post I will compare maximum workloads between lpars using POWER7 and POWER8 processors.

Detailed processor information for the two test lpars

Processor Type: PowerPC_POWER7
 Processor Implementation Mode: POWER 7
 Processor Version: PV_7_Compat
 Number Of Processors: 10
 Processor Clock Speed: 4228 MHz
 System configuration: type=Shared mode=Uncapped smt=4 lcpu=40 mem=92160MB psize=16 ent=1.00

Processor Type: PowerPC_POWER8
 Processor Implementation Mode: POWER 8
 Processor Version: PV_8_Compat
 Number Of Processors: 8
 Processor Clock Speed: 4157 MHz
 System configuration: type=Shared mode=Uncapped smt=8 lcpu=64 mem=102400MB psize=16 ent=0.80

My expectation for processor performance is based on specint_cpu_explorer from Karl Arao

I picked the closest processor from the list based on clock speed and type

ResultPerCore: 54.63
 Operating System: IBM AIX V7.1
 Processor MHz: 4228
 Processor: POWER7+
 System: IBM Power 730 Express (4.2 GHz, 16 core)

ResultPerCore: 79.00
 Operating System: IBM AIX V7.1
 Processor MHz: 4192
 Processor: POWER8
 System: IBM Power E870 (4.19 GHz, 80 core)

Expected workload increase between POWER7 and POWER8 processors based on specint_rates

79/54.63=1.44

Test results  generated by my hog load script for both test lpars

lpar using POWER7 with 10 cores: 1821 tx/s, 182 tx/s/core
lpar using POWER8 with  8 cores: 2521 tx/s, 315 tx/s/core

Workload ratios between POWER7 and POWER8 when lpar-s are maxed out

lpar 2521/1821 = 1.38
core 315/182 = 1.73

POWER7 vs POWER8

References
hog
Lpar test
Lpar test POWER8

Advertisements

Lpar test POWER8

In lpar test I looked at an lpar running on POWER7 virtual processors. This time I will run the same test on an lpar with POWER8.

In power8.xlsx I calculated these theoretical utilization numbers:

smt core perf core use thread perf thread use thread resp
1 1.00 0.50 1.00 0.50 1.00
2 1.45 0.73 0.73 0.36 1.38
4 1.89 0.95 0.47 0.24 2.12
8 2.00 1.00 0.25 0.13 4.00

My expectation is to have a 2 fold workload increase between single threaded and SMT8 modes.
power8_performance

At the same time each thread will run 4 times slower, a trade-off for more throughput.

In my load test script (hog) I am using a PL/SQL loop to drive up the load on the CPUs

a := 1; begin for i in 1..40000 loop a := ( a + i )/11; end loop; end;

I use no data/sql so there is no variation in execution (like change in execution plan, or use of direct path read during full table scans, etc.). Therefore the same hog can be used on different databases to compare performance. I am counting the number of executions “tx_s” and response time for each loop “res_ms”.

lpar with POWER8> hog plsql_loop nobind '0 80 1' | tee hog_plsql_loop_`hostname -s`_${ORACLE_SID}_`date +%m%d%y_%H%M`.log

power8 top activity

Please see data collected in power8.xlsx lpar test sheet.

On this lpar we do not have “app” output from lparstat (I am guessing sysadmins did not expose it) but vmstat’s physical processors consumed (pc) steadily increased during the test. Core capacity used by the lpar was maxed out at around 8,  it has 8 virtual processors backed by sufficient core capacity from the shared pool.

DB and session CPU consumptions “db_cpu”/”sess_cpu” are matching core capacity used “pc” of the machine, the database is responsible for the load on the machine.

“sess_cpu” is matching “purr” which is core used by all hog processes as it captured by ‘pprof’, therefore Oracle session CPU usage is expressed in core capacity.

At the same time hog process CPU consumption expressed as time spent on CPU in “time” column is matching the number of hogs. Each hog is designed to max out (constantly running on) the logical CPU/thread.

Up until around 60 hogs “os_cpu” is matching “time” from ‘pprof’, so we know OS CPU utilization is time based. OS is busy from hogs.

At high load “os_cpu” and “ash_cpu” are going above the maximum time based CPU capacity of 64. Same true for the core based Db and session CPU used from time model “tm_cpu” and “db_cpu”. Looks like these metrics won’t give us true CPU capacity used (time or core based), they mix in some time component after CPUs maxed out. All metrics are documented in hog post.

Let’s see if test results are matching my expectations.

Workload increase is linear from 1 to 10 hogs, we have 10 virtual processors.
Workload went up from 1169 tx/s to 1830 tx/sec, a 56% increase when the processor switched from single threads to SMT2.
Workload went up from 1169 tx/s to 2521 tx/sec, a 2.15 fold increase when the processor switched from single threads to SMT8.
We are close (slightly better) compared to calculated workload increase for POWER8.

Response time started at 6 ms with 1 hog and goes up to 25 ms with 64 hogs… 3.7 fold increase…a bit faster then I expected. Calculation was based on expected performance gain based on typical workload. Perhaps the workload generated by hog favors POWER8, so I revised the estimated performance increase for POWER8; see updated numbers in lpar test sheet in power8.xlsx

power8 throughput and response time

Looking at the data… for this type of workload I would not recommend running the DB with SMT8 (above 32 threads). It won’t provide much additional throughput and response time is significantly worse.

Our production DB is running on POWER8 with 16 virtual CPUs. It is time to apply some fuzzy logic. During batches half of the virtual processors switched from SMT4 to SMT8 (ballpark). Average expected response time increase is 44% (=(4-2.12)/2.12/2). It is matching the run time increase of batch between good day (4/18) and slow day (4/27). You can check post What happened to the batch? for the details.

In the next post I will compare test results between POWER7 and POWER8.

Cputoolkit on lpar

While researching CPU behavior on AIX/POWER/lpar I found Karl Arao’s article on cores vs threads, v2 vs x2. The article refers the cputoolkit tool from https://karlarao.wordpress.com/scripts-resources/.

Soon I realized I need to compare CPU usage from several sources. This approach allowed me to observe both time spent on CPU and core based usage. I ended up creating my own script (hog) to generate load and capture CPU related information.

I was wondering how hog compares to cputoolkit. I added the driver sql from cputoolkit so I was expecting something similar.

--lotslios, borrowed from Karl Arao's cputoolkit, I think he got it from a script by Tanel Poder
--calibrated where clause for around 0.3 sec run time on test lpar
select /*+ cputoolkit ordered use_nl(b) use_nl(c) use_nl(d) full(a) full(b) full(c) full(d) */ count(*) into a 
from sys.obj$ a, sys.obj$ b, sys.obj$ c, sys.obj$ d 
where a.owner# = b.owner# and b.owner# = c.owner# and c.owner# = d.owner# and rownum <= 3000000;

In cputoolkit the data is gathered by sampling v$sql every few seconds, then we use the data from the last sample for each set. I was wondering if all logical IOs (LIOs) are accounted for this way.

In hog I am calibrating the sessions to run for set time and catching the number of transactions (number of executions of the driver sqls) and LIOs from v$mystat on exit. This approach guarantees exact counts and also helps with smoother transition between sets, I do not have to kill Oracle sessions.

I opted to report data normalized on time. This way I can easily compare information from ASH, instrumentation, and OS.

hog script is one self-contained executable; it is easy to deploy; requires no change to the db;connects as sysdba.

I picked an lpar with 10 virtual processors, smt=4, 40 logical CPU-s. During the tests there was enough avail core capacity  in shared pool to fully utilize all processors.

cputoolkit

sh cputoolkit 1 1 50 $ORACLE_SID

Workload starts with 179526 LIOs/sec for the 1st session. I see an almost linear increase up until around 10 sessions (make sense, we have 10 virtual processors). After that we see diminishing increase from additional sessions. Max workload is around 2.3 million LIO/sec at 40 sessions. I calculated LIOs/sec by multiplying  LIOS_ELAP with number of sessions. It is matching ‘Logical Reads Per Sec’ from loadprof.sql which is recorded by cputoolkit during the test. I uploaded logs in cputoolkit_031717.zip

cputoolkit_lpar_throughput

The 1st session can produce 179526 LIOs per second. By the time we run 40 concurrent sessions performance for each session degrades to 57910 LIOs.

At the sql level as we run more concurrent sessions each sql runs a bit longer; ELAP_EXEC is up from 0.35 to 0.92 seconds with 40 sessions. This means less number of executions by each session; EXEC/sec/CPU is down from 2.37 to 1.09 executions  per second.cputoolkit_session_throughput

You can see all data points in cputoolkit.xlsx

For some reason LIOs per execution (LIOS_EXEC column in cputoolkit.xlsx) is slightly decreasing, not the case with hog. This would mean our driver sql executes with less logical IOs… not likely. I suspect measuring error from periodic sampling of buffer_gets from v$sql. Data points from Karl also show decreasing LIOS_EXEC resulting in sloping down LIOs/Elap even when running less sessions than cores.

But the real issue on AIX/POWER/lpar is that cpu_time in v$sql is not representing time spent on CPU. It is core capacity used. Therefore CPU WAIT can’t be calculated by subtracting cpu_time from elapsed_time.

hog

hog lio nobind '0 50 1'

Workload starts with 179000 LIOs/sec for the 1st session. Similarly to cputoolkit I see an almost linear increase up until around 10 sessions. After that there is diminishing returns from additional sessions. Max workload is 2.2 million LIOs/sec at 40 sessions. I calculated LIOs/sec from slio_k which is thousands of ‘session logical reads’ from v$mystat. It is matching with test results from cputoolkit.
hog_lpar_throughput

The 1st session can deliver 179000 LIOs/sec which degrades to 67650 by the time we run 40 concurrent sessions.

Workload graphs beyond 20 sessions are jerky. I think it is from POWER processor switching between SMT levels to run more sessions, i.e. when a processor changes from SMT2 to SMT4 there will be a slight degradation in performance until all threads are utilized… check out power7.xlsx utilization sheet to see how that can happen.

At the sql level as we run more concurrent sessions each sql runs a bit longer: res_ms (brown line) is up from 357 msec to 932 msec with 40 sessions (this was very similar, between 0.35 to 0.92 seconds with cputoolkit). This means less number of sql executions by each hog: hog throughput (tx_s/hog, blue line) is down from 2.8  to 1.07 executions  per second with 40 concurrent hogs running.
hog_session_throughput

For workload and sql response time hog is matching results from cputoolkit.

With hog I collect enough information to calculate the same information reported by cputoolkit. “time” from pprof is the sum of time spent on CPU by Oracle sessions. Assuming sqls fully utilize the sessions it is equivalent to CPU used by the driver sqls (see hog sheet in  cputoolkit.xlsxtime -> CPUSECS). ELAP_EXEC (brown) is the duration of the test divided by the number of sql executions. From these I was able to calculate CPU WAIT (gray line). Waits start to show up after all 40 logical CPUs were saturated, beyond that elapsed start to diverge from CPU used.cpu_wait.jpg

Converting/substituting core based CPU usage to time spent on CPU allowed me to calculate CPU WAIT on AIX/POWER/lpar. I obtained time spent on CPU from pprof. I could use ASH ‘ON CPU’ but ASH would continue to grow beyond 40 as we run more hogs. On the other hand, pprof CPU usage is maxed out at 40 since we have 40 logical CPUs providing maximum 40 seconds of CPU time each wall clock second.

I started with testing cputoolkit on AIX/POWER/lpar. Noted problems with CPU WAIT due to cpu_time is on different scale…it is not time on AIX/POWER. Then tested with hog. I updated it to use the same sql to drive the load. It produced similar results for lpar throughput and sql response time.  Hog collects both core capacity used and time spent on CPU by Oracle sessions. With that information I was able to calculate CPU WAIT. This post is another example where it is important to understand the difference between the two ways CPU interpreted/reported on AIX/POWER/lpar.

Soft thoughts

Learning

There is always the next level.

I am still learning. What I wrote here mostly exists in bits & pieces on the Internet. Not much new… it was new for me. During my research I focused on CPU utilization on AIX/POWER/lpar. I felt the CPU used by the DB somehow was not accounted for in AWR reports, nor was reflected in CPU utilization at the OS level.

I built on findings by Marcin Przepiorowski in his article on Oracle on AIX – where’s my cpu time ?. I also benefited from Jeremy Schneider’s excellent summary Understanding CPU on AIX Power SMT Systems. Another great source was IBM AIX POWER7 CPU Usage and ThroughputCPU Usage and Throughput by ksun. Applying the information allowed me to explain low CPU utilization of the machine/lpar, and the low CPU usage numbers by the DB. It also led me to the root cause of slow DB performance.

Why Write?

Writing my experience down forced me to think it over. I reached a whole new level of understanding. I hope you will learn something by reading this. I also hope someone out there will point out mistakes in my posts, that will give me another push to dig deeper.

Rule #1

The DB is not operating in thin air. It is executing queries on behalf of the applications. It is responding to those requests by using resources from the OS/HW. Neither side is controlled by the DBA. Yet we are frequently called upon to fix slowness in the DB.

Rule #1: the DBA is guilty. The application is slow from the DB unless proven otherwise. The DB is using too much resource on the OS, DBA is guilty again. But how can a DBA know the app and the OS/HW? In today’s specialized world you can’t cover it all. So people supporting the app and the OS are your friends. They can help you find unusual patterns of activity affecting the DB. Do not expect them to find you, you need to reach out… you are lucky if they are approachable.

Management may not recognize this; they think pushing the DBA harder is the best way to solve the problem. This can quickly become a distraction: you may be asked to participate on fire calls and report progress every few hours. That time would be better spent on thinking, or consulting with the few people relevant to the problem. All hands on deck can be counter productive. Ultimately the client is telling you what to do: you may let them know it is not helping, but I see no point sweating this too much.

Engaging too many “experts” and feeding them with information is another sure way to waste some time. While until we do not know what hit us  any help is welcome, we just need to balance it somehow.

Our job

As a DBA we are responsible to keep the DB humming. If something unusual is going on you need to recognize it. Do yourself a favor, know your DB. Know the numbers in your load profile. Know the type of workload the application is requesting. Understand the limitations of the HW the DB is running on. Basic stuff, like 1000 sql executions/sec, 10 hard parses/sec, 2msec average db file sequential reads (which can’t possibly come from physical disk), max 10000 IOPS for the SAN, max 11 seconds of CPU core capacity on lpar for the main DB (which is on different scale than the 128 CPU seconds the OS thinks the lpar has)…

Dead-ends

So far I did not post about dead-ends/side tracks. Those are the necessary evil. No way to know they are dead-ends until you pursue them. You may conclude faster they are irrelevant (victims instead of root causes) by learning from others.

Some of the pitfalls we had:
– using only AWR reports to explain slowness…not everything is in there
– trying to shoestring the problem into one of the seemingly possible scenarios based on data in AWR reports
– picking the wrong day (with less load) for baseline to “prove” slowness was from increased workload
– picking the wrong day (with sudden jump in number of soft parses) for baseline to “prove” slowness was from increased soft parse
– in the heat of the battle making changes on “expert” advice wo/verifying or understanding
– disk IO/speed trap on machine with CPU starvation
– allowing yourself to be distracted by constant/useless conference calls
– not following through with application owners, explain reason DB can’t cope, work together on solution

At the end

Our week-long DB slowness  ended before we knew what hit us. At that time a lot of attention was devoted to batches. We thought they were the reason why day time activity was slower. The vendor stopped churning through the whole data set and started to process only the subset of accounts (those matter). The fastest sql is the one not running 🙂

Little did we know that the batches were the victims of hard parsing by other sessions. At around the same time they modified the batch sqls with literals also stopped coming. So we were back to normal. Couple of months later I tracked down the application responsible. By that time management moved on and was not interested to pursue it further.

The same application still puts out bursts of similar sqls few times a year. During my research I created a tool (lprof) to catch hard parsing as it happens. Couple of months ago the tool helped me identify the issue on the spot. It was “just” a 6 hour burst at business hours, it did not interfere with overnight batches, users did not really noticed it (there were no complains). I notified management and they acknowledged it. I am out of the environment by now but this blog may help DBAs the next time it happens again.

Thanks for reading.

lprof

How to catch the problem next time?

We had slow DB for a whole week. We started with AWR reports. I like to look at the load profile first. For our hard parsing problem it did not really help. Time model section had more information (ballooning parse time elapsed), but not enough to catch the increase in ‘parse time cpu’. In awrdiff_0418_0427_dbsidxxx.html data for ‘parse time cpu’ is listed somewhere below between hundreds of other metrics.

awrdiff_parse_time_cpu

Woow, we had 1184% increase in ‘parse time cpu’. But that information is buried in an AWR report generated long after the fact.

Wouldn’t be nice to see relevant/main metrics real time?

In my load test script (hog) I combine information from several sources. I wanted to see some of the AIX specific output from lparstat (app, physc) with CPU related info from ASH (sampled) and sysstat (instrumented)…would be nice to have similar information every few seconds.

I also wanted something generic enough to catch issues with wide range of resources…CPU, disk or network.

I wanted it not to spill my screen with too much information, I made it easy to pick and choose metrics.

I wanted to see results every few seconds allowing me to observe changes in resource usage real time.

I wanted it light weight and work in a simple terminal (no GUI).

I wanted it to run without modification on both AIX and Linux.

Script lprof was born.

Linux or AIX> lprof

Oracle load profile sampler
Usage
        lprof <sleep seconds>
Output (any/all of these)
        time    date +%H%M%S
        app     lprstat available physical processors in the shared pool or vmstat r (when lparstat is not avail)
        physc   lprstat number of physical processors consumed or vmstat pc (AIX) or vmstat id (on linux)
        busy    lprstat lbusy (percentage of logical processor utilization) or vmstat busy (us+sy) 
        cpu_os% v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP,2)/&CPU_COUNT*100
        cpu_os  v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP
        db_cpu  v$sys_time_model, stat_name = 'DB CPU'/&SLEEP
        cpu_bgr v$sys_time_model, stat_name = 'background cpu time'/&SLEEP
        lio_k   v$sysstat where name='session logical reads' [1000]
        pio     v$sysstat where name='physical reads'
        rmbps   v$sysstat where name='physical read total bytes' [MB]
        wmbps   v$sysstat where name='physical write total bytes' [MB]
        riops   v$sysstat where name='physical read total IO requests'
        wiops   v$sysstat where name='physical write total IO requests'
        seqr_ms v$system_event, 'db file sequential read'
        execs   v$sysstat where name='execute count'
        prs_tot v$sysstat where name='parse count (total)'
        prs_h   v$sysstat where name='parse count (hard)'
        prs_ela v$sysstat where name='parse time elapsed
        cpu_prs v$sysstat where name='parse time cpu'
        bstcl_k v$sysstat where name='bytes sent via SQL*Net to client'                
        brfcl_k v$sysstat where name='bytes received via SQL*Net from client'  
        rtrcl   v$sysstat where name='SQL*Net roundtrips to/from client'               
        bstdbl  v$sysstat where name='bytes sent via SQL*Net to dblink'                
        brfdbl  v$sysstat where name='bytes received via SQL*Net from dblink'  
        rtrdbl  v$sysstat where name='SQL*Net roundtrips to/from dblink'               
        load    v$osstat, LOAD average for last minute
        aas     v$active_session_history sessions with session_state = 'ON CPU' in last sample
        ash_cpu v$active_session_history with session_state = 'ON CPU'
        ash_prs v$active_session_history with session_state = 'ON CPU' and IN_PARSE
        ash_prh v$active_session_history with session_state = 'ON CPU' and IN_HARD_PARSE
        ash_lit v$active_session_history count sql-s with most literals in last sample, not normalized for time
        ash_tm  v$active_session_history latest sample_time
        aas1    v$active_session_history aas for most active event
        ash_1st v$active_session_history most active event
        aas2    v$active_session_history aas for 2nd most active event
        ash_2nd v$active_session_history 2nd most active event
        sleep   actual time between measurements, can increase on busy environment

It is my way to have the first look at a DB, helps me to feel out what is going on in the environment.

The script was developed and tested on AIX and Linux with Oracle 11R2, you may need to make minor adjustments for other OS/DB versions.

uname -vM
7 IBM,8286-42A
 
uname -a
Linux ol6-112-rac1.localdomain 2.6.39-200.24.1.el6uek.x86_64 #1 SMP Sat Jun 23 02:39:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

Use case #1

On Linux I used hog to generate parse load, watched changes in number of parses per second and cpu used for parsing.

Use case #2

On AIX I used the same hog script to generate parse load, watched changes in number of parses per second and cpu used for parsing. Also watched how ASH shows the session is using 100% CPU. Observed the AIX specific core based CPU utilization reported by instrumentation (cpu_db, cpu_prs).

Use case #3

We had a similar episode. For several hours the DB maxed out the CPU on the lpar. It was from hard parsing. I was able to catch it real time using lprof.

AIX host> lprof 5

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
Waits between measurements [sec] 5
System configuration: type=Shared mode=Uncapped smt=8 lcpu=128 mem=131072MB psize=12 ent=10.00 
Online Virtual CPUs                        : 16
We are using lparstat on AIX

time    app     physc   aas     aas1    ash_1st aas2    ash_2nd execs   prs_tot prs_h   lio_k   cpu_db  cpu_prs rmbps   wmbps   riops   wiops   seqr_ms
160428  0.00    11.60   110.60  35.39   cursor: 26.54   CPU     3140    908     11      511     8.5     4.64    498.5   11.7    2380    639     17.69   
160452  0.00    11.65   78.80   22.06   cursor: 18.91   CPU     3469    823     11      516     8.9     5.32    408.0   12.4    3166    609     8.17    
160515  0.00    11.45   190.00  55.10   library 45.60   CPU     2959    833     9       508     8.1     5.22    508.3   11.8    2361    624     15.06   
160539  0.00    11.62   151.20  51.41   cursor: 37.80   CPU     3070    810     14      514     8.6     4.99    466.4   10.8    2354    584     12.50   
160601  0.00    11.26   38.00   14.82   cursor: 9.12    CPU     2931    817     10      564     8.3     4.65    490.2   11.3    2394    611     15.00   
160626  0.00    11.45   80.60   33.05   cursor: 18.54   CPU     2724    762     9       530     8.6     4.77    492.7   11.2    2396    520     13.79   
160651  0.00    11.39   40.40   15.35   cursor: 9.70    CPU     2462    766     9       531     8.7     4.29    487.0   8.3     2109    406     20.06   
160717  0.00    11.67   82.60   23.95   cursor: 20.65   CPU     2743    827     13      575     8.4     4.60    512.0   9.3     2133    469     19.51   
160742  0.00    11.71   80.20   28.87   cursor: 20.85   CPU     3235    803     12      614     8.7     3.78    485.6   10.1    2052    455     21.03   
160808  0.24    11.33   121.40  38.85   cursor: 27.92   CPU     3051    809     8       491     8.7     4.23    439.5   10.8    2986    585     10.64  

# next day...normal...

time    app     physc   aas     aas1    ash_1st aas2    ash_2nd execs   prs_tot prs_h   lio_k   cpu_db  cpu_prs rmbps   wmbps   riops   wiops   seqr_ms
165733  2.93    8.84    17.80   6.41    CPU     4.81    SQL*Net 2005    612     14      655     4.5     .07     814.4   3.2     5214    255     2.23    
165743  2.29    9.46    17.40   6.61    CPU     4.87    SQL*Net 2240    617     11      873     4.6     .04     673.5   3.5     6298    264     1.76    
165754  1.85    9.96    22.60   7.68    CPU     6.33    SQL*Net 2356    709     15      789     4.4     .07     773.5   3.2     4439    207     1.28

The lpar used 11+ cores (physc), there was no core resource left in the shared pool (app=0).  From the 11 cores around 8 were used by the db (cpu_db). Around half of that was used for parsing (cpu_prs)… bad. Also, look at the high ‘db file sequential read’ times (seqr_ms), the side effect of CPU starvation.

File lprof.txt has the script with sample test runs, enjoy.

References:
Load profile
SQL Dashboard V2.1 (beta)

Root cause part 2

In previous post I narrowed down the root cause to hard parsing. In this post we will track down the hard parsing sqls.

AWR report (and DBA_HIST_SQLSTAT) may not have such information, each sql may execute only once (or few times), they are not big enough to make it into AWR.

But ASH report’s “Top SQL using literals” section showed two FORCE_MATCHING_SIGNATUREs (FMS) with many sql versions. The DB was executing the same sqls different only in literals passed in.

We can also find those FMSs using custom scripts, see how I do that in rootcause.txt. There are 2 FMSs with distinct sql counts of 90+ thousand on 4/27. They started to show up around 4/23, same time when parse CPU shut up.

We can also find parse related information in ASH. We can even calculate time spent on logical CPUs for parsing. ASH also has session count, account used, the application connecting, the machine initiating the connections, etc… a gold mine of information. With such detail we can track down the source parsing sqls executed from.

Let’s look at how parse CPU from the 2 FMSs compare with CPU used for all parses and CPU used for all the sqls in sqlstat. Please see query “–parse_cpu_fms” in rootcause.txt.

Data points are also listed in “rootcause” sheet of mining_awr.xlsx or you can zoom in
parse_cpu_fms.jpg

parse_cpu_fms.jpg

CPU_HARD_PARSE_SEC	ASH session_state='ON CPU' and IN_HARD_PARSE='Y' 
CPU_SEC_FMS		ASH session_state='ON CPU' for the 2 FMSs
CPU_HARD_PARSE_SEC_FMS	ASH session_state='ON CPU' and IN_HARD_PARSE='Y' for the 2 FMSs

Most of the DB hard parse CPU (CPU_HARD_PARSE_SEC) came from hard parsing the sqls with the 2 FMSs (CPU_HARD_PARSE_SEC_FMS). The sqls with literals used most of the CPU for parsing since the CPU used during FMSs execution (CPU_SEC_FMS) is the same as CPU_HARD_PARSE_SEC_FMS.

For some reason 4/29 was different. That day parse related CPU usage disappeared, instead CPU was used for executing the sqls with FMSs (CPU_SEC_FMS, yellow line). Any guesses what happened that day? I may write about this in the future.

According to ASH we had 30 sessions executing the 2 FMSs.  The sessions were connected from 4 machines, please see query “–parsing_sessions” in rootcause.txt. In my previous post Are we maxed on CPU? we saw OS load increased by 30 on slow day, these are the sessions responsible for the load increase.

During the execution of sqls related to the 2 FMSs the sessions were constantly hard parsing.  That is a CPU intensive task, see query “–session_state by FORCE_MATCHING_SIGNATURE”.

FORCE_MATCHING_SIGNATURE SESSION EVENT                          I I TOTAL_TIME        CPU    WAITING
------------------------ ------- ------------------------------ - - ---------- ---------- ----------
    10458055140908993565 ON CPU                                 N N      59000      59000          0 batch related
    17710121666287808037 ON CPU                                 Y Y      47000      47000          0 hard parsing FMS
     6747409576012680712 ON CPU                                 Y Y      44270      44270          0 hard parsing FMS
      201958791624409026 ON CPU                                 N N      11100      11100          0
    16583812627080328788 WAITING db file sequential read        N N      13210          0      13210
     1409798687556705245 WAITING read by other session          N N      52440          0      524400

Hard parsing sessions together with the sqls in batch increased CPU usage up until DB used all core capacity available  to the lpar.

I am declaring hard parsing of sqls with literals as the root cause.

During my discovery I did a lot of research on (hard) parsing, it is no accident you will see parse_hard & parse_soft hog types in my load testing “hog” script. In a future post I may demonstrate how to generate (and catch) a parse heavy DB load.

At the beginning of my research I ruled out parsing:
– number of parses (soft and hard) did not change a lot, soft parse decreased by 12%, hard parse increased by 19%
– the machine was running with around 40% CPU utilization on slow week. I was thinking we have plenty of unused CPU capacity

I was wrong. Even without batches the load was around 60 on a machine with 128 logical CPUs/threads. On POWER8 that translates to close to 90% core utilization while the OS thinks CPU is around 50% utilized. Therefore CPU utilization was under reported, it was misleading.

But it gets worse, the lpar was not able to get full cores for all 16 virtual processors, only around 14 cores were available from shared CPU pool, so we were utilized beyond 90% on core capacity available to the lpar.

How is it possible that such a small change in number of hard parses maxed out the CPUs?
The extra 200 thousand hard parse on slow day showed up as an increase of 11.4  to 13.7 hard parses per second, does not look that much, right?
Turns out not the number of hard parses but the type of sql we parsed was the problem. I won’t post the sql to protect the innocent. Each sql was a complex 20K long sql, the DB processed close to 200 thousand per day on our slow week.

On our slow week on top of normal DB activity and the overnight batches an application started to bomb the DB with those complex  sql-s with literals. The 30 hard parsing sessions increased the load on the machine. POWER8 processor turned on more threads to accommodate the load. Oracle sessions (sqls in those sessions) had slower response time due to POWER8 processor was running with higher SMT.

The 2nd biggest DB Time increase (after ‘hard parse elapsed’) came from waits on ‘db file sequential read’. A machine with CPU starvation can serve disk requests slower especially when data is coming from cache/memory. See case study CPU starvation disguised as an I/O issue (yet another AWR case study) by Savvinov with similar scenario.

During my analysis I got distracted a bit by the intricacies of AIX/POWER. I wasted some time on dead ends. I also learned a lot along the way. Hope you will benefit and your next root cause analysis can conclude in days instead of months.

An abbreviated version of my experience is available in powerpoint format, Oracle on aix v4.pptx.

Root cause part 1

In “What happened to the batch?” I noted the increased load from the DB. Where the extra load coming from?

Most metrics are the same or less in DB load profile for slow day. In AWR diff report (awrdiff_0418_0427_dbsidxxx.html) parsing is down by 12% but hard parsing is up by 19% from 11.4/sec to 13.7/sec. The diff report does not compare the “Instance Efficiency Percentages” section, for that we have to go back to AWR reports. Between awr_0418_dbsidxxx.html and awr_0427_dbsidxxx.html) “% Non-Parse CPU” changed from 92% to 46%… more than half of the CPU was used for parsing on 4/27.

Let’s visualize CPU used for parsing and by the whole db. Sheet “whole db” in mining_awr.xlsx has both. I also included CPU used by all sqls found in sqlstat… which is not all the sqls executed by the DB… please remind me if you know the article explaining what goes into AWR with the snapshots.parse_cpu

DBCPU_sec	DBA_HIST_SYS_TIME_MODEL, 'DB CPU'
PARSE_CPU_S	DBA_HIST_SYSSTAT, 'parse time cpu'
SQL_CPU_SEC	DBA_HIST_SQLSTAT, sum(CPU_TIME_DELTA)

Zoom in to see it better parse_cpu.jpg

Around 4/23 parse CPU (PARSE_CPU_S) shut up and become big part of overall CPU use by the whole DB (DBCPU_sec). It calmed down for the weekend just to go back up for the whole week of 4/25. Also, it looks like combined CPU usage from DBA_HIST_SQLSTAT by all sqls (SQL_CPU_SEC) and parse CPU from DBA_HIST_SYSSTAT (PARSE_CPU_S) can’t push overall CPU utilization (DBCPU_sec) higher than 10000 centiseconds for any 15 min snapshot…core based DB CPU usage is maxed at around 11 sec (=10000/15/60).

The biggest DB Time gain on slow day (4/27) came from parse elapsed. See the data points in mining_awr.xlsx , sheet “whole db”. On this picture I am showing corelation between ‘parse time elapsed’ and ‘parse time cpu’.

parse_elapsed

parse_elapsed.jpeg

We have around 25000 seconds of parse related elapsed time for 15 minute snaps… translates to 2.4 million seconds added DB time from parsing (=25000*4*24) for the day (the length of my AWR reports). It is matching ‘hard parse time elapsed’ increase between awr_0418_dbsidxxx.html and awr_0427_dbsidxxx.html in Time Model Statistics section.

Check out Variable Load Impact On Oracle Database Hard Parsing by Craig Shallahamer, it explains a scenario with ballooning parse elapsed.

Next I will investigate where those hard parsing sqls came from.