hog

In previous post I calculated an lpar’s core capacity used and CPU time accumulated on logical CPUs. My calculation was based on estimated throughput increase of the POWER processor. I was wondering how close my results are to reality.

By this time I was tired of forking processes, manually collecting and summarizing results. I had to automate this: “hog” was born, you can download it here… hog.txt.

“hog” is a korn shell script designed to run either on Linux or on AIX. It should work on both without modification. It can execute several resource intensive processes (hogs) in a controlled way. Different type of hogs are available to generate load on resources like CPU or disk. It can bind hogs to specific CPUs or let the OS scheduler decide how to spread the load. It will collect and report CPU related metrics (and more). It is one self contained script for simple deployment.

You will get readme when you execute it with no parameter:

oracle@ol6-112-rac1.localdomain:/home/oracle/tmp/opload
[RAC1]> hog

Oracle utilization test on Linux or IAX/POWER/lpar

USAGE
        hog [hog type] [bind 'cpu list'|nobind 'start end incr'] [test]

hog types:
        bc              hog is using bc to put load on CPU until we stop it
        bc2             bc loop to put load on CPU until set time, collect vmstat data
        bc3             bc loop to put load on CPU until set time, collect vmstat & pprof data (if pprof avail)
        plsql_loop      pl/sql loop running on the db to put load on CPU until set time, collect vmstat & pprof data
        lio             execute sql to generate logical IOs on the db to put load on CPU until set time, collect vmstat & pprof data
        disk_seqr       hogs with indexed reads to generate 'db file sequential read'
        disk_dirpr      hogs generating event 'direct path read'
        parse_hard      loop with hard parsing sql for set time, collect vmstat & pprof data
        parse_soft      loop with soft parsing sql for set time, collect vmstat & pprof data

        bind    bind hogs to specific processors, silently ignored when can't bind
        nobind  start specified number of hogs, let OS pick processors to run them on

        test    dry run

EXAMPLES
        dry run
                hog bc bind '31' test
        start 1 hog, bind it to cpu 31
                hog bc bind '31'
        start 2 hogs, bind them to cpu 30 and 31
                hog plsql_loop bind '30 31'
        start 1 hog, increment by 2 until we reach 6, report in between, let OS assign CPU-s
                hog bc nobind '1 6 2'

OUTPUT... only show those relevant to hog type picked
        hogs    number of hogs started
        app     lprstat, available physical processors in the shared pool (or vmstat's r)
        physc   lprstat, number of physical processors consumed (or vmstat's pc or id)
        lbusy%  lprstat, percentage of logical processor utilization (or vmstat's busy=us+sy)
        os_cpu% v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP,2)/&CPU_COUNT*100
        os_cpu  v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP
        db_cpu  v$sys_time_model, stat_name = 'DB CPU'/&SLEEP
        backgr  v$sys_time_model, stat_name = 'background cpu time'/&SLEEP
        seqr_ms v$system_event, 'db file sequential read'
        M       effective core, ((&IDLE_TIME_2-&IDLE_TIME_1)/100 + (&BUSY_TIME_2-&BUSY_TIME_1)/100)/&SLEEP
        load    v$osstat, stat_name = 'LOAD'
        aas     v$active_session_history active sessions
        ash_cpu v$active_session_history with session_state = 'ON CPU'
        ash_prs v$active_session_history with session_state = 'ON CPU' and IN_HARD_PARSE
        ses_cpu v$mystat, 'CPU used by this session'
        tm_cpu  v$sess_time_model, stat_name in ('DB CPU')
        ses_prs v$mystat, 'parse time cpu'
        slio_k  v$mystat, thousands of 'session logical reads'
        tx_s    sum of # of transactions by all hogs per second during the measuring window
        res_ms  response time, time took to run 1 transaction
        time    pprof ACC_time/STP-STT, process (hog) active time on CPU per second
        avg_t   TIME/hogs, active time on CPU for each hog per second
        purr    pprof ACC_time/STP-STT, PURR based process (hog) active time per second
        avg_p   PURR/hogs, PURR based active time on CPU per hog per second

PURR   Stands for Processor Utilization of Resources Register and its available per Hardware Thread Context.
        PURR provides an actual count of physical processing time units that a hardware thread has used.
        The hardware increments for PURR is done based on how each hardware thread is using the resources of the processor core.

The script was developed and tested on AIX and Linux with Oracle 11R2, you may need to make minor adjustments for other OS/DB versions.

uname -vM
7 IBM,8286-42A
 
uname -a
Linux ol6-112-rac1.localdomain 2.6.39-200.24.1.el6uek.x86_64 #1 SMP Sat Jun 23 02:39:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

While developing “hog” I benefited & borrowed ideas from these excellent sources:
Craig Shallahamer’s OP Load Generator
Karl Arao’s cputoolkit
Kyle Hailey’s Oracle CPU Time
Charles Hooper’s CPU Usage Monitoring – What are the Statistics and when are the Statistics Updated?

See examples in hog in action.txt. Please let me know if you had problem running it in your environment. Use at your own risk. Do not run in production environment…unless you exactly know what you are doing.

Next we will put an lpar to a test.