In previous post I calculated an lpar’s core capacity used and CPU time accumulated on logical CPUs. My calculation was based on estimated throughput increase of the POWER processor. I was wondering how close my results are to reality.
By this time I was tired of forking processes, manually collecting and summarizing results. I had to automate this: “hog” was born, you can download it here… hog.txt.
“hog” is a korn shell script designed to run either on Linux or on AIX. It should work on both without modification. It can execute several resource intensive processes (hogs) in a controlled way. Different type of hogs are available to generate load on resources like CPU or disk. It can bind hogs to specific CPUs or let the OS scheduler decide how to spread the load. It will collect and report CPU related metrics (and more).
You will get readme when you execute it with no parameter:
email@example.com:/home/oracle/tmp/opload [RAC1]> hog Oracle utilization test on Linux or IAX/POWER/lpar USAGE hog [hog type] [bind 'cpu list'|nobind 'start end incr'] [test] hog types: bc hog is using bc to put load on CPU until we stop it bc2 bc loop to put load on CPU until set time, collect vmstat data bc3 bc loop to put load on CPU until set time, collect vmstat & pprof data (if pprof avail) plsql_loop pl/sql loop running on the db to put load on CPU until set time, collect vmstat & pprof data lio execute sql to generate logical IOs on the db to put load on CPU until set time, collect vmstat & pprof data disk_seqr hogs with indexed reads to generate 'db file sequential read' disk_dirpr hogs generating event 'direct path read' parse_hard loop with hard parsing sql for set time, collect vmstat & pprof data parse_soft loop with soft parsing sql for set time, collect vmstat & pprof data bind bind hogs to specific processors, silently ignored when can't bind nobind start specified number of hogs, let OS pick processors to run them on test dry run EXAMPLES dry run hog bc bind '31' test start 1 hog, bind it to cpu 31 hog bc bind '31' start 2 hogs, bind them to cpu 30 and 31 hog plsql_loop bind '30 31' start 1 hog, increment by 2 until we reach 6, report in between, let OS assign CPU-s hog bc nobind '1 6 2' OUTPUT... only show those relevant to hog type picked hogs number of hogs started app lprstat, available physical processors in the shared pool (or vmstat's r) physc lprstat, number of physical processors consumed (or vmstat's pc or id) lbusy% lprstat, percentage of logical processor utilization (or vmstat's busy=us+sy) os_cpu% v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP,2)/&CPU_COUNT*100 os_cpu v$osstat, (&BUSY_TIME_2-&BUSY_TIME_1)/100/&SLEEP db_cpu v$sys_time_model, stat_name = 'DB CPU'/&SLEEP backgr v$sys_time_model, stat_name = 'background cpu time'/&SLEEP seqr_ms v$system_event, 'db file sequential read' M effective core, ((&IDLE_TIME_2-&IDLE_TIME_1)/100 + (&BUSY_TIME_2-&BUSY_TIME_1)/100)/&SLEEP load v$osstat, stat_name = 'LOAD' aas v$active_session_history active sessions ash_cpu v$active_session_history with session_state = 'ON CPU' ash_prs v$active_session_history with session_state = 'ON CPU' and IN_HARD_PARSE ses_cpu v$mystat, 'CPU used by this session' tm_cpu v$sess_time_model, stat_name in ('DB CPU') ses_prs v$mystat, 'parse time cpu' slio_k v$mystat, thousands of 'session logical reads' tx_s sum of # of transactions by all hogs per second during the measuring window res_ms response time, time took to run 1 transaction time pprof ACC_time/STP-STT, process (hog) active time on CPU per second avg_t TIME/hogs, active time on CPU for each hog per second purr pprof ACC_time/STP-STT, PURR based process (hog) active time per second avg_p PURR/hogs, PURR based active time on CPU per hog per second PURR Stands for Processor Utilization of Resources Register and its available per Hardware Thread Context. PURR provides an actual count of physical processing time units that a hardware thread has used. The hardware increments for PURR is done based on how each hardware thread is using the resources of the processor core.
While developing “hog” I benefited & borrowed ideas from these excellent sources:
Craig Shallahamer’s OP Load Generator
Karl Arao’s cputoolkit
Kyle Hailey’s Oracle CPU Time
Charles Hooper’s CPU Wait? LAG to the Rescue
See examples in hog in action.txt. Please let me know if you had problem running it in your environment. Use at your own risk. Do not run in production environment…unless you exactly know what you are doing.
Next we will put an lpar to a test.