Team with the best
Db2® LUW Performance Tools
company in the World

DB2 LUW Performance: Let the Elephant Hunt Begin

April 15, 2009, 11:43 pm
Posted by scott in DB2 Performance How-To
Hopefully you read the prior blog post on elephants and mosquitoes which discusses different approaches to statement performance analysis. If you haven't read it yet, please do so now. In this post, we will look at ways to hunt and kill your elephants. And, by no means do I favor cruelty to animals in any way, this is just a metaphor. Let us imagine that your phone just rang and your boss is screaming "What's happening RIGHT NOW?!?!?!!?!?" ...
We can find relevant information from several sources. Many DBAs begin by running the command:
db2 "list applications show detail"

This will give you a hint as to which connections are currently executing and if there are a number of connections in a lock wait status.

If you observe lock waits, you should next take a deeper look into the lock contention with the command:

db2 "get snapshot for locks on DBNAME"

This verbose report will show you which application connections are holding locks and which are waiting. Compounding matters, some of the connections that are waiting may have other connections waiting on them. A real snowball effect can occur if you have LOCKTIMEOUT set to -1 (infinity, never time out) or set higher than 30 seconds.

In the case of lock contention, your remedy usually involves killing the elephant with the command:

db2 "force application (application-handle) "

You are more likely to have lock contention issues in an OLTP database than a Data Warehouse (DW) database. OLTP databases are more prone to mosquito infestations than elephants, but the occasional elephant has been known to wreak havoc in the best tuned OLTP databases. Your DW databases, in the absence of a optimum physical design, are highly prone to the type of elephant that every DBA fears most - the type of long running, resource consuming query that sucks the life out of your system, makes the lights go dim, and causes angry phone calls demanding to know "What's happening right now??!!??!!"

OK, so, assuming your database is free of lock contention elephants, let us next try to find and kill the ones that are sucking the life out of your system.

To find details about connections that currently exist in your database, you need to use the command:

db2 "get snapshot for applications on DBNAME" Click Here to see sample output from this command

If you have a DB2 9 database, a SQL snapshot command can be used as an alternative:

db2 "select * from SYSIBMADM.SNAPAPPL"

Of course, you will likely want to request specific columns and add sort criteria to the above query, but first let's chat about the important numbers. You want to look for application connections that have high costs and inefficient ratios:

  • High Rows Read, and possibly high Rows Selected - although Rows Selected won't begin to increment until rows actually begin to be returned to the application
  • High Total Sorts - not all elephant queries perform sorts but most do
  • High Sort Time (ms), and high average Sort Time (Sort Time/Total Sorts)
  • High sort overflows
  • High Buffer pool data and index logical and physical reads
  • The UOW start timestamp is known but the UOW stop timestamp is NULL or not specified -- this will indicate that a transaction is currently in progress despite the fact that Application Status may show "UOW Waiting"
  • High Total User + System CPU Time used by agent(s)
  • High SQL compiler cost estimate in timerons
  • High SQL compiler cardinality estimate
    • The percentage of Rows Selected over the SQL compiler cardinality estimate might reveal how close the query is to completing (Rows Selected X 100 / Cardinality Estimate). If the percentage is high, you might let the elephant finish its damage rather than killing a user's query
  • High Index Read Efficiency (IREF) = (Rows read / Rows fetched)
  • If you have been notified that TEMPSPACE storage is running low, look for:
    • High number of Rows Written
    • High Total buffer pool write time (milliseconds)
    • A large number of Buffer pool temporary data and index logical and physical reads. Unfortunately, DB2 does not provide data about Buffer pool temporary data and index writes - which would be more telling of the culprit that is consuming your TEMPSPACE (Hey Toronto lab - HINT HINT HINT)
  • And, of course, we want to see:
    • The statement text
    • Who the guilty user is (CONNECT Authorization ID and Client login ID)
    • The user's phone number, email address, education level and training, and their title or position in the organization - but, alas, this information is not available from the snapshot so you may need to make some phone calls before you kill your elephant and ruin a C-Level executive's day (this is also sometimes called a "career limiting move")
Once you've identified your deadliest elephant and isolated "WHAT'S HAPPENING RIGHT NOW??!!!?!?!?!", you may want to kill the elephant unless it belongs to your CEO:
db2 "force application (application-handle)"

Before we wrap this up, let's return to that DB2 9 SQL Snapshot Query and make it a little more useful:

SELECT SUBSTR ( DB_NAME,1,8 ) AS DB_NAME, AGENT_ID AS FORCEAPPLID, ROWS_READ, ROWS_WRITTEN, (ROWS_READ / (ROWS_SELECTED + 1)) AS IREF, (AGENT_USR_CPU_TIME_S + AGENT_SYS_CPU_TIME_S) AS CPUSEC FROM SYSIBMADM.SNAPAPPL WHERE UOW_STOP_TIME IS NULL ORDER BY ROWS_READ DESC

This query may help you identify your elephants and show you the Application ID that you would want to FORCE. I added +1 to ROWS_SELECTED because I'm lazy and didn't want to write a CASE statement.

I fully realize there are many smart people worldwide who read this blog, and I am sure some of you have your own "war chest" of your favorite queries that you use. If you like, please post a comment and share your favorites. Posting anonymously (no fancy account required) only takes a minute.

Closing remarks

Some clients and customers tell me they'd like a performance monitoring tool that would show them every SQL statement, its resources consumed, who did it, when, and from where. Folks, this is Database Activity Monitoring or Auditing. If you want to see every statement, then you are obsessed with elephant hunting. Elephant hunting is not workload performance management and does not provide long term tuning and optimization. The next blog post on mosquito extermination will more fully address statement workload analysis and long term optimization for greater efficiency, speed, and performance reliability.

The Shameless Marketing Moment

Brother-Eagle Logo DBI's Brother-Eagle™ for DB2 LUW is a real-time monitor for DB2 database performance that automatically computes several key metrics for you. Best of all, the Standard Edition is FREE. Enterprise Edition enables additional drill downs to connection and lock activity, and you can easily FORCE connections if you like. So, if ELEPHANT HUNTING is your sport, Brother-Eagle® is your ideal tool.

Just for Fun!

Read my other blog for more thoughts and topics.

If you use the internet and enjoy downloading movies, music, and more, the subject of Net Neutrality may interest you. Visit www.savetheinternet.com and learn more.

With kindest regards,
Scott

Scott Hayes
President & CEO, DBI
IBM GOLD Consultant
DBI is an IBM Advanced, Industry Optimized, Business Partner
DBI is an Oracle Technology Network Partner
DBI provides products and services to over 2,000 customers worldwide
www.Database-Performance.info
www.Database-Auditing.info
Your Performance IS Our Business

DBI Logo

Printer friendly