Just a few weeks ago, I had the "pleasure" of working with a customer who had their database SORTHEAP set at 20,000 and SHEAPTHRES set around 950,000! As such, this customer was providing almost 80MB of memory per active sort, and allowing up to 47 of these 80MB sorts to run concurrently! OUCH!
Allowing sorts to complete successfully in SORTHEAP memory is certainly preferable to having sorts OVERFLOW to TEMPSPACE which requires physical I/Os to complete. When sorts OVERFLOW, we can measure the number of Sort Overflows, the percentage of sorts that Overflow as a percentage of Total Sorts, and we can measure the logical and physical I/O costs via:
- Buffer pool temporary data logical reads
- Buffer pool temporary data physical reads
- Buffer pool temporary index logical reads
- Buffer pool temporary index physical reads
However, guess what! When sorts successfully complete within SORTHEAP memory, there are very few measurable indicators of costs - only Total Sorts and Total sort time (ms) are accumulated. THE REST OF THE COSTS ARE DB2'S DEEP DARK SECRET.
Here's the rest of the story - Sorts that occur in SORTHEAP memory consume CPU cycles! Where high sort times go, high CPU utilization follows! To be a successful Performance Hero, you must reduce sort costs and sort times to reduce CPU consumption and improve response times.
In addition to measuring the number of Sorts per Transaction (SRTTX), we also need to measure (use database snapshots or file 'dbsnap2.txt' for your calculations):
- The Sort Overflow Percentage (SRTOVP)
SRTOVP = (Sort overflows * 100) / Total Sorts
- The Average Sort Elapsed Time (SRTMS) per Sort
SRTMS = Total sort time (ms) / Total sorts
- The Average Sort Elapsed Time (SRTMSTX) per Transaction
SRTMSTX = Total sort time (ms) / (Commit statements attempted + Rollback statements attempted)
For OLTP databases, the Sort Overflow Percentage (SRTOVP) should generally be 3% or less. For Data Warehouse databases, a larger percentage of sorts will normally overflow SORTHEAP - try to:
- Reduce and eliminate sorts by adding indexes, clustering indexes, or using MDC tables
- Make sure your TEMPSPACE tablespace is very well tuned with multiple containers spread across different physical devices
For OLTP databases, having SRTMS less than 50ms is desirable and less than 10ms is excellent. SRTMS will generally be higher for data warehouse databases, but you can use this metric to track your tuning effectiveness as you work to reduce sort costs and times.
Shameless Marketing Sound Byte: DBI's Brother-Eagle Standard Edition ( which is FREE ) measures SRTMS, SRTOVP, SRTTX, and several other key metrics for you automatically. And it's free. Did we mention the price? Free.
The Average Sort Elapsed Time per Transaction (SRTMSTX) is an important component for understanding where the time goes inside the database engine. Obviously, the less sort time per transaction then the faster end user transaction response times will be. Reducing sort times also reduces CPU consumption, which frees up more CPU cycles for other work, which generally results in faster response times for all users. SRTMSTX should also be used in combination with read and write I/O times per transaction (as discussed in prior posts), direct I/O times per transaction, lock times per transaction, and CPU time per transaction to determine where the majority of time goes inside the database, and what the real resource bottlenecks are.
How much do sorts really cost? Individual mileage will vary, but you may be pleasantly surprised by your own tuning successes. In some sort performance tests completed recently, a SQL statement caused a Sort Overflow with a 1MB SORTHEAP but completed in memory with a 2MB SORTHEAP (no overflow). Elapsed time improved by 10% and CPU utilization was reduced by 15%. When a Clustering index was implemented to eliminate the sort, Elapsed time improved by about 30% and CPU utilization was cut nearly in half!
The Shameless Marketing Moment
At DBI, our slogan is "Accountability Starts Here". Why is it that whenever there is an application performance performance problem, the database always gets blamed first and the issue lands on the DBA's back? Performance problems don't always belong to the DBA team - sometimes problems are outside the database. DBI's Brother-Thoroughbred™ was specifically designed to help DBA teams manage performance accountability, identify ownership of issues, accurately ascertain resource bottlenecks inside the database, and monitor achievement of Service Level Agreements (SLAs).
This Brother-Thoroughbred image shows the average database transaction response time, how much time is inside the database (owned by the DBA team), and how much time is outside the database.
This Brother-Thoroughbred image shows the time distribution of time spent inside the database. The largest pie slice represents the primary resource bottleneck. Clicking on a pie slice will automatically take you to Brother-Panther's statement analysis sorted according to the resource bottleneck selected. Click, click, solved!
Last week we helped a customer reduce CPU utilization by 40% and cut average transaction response times by roughly 20%. If you're ready to become a Performance Hero in your organization, contact DBI and we'll help you too!
Just for Fun
Read my most recent personal blog post: Man Injured by Tomato Basil Soup, No Data Breach Occurred
I hope you have been enjoying a merry holiday season and that you have a very happy, prosperous, and healthy new year!
President & CEO, DBI
IBM GOLD Consultant