Team with the best
Db2® LUW Performance Tools
company in the World

DB2 LUW Health and Fitness Monitoring

April 17, 2009, 12:24 am
Posted by scott in DB2 Performance Best Practices
I pushed my father in a wheel chair for the first time last month. As I looked at the man who was my Scout Master, who helped me achieve Eagle rank, who taught me to bowl, and inspired me with strong core values of helpfulness and community service, these wheel chair excursions marked a pivotal inspiration point for contemplating my own current health, and have given rise to my own self-improvement goals that you can read about in my personal blog. This year's new IDUG presentations and Ed Seminars will also include new information on DB2 Health monitoring. So, in this blog post, let's look into the health and fitness of your databases.
If you've been to a hospital or watched any popular TV shows, you've probably seen someone hooked up to a lot of monitoring equipment. This equipment monitors heart rate and breathing, and possibly other basic indicators of life. When the heart stops, alarms sound off and there is a rush of effort to revive the patient (I like Grey's Anatomy ).

At the most basic level, for some organizations, DB2 health monitoring is simply looking for the presence of a heart beat (can we connect to the database? ) and the absence of potentially fatal problems (are we running out of disk space or memory? ). However, the presence of life in a human does not infer readiness to run a marathon, nor does the absence of fatal problems in a database mean that the database is really ready to serve the demands of your business. Hence, the title for this blog post "DB2 LUW Health and FITNESS Monitoring."

The IBM DB2 LUW Health Monitor

I've heard varying opinions about its usefulness. So I decided to do my own due diligence by using it myself, reading the IBM documentation, and reading blogs of others in the DB2 community.

The Health Monitor is inextricably built into DB2 and it's free. In the current economy, free seems to be a fairly popular price. Ultimately, it will be up to you to decide the value of it to your organization by doing your own due diligence.

The DB2 Health Monitor keeps tabs on a pre-defined set of health indicators - most of which are indicative of "life or death" and testing for the absence of critically fatal problems. I'd like to see the Health Monitor include more functionality for monitoring wellness, fitness, and efficient performance, beyond the basic "life or death" indicators.

That said, the following Health Monitor indicators do provide some wellness, fitness, and efficiency clues:

  • db.tb_reorg_req (Reorganization Required)
  • db.tb_runstats_req (Statistics Collection Required)
  • db.catcache_hitratio (Catalog Cache Hit Ratio)
  • db.pkgcache_hitratio (Package Cache Hit Ratio)
  • db.spilled_sorts (Spilled Sorts, or Sort Overflows)

The DB2 Health Monitor is turned OFF or ON for a DB2 instance via the DBM CFG switch HEALTH_MON.

Use the command:

db2 "GET DBM CFG"
to see if the Health Monitor is currently ON or OFF.

To turn it on:

db2 "UPDATE DBM CFG USING HEALTH_MON ON"
To turn it off:
db2 "UPDATE DBM CFG USING HEALTH_MON OFF"

Obtaining information about the status of the health indicators is easily done with the command:

db2 “get health snapshot for db on DBNAME [show detail | with full collection]”

The "show detail" option gives about 4.5 hours of recent history for the indicators. Click this link to see sample output.

As an alternative to "show detail", the option "with full collection" can be used. This option will show you the status of all indicators, objects, and tables in painful detail for REORG and RUNSTATS criteria. You will have to redirect your output to a file, and the file will probably be very large.

To check the IBM supplied indicators for the database manager, use the command:

db2 "get health snapshot for dbm"

This will show you if DB2 is up (alive), the time DB2 started, and monitor heap utilization.

If you like using the SQL Snapshot capabilities of DB2, the health monitor data can also be retrieved via SQL Snaps:

db2 “SELECT * FROM TABLE( HEALTH_DB_INFO ( cast (NULL as VARCHAR(1) ) , -1) ) as HEALTH_DB_INFO”

And, if graphical user interfaces are your preference, the DB2 Health Center makes it relatively easy to view indicator status and update indicator thresholds:

START > All Programs > IBM DB2 > DB2COPY1 > Monitoring Tools > Health Center

The indicator threshold values can also be updated with command line commands such as:

db2 “UPDATE ALERT CFG FOR DATABASES USING db.lock_escal_rate SET ALARM 10, WARNING 5, SENSITIVITY 0, THRESHOLDSCHECKED YES , ACTIONSENABLED NO”

The Health Monitor can also be configured to send email alerts to specified email addresses. Unless you want to look at the Health Center all day or repeatedly run and review the command outputs, setting up email notifications to grab your attention makes sense. But, be careful that you set the thresholds to reasonable values and only enable those indicators that are important to your business - otherwise you might drown in alert noise.

Are you using the IBM DB2 Health Monitor? Do you like it? Is it helpful? Or do you have it shut off? Why? Please consider posting a comment to share your story with your peers.

Beyond Basic Health Monitoring: FITNESS Monitoring

Previous blog posts in this series have described some additional indicators that should be periodically reviewed to determine if your database is running efficiently:

  • Index Read Efficiency (IREF) - the ratio of Rows Read to Rows Fetched. A high ratio implies that necessary indexes are missing, or current indexes are not well defined.
  • Synchronous Read Percentage (SRP) - the percentage of database page reads that are completed synchronously. A high percentage indicates that good indexes are in place and a low percentage means that the database is satisfying queries by performing expensive data scans.
  • Table Rows Read per Transaction (TBRRTX) - the average number of Rows Read per Transaction for each table. A value greater than 10 in an OLTP database means it is highly likely that costly scans are occurring against the table.
  • Bufferpool Logical Index Reads per Transaction (BPLITX) - the average number of index pages read per Transaction. When BPLITX is greater than 6 X the number of Selects per Transaction, it is quite likely that costly leaf page scans are occurring and higher quality indexes are needed.
  • Asynchronous Pages read Per Request (APPR) - the average number of pages successfully read into the bufferpool for each prefetch request. For Data Warehouse databases with heavy asynchronous prefetch I/O, this metric indicates if prefetching is working effectively or not.

The five FITNESS indicators shown above can easily help you understand if your database is operating efficiently or not. In addition to these, consider evaluating aggregated SQL costs to verify that no single statement is using in excess of 10% of the database CPU during the prior 15, 30, or 60 minutes. Also look at Transaction Time Analysis to verify that response times are within business SLA (Service Level Agreement) requirements.

By evaluating FITNESS indicators in addition to basic health indicators, you will achieve a more complete picture of your organization's database wellness and ability to run marathons. By automating lights out alerting with email alerts for health and fitness indicators, productivity can be improved by directing your attention to the databases in need of help. And these days, with shrinking staff sizes and "IT Cost Optimization" in fashion, higher automation and productivity are business imperatives.

The Shameless Marketing Moment

Is your organization looking to Optimize IT Costs through automation?

DBI's Brother-Hawk™ for DB2 fully automates basic and fitness health indicator alerting with Email and/or SNMP Trap alerts so you can focus your attention on databases that truly need DBA attention.

Brother-Hawk Logo Alerts can be based on real-time metrics or computed near real-time metrics (recent history). Dozens of alert criteria are predefined with industry best practices thresholds. Advanced alerts such as "Notify me if a single SQL statement uses more than 20% of the CPU during the prior 30 minutes" or "Notify me if fewer than 95% of transactions have completed in less than one second" are available. Brother-Hawk is highly customizable and extensible - you can add any alert criteria that is important to your business and set alert rules for various timeframes and days of the week. And, an industry unique alert de-duplication feature can automatically suppress excessive alert noise. Visit www.DBIsoftware.com/brother-hawk.php to learn more.

Just for Fun

Let's connect on LinkedIn. You can find my profile at www.linkedin.com/in/scottrhayes - view my full profile to check out my reading list and groups. When you send your connection request, please indicate that you are a DB2 LUW Performance blog reader so that I don't accidentally click the "I don't know you" button. If you find these blog posts helpful or if I have helped you in other ways, I'd be very thankful to receive your recommendation and will reciprocate if I know you well.

With kindest regards,
Scott

Scott Hayes
President & CEO, DBI
Follow Me on Twitter
IBM GOLD Consultant
IBM Data Champion
DBI is an IBM Advanced, Industry Optimized, Business Partner
DBI is an Oracle Technology Network Partner
DBI provides products and services to over 2,000 customers worldwide
www.Database-Performance.info
www.Database-Auditing.info
Your Performance IS Our Business

DBI Logo

Printer friendly