Great webinar today, thank you. A quick question on clustering indexes. When trying to determine the best columns by looking at the highest aggregate sort time, what should one do if there is already a cluster index on the table? That is, what to do if the existing cluster index is suspected to not be the right one? Here's my answer...
I love this question - it is a good one!
Given that a CLUSTERING index already exists, is it clustered? What is its cluster ratio or cluster factor? Is the table being periodically reorganized according to the existing clustering index? If the table is not regularly being reorganized according to the clustering index, or if the cluster ratio or factors are low, then this would weigh in favor of the current clustering index not being the right, or optimized, clustering index.
Sometimes the mistake is also made of making the primary key (and its unique index) be the CLUSTER index as well. This is the wrong thing to do. If this is the case, then your current clustering index is incorrect. UNIQUE indexes have their RIDS maintained in sorted sequence amongst the index leaf pages, so sometimes just having a UNIQUE index in place is sufficient to help DB2 avoid a sort.
In fact, UNIQUE indexes are kind of a sneaky way of getting DB2 to have more than one clustering index.
As for your next steps, after looking into the aforementioned points, you might consider:
- Using your DBI tools, find the SQL that is driving the I/O to the table in question. Review the statement texts to see if any statements have an ORDER BY or GROUP BY clause that matches the definition of your current clustering index. If so, maybe the current clustering index is the right one (provided that it is not also defined as UNIQUE).
- Using your DBI tools, pass all of the statements that are driving I/O to the table to the IBM Design Advisor (Design Analysis option) and ask for MDC advice. Also check the boxes to preserve the contents of the Explain tables. The Design Advisor might give you an MDC recommendation or a CLUSTER index recommendation - this would be another sign that you don't have the right CLUSTER index currently in place.
- After the Design Advisor executes, look at your Explain tables to see if you can determine if the current CLUSTER index is being used. If it is not, you definitely do not have the optimum CLUSTER index in place and the current index can be dropped with certainty. This assumes, however, that you have analyzed a statement workload that spans a significant period of time - such as all statements driving I/O to the table during the past month.
- You could drop the current CLUSTER index and replace it with a regular non-clustered index. After doing so, capture additional SQL workloads and analyze the sort costs of all statements driving the I/O to the table in question. The statement(s) showing the highest percentage of aggregate sort time costs will reveal to you the optimized definitions for the ideal CLUSTER index.
- Implement a new CLUSTER index and reorganize the table data according to that new index, and remember to run the RUNSTATS utility "with distribution on all columns and detailed indexes all".
- Use your DBI tools to measure the performance effectiveness of the new CLUSTERING index - use Database Performance Trend charts to look for a reduction in sorts, and Statement Performance trend charts to look for reductions in sorts, sort time, CPU time, and Avg Elapsed time.
If you are reading this blog and would like to watch a recorded replay of this Webinar, it can be obtained by clicking the following link or directing your browser to: