Page 2 of 2
Author: Jason Strate and Ted Krueger
Audience: Performance DBAs/Developers
Reviewer: Ian Stirk
Chapter 5 Index myths and best practices
This chapter was a nice bit of fun, short, sharp and quite revealing. The following myths were discussed (typically illustrated with practical examples):
Databases don’t need indexes
Primary keys are always clustered
Online index operations don’t block
Any column in a multicolumn index can be used as a filter
Clustered indexes store rows in physical order
Fill factor is applied on row inserts/updates
All tables should have heap/clustered index
This was followed by some index best practices. Maybe the best advice given is that you should always test things yourself! The best practices included:
Use clustered indexes on Primary Key by default
Balance index counts (i.e. reads < updates)
Set FillFactor for indexes individually
Index Foreign Keys
Continuously review your index environment
Chapter 6 Index Maintenance
As data changes, indexes can degrade, which can reduce query performance. This chapter is about ensuring your indexes are optimal. The first section is concerned with physical and logical fragmentation. SQL to optimise the index is given. As throughout the book, there are some great examples to illustrate the points made.
Similar to index fragmentation maintenance, statistics maintenance is also discussed. Typically an index’s statistics are automatically updated when 20% of its underlying data changes. While this is fine in most cases, there are times when the statistics may need to be updated manually. You can imagine an index with 100 million rows, if 1 million rows are changed each day, it might take more than a month for the index’s statistics to get updated automatically – potentially given sub-optimal performance until they are updated. The use of maintenance plans and a custom update script is explained. (5/5)
Chapter 7 Indexing Tools
This chapter discusses in depth the use of the missing index DMOs and Database Engine Tuning Advisor (DTA) to find what indexes your systems need. In both cases, the recommendations need to be consolidated and considered, rather than being blindly implemented.
Of the two, the missing index DMOs are simpler to use, you can easily identify the more obvious indexes that are missing from your system. However, they do have more limitations (e.g. column order is not specified, only considers non-clustered indexes). DTA is more comprehensive, providing you supply it with a trace workfile that contains all the relevant queries you want to optimize. The DTA can recommend clustered, non-clustered, partitioning, new, and dropping of indexes. It was good to see a mention of the use of the plan cache as a new input to the DTA – a great feature that is often neglected. (5/5)
Chapter 8 Index Strategies
This chapter expands on the earlier chapters and provides more index usage information, including the use of heaps, clustered indexes, primary keys, foreign keys, GUIDs, columnstore indexes, included columns, filtered columns, compression, and indexed views. Throughout the chapter the importance of testing is emphasised, and the use of SET STATISTICS IO used to record and compare metrics.
There is a very nice point that we are all using heaps more than we think, since temporary tables are by default heaps! This expands further to explain that heaps are often not so harmful, unless you do further filtering or sorting. There is an interesting point about testing the use of a heap that has a non-clustered index, and comparing this to a clustered index.
There is a minor error on page 216, it says filtered indexes were introduced in SQL Server 2005, but they were introduced in SQL Server 2008. (5/5)
Chapter 9 Query Strategies
This chapter looks at aspects of SQL queries that can prevent otherwise useful indexes from being used. Although short in content the approach is excellent i.e. taking a scientific approach, based on testing to reduce the number of reads.
The query aspects covered are:
I loved the examples in this chapter, I had wanted it to be longer, but I couldn’t think what else they could have included (related to indexes specifically). (5/5)
Chapter 10 Index Analysis
In many ways, this chapter is what the whole of the book has been building towards. It aims to provide a comprehensive approach to monitoring, analyzing, and implementing indexes. Ultimately, this chapter aims to provide you with the indexes your systems need.
The index life cycle is divided into 3 sections: monitoring, analysis, and implementation.
The monitoring phase covers the use of perfMon counters, DMOs, wait stats, and SQL trace. In each case, details of what to collect, their meaning and context, are explained. Sample SQL is provided to get you started in collecting data periodically. Some correlations between the perfMon counters and wait stats are discussed.
The analysis phase uses as its input the output from the monitoring phase. It is good to see expected values for various perfMon counters, this should prove useful in highlighting problem conditions. The various common wait stats are explained. For more information on interpreting the meaning of waits and perfMon counters, see Tom Davidson’s seminal paper SQL Server 2005 Waits and Queues – still relevant today!
I had expected the Performance Analysis of Logs (PAL) tool to be used on the perfMon counters, to automatically highlight the major problems on the box. This is a great tool and should be included in your performance toolkit.
SQL is given to identify heaps, duplicate indexes and overlapping indexes. The SQL code works if you follow the approach given, but for a more generic (and useful?) version you can replace the text:
There’s some useful SQL for identifying un-indexed foreign keys (but not for tables without primary keys).
The authors make an excellent point about using DTA. DBAs might want to prove their mettle by examining the data and manually creating the indexes. But often using the DTA may be a better option. A detailed walkthrough of using the DTA via the command line is provided.
The implementation phase is relatively small. It discusses the importance of communication with regards to impact analysis and status reports. Deployment and rollback scripts are briefly discussed, together with the importance of source control and having repeatable scripts. The last section concerns script execution.
There’s a nice section about using twitter for help, with some of the more useful hash tags given (#sql, #sqlserver, #sqlhelp), together with the authors’ own accounts. The improvement cycle never stops of course, since data and usages change. At the end of this process (monitor, analyze, implement) it’s time to start the cycle again.
Did this chapter fulfil its aim? There’s a song by The Jam called The Butterfly Collector, a truly beautiful song, but it is just below the standard that makes a song a classic, you can feel it almost touching this imaginary mark... In many ways, that’s how I feel about this chapter – it is almost perfect, but I still have to do a fair amount of analysis and investigation to get the optimal indexes. What is missing is this: I want some software to look at SQL Server tables and indexes, stats, DMOs, SQL queries etc, and then at the push of a button, gives me the optimal data structures for these queries. Maybe the DTA is the closest we will come to this? Maybe I am asking too much? (4.5/5)
This book probably covers everything you would want to know about indexes, it has great depth and range, is full of relevant examples, and has a methodological approach to performance tuning using indexes. The book is an excellent resource and will be useful to anyone looking to improve the performance of their SQL Server databases.