Page 1 of 2
Author: Jason Strate and Ted Krueger
Audience: Performance DBAs/Developers
Reviewer: Ian Stirk
Indexes are a primary means of making performance improvements on SQL Server. This book aims to tell you all you need to know about indexes, and more importantly, provides a methodological approach to index tuning, turning what is often viewed as an art into a science.
Indexes are typically a dry subject matter, however they are fundamental to both understanding the structure of data and helping improve query performance. The examples provided make the subject interesting and accessible.
Before reading the book, I had a list of questions that I hoped would be covered, this would let me know how detailed the book was. It successfully answered most of my questions – showing me the book had both depth and quality. These questions included:
Will it use practical examples to illustrate a point?
Does it mention the plan cache as an input to the DTA?
Does it identify the limit in the number of missing indexes?
Will it tell me how to fix a corrupt index?
Will it provide scripts to automatically optimize my databases?
I did wonder why anyone would want to write a book solely about indexes. After all, if you wanted to know about a given make of car, would you buy a book that’s only about the car’s engine? The answer I think, from a performance perspective, is yes.
Below is a chapter-by-chapter exploration of the topics covered, I’ve given each chapter its own rating.
Chapter 1 Index fundamentals
The chapter provides a comprehensive overview of what indexes are, the different types, and why they are important. The types of index discussed include heaps (sic), clustered, non-clustered, columnstore, xml, spatial, and full-text search. Also covered are primary keys, unique indexes, included columns, partitioned indexes, filtered indexes, and compression.
The various options used for creating an index are covered in good detail. These options include fillfactor, pad_index, sort_in_tempdb, ignore_dup_key, statisitics_norecompute, drop_existing, online, allow_row_locks, allow_page_locks, maxdop, data_compression, on partitions.
A brief overview of the system tables that contain index metadata is provided, including sys.indexes, sys.index_columns, sys.xml_indexes, sys.spatial_indexes, sys.column_store_dictionaries, and sys.column_store_segments.
Throughout the chapter there is an analogy made between indexes and books within a library. For example, the primary catalogue (e.g. book title) being equated with a clustered index, and a secondary catalogue (e.g. author name) with a non-clustered index. This is extended with the example of a large dictionary, where every letter has its own book, to reflect partitioning. Finally microfiche is used to reflect compression. (5/5)
Chapter 2 Index Storage Fundamentals
There is great detail in this chapter, probable more than you would need to know. Parts of it are academic, however it does give you a detailed understanding of how data is physically stored and related.
Rows, pages and extents are discussed, before delving deep into the various page types. The different types of page tie the data structures together, and physically implement the database. Page types covered are: File header, boot, PFS, GAM, SGAM, DCM, BCM, IAM, Data, Index, and LOB (text/image). Index organization in terms of heaps, b-trees and columnar is described.
Various DBCC commands are discussed, and example SQL code provided, to look at the content of individual physical pages. I’ve used this in the past when DBCC CHECKDB has reported corruption. (Unfortunately for me, the typical response of a database restore was not possible due to missing trans logs, but luckily the corruption was limited to a non-clustered index that could easily be rebuilt.)
The chapter continues with a discussion of a major cause of index degradation – fragmentation. It covers forwarded records (for heaps only) and page splits. (5/5)
Chapter 3 Index Statistics
I found this chapter’s title misleading, I was expecting it to cover the index statistics in the traditional sense i.e. information about the density and distribution of column values of an index, these are used by the optimizer to create an efficient access path to the data. However the chapter also included ‘statistics’ related to various index related DMOs (Dynamic Management Objects). Perhaps a better chapter title would have been Index Metrics.
The chapter provides a detailed look at what index statistics are and why they are important. Stale statistics can make SQL that should run in a few seconds take many minutes (or hours) to run. Helpfully, SQL is provided to examine the state of your index statistics.
Usage stats in relation to the DMO, sys.dm_db_index_usage_stats, are discussed next. This DMO contains metrics about how an index is being used (i.e. seek, scan, lookup), in terms of the number of batches that use this index (rather than the number of rows accessed via each method). It also provides associated last accessed dates for each index access method. This can be useful in determining how, when and if an index is used.
Operational stats in relation to the DMO , sys.dm_db_index_operational_stats, are then discussed. This DMO contains metrics about how an index is being used (i.e. seek, scan, lookup), in terms of the number of rows accessed via each method. Additional useful columns relate to row/page locks, lock escalations, latch contention, and compression are discussed in detail.
Finally, index metrics relating to space and fragmentation in the DMO, sys.dm_db_index_physical_stats, are discussed. Fragmentation can result in data taking longer to read since the data is not in the required logical/physical sequence. Fragmentation is typically removed during a maintenance period. The SQL to defragment indexes (via a REBUIL D or REORGANIZE) is given later, in Chapter 6 Index Maintenance.
While it is great having lots of details about the various DMOs, and how the indexes are being used, I wanted more information about how to interpret the data (some is given), and what to do next. (4.5/5)
Chapter 4 XML, Spatial, and Full-Text Indexing
These are non-typical indexes, relating to unstructured data. I think they were included for completeness rather than for detail (after all they contain the word ‘index’). That said, in each case a satisfactory overview is given with examples to get you started in your investigations.
XML indexes allow you to query XML by storing path, tags, and value details in an index. Spatial data relates to the 2008 geometry and geography data types. Full-text indexing relates to indexing text based content/documents. (4/5)