Page 5 of 5
Author: Kalen Delaney et al
Publisher: Microsoft Press
Audience: DBAs and SQL Developers
Reviewer: Ian Stirk
Chapter 14 DBCC internals
Although the DBCC set of functionality does many things, its primary purpose is as to check data consistency. The author of the chapter has worked extensively with the DBCC source code.
The chapter begins with a look at the need to shrink data files and databases. Shrinking can cause problems including: expensive resource usage (I/O and CPU), changes the buffer pool content, increased transaction log usage, and introduces index fragmentation. Owning to these problems it’s suggested that shrinking should only be used when absolutely necessary e.g. when a smaller database is needed or the transaction log needs to be resized.
The chapter proceeds with a discussion of data file shrink, it describes how shrinking works (i.e. scan file backwards using the GAM and PFS to get highest page number with data , obtain exclusive locks, and move rows to lowest available empty page, then repeat). This is followed by an examination of log file shrink, which by comparison is much simpler, where any virtual log files at end of file that are inactive can be removed. The DBCC SHRINKFILE command does the actually shrinking.
The heart of the chapter relates to consistency checking. I/O subsystems, hardware, and disks are mentioned as the chief causes of corruptions. It is recommended to run the consistency checker regularly, perhaps once per week. The DBCC CHECKDB is the most comprehensive command, which examines the physical and logical structures to ensure they are correct. Repairs can also be implemented if requested. DBCC CHECKDB requires a transitionally consistent database, a hidden snapshot is provided to cater for this (it holds data that has changed as the DBCC CHECKDB command runs).
The DBCC CHECKDB section continues with a very detailed examination of its purpose. Details of each type of check are discussed, together with the meaning of any error messages output. The DBCC CHECKDB regular output is discussed, and the utility’s progress can be monitored by the percent_complete column of the sys.dm_exec_requests DMV. Various DBCC CHECKDB parameter options are discussed including: NOINDEX, ALL_ERRORMSGS, NO_INFOMSGS, and PHYSICAL_ONLY. The important area of database repairs shows what is corrected and what will result in data loss.
If you want to know more about data consistency checking, this chapter provides an authoritative source, with deep dives into algorithms and structures.
I was surprised that setting the page_verfify_option to CHECKSUM wasn’t mentioned. This feature enables many corruptions to be reported when they occur, thus reducing the time you’re unknowingly using a corrupt database. Similarly checking msdb.dbo.suspect_pages for problems could also have been useful. That said, the author acknowledges he is only discussing using DBCC CHECKDB.
I did wonder if DBCC is really about internals, instead, is it not an application that uses knowledge of internals to do its work?
This is a detailed book by well-known and respected authors. Although the content is deep, it is purposely narrow in range. Just as you don’t need to be a mechanic to drive a car, you don’t need to know low-level SQL Server internals to run code adequately; however, knowing about the internals of SQL Server will not only satisfy your intellectual curiosity, but also provide clues on how to optimize your SQL queries and troubleshoot problems.
As an indication of the depth of the book, here is an extract on indexes: “An index row doesn’t use the TagB orFsize row header values. In place of the Fsize field, which indicates where the fixed-length portion of a row ends, the page header pminlen value is used to decode an index row. The pminlen value indicates the offset at which the fixed-length data portion of the row ends. If the index row has no variable-length or nullable columns, that is the end of the row.”
In some ways, this book seemed less ‘special’ than its previous editions. This is not a fault of the book per se, but more because increasingly SQL Server internal information is being blogged.
Each chapter has a half-page conclusions section, this might have been better as a 2 or 3-page summary.
Often there is little linkage and flow between chapters by different authors, while this has the advantage that some chapters can be read independently, there is a degree of overlap and also less coherence than might have been possible with better editing (e.g. chapters on query execution, query optimizer and plan caching might have been better written by one author). That said, all chapters are individually well written, barring some basic English errors that the editors should have caught (I have submitted a list of around 10 errors to the companion website).
This is a big, heavy, and detailed book, I had to physically tear it into two parts so it was easier to carry and read. My initial notes on the book numbered 30 pages, and several sections required more than one reading.
You can download the first chapter for free, together with a detailed table of contents, from its page on the O'Reilly website:
If you want to know more about SQL Server, how the pages are linked together, the structures involved, and how the optimizer works, I can certainly recommend this book.