Page 1 of 5
Author: Kalen Delaney et al
Publisher: Microsoft Press
Audience: DBAs and SQL Developers
Reviewer: Ian Stirk
The introduction says "This book is intended to be read by anyone who wants a deeper understanding of what SQL Server does behind the scenes", so how does it fare?
This book has a revered pedigree, with previous editions containing some of the most detailed explanation of SQL Server internals available. Additionally, the list of authors reads like a Who’s Who of the SQL Server world.
Owing to the increasing size and complexity of the SQL Server software, this book is selective in its content, concentrating on the query processor and storage engine. It specifically excludes client programming, business intelligence, security, and other non-core engine areas.
Below is a chapter-by-chapter exploration of the topics covered.
Chapter 1 SQL Server 2012 architecture and configuration
This first chapter provides an overview of the database engine, its key components, what they are and how they interact. In essence it aims to put the subsequent chapters into context.
SQL Server editions and installation are briefly discussed. SQL Server metadata is examined in terms of compatibly views, catalog views, Dynamic Management Objects (DMOs), and other metadata (e.g. system functions and stored procedures). This metadata can be very useful in diagnosing problems e.g. identifying what SQL statements are blocking.
The major components of the engine are identified (i.e. protocol layer, query optimization, query execution, storage engine, and the SQL Server Operating System [SQLOS]) and briefly discussed. They are discussed extensively in later chapters.
The SQL Server Configuration Manager is introduced as the main tool to manage services and network protocols. Other system-related configuration that you might want to change is discussed in relation to the operating system (e.g. page file location, optimize system for background services) and SQL Server options (e.g. Max Server Memory, Max Degree of Parallelism).
Included is a useful list of configuration options that you might want to check/change to optimize your SQL Servers. There is a very valid warning related to any changes, please ensure you test changes thoroughly, a change that might generally improve performance may actually be detrimental on your system.
While discussing parallelism, the point is made that the Dynamic Management View (DMV) sys.dm_os_tasks can be used to observe parallelism, however no further information is provided. It may have been more informative to say the DMV contains the column exec_context_id, which shows the different threads for the same process id (spid), a screenshot might have been useful too.
Overall, the chapter provides a gentle introduction to what the book is about and what you can expect in subsequent chapters. There are good links to other chapters.
Chapter 2 The SQLOS
The SQL Server Operating System (SQLOS) is a layer between SQL Server and the operating system (OS). Having a separate layer allows the various SQL Server components to take advantage of this centralized and scalable functionality, without them each becoming bloated themselves.
SQLOS’s main functions are scheduling and memory management. Its other functions include the management of synchronization, memory brokers, exception handling, deadlock detection, extended events, asynchronous I/O, and CLR hosting.
The Non-Uniform Memory Access (NUMA) architecture is explained in detail. In NUMA architecture, groups of the CPUs have their own memory bus, which helps with scalability. Previously, with Symmetric Multiprocessing (SMP), memory access was via the same memory bus, which could become a limiting factor, and reduce performance.
The SQL Server scheduler was developed to cater for the specific requirements of SQL Server, rather than rely on the OS which treated all threads in a similar manner. Such requirements include the use of lightweight fibres instead of threads, and reduced context switching of threads. The relationship of schedulers to CPUs, workers, and tasks is explained in depth. The scheduler internals can be viewed using various DMVs, these and their main columns are described, however no examples are given.
The buffer pool is the main memory component in SQL Server. Its purpose is to cache data, making subsequent data usage much faster (since you don’t need to go to the slow disks to get the data). Additionally, memory is also used to store execution plans. The relationship between stale/changed data, the lazywriter and checkpoint process is given in detail. The memory internals can be viewed using various DMVs, these and their main columns are described, however no examples are given.
The purpose of the Resource Governor is to restrict and prioritise the resources (CPU and memory) that users/groups/workloads can have. The purpose and architecture of the Resource Governor is described, and a useful sample code example given (it has an off-hours and production groups). The Resource Governor internals can be viewed using various DMVs, these and their main columns are described, however no examples are given.
For a number of years the software industry has been moving towards an event driven model i.e. instead of polling for an action, events are raised and registered clients can use the information. Extended Events (XE) allow you to determine what is happening on your SQL Server, they are the successor to SQL Trace/Profiler. The lightweight XE architecture is described in detail, and a useful sample code example given (i.e. it traces completed SQL statements). Luckily SQL Server 2012 includes a GUI to help create XE sessions, previously there was a third-party GUI add-in for SQL Server 2008. A step-by-step walkthrough of creating an XE session via the GUI is included.
For me, the chapter didn’t need to discuss the Resource Governor or Extended Events. I think of these as system applications that make use of SQLOS, rather than being an integral part of it. However, both these sections are well written and useful, but in a different context.
The sections on the scheduler and memory management are very detailed, especially after the gentle introduction of Chapter 1. I do wonder if the content could have been presented in a more understandable manner (I had to reread several sections).
Chapter 3 Databases and database files
This chapter describes the internals of databases and the files they are composed of. Databases are stored as OS files, each database having at least one data file and a transaction log file. Integrity constraints ensure the database is logically consistent. Databases can grow and shrink. Various metrics are described, e.g. maximum number of database per instance (32,767), maximum number of objects per database (231-1).
Database files are described as consisting of three types of file: primary data files (file extension .mdf), log files (.ldf), and optionally, secondary files (.ndf). Additionally, databases can contain FILESTREAM, FileTable, and full-text data files. Examining the view sys.database_files shows important information about these different file types. There is a step-by-step walkthrough of creating a database, using both the GUI and SQL.
The ‘expanding and shrinking database’ section starts with a warning that shrinking is generally a bad thing unless you really need to reclaim space (it is resource intensive and creates fragmentation). Where possible you want to restrict the file growth (set it to none) which should be possible if you know your data volumes and processing cycle, rather than have uncontrolled growth. You can add additional database space using the ALTER DATABASE with MODIFY FILE command. Ideally you want to initialize data files instantly, thus improving performance, by using Instant File Initialization (IFI) this is controlled via the SE_MANAGE_VOLUME_NAME setting in Windows.
The next section discusses how you can use filegroups to group related data for maintenance and administration purposes. It is possible to restore at the filegroup level, so if a database contains PnL and Sales information each on their own filegroup, if there is a problem with one filegroup the other can still be made available. There’s some useful example filegroup creation SQL provided.
Since altering databases is a large part of many DBAs work, it’s apt that altering databases is discussed. Items discussed include change database name, add/remove data files/filegroups/logs, change file size, and mark filegroup as READONLY. Details of the composition of data pages are given, together with their containers (uniform and mixed extents). Special pages, which record what extents have been allocated, are described in the Global Allocation Map (GAM) and Shared Global Allocation Map (SGAM). Tools and examples that examine the content of GAM and SGAM pages are provided. Other special pages (PFS, DIFF, and ML) are briefly described.
The next section discusses various database options, which can be seen in the view sys.databases. The more common settings are discussed, including: user_access_desc (SINGLE_USER, RESTRICTED_USER, MULTI_USER), state_desc (OFFLINE, ONLINE, EMERGENCY), is_read_only (READ_ONLY, READ_WRITE), various SQL options (e.g. ANSI_*), recovery_model (FULL, BULK_LOGGED, SIMPLE), and page_verify_option.
In some ways, this chapter represents the more traditional deep dive into database structures. This chapter was relatively easy to read, but a few sections may need rereading, there was a good flow, with most sections largely building on previous ones. There was good use of screenshots and links to other chapters for further/deeper understanding.
The use of Performance Monitor to monitor space was briefly described but no information was given on how to set it up or how to create logs. Perhaps link for this could have been provided? That said, the author mentions this is not a How-To book, and of course a single book has limits.