Author: Basit A. Masood-Al-Farooq
Publisher: Packt Publishing
Audience: Potential database developers
Reviewer: Ian Stirk
This short but wide-ranging book aims to cover the SQL Server 2014's development essentials, from the beginning, to in-depth knowledge, how does it fare?
The book’s introduction says “This book will provide you with all the skills you need to successfully design, build, and deploy databases using SQL Server 2014. Starting from the beginning, this book gives you an insight into the key stages of the SQL Server database process, provides you with an in-depth knowledge of the SQL Server database architecture, and shares tips to help you design the new database.”
Below is a chapter-by-chapter exploration of the topics covered.
Chapter 1 Microsoft SQL Server Database Design Principles
This chapter aims to cover the major steps in database design. It begins with a look at the place of database design within systems development, noting a poor design can lead to slow or unused systems.
Database design is broken down, and described briefly in the following steps:
- Requirements gathering/analysis (interviews, analysis, user and functional requirements)
- Conceptual design (meets requirements, independent of database vendor, E-R diagrams)
- Logical design (maps E-R diagram to database vendor)
- Physical design (map to database structures)
- Implementation/loading (create database and tables, load data, configure)
- Testing/evaluation (performance, integrity, concurrency, security)
Table design is considered next, tables contain columns and rows that store data. Business entities form the basis of tables, and the entity’s attributes become table columns. The associations between the different tables are described using 1-to-1, 1-to-many, and many-to-many relationships.
Data integrity ensures data is reliable and follows business rules. Integrity is examined briefly in terms of domain (e.g. check constraint), entity (e.g. primary key), referential (e.g. foreign key), and user-defined (e.g. triggers). Normalization is discussed as a means of reducing data redundancy, so the data exists only once. The first 3 normal forms are described briefly.
An overview of SQL Server’s file architecture is given, consisting of a primary data file, zero or more secondary data files, and a log file. The structure of a database page (i.e. header, data rows, offset array) is described, as are extents (8 contiguous pages), the transaction log (allow rollback/recovery), and filegroups (allow easier admin). Finally, the importance of data types is described.
This chapter provides a whirlwind review of the major steps and considerations in creating a database. While it is wide-ranging, it often lacks depth. Several features are mentioned in passing (e.g. checkpoint), without any explanation being given.
The database design section mentions using data flow diagrams (DFDs), instead some of the various more ‘modern’ UML diagrams should have been mentioned. The definitions of the normal forms feel awkward, and composite keys are referenced in second normal form - without first describing what a composite key is. Deriving the business entities from the user/functional requirements is not discussed.
The chapter does provide a good review of the major steps in database design (especially if you already know the subject matter), which for the most part is independent of SQL Server itself. There are some useful links for further information. There’s a useful diagram that describes the working of the transaction log.
Chapter 2 Understanding DDL and DCL Statements in SQL Server
After describing database design in chapter 1, this chapter progresses to implementing the database, with reference to Data Definition Language (DDL) and Data Control Language (DCL).
The chapter continues with a brief description of the SQL Server 2014’s systems databases:
- master –configuration info, logins, roles, info on other databases
- model – template for new databases
- msdb – used by SQL Server Agent, SSIS packages, backups
- tempdb – contains user and system temporary objects
- resource – internal system objects
- distribution – used for replication
The various recovery models are described: simple (log truncated when transaction ends), bulk-logged (reduced logging, smaller log), and full (has the most backup options). Creating, modifying, and dropping databases are described briefly, using both T-SQL (Microsoft’s version of SQL) and SQL Server Management Studio (SSMS). Creating and managing schemas, which group ‘related’ objects by scope or ownership is described, and brief examples given.
Having looked at the database and schemas, the chapter moves onto creating and managing tables. The various types of table are described (i.e. temp, system, users, partitioned, file, memory-optimized). Brief examples of creating and modifying tables are given using both T-SQL and SSMS. The chapter ends with a look at the DCL T-SQL to grant, deny, and revoke permissions to securables.
This chapter provides a brief overview of how to create and alter databases and tables using both T-SQL and SSMS. The content feels minimal, often containing basic syntax and few examples. The SQL syntax provided seems to be based on Books Online (BOL) – an invaluable resource for any SQL Server professional. There’s a useful link to storing the database in Windows Azure.
The chapter should have mentioned point-in-time restores are not possible for bulk-logged recovery, if the log contains bulk-logged entries. Additionally, it states “All transactions on memory-optimized OLTP tables are fully atomic, consistent, isolated, and durable (ACID)”, this is not completely correct, SQL Server 2014 introduced the concept of delayed durability. The WITH NOWAIT clause of the ALTER DATABASE command is not explained. More detail, examples, tips, and common problems would make the chapter more useful.
Chapter 3 Data Retrieval Using Transact-SQL Statements
After creating the database and tables, the next step is to retrieve the data. Various methods of retrieval are described in this chapter. The AdventureWorks2012 database is used in the examples.
The chapter opens with a look at the ubiquitous SELECT statement, which can be used to get data, assign data to variables, and call functions. Its basic use is described, together with the WHERE clause, and the IN and BETWEEN keywords.
The section on T-SQL functions describes briefly, system functions, deterministic and non-deterministic functions, together with various function types (i.e. aggregates, configuration, cursor, date/time, maths, metadata, rowset, and security related).
Multi-table queries are described with reference to the UNION, EXCEPT, and INTERSECT keywords. The various types of JOINs are described (i.e. inner, left/right outer, full outer, cross, and self join). Sub-queries are briefly discussed as nested, correlated and non-correlated. Common table expressions (CTEs) are described as temporary results that simplify queries, allowing multiple use of the same table, and allowing recursive queries. ORDER BY and GROUP BY are described, as is the HAVING clause, TOP, DISTINCT, and PIVOT/UNPIVOT. Lastly the chapter describes the various window functions, used for ranking, moving averages, and cumulative sums.
The chapter provide a basic overview of data retrieval. Although the chapter is wide–ranging in content, it lacks depth, for example, under string functions only the CONCAT function is mentioned, additionally, only simple examples are given. Some topics are mentioned without being described e.g. stored procedures. There’s a good point about not using the * wildcard in queries.
There are very few examples which have output results that you can compare with, which rather defeats the purpose of using the AdventureWorks2012 database. I’m not sure the reader can gain much, except the bare syntax and concepts from this chapter.
Chapter 4 Data Modification with SQL Server Transact-SQL Statements
This chapter describes how to modify data using the INSERT, UPDATE, DELETE, MERGE, TRUNCATE, and SELECT…INTO statements. The MERGE example has some useful code that illustrates the use of the OUTPUT keyword for debugging/troubleshooting, however, this is not highlighted in the text.
As with other chapters, basic BOL-like syntax is provided, with the major properties described. Some basic examples are provided, and some outputs are given, and some explained! (I note this because in much of the book there are no outputs, and fewer explanations). There’s a useful SQL Server 2014 specific feature described, SELECT…INTO can now operate in parallel, I note this because there are very few 2014 specific development features in this book (and in SQL Server itself!).
Chapter 5 Understanding Advanced Database Programming Objects and Error Handling
This chapter discusses some of the more complex programming objects.
The chapter opens with a look at creating and using variables, including local, cursor and table variables. This is followed with a look at flow control keywords i.e. BEGIN… END (groups code), IF…ELSE (decision making), CASE (decision making), WHILE (looping), BREAK, CONTINUE, RETURN, GOTO, and WAITFOR. Basic examples are provided, without any results being shown.
Next, simple and indexed (materialised) views are described. The former is just a SQL definition that becomes populated at run-time, while the latter contains physical data that changes as the underlying base tables’ data changes. Syntax and basic examples are provided.
The advantages of using stored procedures are described (reduced network traffic, security, reusable, modular). The different types are discussed (user-defined, natively compiled, system, temporary, extended, and CLR). Syntax and examples are provided, using both T-SQL and SSMS.
User defined functions (UDFs) are described as being like stored procedures but without support for OUTPUT parameters. Both scalar (return one value) and table-valued functions (returns a table) are described, and examples provided.
The chapter continues with a look at triggers, these are routines that run in response to an event, they’re typically used to enforce business rules. Syntax and examples are provided, for both DML and DDL triggers. The chapter ends with a look at error handling, using the TRY…CATCH structure. Syntax and examples are provided.
The chapter provides a look at the more ‘advanced’ SQL Server programming features. There are some useful tips given, including:
- tables variable are good for small dataset
- avoid using SELECT * in views
- use SET NOCOUNT ON
- table variables now support non-unique clustered and non-clustered indexes
There’s a tip about using DBCC FREEPROCCACHE to clear the procedure cache, however this should mention its downsides (these are given in chapter 6!). It also fails to mention that functions, when used in SELECT or WHERE statements, can result in poor performance. There is nothing about Logon triggers, or SQL Server 2014’s new security commands. The chapter could have mentioned indexed views have many limitations.
Chapter 6 Performance Basics
The chapter opens with a look at the importance of the Service Level Agreement (SLA), and the general importance of indexes as a major tool for improving performance.
The relational engine (AKA the Query Processor) takes a SQL query and creates a query plan and is then executed. The various stages and aspects of the optimization process are described (parsing and binding, query optimization [trivial or full plan], non/parallel plan, plan cache, and plan aging). There’s a useful diagram illustrating the optimization process.
SQL Server 2014’s in-memory OLTP engine allows the creation of in-memory optimized tables, has a lock and latch free design, using multi-version concurrency control (MVCC) – all these things can improve performance significantly. Additionally, natively-compiled stored procedures can improve some performance by x50. A wizard is available to help identify and migrate the relevant tables and stored procedures. Some limitations are described e.g. no foreign keys.
The chapter then moves onto the major section of this chapter (and book), indexes. Indexes typically improve retrieval performance, but can slow data modifications. Clustered indexes and heaps are briefly described. The b-tree structure of indexes are described and illustrated. The following aspects of indexes are touched upon: non-clustered indexes, composite indexes, covering indexes, unique indexes, spatial indexes, partitioned indexes, filtered indexes, full text indexes, XML indexes, memory-optimized indexes (hash and non-hash), and columnstore indexes. In many cases only a short paragraph without examples is given.
There’s a very useful section concerning guidelines for indexes including:
- don’t over-index
- index columns used by foreign keys and joins
- limit key columns to high selectivity columns
- rebuild based on fragmentation level
Database statistics often cause problems when they become stale. Code is provided to show when the statistics were last updated, using both DBCC SHOW_STATISTICS and sys.stats. Unfortunately, neither show the number of rows changed since the statistics were last updated (i.e. are they stale or not?).
The chapter then describes the fundamentals of transactions, explaining ACID (Atomic, Consistent, Isolated, Durable), but without examples. Basic SQL is provided for starting, committing, rolling back and saving a transaction. An overview of locking is given, including its associated problems (lost updates, dirty reads, nonrepeatable reads, phantoms), the locking object hierarchy, lock types, and transaction isolation levels. All quite basic with little depth.
The chapter end with a very brief overview of some of the more common monitoring and troubleshooting tools (activity monitor, SQLServer:Locks performance object, Dynamic Management Views [DMVs], SQL Server Profiler, sp_who2, and Extended Events.
This chapter is particularly wide ranging, covering potentially difficult areas. The content is discussed mostly at a superficial level. There’s a good point that unique indexes often help the optimizer produce better plans. There’s a good description and illustration of columnstore index structure. The guidelines on index design and optimization are useful, but would be better expanded. There’s a useful hint about using sys.dm_exec_cached_plans to find plans but no example SQL provided. Again, several features are mentioned but not defined (e.g. latches, spinlocks, and memory grants).
The chapter states “…stored procedures that operate on memory-optimized tables, though written in Transact-SQL, are compiled to highly efficient machine code”, this is only true of natively compiled stored procedures, normal stored procedures need to use slower interop instead. There’s a note about “abbreviated to xVelocity where appropriate” which doesn’t make sense.
This book contains SQL syntax and typically brief example code covering a wide-range of SQL Server development topics. Although it is wide-ranging, it lacks depth, even for an ‘essentials’ book.
If you are new to SQL Server, which is the intended market, this book will give you a basic understanding of using SQL Server from a developer’s perspective, taking you from say level 0 to level 3 or 4 (out of 10). It does not contain enough detail to be a useful teaching aid. If you already understand the subject matter, this book can be used as a refresher, with the proviso it has limited depth.
Little of the book relates specifically to SQL Server 2014. About 95% of the book is equally applicable to SQL Server 2012. which is not surprising since the major changes in SQL Server 2014 from a developer’s perspective tend to be internal. You can see the changes here: http://msdn.microsoft.com/en-us/library/bb510411.aspx. I suspect the 2014 in the title is to catch the market for new SQL Server 2014 books.
In many ways, the book contains the right topics and subheadings, but to be more useful, the content should have been expanded to include more detail, examples, tips, troubleshooting remarks, and explained output/results.