SQL Server Query Tuning and Optimization (Packt)
Article Index
SQL Server Query Tuning and Optimization (Packt)
Chapters 2 - 6
Chapters 7 - 9
Chapters 10 - 12, Conclusion

Chapter 10 Intelligent Query Processing

Intelligent Query Processing (IQP) offers a group of features that automatically improve the performance of your SQL queries. Since its inauguration in SQL Server 2017, more features have been added, and existing functions improved. Some of these features need Query Store to be enabled.

The chapter opens with an overview of what IQP is, before discussing some of the more prominent IQP features, including:

  • Parameter-sensitive plan optimization – multiple plans are created based on the distribution and density of parameter/data in the underlying table
  • Memory grant feedback – helps ensure the query has the appropriate amount of memory, thus preventing spills to tempdb
  • Persistence and percentile – using Query Store allows information to be persisted across server restarts. This persistence, together with the ability to use the history of a query’s execution, allows improved memory grant feedback
  • Cardinality estimation feedback – provides better estimation of rows used by the query
  • Degree of parallelism feedback – helps correct parallelism problems for repeated queries
  • Interleaved execution – helps give multistatement table-valued functions better cardinality estimates
  • Table variable deferred compilation – helps give better cardinality estimates
  • Adaptive joins – defers the type of join used until execution time, where an optimal join type typically depends on the number of rows involved

In each case, a useful discussion of the item is given, showing the problem it is trying to correct, and helpful example code provided.

Increasingly, with each version of SQL Server, more automated tuning features are being added to correct known problems. Perhaps in the future we wont need performance tuning experts. A very useful and interesting chapter.

Chapter 11 An Introduction to Data Warehouses

Most of what’s been discussed in previous chapters is applicable to both OLTP and Data Warehouse systems. This chapter now concentrates on optimizations that relate to Data Warehouse systems.

The chapter opens with a comparison between OLTP and Data Warehouse systems. Data Warehouse systems typically use historical data, with complex long running queries (often performing aggregations). The database is typically denormalized and often more suitable for ad hoc queries. The discussion continues with reference to star schema, fact and dimension tables, and additional normalization giving a snowflake schema. Big Data is briefly mentioned, with reference to Microsoft’s products (Parallel Data Warehouse, HDInsight, and Azure Synapse Link for SQL).

Star join queries join fact and dimension tables, typically aggregating and applying filters. Heuristics are used to identify the fact and dimension table allowing the query to be optimized. Bitmap filtering can be applied, for parallel plans, which does a semi-join early, allowing rows to be dropped before entering the parallel processing step. Examples of this are provided and related to the plan content.

Columnstore indexes offer an order of magnitude of performance improvement. Introduced in SQL Server 2012, in later versions they are updatable, and can involve a clustered index. Data is stored in columns instead of rows, and compressed to improve performance further. Performance benefits include reduced IO, segment elimination and batch mode processing. In batch mode processing, instead of processing/getting one row at a time (the default), the data is processed in batches of 1000 rows. In SQL Server 2012 only a few operators could take advantage of batch mode processing, this has been increased in later versions of SQL Server. The plan shows both the estimated and actual mode used. SQL code is provided to create columnstore indexes using both clustered and non-clustered indexes.

This chapter explains some data warehouse system specifics, notable batch mode processing and use of columnstore indexes. There’s a useful table comparing OLTP and data warehouse systems. There’s a helpful reference to big data, with reference to Microsoft products, together with a reference to Kimball’s standard data warehouse book.

It might have been instructive to mention that the input should be sorted when creating the columnstore index (else insert into a clustered index), and also use a MAXDOP setting of 1 (this ensure data is input into distinct segments, which also helps with segment elimination).

Chapter 12 Understanding Query Hints

This chapter starts with a look at the history of optimizer research (starting in the 1970s), and then looks at how join order and query complexity affect the optimizer together with its limitations. Finally, the use of hints, and the various types of hints are discussed.

Join order affects the amount of data flowing between operators. Joins can be redefined using relational mathematics, to give alternatives that give the same results. As the number of joins increases, so the number of potential plans increases dramatically. The optimizer needs to balance the cost of optimization time with the production of a quality plan. Heuristics are used to reduce the number of potential plans.

Often, it’s possible to break down a complex query to produce a better plan. Having multiple, but simpler queries, gives the optimizer more options, also having intermediary temporary tables allows up to date statistics to be created and used in the plan. Breaking a query down into more steps is preferable to using hints.

Hints should only be used as a last resort since they limit the number of transformations that can be applied, reducing the optimizer’s functionality. Hints need more maintenance, and their impact may change with each version or service pack of SQL Server. Before hints are used, ensure the problem doesn’t relate to IO or cardinality estimates (ways of updating the stats have been discussed earlier).

The last section discusses, with examples, the various types of hints, including: joins, aggregations, force order, index, forcescan, forseek, fast n, and noexpand. Lastly the creation and use of plan guides is examined, and there’s a useful reference to using Query Store hints too.

This chapter provided some interesting reading, especially the section on breaking down a query (there’s a link to the author’s paper on the same topic provided). The comment about using hints as a last resort is useful. The chapter includes helpful sample code that illustrates what happens when you use a particular hint.

Conclusion

This book aims to give you the tools and knowledge to get peak performance from your queries and applications. It differs from other performance related books by concentrating on the factors that influence the production of a good execution plan, and thus improve performance.

The book is generally easy to read, this is quite an achievement since some of the concepts are deep and undocumented. There are good links between the chapters, helpful website links for further information, good diagrams to support the discussions, and helpful example code to support the assertions made. I also enjoyed the author’s pragmatic approach e.g. use hints as a last resort, and always test recommendations on your own system.

As mentioned at the outset, this book is an update of a previously published book relating to SQL Server 2014. Excluding the two new chapters (Query Store and IQP), I would estimate it is at least 80% the same as its predecessor. That book had appendices relating to: References, White papers, Articles, Research papers and Books which I found very useful since it contained some great resources for investigating topics further. In my review I wished other books contained something similar. However, this section has been removed from the current edition of the book.

It might be argued that the book is not as diverse as it could be, since it doesn’t cover other aspects of performance, e.g. RAID levels, SQLDIAG, wait stats and queues. I think this misses the point, the book concentrates on those factors that influence the quality of the execution plan.

The book is suitable for intermediate to advanced level developers, and should take your level of understanding of performance tuning in general, and optimizer in particular, from level 5 to 8 or 9. Although the book is primarily for SQL Server 2022, it typically indicates in which version of SQL Server a given feature was introduced, so should prove useful in earlier versions.

I enjoyed reading this book. If you spend time troubleshooting performance problems, or you want a deeper understanding of the optimizer and the factors that influence it, I can heartily recommend it.

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

Banner


Artificial Intelligence, Machine Learning, and Deep Learning (Mercury Learning)

Author: Oswald Campesato
Publisher: Mercury Learning
Date: February 2020
Pages: 300
ISBN: 978-1683924678
Print: 1683924673
Kindle: B084P1K9YP
Audience: Developers interested in machine learning
Rating: 4
Reviewer: Mike James

Another AI/ML book - is there room for another one?



Learn to Code by Solving Problems

Author: Dr. Daniel Zingaro
Publisher: No Starch Press
Date: June 2021
Pages: 335
ISBN: 978-1718501324
Print: 1718501323
Kindle: B08FH92YL8
Audience: People wanting to learn Python
Rating: 4
Reviewer: Mike James
Solving problems - sounds good?


More Reviews

 



Last Updated ( Wednesday, 23 November 2022 )