Administering Relational Databases on Microsoft Azure
Article Index
Administering Relational Databases on Microsoft Azure
Chapters 5 - 6
Chapters 7 - 10
Chapters 11 - 15; Conclusion

Chapter 7. Monitor Azure Resources using the Azure Monitor

While the previous chapter looked at using various methods to monitor applications and services, this chapter focuses on monitoring using Azure Monitor.

There’s a walkthrough on setting up Azure Monitor for Azure SQL Database. Azure SQL Analytics can monitor the selected performance metrics on all your Azure SQL Databases, Azure Managed Instances, and elastic pools, from a single view. It is possible to see insights, errors, timeouts, blocking, and waits etc. Note: you need to enable diagnostics and output to log analytics workspace.

There are various views, as part of Azure SQL Analytics, and these provide some very helpful information (e.g. CPU usage). Perhaps the most interesting part relates to the Intelligent Insights report, which is used for proactive monitoring, and early detection of database degradation. 

To make things easier to decipher, Power BI reports can be used on the underlying data, and an example of this is provided.

The next section discusses Extended Events (XEs), which is a lightweight, scalable replacement for SQL Profiler. This section describes the underlying XE components of session, event, global fields but has weak explanations and not immediately relevant! (e.g. the description of an Event). Next, the system_health XE is discussed, this is not provided on Azure by default (unlike on-premise), hence some useful code is given here. There is an example XE to capture deadlock information, with the deadlock information stored in the system_health XE, and code is provided to decode it. 

Automatic tuning is briefly discussed (although describing the benefits seems long winded), namely:  force plan, create index, and drop index. 

The last section looks at optimizing Azure SQL on VM. It discusses the various types of VM available, being optimized for compute or memory or IO etc. The various types of managed disks are described, before looking at SQL Server storage best practices (e.g. separate volumes for data and log files). I would expect this section to include details of the common problem where the Azure SQL VM is built from a gallery image, which has typically been built on a server with 2 CPUs - this means you will need to change the SQL Server MAXDOP and number of tempdb files on machines with more than 2 CPUs to gain better performance.  

This was another interesting and useful chapter. Helpful example code was provided (e.g. system_health XE). As usually, there is some bad grammar.

Chapter 8. Optimize Query Performance in SQL Server

The various stages of the Query Lifecycle are explained, namely: Parsing (validates syntax, creates parse tree), Algebrizer (check parse tree objects exist, validates types and permissions), Query Optimizer (create cost-based execution plans, choose ‘best’), and Execution (executes).

Next, we look at the different formats and types of the query execution plan: graphical (friendly, line thickness, text, xml). estimated (query not executed), actual (may differ from estimated plan), and live query stats (shows animated actual execution plan progressing – can monitor while query still running). Guidance is given on reading the graphical execution plan. 

There’s some good advice concerning identifying problem areas in execution plans: table scan v index scan v index seek, RID lookup and Key lookup, sort operator, parallelism, warnings in plan (eg tempdb spill), inaccurate row estimates, missing indexes, SARGability, missing and stale stats.

The execution plan is typically created using the parameters used when it is first called. If these parameters are atypical, it can result in subsequent calls to the execution plan (using more typical parameters), running slowly. This is parameter sniffing, and various solutions to it are discussed (e.g. optimize for unknown, option(recompile), dynamic SQL, database-scopes setting).

ColumnStore indexes offer another optimization technique, where the data is stored column-wise (instead of row-wise), this typically offers greater opportunity for data compression, resulting in more data per page. Compression rates of 90% are typical, with a concomitant improvement in performance. This is especially useful for long running, scan based OLAP queries.

In-memory OLTP is a technology that can significantly improve OLTP performance. The tables are held in memory, and there’s no locking or blocking, giving significantly better performance.  

There’s a brief section on using DMVs/DMFs for performance monitoring, github links to about 20 useful DMV queries are provided (e.g. top 10 CPU consuming queries). It might have been better to discuss at least one of these scripts.

SQL code is either running on the CPU else waiting on a resource. SQL Server records these as wait stats, and examining them can be a very useful method for determining what problems an instance has. The underlying DMV sys.dm_db_wait_stats is examined, together with some of the more common types of wait, and their meaning.

I think most people assume, since Azure SQL offers many in-built automated features (e.g. backup, failover), that index defragmentation and update statistics are also automatically implemented – however, this is not true! You must implement index defragmentation and update statistics yourself, and a useful link to a well-knows routine (AzureSQLMaintenance) is given.

Query Store records details of your queries, automatically capturing query history, plans, and runtime stats. It is enabled in Azure by default (but not on-premise, until SQL Server 2022). There are some very useful Query Store reports (e.g. top queries based on duration) – unfortunately these are not discussed here.

Next, there’s a lab exercise, on using a covering index to improve performance. While this is useful, I cannot understand why it is placed between the topics of Query Store and tempdb configuration… A little later, there’s a lab exercise on using Query Store to resolve a performance issue – that lab should have been placed here!

Tempdb configuration can be critical to SQL Server performance, and a little information is given on how its configuration can be inspected.

Max degree of parallelism (MAXDOP) is a setting that can improve the performance of long running queries. Its value in relation to the NUMA architecture and number of CPUs is briefly discussed. I was surprised there was no mention of its related feature called Cost Threshold for Parallelism.

Next, there is a lab about using Query Store to resolve a performance issue – this is in the wrong place!

Overall, I enjoyed this chapter, it provides an up-to-date overview of features to consider when investigating performance. However, be warned, there is much more that could be discussed. That said, it is a good start, and certainly provides a range of important topics. Much of this chapter should be familiar with most on-premise DBAs.

I noticed as I was reading the chapter, the chapter’s subheading levels didn’t match those given in the Table of Contents (ToC), this was quite annoying - I often use the heading’s level to ‘see’ where I am within a section. This applies to other chapters too. I am reading the Kindle version of the book. 

The statistics profile image is incorrect, it is the same as the previous image of showplan_text.

Chapter 9. Perform Automation of Tasks

This chapter covers a wide range of topics in relation to automating tasks, topics include: 

  • Azure Cloud Shell 

  • Azure Portal 

  • Azure PowerShell 

  • Azure CLI 

  • ARM 

  • Azure Automation 

  • SQL Agent Jobs 

  • Azure Elastic Jobs 

  • Azure Logic Apps  

In each case, a useful overview of the basic capabilities of these automation tools is given, and sometimes a worked example. There are some good discussions on when to use a feature, together with its limitations – again this is selective.

Sometimes, a list of items to be discussed is given, but it doesn’t match the order in which they are then discussed – which can be disorientating. The section on using a Linked Server between on-premise and Azure Databases comes out of nowhere, no link between topics. The example itself is useful (just to know you can do this can be valuable). 

In discussing the many advantages of using ARM templates, it misses the obvious one, that it provides consistency. If different people try the same thing manually, it’s easy to make mistakes, having an ARM template provides consistency. 

There’s a useful section on SQL Agent, covering multi-server automation, Master/Target, configuring database mail, operators, and alerts. Using maintenance jobs via the maintenance plan wizard is briefly discussed

This chapter covers a lot of useful information, but its overall structure made it feel muddled. Again, problems exist with the English (e.g. “Why does PowerShell cmdlets so easy to use?”, and “It is advised that you give the utmost importance when you’re working with the corresponding toolsets.”)

Chapter 10. Plan and Implement High Availability and Disaster Recovery in Azure

Recovery from a local or wider problem is obviously very important for system availability. This chapter opens with a brief look at some terminology. Firstly, High Availability (HA), which aims to get a system online quickly after a local failure (e.g. bad disk). It then looks briefly at Disaster Recovery (DR) where the problem is typically wider (e.g. earthquake in data center or region). This is followed with a look at Recovery Time Objective (RTO) – the maximum allowable downtime, and Recovery Point Objective – the amount of data that can be lost. 

There’s a helpful table that decodes the meaning of the various availability 9s (e.g. 99.999% equates to a downtime of 5.26 minutes per year). The importance of SLAs is noted, and the number of 9s in the various SLAs for various Azure databases is given.

There’s an examination of the various HA/DR options available with both IaaS and PaaS. Topics discussed include Availability Sets, Availability Zones, Azure Site Recovery, Windows Failover Cluster, Log Shipping, and AlwaysOn Availability Groups.

There’s a useful section on database backup and restore, including many step-by-step walkthroughs.

I found the structure of this chapter particularly troublesome, with it being difficult to decipher the topics within the expected subheadings in relation to the ToC.


Last Updated ( Saturday, 23 April 2022 )