Pro SQL Server Internals
Article Index
Pro SQL Server Internals
Chapters 5 -11
Chapters 12 -17
Chapters 18 - 24
Chapter 25 - 29
Chapters 30 - 35



Chapter 18 Troubleshooting Blocking Issues

Blocking is a very common in databases and is to be expected, however excessive blocking decreases concurrency and throughput and should be investigated with a view to reducing it. The chapter discusses a general approach to troubleshooting i.e. find the queries involved in the blocking, find out why they’re blocking, and fix the root cause. Various tools are given for investigating blocking, these include DMVs for real-time troubleshooting, and the ‘blocked process report’ for later analysis.

Examples are included to examine real time blocking issues, the key DMV is sys.dm_tran_locks, but it needs joining with sys.dm_exec_requests and sys.dm_os_waiting_tasks to get additional useful information. The troubleshooting code provided should prove very useful in investigating your own blocking problems.

The chapter ends with a look at collecting blocking information for later analysis. This involves enabling the ‘blocked process report’ setting, and collecting the blocking event (via SQL Trace easiest). The last section details how to interpret this captured output.

This chapter contains a good overview of what blocking is, and approaches to investigate both real-time and for later investigation. Some very useful code is provided that identifies the code being blocked and that causing the blocking. Often looking at the SQL involved and the execution plan will indicate the blocking is due to unnecessary scans caused by suboptimal SQL.

Chapter 19 Deadlocks

Deadlocks occur when 2 sessions own resources and then want the resource the other session has. In many ways deadlocks can be viewed as irresolvable blocking. A deadlock manager runs every 5 seconds to detect deadlocks and resolves them by rolling back the transaction with the least log space, rollback is influenced by the set_deadlock_priority option.

The chapter provides sample code and descriptions of deadlocks that are due to:

  • volatile data being updated

  • non-optimized queries causing scan to be used

  • simultaneous read/update statements

  • multiple updates of the same row

Troubleshooting deadlocks is similar to troubleshooting blocking. There are various ways to get a deadlock graph, including SQL profiler, Extended Events (XE), trace flag 1222, and the system_health XE session. The content of the deadlock graph is decoded and explained.

Various methods of reducing deadlocks are given, these include:

  • optimize queries (scan often the problem, so add index)

  • keep locks as short as possible (keep transactions short, update at end)

  • use lowest transaction isolation level that provides the required data consistency

  • access entities in same order

  • avoid updating the row multiple times in same transactions when multiple indexes involved

  • use retry logic (TRY/CATCH, 1205 error code – sample code)

This chapter provides a comprehensive review of deadlocks, what they are, how to investigate them, and potential solutions. As always, useful code was given. I was a surprised no mention was made of deadlocks being caused by lock escalations, more so because it’s the subject of the next chapter.

Chapter 20 Lock Escalation

This chapter explains the process of lock escalation. Row-level locks are good for concurrency, but can be expensive on memory. In lock escalation, the row locks are replaced by a table lock. Typically, when 5000 row or page level locks are acquired on an object, SQL Server tries to escalate the locks to a table lock. The operation succeeds if there are no other sessions having incompatible locks, otherwise the lock escalation is retried for every 1250 additional row/page locks on the given object.

There’s an interesting section on troubleshooting lock escalation problems. Various tools can be used to investigate lock escalation including using wait stats, SQL trace, various Performance Monitor (perfMon) counters, and lock escalation extended events. It is possible to control lock escalation behaviour using the lock_escalation table setting and various trace flags.

The chapter provides a good explanation of lock escalation, what it is and why it is a concern. There’s a useful code sample that shows less locks being used when LOCK_ESCALATION=TABLE is set. The author recommends keeping lock escalation enabled unless you find it introduces noticeable blocking issues. There’s a useful point that WITH ROWLOCK doesn’t affect lock escalation behaviour!

Chapter 21 Optimistic Isolation Levels

Optimistic isolation levels are a useful way to handle blocking and data consistency. In essence, while data is being modified, other sessions can read an older version of committed data (in the version store) instead of being blocked. A cleanup process runs regularly to tidy up the version store, this can be seen via perfMon counters.

Various DMVs can be used to troubleshoot version store and transaction problems. There is a good overview of the row versioning process, which reduces blocking but comes at a cost (increased load on tempdb, overhead in data modification/retrieval, and increased fragmentation).

The various optimistic transaction isolation levels are discussed, namely:

  • Read committed snapshot (database option that changes behaviour of readers)

  • Snapshot (separate transaction level, use in code)

This chapter provides a good discussion of the Optimistic Isolation Levels, what they are and how they are useful in reducing blocking, the cons associated with them are also discussed. There’s an interesting point that switching on read committed snapshot isolation can be great emergency way of removing blocking issues, since it removes writers/readers blocking with no code changes.

Chapter 22 Application Locks

This chapter discusses a relatively little used method of locking. Applications can place locks on resources, these are not related to database objects. In essence the user controls the locking process via sp_getapplock and sp_releaseapplock. Two sessions can’t execute same code at same time, it’s similar to mutexes in that it serializes code access. Useful code samples illustrate these concepts.

Use of application locks is relatively rare, so I’m not sure why it was including in this book, perhaps to provide an alternative more controllable locking mechanism. This chapter had no subsections or summary section.

Chapter 23 Schema Locks

Schema locks are used to ensure the database schema is consistent during code execution e.g. to prevent table alteration while other queries accessing a table.

Schema modification locks (Sch-M) are acquired before any schema changes, and are held until the end of the transaction e.g. used when altering the partition function. They can seriously affect system availability when changes cause scans. Schema stability locks (Sch-S) are used during DML query compilation and execution. Useful code is provided to illustrate the impact of these locks.

SQL Server 2014 introduced low-priority locks, which aims to improve concurrency during online index rebuilds and partition switching operations.

Chapter 24 Designing Transaction Strategies

This chapter summarises some very useful rules garnered from previous chapters, to reduce blocking (and deadlocks) and improve concurrency, namely:

  • can be improved with query optimization (also Optimistic transaction isolation levels)

  • make transaction as short as possible

  • update data close to end of transaction (reduced X lock duration)

  • if read and write to same table, best to read first. Also avoid updating same rows multiple times, especially if many indexes

  • don’t mix DML and DDL statements in one transaction

  • be careful of lock escalation, especially with OLTP. Try to work with batches < 5000 rows.

  • considered retry logic for deadlocks

  • read uncommitted says we don’t care about data consistency…

  • be careful of using repeatable read and serializable for OLTP

  • read committed snapshot is good option but be careful of impact on tempdb

  • different use cases need different approaches and isolation levels


Last Updated ( Tuesday, 01 May 2018 )