|Expert SQL Server In-Memory OLTP (2nd Ed)|
|Written by Ian Stirk|
Page 1 of 3
Author: Dmitri Korotkevitch
This book aims to explain SQL Server 2016 In-Memory OLTP, which can dramatically improve performance, how does it fare?
In-Memory OLTP (Online Transaction Processing) is the primary reason to migrate from older versions of SQL Server (pre-2014). Under the right conditions it can offer significant performance benefits, with code typically running 20 to 40 times faster.
This book is an updated version of the author’s SQL Server 2014 book. Whilst SQL Server 2014 introduced In-Memory OLTP technology, it contained many limitations, most of these have been removed in SQL Server 2016.
This book is targeted at SQL Server developers and architects wanting to know how to take advantage of the potential performance improvements. Some background knowledge of SQL Server, especially performance, will be useful.
This is a single topic book, having 13 chapters and 4 short appendices, contained within around 300 working pages.
Below is a chapter-by-chapter exploration of the topics covered.
Chapter 1 Why In-Memory OLTP?
The chapter opens with a look at the background of In-Memory OLTP. Historically, memory was expensive, it made sense to store data on disks and bring it into memory for processing. With the advent of much cheaper memory, a different processing model could be used. In-Memory OLTP was designed for scalable fast processing. The design resulted in In-Memory structures, a concurrency model that eliminates locks and latches, and natively compiled stored procedures that use significantly fewer CPU cycles – all of which can significantly improve performance.
Additionally, In-Memory OLTP is integrated seamlessly with other SQL Server 2016 components, there is no need for system refactoring – this is a major advantage when compared with other database vendors. There are still some limitations, but significantly fewer than with SQL Server 2014.
The chapter continues with a look at In-Memory OLTP architecture. While In-Memory objects are fully integrated into SQL Server, and can be used transparently, internally they are different. The objects are stored in memory, they live outside the buffer pool, and have different structures compared with on-disk structures. Classic (i.e. interpreted) SQL can access In-Memory tables via interop, and this is typically 2 to 4 times faster than accessing the corresponding on-disk table. Natively compiled stored procedures can only access In-Memory tables, and are typically much faster than using interpreted SQL.
This chapter provides a useful overview of how In-Memory OLTP arose, together with its major components.
Useful discussions, diagrams, practical example code, website links, and inter-chapter links are given throughout. These traits apply to the whole of the book.
Chapter 2 In-Memory OLTP Objects
This chapter opens with a look at preparing a database to use In-Memory OLTP. Firstly, a separate filegroup with the keyword CONTAINS MEMORY_OPTIMIZED_DATA is required, a code example of this is provided. Next, sample code is provided to create an In-Memory table, and the various keywords discussed (memory_optimized, durability). Indexes are created in-line at the same time as the creation of the table. The two types of index are briefly discussed, hash indexes are optimized for key lookups, and nonclustered indexes are useful for searching for a range of data.
The chapter continues with a look at working with In-Memory tables. Interpreted SQL treats In-Memory tables like on-disk tables, and most SQL is supported but there are a few restrictions (e.g. can’t truncate tables). Next, natively compiled stored procedures are examined, these can only access In-Memory tables, and are compiled into DDLs for improved performance.
The chapter ends with a look at the optimistic concurrency model used by In-Memory OLTP, this is lock and latch free, using versioned records similar to the snapshot isolation level. Example code is provided to compare the performance of interpreted SQL running against both In-Memory tables and on-disk tables, and this is compared with running natively compiled stored procedures against In-Memory tables. The results show that In-Memory tables perform better than on-disk tables, additionally In-Memory OLTP provides improved scalability.
This chapter provides helpful example code to create the relevant In-Memory structures, additionally, the performance advantage of In-Memory tables, and natively compiled stored procedures are demonstrated.
Chapter 3 Memory-Optimized Tables
This chapter provides greater implementation detail. First, the structure of on-disk and In-Memory tables are discussed and compared. In-Memory tables don’t use pages (except the Bw-tree), and data rows reference each other via pointers, a table needs to include at least 1 index, and indexes are embedded into the table.
The chapter continues with a review of the optimistic concurrency model used. In-Memory table data is never updated, instead the existing row is marked as deleted and a new row inserted, this is controlled via the BeginTs and EndTs columns on each row. This approach means there is no need for locks and latches, improving performance and scalability. Deleted rows are removed later by the Garbage Collector. The format of the data row is discussed, having a row header (containing: BeginTs, EndTs, StmtId, idxLinkcount, Index pointer array), and a Payload (the actual data).
Next, the process of compiling the In-Memory tables is discussed. A separate DLL is created for each table, these DLLs are used to access and manipulate data in payload section of the row. The section continues with a look at various limitations, these relate to: data types, table features, and database-level limitations – however these limitations are significantly fewer than with SQL Server 2014.
The chapter ends with a brief look at how In-Memory OLTP interacts with High Availability (HA) technologies. AlwaysOn failover cluster, Availability Groups, log shipping, and replication are supported. Mirroring is not supported, and this is to be expected since AlwaysOn Availability Groups is the preferred HA mechanism.
This chapter provides a helpful comparison of on-disk and In-Memory tables. There’s a useful overview of the Multiversion Concurrency Control (MVCC) mechanism, showing why locks and latches are no longer necessary. The discussion of the data row format provides a helpful insight into the internals. The remaining limitations discussed should prove useful when migrating existing systems to In-Memory OLTP.
Chapter 4 Hash Indexes
The chapter starts with an overview of what hashing is - it provides a compact value of data, which typically improves performance. Ideally each hash will have its own bucket, but uniqueness can’t be guaranteed, and collisions can occur, and this can impact performance.
After examining the structure of the hash index, the chapter continues with a discussion on the importance of the bucket count. The bucket size is pre-allocated, too many and space is wasted, too few and collisions occur resulting in traversing more rows in a bucket to get to correct data - decreased performance. Example code is provided that illustrates the performance impact of having too high and too low bucket counts. Ideally, the bucket count should relate to the number of unique keys, but also take into account any future expected growth.
Various Dynamic Management Views (DMVs) are shown to examine bucket counts and row chain lengths etc. It is possible to change the bucket count after the table has been created, by altering the table (new in SQL Server 2016), which creates a new table object in the background.
The chapter next looks at hash indexes and SARGability – the ability to use predicates to search via an index. Hash indexes are very efficient for point-lookup equality searches. It should be noted, for composite indexes, you need to specify all the columns in the composite index – there is no partial matching as is possible with B-tree indexes.
The chapter ends with a look at statistics on In-Memory tables. Statistics describe the distribution and density of column values, and are used by the optimizer to estimate cardinality, and thus help produce an efficient query plan. The statistics on SQL Server 2016 In-Memory tables/indexes are updated automatically as the data changes (new in SQL Server 2016), however any stored procedures that use these tables are not automatically recompiled to take advantage of the new statistics – it is important to manually refresh these stored procedures if the data changes significantly. Ideally the stored procedures should be altered rather than recompiled since the latter is deferred until needed, which may impact performance.
This chapter provides a very useful discussion of hash indexes, describing the importance of the bucket count, and its role in performance. The importance of manually refreshing stored procedures after the statistics are updated is noted.
|Last Updated ( Tuesday, 01 May 2018 )|