SQL Server 2012 Data Integration Recipes

Author: Adam Aspin
Publisher: Apress
Pages: 1003
ISBN: 978-1430247913
Audience: Developers needing to bring data into SQL Server
Rating: 4.5
Reviewer: Kay Ewbank

Data integration is quite a narrow topic in the overall area of databases, but it’s one that you have to get right if your database project is going to be successful.

This book gives detailed step by step instructions for taking data to and from a wide variety of data sources, and for how work with the data once you’ve got it into SQL Server. This might seem an easy topic - why do you need to read a heavyweight book to learn how to import and clean up data? The aim here is to provide as much automation as possible, though, to save on time and errors, so this isn’t just about how to carry out a task, but how to include it as part of a T-SQL or VB.NET script.




The first seven chapters of the book look at how to get data into and out of SQL Server, with examples showing how to connect to different data sources, and what problems you’re likely to encounter. Aspin then moves on to the wider aspects of ETL, looking at metadata analysis, data transformation, profiling source data, and ways you can optimize the data load.

The examples are mainly written in T-SQL, with a smattering of MDX where Aspin is showing how to handle Analysis Services data. Where he’s writing SSIS scripts he mainly uses VB.NET, though there are a few examples that use C#. The book covers pretty much everything about using SSIS (SQL Server Integration Services), and also covers other aspects of ETL.




The early chapters each take a single data source, and give step by step instructions on importing and exporting data to and from it, so you’re shown how to script the import and export data from Microsoft Office (Excel and Access), flat files, XML, SQL, SQL Server, and a round-up of miscellaneous sources. The details go further than just import and export; Aspin looks at modifying the data while loading and choosing the subset of data to load. He also shows how to deal with specific problems such as converting a complex Access database to SQL Server, using BULK INSERT, and flattening XML files prior to insert. The chapter on loading SQL files covers Oracle, DB2, MySQL, Sybase, Teradata and Postgre. The miscellaneous options are SQL Server Analysis Services, OLAP, images and documents, Visual FoxPro, dBASE, ODBC, and Windows Management Instrumentation.

Getting the data into SQL Server is the easy bit, though, and Aspin then moves on to making it usable. There’s a good chapter explaining metadata and how to analyze and query metadata from a number of SQL databases, and a thorough coverage of data transformation, including pivoting data, normalizing, consolidating multiple sources, and dealing with the various different types of slowly changing dimensions. Aspin shows how to use T-SQL and alternatively SSIS in each case. Data profiling to gain a high level understanding of what you’ve got or not got in particular columns of data is the next topic, with in-depth coverage of using the SSIS Data Profiling Task alongside alternative methods.

Once the data is all loaded and worked over, you’ll probably need to reload those data values that have changed, a topic Aspin calls Delta Data Management. He looks at the options of working out what data has changed externally and alternatively identifying it during the SQL Server ETL Load, with analysis of what the advantages and disadvantages are. The related topic of change tracking gets its own chapter, as does ways to make ETL faster. The topic of organizing and optimizing data loads is covered particularly well, and could save you a lot of time if you follow the advice. The chapter on Logging and Auditing and using the logging framework is also very good, and brings the main book to a close.

Most database developers would find some elements of this book useful. You’re unlikely ever to need all of it - if you’re working with DB2 data, the section on working with Access will be wasted, for instance, and in general you’ll probably only use one small part of each chapter - the bit relevant to your particular circumstances. However, the scripts and T-SQL are clearly written and explained, and while you may only use a small percentage of the overall material, that’s down to the multiple options rather than any weakness of the book.  



The Big Book of Small Python Projects

Author: Al Sweigart
Publisher: No Starch Press
Date: June 2021
Pages: 432
ISBN: 978-1718501249
Print: 1718501242
Kindle: B08FH9FV7M
Audience: Novice Python developers
Rating: 4
Reviewer: Lucy Black
A project book? A good way to learn Python?

SQL Server 2022 Query Performance Tuning (Apress)

Author: Grant Fritchey
Publisher: Apress
Pages: 745
Audience: DBAs & SQL Devs
Rating: 4.7
Reviewer: Ian Stirk 

A popular performance tuning book gets updated for SQL Server 2022, how does it fare?

More Reviews

Last Updated ( Tuesday, 29 January 2013 )