Author: Joe Celko
Publisher: Morgan Kaufmann
Aimed at: Advanced SQL developers or computer science students
Reviewed by: Kay Ewbank
This is a must-read book for any SQL programmer, covering advanced topics in an understandable way.
This is the fifth edition of Joe Celko’s seminal work on SQL programming, updated to cover ANSI SQL 2011.
Celko is famed in the SQL community; he served for ten years on the ANSI/ISO SQL Standards Committee, and contributed to both the SQL-89 and SQL-92 standards. He’s equally well known for his many books, magazine and web articles, some of which have been contributed to this site, see Sharpen Your Coding Skills.
Since the first edition twenty years ago, this book has been aimed at taking experienced SQL programmers and giving them the extras to make them experts.
While the title of the book refers to advanced SQL programming, Celko starts with the underlying structures – the data declaration features. He has chapters discussing databases versus file systems, transactions and concurrency control, tables, keys, normalization, views, and other schema objects. Throughout the book, there is ample SQL code, all of it ANSI standard SQL. This does mean you’ll need to ‘tweak’ it to reflect the idiosyncrasies of the dialect of SQL used by your particular server, but that shouldn’t be too much of a problem for the target readers of this book.
One of the best chapters in this part of the book covers DDL flaws that you should avoid, starting with tibbling. This is Celko’s name for the tendency to start object names with some ‘identifier’ showing what type they are; tbl for table, hence tibble. Other things covered in this chapter include attribute splitting, overloading design flaws, and non-normal form redundancy.
One of Celko’s most endearing characteristics is his wry use of examples. For schema level overloading, he uses the example of ‘automobile, squid and Lady Gaga’ to refer to the temptation to put unrelated data into one logical unit. Examples like this make you laugh, but illustrate the point being made very clearly.
The second part of the book covers data types, with chapters on the different types – numeric, character, and temporal, along with nulls, table and set operations.
For many readers, the real heart of the book starts with Part 3, which covers row and column level features such as comparison operators, subquery predicates, between and overlaps, case, like and similar to predicates. It also has extensive coverage of Select, aggregate functions, and OLAP aggregation. The chapters on OLAP aggregations and on advanced select statements are excellent. If someone’s not clear on the different join types, point them in the direction of this chapter, it will give them real insights into how and when to create joins.
Part 4 (by this time we’re already at page 545), is about data structures in SQL. There are chapters on graphs, trees and hierarchies, queues and matrices. The material on trees and hierarchies is covered in much greater detail in Celko’s book dedicated completely to this topic.
There’s a really good section next on typical queries, starting with a chapter on partitioning and aggregating data in queries that gives as good an explanation as I’ve read anywhere. Sub-sequences, regions, runs, gaps and islands discusses handling more complex sequences of data, and covers ideas such as swapping and sliding values in a list, and condensing and folding lists. There’s a chapter on data from auctions and how to work with it, and a good discussion on relational division using the example of pilots and planes they can fly. Celko is particularly good on temporal data, and his chapter on temporal queries shows this.
The final part of the book has the title ‘Implementation and coding issues’, and it has chapters discussing procedural, semi-procedural and declarative programming in SQL; nesting levels; and embedded SQL, CLI Dynamic SQL, and SQL/PSM.
I really enjoyed this book. Someone once described a previous edition as being really good so long as you’re already a really good programmer, which I think is a bit harsh. It’s true that you wouldn’t choose this book if you’d never programmed or met SQL before, but you can learn a lot without needing to be an expert; it’s just you won’t get the full benefit from Celko’s expertise. Celko has strong opinions and isn’t afraid to share them, but even if there are situations where you don’t agree with what he says, you learn a lot. He’s excellent at giving understandable explanations of complex ideas. You’ll probably get to some sections of the book and think “I’m never going to have to code for that”, but that’s not a problem; just make use of the parts you do need to learn.