|Joe Celko’s Trees and Hierarchies in SQL for Smarties|
Author: Joe Celko
Publisher: Morgan Kaufmann
Aimed at: Advanced SQL developers or computer science students
Pros: Lucid and understandable descriptions of abstract topics backed up by real code
Cons: Not a light read
Reviewed by: Kay Ewbank
Hierarchies and Trees are never going to provide the material for a light read, so be prepared.
If you’re a database developer, chances are you’ll already know Joe Celko’s work from his column in DBMS Magazine, and from his writing in other books and on the web.
This is the second edition of his book on using trees and hierarchies in SQL. In the introduction Celko says his real aim in writing the book is to save him the trouble of answering emails about the techniques and to overcome the need to paste more code on forums, newsgroups and blogs.
The book is fairly short, but packed with useful stuff if you need to represent trees and hierarchies in SQL as nested sets, or if you need to understand graph theory for your computer science course. Of course, the number of SQL programmers who’ll need to do this is probably quite limited, but it’s an interesting read anyway, if only for the insights it provides into what can be achieved in SQL.
The early chapters in particular are much more academic than you would normally expect in a practical book on database programming, but the material means you have to understand abstract concepts. The SQL used in the book is standard ANSI, and doesn’t make use of any of the proprietary methods added by Oracle, IBM and Microsoft, which means the code can be used as-is in databases including MySQL.
Given the material covered, you can’t expect this to be an easy read, but Celko has an excellent writing style and explains the ideas as clearly and simply as possible. He is very much a ‘show by example’ writer, but so long as you don’t mind that, the book works well.
The book kicks off with an intro to graph theory, and what trees and hierarchies are in terms of graphs, and how graphs fit into the development of databases. There are chapters on the adjacency list model, path enumeration models, nested set model, and frequent insertion trees. After a look at binary trees, the general material ends with a look at other models for trees. Celko then moves on to look at proprietary extensions for trees, looking at how Oracle, DB2 and Microsoft have incorporated ways to handle trees in their databases.
There are two chapters on hierarchies, looking at data modelling and encoding schemes, and there’s also an extensive chapter on hierarchical database systems - essentially IMS. This might seem a strange topic to include, but as Celko points out, there’s a good chance that there’s still more data held in IMS databases than is held in SQL, so you may well have to interact with IMS at some point.
If you are familiar with the first edition of the book, the new material is mainly in three extra chapters, covering graphs in SQL, Petri nets and state transition graphs. The chapter on graphs in SQL shows how recursive common table expressions can be used instead of loops to get faster query execution, with plenty of samples.
The chapter on Petri nets is fairly short, essentially a heads up on what they are (a cross between a state transition diagram and a board game, according to Celko), and a minimal description of why they’re useful. There’s also a data definition language example showing how to model a Petri net, along with code for using it. The final new chapter covers state transition graphs and how they can be used for enforcing business rules.
Celko does his best to make the entire subject of trees and hierarchies understandable. He’s good at coming up with real-world examples so you can follow the abstract concept - a newsgroup and its message threads make up a frequent insertion tree, nested sets are like ‘counting tags’ in XML or HTML, and so on. This helps, but the material is still pretty abstract. A lot of SQL programmers have been experimenting with trees and hierarchies, and if you fall into that category - or have data sets that are hierarchical - then this is the best general description you’re likely to find.
|Last Updated ( Tuesday, 27 March 2012 )|