HBase in Action

Authors: Nick Dimiduk & Amandeep Khurana
Publisher: Manning, 2012
Pages: 360
ISBN: 978-1617290527
Aimed at: Database developers who want to learn how to use HBase
Rating: 4.5
Reviewed by: Kay Ewbank

This book sets out to teach those with experience of other databases how to build applications using HBase. How well does it succeed?

One barrier to learning about open-source software is that the documentation is usually sketchy or non-existent. This is understandable - all of us know that writing the manual is time-consuming and fairly thankless, and when there aren’t limitless resources to throw at the problem, it’s going to be low on the priority list. All that doesn’t make it any easier to break into a new technology, though.

HBase is the NoSQL database that was developed as part of Apache’s Hadoop project. HBase in Action has been described by the authors, Nick Dimiduk and Amandeep Khurana, as the HBase User’s Guide, with an intention of teaching developers who probably have some experience with other databases how to build applications using HBase.

 

Banner

 

As such, the book opens with a couple of chapters introducing HBase and getting started using it. The example database application used throughout the book is introduced - Twitbase, a simplified clone of Twitter. The authors use the HBase Java client library for these early chapters, and there’s code on most pages. The material is covered at an ideal level, and because Dimiduk and Khurana have developers as the target audience, they focus on what you actually need to know.

By Chapter 3 the authors are on to Distributed HBase, HDFS and MapReduce. They start with a description of the problem MapReduce is used to solve - efficient batch handling of large amounts of offline data, then give an overview of MapReduce and how to use it on dataflows. After showing how HBase works in distributed mode, they then go on to put HBase and MapReduce together in an HBase MapReduce app.

 

hbase

 

Part 2 of the book is titled Advanced Concepts, although it starts with a look at designing schemas in HBase. The next chapter moves on to using HBase with the observer and endpoint coprocessors. In HBase terms, this refers to pushing some computation to HBase nodes, where the computation is run in parallel across all HBase’s RegionServers.

Coprocessors were added to HBase in the 0.92 release and the authors stress they’re untested in production deployments. HBase coprocessors, being so new, aren’t that well understood, and this chapter makes the book worth getting even if it’s the only bit you use. The final chapter in this part of the book covers alternative HBase clients - scripting from UNIX; JRuby, REST, Python, and an alternative Java HBase client called Asychbase.

Part 3 of the book shows two example HBase applications, an online time-series database and a geographical information system. The final part of the book looks at putting HBase into operation. It starts with a chapter on deploying HBase with discussions of how to plan your cluster, which distribution to use, and how to configure the system. The final chapter looks at ongoing management - monitoring your cluster, performance testing and tuning, cluster management, and backup and replication.

This is a really interesting book. It’s well written and readable, even when explaining difficult topics. The code is well explained, and used to illustrate relevant points rather than just to fill space. There were some aspects that seemed in an odd order - the fact that deployment comes last, and that schemas are put in the ‘advanced’ topics both seemed a little odd. That’s a very minor caveat, though. At the end of reading the book I felt I had a much clearer understanding of HBase, and would be reasonably happy to write a real system using it.

Related Reviews

HBase: The Definitive Guide

 

Banner


Access 2010 Programming by Example with VBA, XML and ASP

Author: Julitta Korol
Publisher: Mercury
Pages: 1057
ISBN: 978-1936420025
Audience: Access users wanting to move on to programming Rating: 3.5
Reviewed by: Kay Ewbank

Does this book help make the transition from Access user to database programmer?



Microsoft SQL Server 2012 Internals

Author: Kalen Delaney et al
Publisher: Microsoft Press
Pages: 982
ISBN: 978-0735658561
Audience: DBAs and SQL Developers
Rating: 4.7
Reviewer: Ian Stirk

The introduction says "This book is intended to be read by anyone who wants a deeper understanding of what SQL Server does behind the scenes", so how do [ ... ]


More Reviews

Last Updated ( Wednesday, 27 March 2013 )
 
 

   
RSS feed of book reviews only
I Programmer Book Reviews
RSS feed of all content
I Programmer Book Reviews
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.