The Definitive Guide to MongoDB

Authors: David Hows, Eelco Plugge, Peter Membrey, Tim Hawkins
Publisher: APress, 2013
Pages: 308
ISBN: 978-1430258216
Aimed at: programmers who want to learn MongoDB
Rating: 4.5

"A Complete Guide to Dealing with Big Data Using MongoDB". Does it live up to this claim?

 

This 2nd edition has been updated to cover the changes to MongoDB version 2.4 such as hashed indexes.

The authors start Part 1: MongoDB Basics with a look at the philosophy and ideas behind the creation of MongoDB, and how the design decisions affect the way the database works. JSON is also introduced in this chapter. While the authors are obviously fans of MongoDB and the NoSQL model, they do also mention its drawbacks so you don’t get a completely biased view.

Having told you why MongoDB is a good idea, the authors then move on to how to install it on Linux and Windows, along with the PHP driver and the Python driver.

 

Banner

 

A MongoDB database consists of collections of documents with indexes to improve performance. The way all this works is the subject of the Chapter 3, including geospatial indexing. Working with data – querying, updating, adding and removing documents – is covered in the next chapter.

By Chapter 5 the authors are on to GridFS, and the way it can be used to locate information within documents. GridFS is the specification used by all the MongoDB drivers, and it overcomes the limit of 16MB per MongoDB document. The idea is that if you have large files that you want to store using MongoDB, they’re stored externally and accessed using GridFS.

 

 

Part 2 of the book is dedicated to Developing with MongoDB. There are chapters on developing for MongoDB with PHP and Python, and a chapter on ‘advanced queries’ that looks at text search, the aggregation framework, and MapReduce. The chapter on using PHP is good on identifying where PHPs way of working differs from what would be an ideal match with MongoDB, and how to get around this. The aggregation framework was introduced in MongoDB 2.2, and consists of a set of pipeline operators that you can put together to form sequences of operations on all your data. The first operator performs on all the data, later ones in the pipe work on the output from the earlier operators. The chapter looks at $group, $limit, $match, $sort, $unwind, $project, and $skip.

The third and final part of the book is titled Advanced MongoDB with Big Data. There’s a useful chapter on administering MongoDB, then in-depth chapters on optimization, replication, and sharding. The optimization chapter looks at how to evaluate query performance using the profiler, explain(), and the two together to optimize a query. The section on how MongoDB selects which index to use was interesting, as was the section on using hint() to force the use of a specific index.

The chapter on replication shows how to manage the oplog in terms of setting its size to balance out the needs of being able to synchronize replicas, and what the stats actually mean. The sharding chapter starts with a good explanation of why you need to shard, then goes on to analyze different sharding options, including the use of MongoDB’s balancer, the use of hashed shard keys, and tag sharding. This lets you specify which data should be located in a particular shard. I’d have liked longer chapters on both replication and sharding, but what is there is good.

The authors have done a good job on this book. The high level explanations really make sense, and the technical material is clear. I’d still want more on the more advanced topics, but it’s a good read.

 

Related Reviews 

MongoDB the Definitive Guide (O'Reilly)

MongoDB Applied Design Patterns (O'Reilly)

MongoDB in Action (Manning)

 

Banner


Continuous Architecture In Practice (Addison-Wesley)

Author: Murat Erder, Pierre Pureur and Eoin Woods
Publisher: Addison-Wesley
Pages: 352
ISBN: 978-0136523567
Print: 0136523560
Kindle: ‎B08ZRTQGLJ
Audience: Software Architects
Rating: 3
Reviewer: Kay Ewbank

This book sets out the case for why software architecture is more important than ever, and in p [ ... ]



Foundational Python For Data Science

Author: Kennedy Behrman
Publisher: Pearson
Pages:256
ISBN: 978-0136624356
Print: 0136624359
Kindle: B095Y6G2QV
Audience: Data scientists
Rating: 4.5
Reviewer: Kay Ewbank

This book sets out to be a simple introduction to Python, specifically how to use it to work with data.


More Reviews

 

Last Updated ( Wednesday, 16 July 2014 )