Author: Jeremy Leipzig & Xiao-Yi Li
Publisher: O'Reilly, 2011
Aimed at: Existing users of R
Pros: One innovative and interesting project
Cons: Lacks explanation, second project unimpressive
Reviewed by: Mike James
This is an example of the very thin but very focused books that O'Reilly has started to produce. This one has a cover price of just less than $30, which puts the cost at around a $1 per page! Even though you can get it discounted to around half this price, it still seem expensive at 50 cents per page. But enough of money; because if the content is good enough you can charge what you like. After all it isn't the paper that you are buying, but the information.
This is a book about using R to create data mashups. Now this is not an idea that will occur immediately to most programmers. R is indeed a statistical language and it does handle both large datasets and it produces graphics. However, most programmers will think of using some other more commonly encountered languages to create an entire system. So, from the point of view of demonstrating that R can do this sort of thing, the book has a value.
The book deals with the construction of a single project - a presentation of foreclosure data. This is a fairly standard task so, even if you aren't interested in the data, the methods are general. The big problem is that the presentation of the ideas isn't very good.
The book launches into it subject matter without much scene setting - it's a short book so that is good - but then it continues with minimal explanation. The first chapter is about cleaning up address data using a regular expression. If you know about regular expressions then it is obvious - if you don't then you won't really follow.
From here we have getting geocoding data from Yahoo, parsing XML and constructing a geographical map. All very simple uses of standard programming facilities in R, or in packages you can add to R. The explanations are not great and if you know R the only thing you are going to get from this chapter is that R can do this. If you don't know R then it might be an incentive to learn it.
Chapter 2, the final chapter, is about adding statistics to the map. Essentially it shows how to acquire and manipulate census data and do standard things like produce correlation plots. This isn't really a "mashup" because the new data aren't used in conjunction with the original map. This is far less impressive as a project using R than the previous example.
The book finishes with an appendix on getting and using R. It is just over a page and entirely pointless in a book of this length.
This is more an essay or a magazine article than a book. If it was full of amazing nuggets of information it might be worth the paper it was printed on, but it isn't. It is very fast moving and fails to set up the problem about to be solved and fails to discuss the solution to the problem that wasn't set up.
Unless you are a struggling R beginner and want to use the book as a crib sheet to implement the same sort of project, or if you just want proof that R can do this, then avoid this book.