Author: R in Action: Data Analysis and Graphics with R
Aimed at: Those who already know some statistics
Pros: Highly practical and well explained
Cons: Doesn't cover deep programming aspects
Reviewed by: Janet Swift
The R language opens up many opportunities for statistical analysis. Is this the book you need?
This book could just as easily be reviewed in the Mathematics books section because it is very heavy on statistical practice. This isn't a bad thing, but if you are a programmer looking for a primer on R as a language you might need to look elsewhere. This is not to say that the book is light on details of the R language, but the account is angled towards the practitioner wanting to get to grips with actually doing some statistics as quickly as possible. It really does answer the question of "how do I" and especially so if you are familiar with any other stats package.
The book is divided into four sections - Getting Started, Basic Methods, Intermediate Methods and Advanced Methods - titles that don't tell you much about what they contain!
Getting Started goes over the basics of working with R. At its end you will know what R is all about, how to install some data and plot some charts. The chapter on creating a dataset takes you quickly through the data structures that R supports, but from a user's, rather than programmer's, point of view. The final two chapters in the section expand on this to include details such as transforming and generally manipulating the data - which is a big part of any statistical analysis.
The book has a slight tendency to assume that you will remember anything that has been introduced earlier and doesn't often bother to make a reference either forward or back to where you can find out more about something. It's not a big problem, however, because you can always go hunting for yourself.
Once you get out of the first part the emphasis is very much on the statistics. Part II describes how to create basic charts and perform basic statistical tests - t-tests and equivalent non-parametric tests. My guess is that this is the section of most use to the stats beginner who is trying to solve a homework problem using R.
Part III of the book begins to get more advanced and shows you how to perform regression and an Anova using R. This is to be welcome because R doesn't do these things in the same way as a package like SPSS say. The final part of the section deals with power analysis - i.e. how big a sample do I need, and resampling - both of which are unusual as "intermediate" topics but it is good to see them introduced so early.
If regression and Anova are intermediate topics what can be waiting for the reader in an advanced section? The answer is the generalized linear model and principle components/factor analysis, which are indeed advanced statistics even if they are basic tools in some disciplines. The final two chapters are also interesting - advanced missing values and advanced graphics. The treatment of missing values is worth buying the book for on its own - it is a topic that is all too often ignored.
Don't buy this book if you really don't have a clue about statistics - it isn't a statistics primer. What it does is to assume that you know much of the theory, it does fill in some gaps for you, and then shows you how to do the job using R. Generally the descriptions of how to achieve some task or other includes enough asides and comments to make you think about the task and perhaps even invent your own way of doing the job. It also isnt' a good book to buy if you are looking for something to solve your programming problems - it isn't good on the specifics of importing data from a particular application say - but there are cookbooks for this and you can always search the web for a pre-programmed solution.
If you want to discover how to do statistics using R then this is a really good place to start. Recommended.