Author: Allen B. Downey
Publisher: O'Reilly, 2011
Aimed at: Programmers
Pros: Good treatment of estimation
Cons: Too slight, badly organized
Reviewed by: Mike James
The subtitle of this book is "Probability and Statistics for Programmers" and the idea of such a book being needed makes sense. Increasingly programmers are having to cope with the difficult ideas of statistics and probability in their everyday programming. The problem is what makes a good approach to stats for programmers?
According to this book the answer is to present some code in Python to do some basic stats jobs. The code doesn't really do very much at all and overall it doesn't really help. A better approach might have been to give some examples in R and at the very least try and keep the examples going to the end of the book. Alternatively the author could just exploit the fact that programmers are good are algorithmic thinking and explain how things work in that style.
Chapter 1 goes over what probability and statistics are all about and gives an example that runs through the book. There is a lot of vague explanations and definitions of terms that I doubt many will find particularly useful.
Chapter 2 is where the book gets down to the details and it starts of with some descriptive statistics - compute the mean and variance - but without any motivation by way of what these descriptive stats are describing, i.e. location and spread of a distribution. Then we have a digression into distributions and how to create a histogram using Python. This might be reasonable in a book with plenty of pages to spare but there are much more interesting and important topics to cover than drawing a histogram. Mixed in are some ideas of probability density functions and outliers but it's a bit of a mess. Oddly conditional probability is also introduced in this chapter - it might have been better to start with some ideas of probability.
The next chapter moves on to cumulative distributions and again in a book with lots of pages to spare you might want to indulge the luxury of looking at this topic in detail but for a programmer it should be obvious anyway. Then on to random numbers and again a programmer should be up on this topic and where there are any subtle points - like using the CDF to generate a random number from the distribution the author says:
"It might not be obvious why this works, but since it is easier to implement than to explain, let's try it out."
What can be said after that?!
Chapter 4 is on continuous distributions and notice we haven't actually looked at probability yet. This chapter takes the form of a short catalog of distributions introduces in a strange order. We do arrive at probability in Chapter 5, but wait - we have been working with continuous probability distributions in earlier chapters and this is a much more subtle idea than discrete probability. This is a very odd order in which to introduce ideas. Straight after the basic rules of probability are hardly explained we have the Monty Hall problem - great fun but why not actually tell the reader what the laws of probability are in a bit more detail before exposing them to the most unintuitive example in all of probability? Then on to the binomial distribution - now we really are going backwards from continuous to discrete - and finally a large explanation of the trendy Bayes Theorem. I say trendy because it does seem to be. It might be the core of probabilistic inference but it is also misused more often than not.
Chapter 6 is called operations on distributions and yet it starts off with some more descriptive statistics - skewness. Then we are into convolution and eventually the central limit theorem. Why not introduce the central limit theorem in a more user-friendly way - why introduce the convolution operator at all? This isn't a textbook and mostly avoids the hard algebra.
I have to admit that I was worried when I reached Chapter 7 on the all-important idea of hypothesis testing and I was right to be. The ideas are introduced in a way that isn't wrong but is prone to misunderstanding. This is the core idea in statistics and the whole book could have been devoted to it rather than the mixed-up, low- level stuff actually covered. Then we get to interpreting the result of a test which is explained in a way that makes the Bayesian approach seem preferable. If you do want to be a Bayesian then at least take the trouble to understand the frequentist argument before binning it.
Chapter 8 is about estimation and this is another very subtle topic. Surprisingly this time it is dealt with in a reasonable, even if, superficial way. The idea of putting estimation as a guessing game is a good one but it again it doesn't go far enough. For example the idea of the distribution of the estimator is just ignored and without it you can't really begin to formulate the idea of a best estimator. Finally we have the obligatory section on Bayesian estimation.
The final chapter is a lightning overview of the troubled topic of correlation. It is introduced in a fairly simple way - correlation measure the degree of linear relationship and so is the non-parametric measure. Again the problem is that it is too shallow and doesn't do the subject justice.
This isnt' a terrible book, even if it comes close to being so, because it isn't in the main actually wrong. It is slight and it could be misleading. It simply isn't an overview of anything recognizable as modern statistics. It has the feeling of being written by some one viewing the subject from very far away. A more organized presentation would be better, as would a presentation of the deeper ideas behind probabilistic and statistical reasoning.
If you are interested in probability and statistics in programming you would be much better advised to read one of the many books on the subject. There is most certainly room for a book on statistics and probability in general aimed at the practicing programmer - but this isn't it.