Author: Allen B Downey
Audience: Python programmers
Reviewer: Mike James
Learning about Bayesian stats while programming in Python seems like a good idea. What could possibly go wrong?
This is a book in the "Think X" series from author Allen Downey, published by O'Reilly which all start off from the postulate that as a Python programmer you can use your programming skill to learn other topics. This is an idea I happen to agree with and I make use of it whenever I can. However, in this case things are more complicated.
The first thing to say is that Bayesian statistics is one of the two mainstream approaches to modern statistics. The first is the frequentist approach which leads up to hypothesis testing and confidence intervals as well as a lot of statistical models, which Downey sets out to cover in Think Stats. The second is Bayesian statistics, which is based on a single theorem that relates to conditional probabilities. In many ways it is Bayesian statistics that is the new kid on the block and to be a Bayesian is to be novel and slightly radical in the statistical community - although as the approach matures this is less and less the case.
If there is any controversy over Bayesian stats it certainly isn't centered around Bayes theorem, which is accepted by everyone as long as the conditional qualities it is relating really are probabilities. Things get a little more controversial when the probabilities being used start to degenerate into measures of belief or confidence. There are lots of complex issues surrounding the use of the approach and this book ignores them all.
It is important to know that all of the examples are in Python and you need to know how to program in Python to get almost anything from this book. You can't simply read it and skip the programs and hope to understand the ideas - the ideas are all in what the programs do.
The first chapter introduces the idea of conditional probability and it mostly does this by examples. This works well as a way of introducing this basic idea and by page 5 we have reached Bayes's theorem in both its simple form as something to do with probabilities and in its extended form as a rule of inference from data. We next have a simple example and then the Monty Hall problem explained in terms of conditional probability. If you get lost on this example don't be surprised because you are not alone. Why introduce this most difficult problem at the point the reader is just getting to grips with the basic concepts?
From here the book gets increasingly deeper into using Python and a set of objects that have been created to work with distributions. Chapter 2 introduces the probability mass function, i.e. a discrete distribution and an object which represents it. This is used to go over the examples in Chapter 1 in a programming problem format.
Chapter 3 is titled "Estimation" but it looks increasingly like a manual for the software. We also meet the problem of the prior distribution, i.e. what should it be? The idea that if you have enough data then the prior becomes irrelevant because the resulting posterior distributions, and hence any conclusions you draw, tend to converge. Yes, this is true, but if you have that much data you probably don't need statistics. The next chapter explains how you can extract simple statistics from the posterior distribution.
Chapter 5 explains the idea of odds and the odds form of Bayes's rule plus a lot of additional features added to the software to make it possible to do probability calculations. Chapter 6 adds decision analysis into the mixture but spends most of its time discussing what a PDF is and how to represent it in software.
Chapter 7 is called Prediction and it shows how to use a Poisson process and Chapter 8 continues the look at the Poisson in terms of wait times.
So far all of the discussion has been about one-dimensional problems and Chapter 9 moves us into 2D with a look at joint distributions, marginals and conditionals that can be derived from them.
Chapter 10 looks at approximate Bayesian computations i.e. sampling. Chapter 11 is about hypothesis testing but not classical hypothesis testing but Bayesian testing - so don't expect to find out about significance here.
The remaining chapters cover evidence, simulation, simple models and higher dimensional situations.
Each chapter starts off with a example and while you might want to praise this feature as being practical it doesn't really give you a chance to discover that principle the chapter is about before you start getting deeper. Even the summaries and discussion at the end of the chapter fails to provide an adequate overview if you happen to have go lost on the way.
While in principle finding out how to convert some theoretical idea into code does help you understand the idea - if you can't do it then you certainly don't understand the idea - it doesn't always work the other way round. I often found myself trying to figure out what the programs were doing rather than what the ideas were. My preferred approach would be to explain the theory, what is probability, how does Bayes theorem let you translate prior distributions into posterior distributions and so on, and then have the program created to make sure I understand.
My final concern about the book is that it doesn't really emphasize the difficulties of the Bayes method. It doesn't point out that you are often not working with anything that can be interpreted as probability at all. It doesn't point out that as the number of variables grows there are real problems in gathering enough data to allow all of the dependencies to be taken into account. In short, it doesn't equip you with what you need to know to move on to more advanced things.
You can tell that I didn't like this book, but if you prefer a code-based approach to theory you might disagree. One thing that is certain is that there are a lot of good examples presented and the book is a great resource if you have to teach a class.
If you would like to form your own opinion of the book then you can download a PDF from http://greenteapress.com/
I can't recommend it if your goal is to understand Bayesian statistics and its place in the wider context of statistics.