Author: Simon Rogers & Mark Girolami
Publisher: Chapman & Hall/CRC
Aimed at: Students preparing for a course in machine learning
Pros: Readable explanations of statistical techniques
Cons: Doesn't cover enough about machine learning
Reviewed by: Mike James
Given the interest in the online course on Machine Learning what could be better than a bit of background reading?
This book looks as if it might be exactly the right stuff to get you started - but in practice that are a few things you need to know about it before you decide that it is for you.
The most important feature of the book is that it is a statistics-oriented account. In fact many of the chapters would be just as at home in a book on classical statistics. For example, the first chapter is on Linear Modelling and it is essentially about least squares linear regression. It is true that it is packaged up in some of the language of machine learning but what you are presented with could be found in any introduction to statistics. This said, it is well presented and the mathematics is broken down into manageable chunks that you should be able to follow. There is also a web site with MATLAB scripts that lets you try out the models, view the graphs and tweak the parameters.
Chapter 2 is more linear modelling, but from the maximum likelihood point of view. Again, this is well explained, but it isn't the stuff that makes machine learning an exciting subject. You could argue that the prospective student of machine learning needs to know all of this before moving on but this isn't really true. There are lots of machine learning techniques that don't need much statistics theory.
Chapter 3 introduces the Bayesian approach to machine learning, but this doesn't cover much more than basic Bayesian stats. After going over an example of coin tossing, it moves on to introduce the basic techniques of Bayesian stats. The next chapter pushes this further to some areas where it does look more like machine learning than classical stats.
Chapter 5 is about classification, but it is a very narrow approach. We learn about the Bayes classifier and logistic regression but not about discriminant analysis. The second half of the chapter introduces non-probabilistic methods and at this point the book more or less abandons the classical stats approach - it really has no choice because the majority of machine learning algorithms don't have firm theoretical foundations. However, they do have probabilistic heuristics underlying them and, for example, the K nearest neighbour classifier can be viewed as an estimate of the Bayes classifier constructed using the sample density. The chapter concludes with a look at support vector machines.
Chapter 6 is about clustering, predominantly the K means approach and the K means augmented by kernel estimation. Again clustering is mostly based on heuristics rather than deep statistical theory so there isn't much justification to make use of the approach used in the earlier parts of the book.
The final chapter returns to classical statistics, multivariate statistics this time with a look at principal components and other latent variable models. Again, the presentation is quite good but more suited to a general statistics book than machine learning.
At the end of the day the problem with this introduction is that it really doesn't cover the subject it claims to. There are so many missing techniques - the perceptron, neural networks, discriminant analysis, decision trees, Bayesian networks, reinforcement learning, the genetic algorithm and so on. You can argue that some of these techniques are too advanced for a first course, but leaving out so many simply robs the book of any real machine learning flavor.
Even the classical statistics that are presented aren't particularly applied to machine learning problems and examples. Instead they relate to data analysis problems that aren't really anything much to do with machine learning. You also don't get any feeling for the way the techniques might be used in programs as online learners. The approach is static, you get some data, analyse it, derive a model, use the model - this really isn't machine learning.
Having said all this, I have to admit that I enjoyed reading many of the chapters but because of what I learned about standard statistical analysis rather than machine learning. If you are looking for a book that introduces model fitting in a sort of machine learning context then this is a really good book. If on the other hand you want a first course on machine learning then this one just doesn't cover the ground.