Thoughtful Machine Learning with Python 
Author: Matthew Kirk A book on AI in Python, what could be better? Python is probably the most popular language for technical computing at the moment. It is easy to use and fairly powerful and has lots of libraries that you can simply import to get the job done. Learning AI or machine learning in Python seems like a really good idea. However, the subtitle of this book is "A Test Driven Approach" which, for me at least, sets alarm bells ringing. The reason is that AI is a tough mathematical subject and test driven programming is another big subject that your really don't want to get mixed up in when trying to master the first one. My fears were justified as the opening chapter wastes a lot of time and effort on fundamental software engineering issues such as SOLID, TDD and general discussion of the software problem. The best part about the chapter is its title, Probably Approximately Correct Software, which will only raise a smile if you already know so much about AI that you don't need to read the book. Things do get better, however, when we move off into the real topic of the book  machine learning. If you think that machine learning is just neural networks and this is all you want to know about then you are going to be disappointed. This book covers a wide range of techniques and only devotes a small chapter to the hot topic of neural networks. After a chapter that outlines the different types of approach to machine learning  supervised, unsupervised and reinforcement learning  the book gets onto its first real topic which is KNearest neighbors. This is a very simple technique, but it involves computing the distance between two data points and there are a lot of different ways you can do this. Many are explained in detail with some math, but mostly by way of analogy. Matthew Kirk attempts to do without math throughout the book, but doesn't succeed simply because of the need for frequent equations to clarify ideas. If you are having problems with math, this is going to leave you with only a vague understanding. One minute the discussion is in very simple words and then suddenly something sophisticated is mentioned
Chapter 3 moves to Naive Bayes methods  classical statistics. This is reasonably well described but leaves out so much. There is code to help you try things out, but without a deeper understanding of the ideas it is difficult to see what use it all is. Next we have decision trees and then on to one of of the most difficult topics in AI  hidden Markov chains. This is about as mathematically complex as you can get and, as you can imagine, the coverage is very slight. We do get a rough explanation of the Virterbi algorithm and an example involving partsofspeech tagging. Chapter 7 introduces the Support Vector Machine with a reasonable explanation of the idea of linear separability and how kernels can be used to convert the decision boundary to a curve, but it really doesn't explain how or why the SVM is any better than any other linear discriminant function; and of course the classical linear discriminant isn't discussed at all. Finally, for many readers, we get to neural networks. This is a reasonably good account of the ideas, but not really incorporating any of the new ideas that have produced the neural network revolution. You don't get any discussion of drop out or batch training, but it is still an adequate intro if you really know nothing about the topic. I found myself disagreeing with many of the suggestions on how to build a network and was puzzled by the inclusion of sine in the list of possible activation functions as it is very far from being a common choice. The suggestion that one hidden layer is enough was also something of an oversimplification in this era of "deep" neural networks with many layers. Yes they are harder to train and they do need more data, but these are the networks that are making headlines. The networks described here are more like small networks standing in for more traditional statistical methods. You don't find any discussion of convolutional, adversarial or feedback networks, and this isn't unreasonable given the size of the book and its level  but again these are the things that the new AI revolution is based on. Chapter 9 is a fairly conventional account of clustering, which is more of a classical statistical method than machine learning. The penultimate chapter is about improving the data we feed to models and the final chapter is an overview.
This book really doesn't stand a fair chance. Machine learning is a topic that doesn't make much sense without at least some math. The math isn't rocket science, but without it you are really working with things you don't really understand. Although each chapter starts out by being conversational and telling you a story without math, sooner or later it simply presents a few equations to clarify but with no attempt to explain what they mean. There are some very nice uses of geometry to explain the way the data is being treated, but this approach isn't used enough. Most machine learning can be understood better with some insights into what the data looks like in the data space. If you aren't good at math, this book isn't going to help you understand machine learning or the math you need. At best it could be an easy read refresher for someone with most of the math, but not enough of the AI. You would still need to go away and read some more, but it is a reasonble overview as far as it goes. It does not, however, give you any insight into the current amazing state of machine learning that is responsible for so much interest. To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.


Last Updated ( Friday, 28 July 2017 ) 