|Deep Learning: A Practitioner's Approach
Author: Josh Patterson and Adam Gibson
AI is important and profitable, is this a good introduction?
One of the problems with AI is that once it becomes a bandwagon everyone wants to get on board. Personally I'm of the opinion that if you don't have a good understanding of the math, including some traditional stats and believe it or not some slightly more advanced geometry, then you are going to have a hard time mastering it. However, this doesn't mean you have to master it to get something out of it, the real problem is to find out how much you need to know to avoid making stupid mistakes. And in AI stupid mistakes are easy to make. They also come in two forms - missing an opportunity to make something work and not realizing that something works for trivial reasons.
So a practitioners introduction might be a good idea even if it is a slightly paradoxical title - how can you be a practitioner if you need an introduction. Where to all introduction start? With a review of machine learning. Chapter 1 goes over the usual ground and doesn't go too deep. It also doesn't really give you an over view. It is best described as a collection of random simple topics. Chapter 2 continues the over view but focuses more on neural models. We go from the Perceptron to multi-layer network and how they are trained. We also go through a list of activation functions. This is the point where the book starts to descend into a shopping list or catalog of methods. Nothing deep and not much to help you understand what is going on.
By Chapter 3 we are out of the overview and into more specific topics, but again we have a list of what is possible. Chapter 4 is even worse in this respect - deep belief networks, convolutional neural networks, recurrent networks, recursive networks and so on. All too discursive and not enough depth. This started to read more an more like a manager's waffle guide rather than a techie's bible.
Chapter 5 should be the most useful in the book because it is about implementing real neural networks. The first thing to note is that it makes use of DL4J and Java, which are not the most usual way into neural networks. However, if you want to use Java rather than, say, Python this is no problem. After explaining how to install the software, we move on to an example, but there is no clue as to what the data represents and so no way of understanding. You just have to follow the instructions. The second example is better - training on the MNIST handwriting data. Next we move on to recurrent networks and generating language. Finally, we look at implementing autoencoders.
The big problem is that most of the text is taken up with listings and there is very little explanation of why things are being done in a particular way. It is a zoo of examples rather than explanatory cases.
Chapter 6 deals with the always tricky problem of tuning networks. How many networks fail to work simply because a parameter needed tweaking? The trouble is that this is something that is hard to teach because there are no firm rules for doing the job. The chapter gives lots of rules of thumb and does attempt to explain, but for me it misses the mark. There is so much more that can be said about designing networks in terms of dimensionality and sample size. The chapter soon drifts into other areas, such as using GPUs to speed things up. There are also topics such as regularization, that I would argue are not really about tuning. The topic continues in the next chapter but applied to specific networks. Again there is lots of advice, but not much clear explanation of what is going on. Suddenly there is discussion of a Restricted Boltzman Machine - RBM - which has hardly been explained.
Chapter 8 is titled "vectorization" but it is mostly about data preparation and we have coverage of topics such as how to compute a mean and standard deviation - which is late in a book on an advanced topic. It does cover things like "bag of words" eventually. No discussion of sparse coding techniques.
The final chapter is about using DL4J together with Spark and Hadoop and there are a number of appendixes, some of which should have been chapters in the book - e.g. Appendix A "What is Artificial Intelligence".
You should only consider this book if you want to work in Java and use DL4J rather than TensorFlow or one of the Python libraries. Even if you do want to use Java, this book doesn't really help you understand neural networks. There are short sections at a fairly low level that will tell you something, but there isn't a complete path from beginner to expert. What there are a lot of are listings. If you want a book of examples, then again you might find this useful - but the examples don't do much for your understanding of what is going on. There is also a tendency to indulge in "management-style" topic listings - as if making a list was the road to understanding. Lists can be helpful in showing the subject domain, but you then have to explore the topics in ways that expose their deeper meaning.
I can't recommend this book simply because there are better alternatives. - but perhaps not in Java.
|Last Updated ( Tuesday, 08 January 2019 )