|Efficient R Programming|
Author: Colin Gillespie and Robin Lovelace
We all want to be efficient in any language and R is no different. What can we learn to do the job better?
This particular book takes a broad interpretation of "efficiency". It is taken to mean both how fast and economical your programs are and how efficient you can be creating programs. It isn't a particularly advanced book if you are a programmer but many R users aren't particularly well trained programmers. This is true of any language that is used by other disciplines to get a job done. In the case of R its users tend to be statisticians or what we now term "data scientists".
The first chapter is very general and outlines a lot of what any programmer in any language should know. You are encouraged to learn touch typing and how to benchmark and use profiling. It is fairly low level and does point out that many skills are transferable between languages - a good point and one that might mean you don't need to read the book. However, it has to be admitted that R is a strange language compared to more mainstream object-oriented or even functional languages.
Chapter 2 is about installing and configuring R, including five tips for efficient R.
Chapter 3 is where the real core of the book gets going. It is about optimization techniques - some general but many specific to R. In particular, vector operations are better than scalar implemenations using loops. This is something that users of R are often told and it is something common only to a small group of languages including MathCad and Octave. One nice aspect is that you get to see graphs showing how much faster things go. The final part of the chapter deals with using the Byte compiler, which is a good idea for large data.
Chapter 4 is about workflow and is more like a self help manual than anything much to do with programming. I'm not saying that this chapter shouldn't be included, but you have been warned. You need to have the mind of a manager to want to read this.
Chapter 5 returns to programming matters and how to perform efficient I/O.
Chapter 6 is titled "Efficient Data Carpentry", which might leave some readers wondering what it is all about. The idea is that data cleaning and transformation is much like taking a rough piece of wood and working it into something finished.
Chapter 7 is on optimization and after looking at code profiling, the topic moves to micro-optimizations - i.e. which exact expression is best and using parallel computing. The final part of the chapter looks at using C++ from R.
The final three chapters have the feel of makeweights. They are not completely irrelevant but they are only just suitable. Chapter 8 goes over efficient hardware, which really means powerful hardware, as powerful as possible. The advice is a bit simplistic and, yes, you should buy an SSD and a 64-bit CPU and so on. Nothing much about GPUs is covered, but you do get to find out what a byte is and what RAM is. Chapter 9 has the title Efficient Collaboration and is on coding style, reformatting and using Git. Chapter 10 Efficient Learning is about how to find out about R, including advice like use Stack Overflow and mailing lists.
The problem is that the real subject matter of this book probably runs to only a few articles rather than a complete book. I like the idea of the wider interpretation of "efficient" programming but there just isn't enough good material to make a convincing book. Whenever the ideas become even a little advanced you have to read the section more than once. You also need to be something of an R expert to get the most out of it.
The authors don't make enough of how different R is from other language in its approach to objects, functions, data typing and so on. The material in the book would be better as a couple of chapters at the end of a more advanced R tutorial-type book.
This isn't a must-have book but it has got some useful information on being a better R programmer.
|Last Updated ( Tuesday, 31 October 2017 )|