Author: Conrad Carlberg Publisher: Que, 2011 Pages: 464 ISBN: 9780789747204 Aimed at: Students, researchers, professionals Rating: 5 Pros: A wellpresented introduction to statistics with lots of useful Excel examples Cons: Restricted to classical linear parametric stats
Reviewed by: Janet Swift
With statistics it's important you know how and why as well as churning out results.
Author: Conrad Carlberg Publisher: Que, 2011 Pages: 464 ISBN: 9780789747204 Aimed at: Students, researchers, professionals who want to use Excel for statistical analysis Rating: 5 Pros: A wellpresented introduction to statistics with lots of useful Excel examples Cons: Restricted to classical linear parametric stats. Reviewed by: Janet Swift
Excel has made both number crunching and data analysis more accessible. But with statistics it's important you know how and why as well as churning out results.
The introduction to this suggests it is suitable for those with little background in statistics. It also reveals that it can be used with any edition of Excel from 97 onwards  but while you don't need to be an Excel expert it doesn't teach you the basics of Excel  but there are plenty of other books for that plus online help on the Microsoft site.
Supporting material for this book is maintained on its own web page. On the Update tab you'll find an Errata list. It also acknowledges the fact that the figures are not as easy to read as the author would have liked. He does however have a solution:
There is a workbook for each chapter, and in each workbook you'll find a worksheet associated with each of that chapter's figures. I think you'll find it easier to see what's going on in the Excel worksheets than in the book's figures, and you'll also find it easier to follow the discussion by examining the contents of different worksheet cells on your own computer.
So even if you prefer to work through examples unaided, there's good reason to click on the links in the Download tab.
As a course in Statistics this book starts from the basics and goes quite a long way.
Chapter 1 has the title "About Variables and Values" and in the context of statistics a "variable" is an attribute or property that describes a person or a thing, such as surname, and points out that database jargon refers to variables as fields. It goes on to look at scales of measurement  category scales (also known as nominal scales); and numerical scales  and then the problems in distinguishing text values from interval values. Then comes a quick section on charting numeric variable and by the end of the chapter we are introduced to frequency distributions.
Chapter 2 is about measures of central tendency and uses Excel to calculate the mean, the median and the mode of a group of observations. Chapter 3, with the title "Variability: How Values Disperse" is where we meet standard deviation and variance with a discussion that looks at concepts as well as looking at how to calculate them before introducing Excel's functions. There also a look at bias in the standard deviation, a topic that is discussed again later in the book and the concept of degrees of freedom is introduced.
Chapter 4 is on correlation  first the concept and how to calculate it and then we get to Excel and to the Analysis ToolPak referred to by its Excel 2010 name, Data Analysis addin. Discussion continues to be interspersed with practical examples with the section "Correlations isn't Causation" being an important example. The latter part of the chapter covers multiple regression.
Chapter 5 returns to looking at nominal variables and moves very quickly to an Excel topic  pivot tables  before getting on to statistical tests and the null hypothesis. It goes into issues of interpretation of results and setting decision rules. A boxout tells us that the author's approach is frequentist, rather than Bayesian, but notes that this controversy is "well beyond the scope of this book". Next comes a good discussion of two fundamental assumptions that have to be met  random selection and independent selection. Shortly afterwards there a good summary of the steps of hypothesis testing. The second paret of te chapter is on twoway pivot tables and the ChiSquare test and distribution and the relevant Excel functions.
Chapter 6 starts with a complaint about the way that the Data Analysis addin makes things difficult for the statistician by returning results as vales rather than as formulas that could be explored to show what is going on. A main topic of the chapter is "internal validity"; the idea that experimental rescuers needs to be designed so that we can have confidence in its results. Carlberg goes through the factors that can compromise internal validity. After this he gives us a through exploration of the Ftest and the tool provided for it by the Excel addin, pointing out the shortcomings in the latter.
Chapter 7 returns to the Normal Distribution and introduces "skewness".and confidence intervals. The relevant Excel functions are introduced as well as thorough discussion of the concepts. The chapter concludes with the Central Limit Theorem.
The next three chapters are on Testing Differences Between Means. Chapter 8, which covers ztests and ttests opens by looking at the reasons for testing the differences between means. This chapter makes good use of charts to illustrate the points it makes, explaining:
You can teach yourself quite a bit about both the nature of a statistical test and about the data that plays into that test, by charting the data.
Chapter 9 is a detailed and discursive chapter that looks at three ttest approaches to investigating differences between means provided by Excel and the considerations associated with each of them. Then Chapter 10 looks at Analysis of Variance (ANOVA) and the F test as an alternative to ttests. You'll notice that this is pitched at a level that is pretty advanced  but the theoretical discussion continues to be readable and authoritative. There's a worksheet for the Scheffé Procedure and the chapter concludes with a discussion of planned orthogonal contrasts.
Chapter 11 goes into further issues concerning analysis of variance starting with Factorial ANOVA including the Data Analysis addin Two Factor with Replication tool. Next comes a discussion on interaction  the way factors operate jointly, then we look at the problem of unequal group size and considers the Two Factor without Replication tool. It concludes with a section on the limitations on the ANOVA tools in Excel, suggesting that in some circumstances you need different software.
The final four chapters comprise two on Multiple Regressional Analysis and two on Analysis of Covariance and in each case there's a chapter on the basics followed by one on Further Issues. These have the same balance of theory and practice using Excel established in earlier chapters and build on the ANOVA section.
Overall this is a readable and authoritative introduction to statistical analysis that makes highly intelligent, and where necessary critical, use of Excel. Like the author, I wish Excel had been available while I was engaged in research and now that it is this book will help researchers and business analysts in both the theory and practice of statistics.
TestDriven Database Development
Author: Mark Guernsey Publisher: AddisonWesley, 2013 Pages: 352 ISBN: 9780321784124 Aimed at: developers working with databases Rating: 4 Reviewed by: Kay Ewbank
Test Driven Development (TDD) is a wellknown and accepted technique for mainstream programming, but is less well known in the databa [ ... ]

Machine Learning for Hackers
Author: Drew Conway and John Myles White Publisher: O'Reilly Pages: 322 ISBN: 9781449303716 Aimed at: "hackers" Rating: 2 Pros: Practical approach to stats Cons: Mostly off topic and lacks explanation. Reviewed by: Mike James
Did any machines undertake any learning for the purposes of this book on a curre [ ... ]
 More Reviews 
