Data Smart (Wiley)

Author: Jordan Goldmeier
Publisher: Wiley
Date: November 2023
Pages: 448
ISBN: 978-1119931386
Print: 111993138X
Kindle: B0CJPP7XZM
Audience: Excel users
Level: Introductory
Category: Data Science
Rating: 4.5
Reviewer: Kay Ewbank

This is an updated edition of a well regarded title which looks at accessible ways to combine statistics and machine learning, along with Excel, to discover insights in your data, 

It has been revised by Jordan Goldmeier who wasn't the original author and is a self-confessed Excel lover who's also a Microsoft MVP. 

Banner

The book kicks off with a chapter titled 'everything you ever needed to know about spreadsheets but were too afraid to ask', in which Goldmeier introduces Excel tables and lookup formulas, pivot tables and array formulas. 

He then goes on to look at Power Query, Microsoft's data transformation and data preparation engine. The chapter considers how to use Power Query's  graphical interface to retrieve data, and the editor for applying transformations, and carrying out the extract, transform, and load (ETL) processing of data.

Chapter three has the light-hearted title "Native Bayes and the Incredible Lightness of Being an Idiot." Goldmeier starts with what he says is the world's fastes intro to probability theory before going on to consider the chain rule, Bayes rule, and how to use Bayes to create an AI model. 

Two chapters on cluster analysis are next, starting with a look at using K-Means to segment your customer base, then going on to network graphs and community detection. 

Goldmeier then looks at regression, which he describes as the granddaddy of supervised artificial intelligence. The concepts are explained well, and the examples are carefully chosen to make the ideas clear.

Next comes a chapter on ensemble models that Goldmeier describes as a whole lot of bad pizza. By this he's referring to an episode of the US version of the sitcom The Office when the boss asks whether its better to have a small amount of really good pizza or a lot of really bad pizza. He then goes on to extrapolate, saying many AI implementations are closer to the 'lots of bad pizza' model. 

A chapter on forecasting starts from the premise that there's no point worrying because you can't win, and Goldmeier backs up his assertion with a statement saying that the only guarantee in forecasting is that your forecast is wrong. He then goes on to say this doesn't mean you shouldn't try forecasting and that you'll still end up knowing more than nothing. 

Chapters on optimization modeling and outlier detection consider whether these techniques could be described as data science. 

Goldmeier then looks at how to go beyond spreadsheets with a chapter on R.

Overall, this is a good introduction to data analysis using straightforward tools and mainstream techniques. I suspect most developers would find it more useful to use R and go further, but the book could help you get started with data analysis. Worth reading. 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Data Structures & Algorithms in Python

Author: Dr. John Canning, Alan Broder and Robert Lafore
Publisher: Addison-Wesley
Date: October 2022
Pages: 928
ISBN:978-0134855684
Print: 013485568X
Kindle: B0B1WJF1K9
Audience: Python developers
Rating: 4
Reviewer: Mike James
Data structures in Python - a good idea!



Visual Complex Analysis

Author:  Tristan Needham
Publisher: Clarendon Press
Pages: 616
ISBN: 978-0198534464
Print: 0198534469
Kindle: B0BNKJTJK1
Audience: The mathematically able and enthusiastic
Rating: 5
Reviewer: Mike James
What's complex about complex analysis?


More Reviews