Telling Stories With Data

Author: Dr. Rohan Alexander
Publisher: Chapman & Hall/CRC
Date: July 2023
Pages: 598
ISBN: 978-1032134772
Print: 1032134771
Kindle: B0C97MMKPX
Audience: Data scientists
Level: Intermediate
Category: Data Science
Rating: 4.5
Reviewer: Kay Ewbank

The aim of this book is to show how you can build and share knowledge based on data and how to use R to build applications based on data. 

The book is organized into six parts - Foundations, Communications, Acquisition, Preparation, Modeling and Applications. 

Banner

The Foundations part of the book starts with an overview of the intent of the book, before the author moves on to a set of worked examples that show the principles from the rest of the book, and follow the recommended workflow of plan, simulate, acquire, model and communicate. 

storiesdata

Chapter 3 then introduces tools that can be used in the workflow to ensure your results can be reproduced. Specifically, Quarto for documents integrating text and R code, R Projects to make the project independent of a specific directory structure, and Git and GitHub for sharing code and data. The chapter also looks at using R. 

Part Two of the book considers communication, with chapters on how to write an effective report, and how to make good use of graphs, tables and maps. 

Part Three is concerned with how you acquire useful data. There's a chapter on measurement and sampling that also looks at publicly available data such as census data and other government statistics. This is followed by a chapter that looks more at tools you might use for getting data such as data scraping, OCR if the data isn't available digitally, and extraction from PDFs.  This part of the book ends with techniques that you can use to acquire your own data including conducting an experiment, running an A/B test, and running surveys.

Having acquired your data, the next part of the book considers how to prepare the data and turn it from raw into something that can be shared and explored. There's a good chapter on cleaning and preparing the data, and another useful one on storing and retrieving it, including how to use R data packages and Parquet. 

Part Five gets on to data modeling, from exploratory data analysis so you understand the data, through the use of linear models, to generalised linear models including logistic, Poisson, and negative binomial regression. 

The final main part of the book considers applications of modeling. There's a chapter on making causal claims from observational data that looks at how you might make use of difference-in-differences, regression discontinuity, and instrumental variables. A chapter on multilevel regression with post-stratification shows how to use a statistical model to adjust for known biases. This part of the book ends with a chapter on the analysis of text-based data.

The final chapter is made up of advice on how you go further and what to read to support this. 

Overall, this is a useful book if you want to do data analysis with some use of R. You do need to be reasonably confident with statistics, or willing to read around the material, but each chapter does come with a list of things you can read ahead of working through the chapter, and there are frequent suggestions for more material throughout the text. There are also lots of examples in R, and plenty of exercises to follow. If you're willing to put the work in, this is a book that will teach you a lot.  

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Machine Learning with PyTorch and Scikit-Learn

Author: Sebastian Raschka, Yuxi (Hayden) Liu & Vahid Mirjalili
Publisher: Packt
Date: February 2022
Pages: 770
ISBN: 978-1801819312
Print: 1801819319
Kindle: B09NW48MR1
Audience: Python developers interested in machine learning
Rating: 5
Reviewer: Mike James
This is a very big book of machine le [ ... ]



Programming with Rust

Author:  Donis Marshall
Publisher: Addison-Wesley
Pages: 400
ISBN: 978-0137889655
Print: 0137889658
Kindle: B0CLL1TGVT
Audience: Programmers wanting to learn Rust
Rating: 3
Reviewer: Mike James
Rust is the language we all want to learn at the moment so this is just in time.


More Reviews