Getting Started With Data Science
Getting Started With Data Science

Author: Murtaza Haider
Publisher: IBM Press
Pages: 608
ISBN: 978-0133991024
Print: 0133991024
Kindle: B019D322UU
Audience: IT people who need to analyse data
Rating: 4.5
Reviewer: Kay Ewbank

This book aims to make data analytics more accessible by using interesting examples. 

If you've enjoyed books such as Freakonomics or Outliers, you'll feel at home reading this book as it uses a similar approach; take an interesting question such as 'Does the higher price of cigarettes deter smoking?', and use that as the basis for some data analysis.

The aim is to teach you how to do your own analyses. Haider works through the examples in R, Stata, SPSS and SAS. Within the book the examples are worked mainly in R, and one of the other languages. The code for the other languages is available for download from the IBM Press website, along with details of how to use it. 

 

Banner

The book opens with a chapter called 'the bazaar of storytellers' that discusses what data science is and gives the author's definition of a data scientist. The next chapter, data in the 24/7 connected world, identifies sources of data that you can analyse, and also introduces the concept of big data. Chapter three looks at how data becomes meaningful when it is used as the basis for 'stories'. Haider's view is that the strength of data science lies in the power of the narrative, and that is what underpins most of the book.

From a practical perspective, the book begins to get useful in chapter four,  which looks at how you can generate summary tables, including multi-dimensional tables. Next is a chapter on graphics and how to generate them. If you're thinking that it seems a bit odd to concentrate on the 'end result' first, you have to remember that the author's view is that data analysis is only useful if your audience actually looks at the results and understands them.

 

 

The next chapter gets more into the workings of data analysis with an examination of hypothesis testing using techniques such as t-tests and correlation analysis. Regression analysis is looked at next, based on the notions "why tall parents don't have even taller children". This is a fun chapter, with examples including consumer spending on food and alcohol, housing markets, and whether the appearance of teachers affects their evaluations by students.

A chapter on analysis of binary variables considers logit and probit models using data from New York transit use. Categorical data and multinomial variables are the topic of the next chapter, which expands on the ideas of logit models.

Spatial data analysis is covered next, taking us into the use of GIS systems and how these have expanded the options for data analysis. There's a good chapter on time series analysis looking at how regression models can be used with time series data, using the examples of forecasting housing markets.

The final chapter introduces the field of data mining. It's more of a taster discussing some of the techniques that can be used, but fun anyway.

Overall, this is a book that is accessible, interesting and still manages to introduce the statistical techniques you need to use for real data analytical work. A good way to get into data analysis. 

Related Reviews

Data Science and Big Data Analytics

Doing Data Science

R in Action: Data Analysis and Graphics with R (2e)

Learning To Love Data Science 

 

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

 

Banner


Introduction to Machine Learning with Python

Author: Andreas C. Müller and Sarah Guido
Publisher: O'Reilly
Pages: 394
ISBN: 978-1449369415
Print:1449369413
Kindle: B01M0LNE8C
Audience: Python programmers
Rating: 4
Reviewer: Mike James

What exactly is machine learning? 



Core Java Volume I - Fundamentals (10e)

Author: Cay S. Horstmann
Publisher: Prentice Hall
Pages:1040
ISBN: 978-0134177304
Print: 0134177304
Kindle: B019PFBM0M
Audience: All Java programmers
Rating: 4.5
Reviewer: Alex Armstrong

 

This tenth edition is a revised and updated incarnation of classic - but this is no re [ ... ]


More Reviews

Last Updated ( Tuesday, 26 July 2016 )
 
 

   
Banner
RSS feed of book reviews only
I Programmer Book Reviews
RSS feed of all content
I Programmer Book Reviews
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.