Mining the Social Web
Mining the Social Web

Author: Matthew A. Russell
Publisher: O'Reilly, 2011
Pages: 360
ISBN: 978-1449388348
Aimed at: Programmers enthusiastic about the social web
Rating: 4.5
Pros: Covers many technical topics with lots of practical examples
Cons: Few really useful examples
Reviewed by: Mike James

The social web provides many opportunities for data mining. This book is fun but not for the faint hearted.

 

Author: Matthew A. Russell
Publisher: O'Reilly, 2011
Pages: 360
ISBN: 978-1449388348
Aimed at: Programmers enthusiastic about the social web
Rating: 4.5
Pros: Covers many technical topics with lots of practical examples
Cons: Few really useful examples
Reviewed by: Mike James

I never fail to be amazed at the range of high tech things that Python is used for. This particular book is all about data mining as applied to the massive amounts of data generated every day by the social web. Be warned however that this is a fast paced and technical view of a great many topics. it isn't for the faint hearted but it is a lot of fun and who knows it might even be useful.

 

Banner

 

The first chapter starts off with how to install Python and how to install the necessary packages. The first example, to make sure it is all working, is a simple Twitter data collection application complete with some simple data analysis - frequency, lexical diversity and drawing graphs. The actual level of Python used throughout the book is never over-complex and if you can't program in Python then as long as you can program in something you should be able to get value from the examples.

Chapter 2 is on microformats and using them to extract data from web pages - geolocation, recipes and reviews. The next chapter takes a step into the world of email which you might not think of as social media in the modern sense - but the data is still there to mine! This also introduces CouchDB , Lucene and Restful web services.

Chapter 4 returns to Twitter and goes much deeper into processing the sort of data that you can get from Twitter. A lot of this comes across as very ad-hoc rather than a worked out approach but this probably corresponds to the nature of the work. The later part of the chapter focuses on analysing social networks - clique detection and graph theory. The next chapter continues working with Twitter. This is another ad-hoc analysis but in this case of Tim O'Reilly's tweets. Later we have a graphic indicating tweet frequency as a tag cloud.

Chapter 6 moves on to consider LinkedIn which in many ways has a more direct connection with making a profit. The focus of the chapter is on cluster analysis and this is a fairly classical approach.

Chapter 7 changes gear a little and looks at the analysis of texts rather than structures and relationships. It introduces some natural language processing with NLTK which continues into the following chapters on processing blogs. This is the toughest problem tackled in the book and of course it doesn't present a complete solution but it is full of ideas.

Chapter 9 finally reaches Facebook which is perhaps presents the most diverse of the social data on offer being a mix of blogging, images, micro blogging and so on. The book rounds off with a chat about the semantic web - with no definite conclusions.

This is an interesting introduction to accessing social data and it presents lots of different ways of working with it and displaying it. I have to admit that I enjoyed reading it but there were times that I wondered why exactly we were doing something. Yes fun but is it useful? Perhaps the very fact that I'm asking this question suggests that I'm not in the last analysis the ideal reader for this book - even so, a few examples that were really useful would have made me feel happier.

A really enjoyable book and highly recommended to the appropriate reader.

Banner


Next Generation Databases: NoSQL, NewSQL, and Big Data

Author: Guy Harrison
Publisher: Apress
Date: December 30, 2015
Pages: 260
ISBN: 978-1484213308
Print: 1484213300
Kindle: B015PQPALM
Audience: Architects, DBAs, and Devs
Rating: 4.6
Reviewer: Ian Stirk

To mark the beginning of the New Year we are republishing our most popular book review of 2 [ ... ]



Beginning Android Programming with Android Studio, 4th Ed

Author: J. F. DiMarzio
Publisher: Wrox Press
Pages:456
ISBN: 978-1118705599
Print:1118705599
Kindle: B01M3MSBV6
Audience: Beginning Android developers
Rating: 3
Reviewer: Lucy Black

Android Programming can be easier using Android Studio, but you have to find out how to use it first.&n [ ... ]


More Reviews

Last Updated ( Monday, 16 May 2011 )
 
 

   
Banner
RSS feed of book reviews only
I Programmer Book Reviews
RSS feed of all content
I Programmer Book Reviews
Copyright © 2018 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.