When you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig. Author Alan Gates is co-founder of Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. This second edition, updated with programming examples, provides comprehensive coverage on key features such as the Pig Latin scripting language and the Grunt shell.
Author: Alan Gates and Daniel Dai Publisher: O'Reilly Date: November 2016 Pages: 347 ISBN: 978-1491937099 Print: 1491937092 Kindle: B01N8TQ46A Audience: Data programmers Level: Intermediate Category: Data Science
- Pig's data model, including scalar and complex data types
- Write Pig Latin scripts to sort, group, join, project, and filter your data Use Grunt to work with the Hadoop Distributed File System (HDFS)
- Build complex data processing pipelines with Pig's macros and modularity features
- Embed Pig Latin in Python for iterative processing and other advanced tasks
- Use Pig with Apache Tez to build high-performance batch and interactive data processing applications
- Create your own load and store functions to handle data formats and storage mechanisms.
Follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.
To have new titles included in Book Watch contact BookWatch@i-programmer.info
Professional C++, 6th Ed (Wiley)
Author: Marc Gregoire Publisher: Wiley Date: February 2024 Pages: 1376 ISBN:978-1394193172 Print:1394193173 Kindle:B0CRXK5191 Audience: C++ developers Rating: 4 Reviewer: Mike James Can a book on C++ get any bigger and does it need to?
|
Foundational Python For Data Science
Author: Kennedy Behrman Publisher: Pearson Pages:256 ISBN: 978-0136624356 Print: 0136624359 Kindle: B095Y6G2QV Audience: Data scientists Rating: 4.5 Reviewer: Kay Ewbank
This book sets out to be a simple introduction to Python, specifically how to use it to work with data.
| More Reviews |
|