A sandbox environment for Hadoop has been announced by Hortonworks, and is aimed at giving new users an easy way to learn how to use Hadoop.
Hortonworks Sandbox is designed to be an easy-to-use environment for learning Apache Hadoop. The Sandbox is a free download of a complete, self contained virtual machine with Apache Hadoop pre-configured. You also get a Web interface and a set of hands-on, step-by-step tutorials that you can use to learn about Hadoop. It is designed to help close the gap between people wanting to learn and evaluate Hadoop, and the complexities of spinning up an evaluation cluster of Hadoop.
The tutorials are built based on what Hortonworks has learned while training people in its Hortonworks University Training classes.
The Sandbox is a single node implementation of the Hortonworks Data Platform (HDP) 1.2 that behaves just like a normal Hadoop environment, which allows you to add your own datasets in an isolated protected environment to evaluate the use of Hadoop in your own data architectures.
While the Sandbox is primarily aimed at people who’ve not used Hadoop before, Hortonworks also intends the environment to be used by more experienced developers who want to learn more and explore the features of the projects included in the Sandbox: Apache Pig, Apache Hive, Apache HCatalog and Apache Hbase.
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. HCatalog is a table and storage management service for data created using Apache Hadoop, while Hbase is the Hadoop database.
The plan is that new tutorials will be released on roughly a monthly basis. The Sandbox will also contain demos and exercises of the integration with the tools and applications from Hortonworks partners like Teradata, Alteryx, Datameer, and Microsoft.