Google Launches Cloud Dataproc
Written by Kay Ewbank   
Friday, 02 October 2015

Google has launched a beta version of Google Cloud Dataproc, a service which will provide an alternative way to manage Hadoop and Spark more quickly and easily.

Google continues to expand its range of cloud services for working with Big Data, see Google Announces Big Data the Cloud Way. Now available in beta, Cloud Dataproc is a managed Spark and Hadoop service that lets you use open source data tools for batch processing, querying, streaming, and machine learning. The aim is to let you you create clusters quickly, manage them easily, and save money by turning clusters off when you don't need them.


The service can be used from three clusters up to hundreds of clusters, and is priced at 1 cent per virtual CPU in your cluster per hour on top of the usual cost of running virtual machines and data storage. The clusters can include preemptible instances that have lower compute prices, and you’re charged using minute-by-minute billing with a ten-minute-minimum billing period. The claim is you’ll be able to start, scale, and shutdown in 90 seconds or less. 

The service comes with built-in integration with other Google Cloud Platform services, such as BigQuery, Cloud Storage, Cloud Bigtable, Cloud Logging, and Cloud Monitoring. You can interact with clusters and Spark or Hadoop jobs through the Google Developers Console, the Google Cloud SDK, or the Cloud Dataproc REST API. When you're done with a cluster, it can be turned off to save money, and data is safe because Cloud Dataproc is integrated with Cloud Storage, BigQuery, and Cloud Bigtable. A free 60-day trial of the Google Cloud Platform is available.

The fact the service is based around Spark and Hadoop and the other elements of the ecosystem such as Pig and Hive, developers will be able to begin work without needing to learn new tools or APIs, and existing projects or ETL pipelines can be moved to the new service without redevelopment.




More Information

Google Cloud Dataproc

Related Articles

Google Announces Big Data the Cloud Way 

Google Moves On From MapReduce, Launches Cloud Dataflow

Google Cloud Dataflow SDK 

Google BigQuery Service

Major Update to Google BigQuery

BigQuery Now Open to All 


To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.



Servo Rehomed With Linux Foundation

Servo, the open source project initiated at Mozilla to create a high-performance browser engine designed for both application and embedded use, is joining the Linux Foundation.

Computer History Under the Hammer

If you crave for a slice of computer history, an online auction from Bonhams salerooms in Los Angeles on November 5th provides plenty of choice. If you don't have deep enough pockets, just browsing th [ ... ]

More News






or email your comment to: