Google Cloud Dataflow SDK
Written by Kay Ewbank   
Monday, 05 January 2015

Google has released a Java software development kit (SDK) for Cloud Dataflow, its cloud analytics system launched in 2014.

 

Announcing the SDK on the Google Cloud Platform blog, Sam McVeety, Software Engineer at Google said that the SDK will make it “easier for developers to integrate with our managed service while also forming the basis for porting Cloud Dataflow to other languages and execution environments.”

The blog post says Google hopes that the developer community will create apps that combine stream and batch based processing models. The Cloud Dataflow SDK introduces a unified model for this, along with a set of windowing primitives that mean the same computations can be used with batch or stream based data sources.

Google also hopes developers will use the SDK to execute Dataflow on other service environments, commenting that “As Storm, Spark, and the greater Hadoop family continue to mature, developers are challenged with bifurcated programming models. We hope to relieve developer fatigue and enable choice in deployment platforms by supporting execution and service portability.”

Google also plans to add support for other languages in SDKs, and is already working on a Python 3 version. The Java version can be downloaded from GitHub.

 

googlecloudsq

More Information

Google Announces Open-Source Cloud Dataflow SDK for Java

Cloud Dataflow

DataflowJavaSDK on GitHub

Related Articles

Google Moves On From MapReduce, Launches Cloud Dataflow

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


TensorFlow 2 Offers Faster Model Training
02/10/2019

There's a new version of Google TensorFlow with faster model training and a move to Keras as the central high-level API used to build and train models.



Apache Rya Becomes Top Level Project
08/10/2019

Apache Rya has become an Apache top level project. Rya is a  cloud-based big data triple store (subject-predicate-object) database used to process queries in milliseconds.


More News

 

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 05 January 2015 )