Apache Samza Adds Container Placements API
Written by Kay Ewbank   
Thursday, 09 July 2020

Apache's distributed stream processing framework Samza has been updated to version 1.5. Improvements include a simplified job submission workflow that provides improved security, and the ability to move containers without having to restart an application.

 

Samza is an open source framework originally developed alongside Kafka by LinkedIn before being made open source and taken over by the Apache Software Foundation. It was originally developed to provide a simple way to develop and run stream processing jobs that can be used by non-programmers as well as developers. It uses Apache Kafka for messaging, Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management, and RockdDB for local state support.  It has a simple callback-based “process message” API comparable to MapReduce, and supports managed state via snapshotting and restoration of a stream processor’s state.

samza

This release adds a Container Placements API that means you can now move or restart one or more containers (either active or standby) of your cluster based applications from one host to another without restarting your application. The API can also be used to build maintenance, balancing and remediation tools.

Job Runner has been simplified and will now simply submit Samza job to Yarn RM without executing any user code. Job planning will happen on the ClusterBasedJobCoordinator instead. The developers say this simplified workflow addresses security requirements where job submissions need to be isolated in order to execute user code. It also makes life simpler where deployment failure could happen at multiple places.

Samza now has better facilities for managing and monitoring local state with the addition of KV store metrics for RocksDB, and a fix meaning Get store names returns correct store names in the presence of side inputs. The Samza SQL API has been improved with support for subqueries in joins and validation of the argument types in SamzaSQL UDF at the execution planning phase. There's also a new system producer for Azure blob storage.

 

 samza

More Information

Samza Site

Related Articles

Apache Samza Adds SQL

Apache Bigtop Adds OpenJDK 8 Support 

Apache Fluo Improves Spark Integration

Kafka 1 Becomes More Tolerant

Comparing Kafka To RabbitMQ

Apache Kafka Adds New Streams API

GoKa Stream Processing For Kafka

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

Banner


Pluralsight Upskilling For AWS DeepRacer
28/07/2020

Pluralsight has announced a collaboration with Amazon Web Services to help us enhance our machine learning skills with AWS DeepRacer. It kicks off today with a webinar with the chance to win an AWS De [ ... ]



English To Bash Competition Opens
23/07/2020

A competition in which participants build natural language processing models that take a description in English and convert the description to its corresponding Bash syntax. The NLC2CMD competition is [ ... ]


More News

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 09 July 2020 )