Apache Samza Adds Container Placements API
Written by Kay Ewbank   
Thursday, 09 July 2020

Apache's distributed stream processing framework Samza has been updated to version 1.5. Improvements include a simplified job submission workflow that provides improved security, and the ability to move containers without having to restart an application.


Samza is an open source framework originally developed alongside Kafka by LinkedIn before being made open source and taken over by the Apache Software Foundation. It was originally developed to provide a simple way to develop and run stream processing jobs that can be used by non-programmers as well as developers. It uses Apache Kafka for messaging, Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management, and RockdDB for local state support.  It has a simple callback-based “process message” API comparable to MapReduce, and supports managed state via snapshotting and restoration of a stream processor’s state.


This release adds a Container Placements API that means you can now move or restart one or more containers (either active or standby) of your cluster based applications from one host to another without restarting your application. The API can also be used to build maintenance, balancing and remediation tools.

Job Runner has been simplified and will now simply submit Samza job to Yarn RM without executing any user code. Job planning will happen on the ClusterBasedJobCoordinator instead. The developers say this simplified workflow addresses security requirements where job submissions need to be isolated in order to execute user code. It also makes life simpler where deployment failure could happen at multiple places.

Samza now has better facilities for managing and monitoring local state with the addition of KV store metrics for RocksDB, and a fix meaning Get store names returns correct store names in the presence of side inputs. The Samza SQL API has been improved with support for subqueries in joins and validation of the argument types in SamzaSQL UDF at the execution planning phase. There's also a new system producer for Azure blob storage.



More Information

Samza Site

Related Articles

Apache Samza Adds SQL

Apache Bigtop Adds OpenJDK 8 Support 

Apache Fluo Improves Spark Integration

Kafka 1 Becomes More Tolerant

Comparing Kafka To RabbitMQ

Apache Kafka Adds New Streams API

GoKa Stream Processing For Kafka


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Ladybird - An Independent Web Browser

Ladybird sets out to be an independent Web Browser, free of Google (or any) advertising. It has taken the step of becoming a non-profit project on receiving a $1 million donation from GitHub founder,  [ ... ]

Perl v5.40.0 Shows That It Is Too Resilient To Die

Having faced doubt, debate and insecurity, Perl is still going after all those years, alive, kicking and making releases. Business as usual.

More News

kotlin book



or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 09 July 2020 )