|Apache Samza Adds Container Placements API|
|Written by Kay Ewbank|
|Thursday, 09 July 2020|
Apache's distributed stream processing framework Samza has been updated to version 1.5. Improvements include a simplified job submission workflow that provides improved security, and the ability to move containers without having to restart an application.
Samza is an open source framework originally developed alongside Kafka by LinkedIn before being made open source and taken over by the Apache Software Foundation. It was originally developed to provide a simple way to develop and run stream processing jobs that can be used by non-programmers as well as developers. It uses Apache Kafka for messaging, Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management, and RockdDB for local state support. It has a simple callback-based “process message” API comparable to MapReduce, and supports managed state via snapshotting and restoration of a stream processor’s state.
This release adds a Container Placements API that means you can now move or restart one or more containers (either active or standby) of your cluster based applications from one host to another without restarting your application. The API can also be used to build maintenance, balancing and remediation tools.
Job Runner has been simplified and will now simply submit Samza job to Yarn RM without executing any user code. Job planning will happen on the ClusterBasedJobCoordinator instead. The developers say this simplified workflow addresses security requirements where job submissions need to be isolated in order to execute user code. It also makes life simpler where deployment failure could happen at multiple places.
Samza now has better facilities for managing and monitoring local state with the addition of KV store metrics for RocksDB, and a fix meaning Get store names returns correct store names in the presence of side inputs. The Samza SQL API has been improved with support for subqueries in joins and validation of the argument types in SamzaSQL UDF at the execution planning phase. There's also a new system producer for Azure blob storage.
or email your comment to: email@example.com
|Last Updated ( Thursday, 09 July 2020 )|