Apache Flink 1.5.0 Adds Support For Broadcast State

Written by Kay Ewbank

Friday, 08 June 2018

The latest version of Apache Flink has been released with a rewritten deployment and process model, and support for broadcast state.

Apache Flink is an open source platform for distributed stream and batch data processing. It consists of a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink includes several APIs, including the DataSet API for static data embedded in Java, Scala, and Python; the DataStream API for unbounded streams embedded in Java and Scala; the Table API with a SQL-like expression language embedded in Java and Scala; and the streaming SQL API that enables SQL queries to be executed on streaming and batch tables, with a syntax is based on Apache Calcite.

One addition to the latest release is a new SQL CLI client that is the first step to adding a service to execute streaming and batch SQL queries. The client provides a SQL shell to run exploratory queries on data streams.

flinkcli

The developers have redesigned and reimplemented large parts of Flink’s process model, and say that while there is still work to do on this reworking, the changes already implemented enable more natural Kubernetes deployments, and mean that all requests to the JobManager now happen through REST. The improvements also add support for dynamic resource allocation and dynamic release of resources on YARN and Mesos schedulers for better resource utilization. In a later version it will be possible to dockerize jobs and deploy them in a natural way as part of the container deployment.

The new support for broadcast state is something that developers using Flink have requested. Broadcast state is replicated across all parallel instances of a function, and might typically be used where you have two streams, a regular data stream alongside a control stream that serves rules, patterns, or other configuration messages. Changes to the processing of the regular stream are configured by the messages of the control stream. By broadcasting rules or patterns to all parallel instances of a function, they can be applied to all events of the regular stream.

The next improvement is aimed at improving the efficiency of failure recovery. While in normal use, Flink writes copies of an application’s state to a remote, persistent storage and loads it back in case of a failure. This ensures state information is not lost, but recovering the state takes time. The new feature, task-local state recovery, adds the ability to also keep a copy on the local disk of each machine of the application’s state, so recovery is faster.

Other improvements include extended join support for SQL and Table API with the addition of support for windowed outer equi-joins; and improved reading and writing JSON messages from and to connectors.

flinklogo

More Information

Flink website

Flink Gets Event-time Streaming

FLink Reaches Top Level Status

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Computer Science Under Threat
02/07/2025

As the demand for "entry-level" programmers declines, established university Computer Science (CS) departments are facing a shortfall of students. How should they adapt their admission policies and&nb [ ... ]

+ Full Story

pg_disatch - Run SQL Queries Asynchronously On PostgreSQL
24/06/2025

pg_disatch is meant to be a TLE-compliant alternative to pg_later but built on top of pg_cron. What makes it different?

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 08 June 2018 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments