Apache Beam 2.70 Improves Python Support
Written by Kay Ewbank   
Thursday, 01 January 2026

Apache Beam, the open source programming SDK for defining batch and streaming data-parallel processing pipelines, is now available in a new version. Apache Beam 2.70 has been released with improved support for FLink and Python. 

Apache Beam has an number of Beam SDKs that you can use to build a program that defines a pipeline. This is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Beam began life at Google, and is used as the Google Cloud Dataflow (GCD) service. Beam uses the same API as GCD.

beamlogo

The advantage Apache Beam offers is its single, unified API for both batch and streaming data processing. This differs from most other parallel processing frameworks, It acts as an abstraction layer. The user says what needs processing, and the underlying runner such as Spark or Flink takes care of how and where it runs. The single API means you can write code just once for both batch  and streaming data, then run the same pipeline on different runners without rewriting. This, along with the ability to use SDKs for a wide range of languages including Java, Python, Go, and SQL, make Beam a good choice. 

The main changes of note to this release start with support for Apache Flink 1.20. Flink is an open source platform for distributed stream and batch data processing, with a streaming dataflow engine for data distribution and distributed computations over data streams. The Flink Runner in Beam translates Beam pipelines into Flink jobs. 

Python improvements start with the addition of full support for Milvus integration, including Milvus enrichment and sink operations. There's also a new Milvus sink I/O connector added for Python. Milvus is an open-source vector database that provides scalable storage for large amounts of vector embeddings and supporting high-performance similarity searches of vector data. The Beam developers have also added examples for the Milvus search enrichment handler on the Beam Website, including a Jupyter notebook example. 

Other improvements include a change to make CloudPickle the default Pickle library; and support for ReadAllFromBigQuery for the Java runner. This was previously only available in the Python runner, meaning developers using the Java runner had to use the BigQuery client library to directly interact with BigQuery when they need to perodically refresh data.

Beam 2.70 is available now.  

beamlogo

More Information

Beam Website

Related Articles

Apache Beam Moves To Java 8

Apache Beam Moves To Top Level

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Facebook or Linkedin.

Banner


Rust For Linux No Longer Experimental
15/12/2025

Linux maintainers attending the Linux Kernel Maintainers Summit have said that Rust in the Linux kernel should no longer be treated as experimental, but rather as a core part of the kernel.



The Goose Advent Of AI Has Commenced
10/12/2025

A new Advent calendar had joined those for Java, Kotlin and Rust. We now have  Advent of AI, a series of AI engineering challenges from Goose, that is already underway.


More News

pico book

 

Comments




or email your comment to: comments@i-programmer.info