Apache Pinot 1.4 Improves Multistage Engine

Written by Kay Ewbank

Tuesday, 14 October 2025

Apache Pinot 1.4 has been released with significant improvements to the Multistage Engine, Pauseless Consumption and Time Series Engine among a wide range of other enhancements. Pinot is a real-time distributed OLAP datastore that is purpose-built for low-latency, high-throughput analytics.

Pinot was originally developed at LinkedIn to run queries including showing users who had viewed their profile. Pinot can perform typical analytical operations such as slice and dice, drill down, roll up, and pivot on large scale multi-dimensional data.

pinot

Apache describes Pinot as being ideal for ingesting and immediately querying data from streaming or batch data sources (including, Apache Kafka, Amazon Kinesis, Hadoop HDFS, Amazon S3, Azure ADLS, and Google Cloud Storage).

The improvements to the new version star with a new query mode added for running Multistage Engine queries against Pinot, heavily inspired from Uber's Presto over Pinot query architecture. The MSE Lite Mode runs queries following a Scatter-Gather paradigm, with a configurable limit on the number of records returned by each instance of the leaf stage. MSE Lite Mode can also scale to 1000s of QPS with minimal hardware, meaning users can now run complicated multi-stage queries making use of features such as sub-queries and window functions at high-qps and low-latencies.

There's also a new query optimizer for the Multistage Engine that can automatically eliminate or simplify redundant Exchanges. The optimizer can simplify Exchanges for arbitrary complicated queries without the need for query-hints. The optimizer supports group-by, joins and union-all, and can solve constant queries within the Broker itself.

The multi-stage engine has also been improved, and now supports multiple Window functions in a single query plan. The team says this enables more expressive and efficient analytical queries with improved stage fusion and execution planning. It also has new support for ASOF JOIN, allowing time-aligned joins commonly used in time-series analytics.

Pauseless consumption has also been added to this version. This improves real-time analytics by minimizing ingestion delays and improving data freshness. Until now, real-time data ingestion was paused during the build and upload phases of the previous segment, meaning there was a gap in accessing the most recent data. Pauseless consumption allows Pinot to continue ingesting data while completing the build and upload phases of the previous segment.

Apache Pinot 1.4 is available now.

pinot

More Information

Apache Pinot Website

Apache Pinot 1.0 Released

Apache Iceberg Improves Spark Support

Spark BI Gets Fine Grain Security

Spark Announcements

Kafka Adds KRaft-Based Authorizer

Kafka 3.1 Adds OIDC Support

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Facebook or Linkedin.

Charles Babbage - Born This Day 154 Years Ago
26/12/2025

It is an annual I Programmer tradition to celebrate the birth of Charles Babbage, the man who invented and designed a programmable computer at the start of the Industrial Age, and who is now reco [ ... ]

+ Full Story

Knuth's Xmas Lecture 2025 - The Knight's Adventure
24/12/2025

It's Xmas and Xmas means Donald Knuth putting on his flamboyant Xmas top and talking to us about something that most of us know nothing about? Of course not. This year it's all about the Kn [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

More Information

Related Articles

Comments