Apache Pinot 1.0 Released
Written by Kay Ewbank   
Tuesday, 26 September 2023

Apache Pinot 1.0 has been released. The real-time distributed OLAP datastore has been purpose-built for low-latency, high-throughput analytics.

Pinot was originally developed at LinkedIn in 2013 to enable the company to run various queries including showing users who had viewed their profile. Highlights of the 1.0 release are an extension to the multi-stage query engine, upsert capabilities (delete, metadata TTL, segment preloading and segment compaction), NULL value support in queries, support for SPI-based pluggable indexes, and improvements to the Spark 3 connector.



Pinot can perform typical analytical operations such as slice and dice, drill down, roll up, and pivot on large scale multi-dimensional data.

Apache describes Pinot as being ideal for ingesting and immediately querying data from streaming or batch data sources (including, Apache Kafka, Amazon Kinesis, Hadoop HDFS, Amazon S3, Azure ADLS, and Google Cloud Storage).

It provides ultra low-latency analytics even at extremely high throughput, and is a columnar data store with several smart indexing and pre-aggregation techniques. The developers say it offers consistent performance based on the size of your cluster and an expected query per second (QPS) threshold.

The current version of Pinot's features include real-time support for upsert mutations (if exists update else insert) that are used when it's not clear if the respective row is already present in the database. It also supports query-time Native JOINs through its multi-stage query engine which efficiently manages complex analytical queries, including JOIN operations. The developers say this engine alleviates computational burdens by offloading tasks from brokers to a dedicated intermediate compute stage. It can also handle semi-structured or unstructured data, and the team says offers "improving" ANSI SQL compliance.

The team says the original query engine works very well for simpler filter-and-aggregate queries, but the broker could become a bottleneck for more complex queries. The new engine resolves this by introducing intermediary compute stages on the query servers, and brings Apache Pinot closer to full ANSI SQL semantics.

Apache says that for application developers, Pinot works well as an aggregate store that sources events from streaming data sources, such as Kafka, and makes it available for a query using SQL. You can also use Pinot to aggregate data across a microservice architecture into one easily queryable view of the domain. 

Apache Pinot is available now.


More Information

Apache Pinot Website

Related Articles

Apache Iceberg Improves Spark Support

Spark BI Gets Fine Grain Security

Spark Announcements

Kafka Adds KRaft-Based Authorizer  

Kafka 3.1 Adds OIDC Support

Kafka 3.0 Released With KRaft 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


MongoDB Atlas Stream Processing Generally Available

The MongoDB developers have announced that MongoDB Atlas now has support for stream processing. The news was announced at MongoDB.Local NYC.

Apache Arrow 16 Adds Azure Blob Support

Apache Arrow 16 has been released with improvements to the C data interface and to Arrow Flight RPC, and the addition of support for Azure Blob data format.

More News

raspberry pi books



or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 26 September 2023 )