Apache Kylin Adds Real-time OLAP

Written by Kay Ewbank

Friday, 03 January 2020

Kylin 3.0 has been released by Apache with improvements including support for real-time OLAP, integration with Apache Livy, and a curator-based job scheduler.

Kylin is an open source distributed analytics engine designed to provide a SQL interface and multi-dimensional analysis (OLAP) on Apache. It was originally developed at eBay before becoming an Apache project. The Kylin OLAP Engine is made up of a metadata engine, a query engine, a job engine and a storage engine. It also includes a REST Server to service client requests. The query engine is based on Apache Calcite.

kylin

While previous releases of Kylin had OLAP support, the main improvement to the new release is the ability for Kylin to implement a millisecond-level data preparation delay for streaming data from sources like Apache Kafka. This means Kylin can now support sub-second level OLAP over historical batch data, near real-time streaming as well as real-time streaming. The Kylin team says this means you can now use one OLAP platform to serve different scenarios. The new functionality comes from the newly introduced Kylin real-time receiver and coordinator components.

kylineco

The second improvement of note is the ability to submit Spark jobs via Apache Livy. Livy is an Apache incubating project that provides a service that can be used to interact with a Spark cluster over a REST interface. It lets you submit programmatic, fault-tolerant, multi-tenant Spark jobs from web and mobile apps with no Spark client needed. This means multiple users can interact with your Spark cluster concurrently and reliably.

The new support in Kylin means an administrator can configure Kylin to integrate with Livy for Spark job submissions. The Spark job is submitted to the Livy Server through Livy’s REST API, instead of starting the Spark Driver process locally. This enables the management and monitoring of the Spark resources, and also releases the pressure of the nodes where the Kylin job server is running.

The final major improvement is the addition of a curator-based job scheduler. This automatically discovers the Kylin nodes and will do an automatic leader selection among them to choose the node that will submit jobs. The Kylin team says that with this feature, you can easily deploy and scale out Kylin nodes without manually updating the node address in kylin properties and having to restart Kylin to make the change effective.

Kylin 3 is available for download from the Kylin website.

kylin

More Information

Kylin Website

Apache Kylin 2.5 Adds All-in-Spark Cubing Engine

Kylin 2.3.0 Adds SQL Server Support

Apache Kylin Gets Table Level ACL Management

Apache Kylin Adds RDBMS Support

Spark BI Gets Fine Grain Security

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Parasoft Adds AI Assistant To C/C++ Test
30/06/2025

Parasoft has updated its C/C++ Test software with an AI-powered documentation assistant, along with complete support for MISRA C:2025 and auto-suppression of equivalent violations. C/C++ Test can be u [ ... ]

+ Full Story

Linux Passes 5% Milestone
22/07/2025

Latest figures from StatCounter show that Linux achieved a 5.03% share of the Desktop Operating System market in June 2025, something that is being viewed as a pivotal moment for open-source soft [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 03 January 2020 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments