Apache Impala 4 Supports Operator Multi-Threading
Written by Kay Ewbank   
Thursday, 29 July 2021

Apache Impala 4 has been released with many improvements including support for multi-threading across all operators, and support for all TPC-DS 99 queries without manual rewrites. The new version has also improved authentication and authorization.

Impala is an open source, native analytic database for Apache Hadoop that provides a high-performance distributed SQL engine. It was originally developed by Cloudera, and donated to the Apache Software Foundation along with Apache Kudu.


Impala can be used to run SQL queries on data stored in HDFS, HBase, Apache Kudu, Amazon S3, and Microsoft ADLS without requiring data movement or transformation.

The support for multi-threading by operators in the new release overcomes earlier limitations caused because a single query fragment ran in a quasi-single threaded manner on a node. The scanners did run in multiple threads, but all other operators (joins, aggregation) ran in the main thread. The new support adds multi-threaded execution on a single node by running multiple fragment instances, each of which runs in a single thread. The move results in significant performance improvements for some queries, in some cases up to seven times faster by taking better advantage of all the CPU cores.

impala parallel query improvement

The degree of parallelism used for certain operations that can benefit from multithreaded execution is set by a parameter called mt_dop (MultiThreading Degree Of Parallelism). Until now, Impala only supported setting MT_DOP in queries that have only scans and aggregates. This limitation has now been removed.

Another improvement to the new release is that it supports all TPC-DS 99 queries without manual rewrites, including Rollup, Cube and Grouping sets, and uncorrelated subqueries in SelectList. Support has also been added for Intersect and Except set operations.

Authentication and authorization features have been strengthened in the new release, with the ability to integrate with Apache Knox, and support for SAML (Security Assertion Markup Language) authentication. Impala is also now FIPS (Federal Information Processing Standards) compliant. A number of LDAP (Lightweight Directory Access Protocol) features have been added, including support for LDAP search bind operations, and User LDAP search bind support. 

Other authentication and authorization improvements include support for Ranger row-filtering policies, and support for basic role-related statements with Ranger. Kudu table ownership is also supported.

The full list of improvements can be seen in the Impala release notes, and Impala is available for download now.


More Information

Impala Website

Impala 4 Release Notes

Related Articles

Apache Kudu Improves Web Interface

Hadoop SQL Query Engine Launched

Cloudera Impala Real Time Query On Hadoop 

Apache Arrow Adds Streaming Binary Format 

HBase Adds MultiWAL Support

Apache Kafka Adds New Streams API

Apache Beam Moves To Top Level

HBase Adds MultiWAL Support

Spark BI Gets Fine Grain Security


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Next.js 14 Adds Turbopack

Next.js 14 has been released with a new turbopack and stable server actions. The updated version was announced at the annual Next.js Conf, where the team described Next.js 14 as their most focuse [ ... ]

PhpStorm Updated For PHP 8.3

The latest version of PhpStorm, the IDE for PHP and web development from JetBrains, has been updated to support the new features of PHP 8.3, due to be released later this week. 

More News




or email your comment to: comments@i-programmer.info