Apache Phoenix Now HBase 2.0 Compatible

Written by Kay Ewbank

Friday, 20 July 2018

Apache Phoenix 5.0 has been released. This is a major version upgrade to bring the compatibility for HBase to 2.0+, and to support Apache Hadoop 3.0.

Phoenix adds support for SQL-based OLTP and operational analytics for Apache Hadoop using Apache HBase as its backing store. It also provides integration with other projects in the Apache ecosystem such as Spark, Hive, Pig, Flume, and MapReduce.

Phoenix provides an open source SQL skin for HBase. You use it via standard JDBC APIs instead of the HBase client APIs to create tables, insert data, and query your HBase data. It compiles SQL queries to native HBase scans, and can be used to access data stored and produced in other Hadoop products such as Spark, Hive, Pig, Flume, and MapReduce. Phoenix uses secondary indexes to make queries faster, and makes use of parallel processing by executing aggregate queries through server-side hooks (called co-processors). This means queries are executed on the nodes where data is stored, greatly reducing the need to send data over the network.

The new release has feature parity with the recently released 4.14.0 version. The developers say that highlights of the release start with a clean up of deprecated APIs that were included to support HBase 0.98, and new performant APIs supporting HBase 2.0 have been brought into use instead.

The developers have also refactored coprocessor implementations to make use new Coprocessor or Observer APIs in HBase 2.0.

Alongside the support for HBase and Hadoop, the Hive and Spark Integration in the new version work in the latest versions of Hive (3.0.0) and Spark (2.3.0) respectively.

There are some new features in the release alongside the HBase 2.0 support. HBase namespaces are now surfaced in Phoenix. HBase supports the concept of namespaces in the form of myNamespace:MyTable, and Phoenix can now make use of this feature to give a database-like feature on top of the table.

Improvements for MySQL include support for its Limit +Offset clause when creating a query with a Serial hint or a Limit without an Order By. This means that you can limit each scan using a page filter. The previous version could only use Limit. The new version enables you to forward the relevant client iterators to the offset provided and then return the result..

The final improvement provides better handling of Big*Big joins. Phoenix already supported hash joins and sort-merge joins, but they were not processed very well when both sides of the join were large. The new release has a Hive-Phoenix handler that can access an Apache Phoenix table on HBase using HiveQL. This is much faster than the Hive-HBbase handler because the Hive-Phoenix handler applys Predicate push down.

phoenix

More Information

Phoenix Website

Apache Phoenix Improves HBase Support

HBase 1.4 With New Shaded Client

HBase Adds MultiWAL Support

Hadoop 3 Adds HDFS Erasure Coding

Hadoop 2.9 Adds Resource Estimator

Hadoop Adds In-Memory Caching

Hadoop SQL Query Engine Launched

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

AI Highlights From Google I/O 2025
22/05/2025

At Google I/O, Sundar Pichai, Demis Hassabis and others took to the stage to outline a long lineup of AI-powered products and services including Gemini 2.5, AI Mode in Search, which is already b [ ... ]

+ Full Story

Plainsight Introduces OpenFilter AI Tool
21/05/2025

Plainsight has launched OpenFilter, an open source project for developing, deploying, and scaling production-grade computer vision applications. The launch took place at the Embedded Vision Summit, th [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments