Apache Drill Adds New Data Formats
Written by Kay Ewbank   
Monday, 28 March 2022

Apache Drill has been updated with new data and storage formats and backwards compatibility with Hadoop 2, making this version usable for organizations that so far couldn't use it because of their reliance on Hadoop.

Apache Drill is an open-source schema-free Big Data SQL query engine for Apache Hadoop, NoSQL, and Cloud storage. Drill is the open source version of Google Dremel, which itself is more widely known as Google BigQuery.

drill

This release puts right a problem that's been happening since Drill updated to Hadoop 3 with version 1.17. When that happened, Drill lost backward compatibility with Hadoop 2, which meant that companies using Hadoop 2 couldn't move on from Drill 1.16. Drill 1.20 now includes a back port for Hadoop 2.

The second improvement of note is a new connector with Apache Phoenix, the open source, massively parallel, relational database engine supporting OLTP for Hadoop using Apache HBase as its backing store. The new Phoenix connector means Drill users can query and join data from Apache Phoenix directly from Drill. The Drill developers say the Phoenix connector from Drill has extensive pushdowns which will make the queries as efficient as possible, and it has user impersonation so queries can run in Phoenix as the current Drill user.

The new version also adds support for writing data to JDBC data sources meaning Drill can be used to write data to JDBC compliant RDBMS such as Oracle, MySQL, and Postgres.

Support has also been added for new data file formats, specifically Apache Iceberg and SAS. Iceberg is a high-performance format for huge analytic tables, and SAS is a statistical software suite developed by SAS Institute for data management and advanced analytics.

Elsewhere, this release has what the developers describe as "very significant improvements" to the HTTP plugin which makes it easier to access and work with data. The two most significant improvements are OAuth integration and automatic pagination. Drill’s HTTP connector now supports APIs which use OAuth 2.0 for authorization. Drill can now query APIs that use OAuth, and has been tested with SalesForce, Google Analytics, Clickup, and Workday.

The support for automatic pagination makes it possible to configure Drill to make API calls in series so that if a user requests 200 records, Drill will execute 2 API calls to retrieve all the desired data.

The new release is available for download now on the Drill website.

drill

More Information

Apache Drill

Related Articles

Apache Drill Adds YARN Support  

MapR Releases Docker Container For Local Development

Apache Drill Reaches 0.6

Perform Data Queries Faster With Drill

 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Azure AI And Pgvector Run Generative AI Directly On Postgres
26/03/2024

It's a match made in heaven. The Azure AI extension enables the database to call into various Azure AI services like Azure OpenAI. Combined with pgvector you can go far beyond full text search. Let's  [ ... ]



Falco On Track To Version 1.0.0
02/04/2024

Falco is a cloud native runtime security tool for the Linux operating system, designed to detect abnormal behavior and warn of potential security threats in real-time. Now it's about to release its fi [ ... ]


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 28 March 2022 )