Apache Drill 1.19 Milestone Release Adds Cassandra Connector
Thursday, 09 September 2021

Apache Drill has been updated in what the developers are calling its biggest release ever. Version 1.19 adds new connectors for Apache Cassandra, Elasticsearch, and Splunk, along with Avro support for the Kafka plugin.

Apache Drill is an open-source schema-free Big Data SQL query engine for Apache Hadoop, NoSQL, and Cloud storage. Drill is the open source version of Google Dremel, which itself is more widely known as Google BigQuery.

drill

Apache Drill is a distributed SQL query engine that works with most non-relational datastores, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores. Drill can be used by analysts, data scientists and developers to explore and analyze non-relational data without having to make the data more regimented for analysis. Drill processes the data in-situ without requiring users to define schemas or transform data.

The Cassandra connector uses datastax Java driver, while the  Elasticsearch, XML and Splunk connectors have been written specifically for Drill. The new Avro support enables the use of Avro messages with schema registry support for Drill's Kafka storage plugin.

Other improvements in the new release include an integrated password vault for secure credential storage, support for Linux ARM64 systems, and new limit pushdowns for file systems, HTTP REST APIs and MongoDB. The new release also has streaming support for Drill's REST API, and new integration with Apache Airflow.

There's a new merge row set-based JSON reader that can be used for a "late schema" style of data reading whereby the schema is worked out while the read is happening. The developers say that the reader:

"Implements many tricks and hacks to handle schema changes while loading, and shows that, even with all these tricks, the only true solution is to actually have a schema."

The new JSON reader uses an expanded state machine when parsing rather than the complex set of if-statements in the current version.

A second addition to the JSON support is a change to use streaming for REST JSON queries. This has been implemented to reduce the memory overhead when running a REST JSON query:

The new release is available for download now on the Drill website.

drill

More Information

Apache Drill

Related Articles

Apache Drill Adds YARN Support  

MapR Releases Docker Container For Local Development

Apache Drill Reaches 0.6

Perform Data Queries Faster With Drill

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


MySQL 9 Adds Support For JavaScript Stored Procedures
11/07/2024

Oracle has announced that support for JavaScript functions and stored procedures has been added to the MySQL database server. The enhancements were added to MySQL 9, an 'innovation release', whic [ ... ]



World Emoji Day 2024 Surveys Most Confusing Emojis
19/07/2024

This week sees the 'celebration', if that's the word we're looking for, of World Emoji Day, the annual emoji-fest that has happened on July 17 for the last eleven years. This year we've largely b [ ... ]


More News

kotlin book

 

Comments




or email your comment to: comments@i-programmer.info