|Apache Arrow Flight Released|
|Written by Kay Ewbank|
|Monday, 28 October 2019|
Apache has released a beta version of Apache Arrow Flight, an Arrow-native data messaging framework. This first release is designed to optimize transport of the Arrow columnar format over gRPC, Google’s HTTP/2-based general-purpose RPC library and framework.
Flight is designed to overcome the problem that Apache Arrow's primary medium is in-memory data, but not all systems can be co-located. Arrow needs an RPC layer, and that's what Apache Flight adds.
Apache Arrow is a columnar in-memory analytics layer that permits random access. It is language independent, can be used for flat and hierarchical data, and the data store is organized for efficient analytic operations.
Flight provides stream management. Data is handled as 'flights' that are a stream of Arrow record batches that you can interact with using Get Stream and Put Stream methods.
While this release of Flight is only integrated with gRPC, Apache intends to add support for other libraries.
The developers say that this 0.15.0 release includes ready-to-use Flight implementations in C++ (with Python bindings) and Java, and that these libraries are suitable for beta users who are comfortable with API or protocol changes while the team continues to refine some low-level details in the Flight internals.
One of the biggest features that sets apart Flight from other data transport frameworks is parallel transfers, allowing data to be streamed to or from a cluster of servers simultaneously. This enables developers to more easily create scalable data services that can serve a growing client base.
While this is just a beta release, the developers say that In real-world use, Dremio has developed an Arrow Flight-based connector which has been shown to deliver 20-50x better performance over ODBC.
or email your comment to: email@example.com