Apache Spark MapR Connector Provides JSON Support
Written by Kay Ewbank   
Monday, 05 June 2017

There's a new Native Spark Connector for MapR-DB JSON that gives developers APIs to access MapR-DB JSON documents from Apache Spark, using the Open JSON Application Interface (OJAI) API.

Apache Spark is an open source big data processing framework, which is used for analytics on streaming and batch workloads. MapR-DB is a high performance NoSQL database, which supports two primary data models: JSON documents and wide column tables. A Spark connector is available for each data model. With the Spark/MapR-DB connectors, you can use MapR-DB as a data source and as a data destination for Spark jobs.

The Native Spark Connector for MapR-DB JSON supports loading data from a MapR-DB table as a Spark Resilient Distributed Dataset (RDD) of OJAI documents and saving a Spark RDD into a MapR-DB JSON table. (An RDD is the base format for storing data for use by Spark.)

native connector batch image

The connector includes a set of APIs that that enable MapR users to write applications that consume MapR-DB JSON tables and use them in Spark. It is is a companion to the MapR-DB Binary Connector for Apache Spark, which can be used to write applications that consume HBase binary tables and use them in Spark.

The connector has two APIs that let you load data from a MapR-DB JSON table to a Spark RDD or save a Spark RDD to a MapR-DB JSON table. It also provides support for Scala bean classes, has a custom partitioner that allows you to partition data for better performance, and supports data locality. When the connector reads data from MapR-DB, it uses the data locality feature of MapR-DB to spawn the Spark executors.

The Native Spark Connector includes support for data frames and dataset APIs, so HBase and MapR-DB binary tables can be queried directly with Spark. The advantage this offers is that it removes any intermediary layers, making it easier to construct faster data pipelines and reduce latency associated with data movement.

mapr

More Information

MapR-DB OJAI Documentation

Related Articles

Apache Spark 2.0 Released

Apache Spark Technical Preview

Spark Announcements

Apache Releases Spark 1.6

Spark 1.4 Released

MOOC On Apache Spark 

Learning Spark (book review) 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

 

Banner


Facebook Launches DeepFake Detection Challenge
06/09/2019

Facebook is teaming up with Microsoft and academics to create a Deepfake Detection Challenge. The goal of the challenge is to produce technology that can be used by anyone to detect when AI has been u [ ... ]



Kite - AI Powered Auto Completion for Python
09/09/2019

Productivity is not just associated with saving keystrokes but it comes from making smart suggestions too. This is something that Kite does with its new AI-powered Intelligent Snippets.


More News

 

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 05 June 2017 )