Apache Spark MapR Connector Provides JSON Support
Written by Kay Ewbank   
Monday, 05 June 2017

There's a new Native Spark Connector for MapR-DB JSON that gives developers APIs to access MapR-DB JSON documents from Apache Spark, using the Open JSON Application Interface (OJAI) API.

Apache Spark is an open source big data processing framework, which is used for analytics on streaming and batch workloads. MapR-DB is a high performance NoSQL database, which supports two primary data models: JSON documents and wide column tables. A Spark connector is available for each data model. With the Spark/MapR-DB connectors, you can use MapR-DB as a data source and as a data destination for Spark jobs.

The Native Spark Connector for MapR-DB JSON supports loading data from a MapR-DB table as a Spark Resilient Distributed Dataset (RDD) of OJAI documents and saving a Spark RDD into a MapR-DB JSON table. (An RDD is the base format for storing data for use by Spark.)

native connector batch image

The connector includes a set of APIs that that enable MapR users to write applications that consume MapR-DB JSON tables and use them in Spark. It is is a companion to the MapR-DB Binary Connector for Apache Spark, which can be used to write applications that consume HBase binary tables and use them in Spark.

The connector has two APIs that let you load data from a MapR-DB JSON table to a Spark RDD or save a Spark RDD to a MapR-DB JSON table. It also provides support for Scala bean classes, has a custom partitioner that allows you to partition data for better performance, and supports data locality. When the connector reads data from MapR-DB, it uses the data locality feature of MapR-DB to spawn the Spark executors.

The Native Spark Connector includes support for data frames and dataset APIs, so HBase and MapR-DB binary tables can be queried directly with Spark. The advantage this offers is that it removes any intermediary layers, making it easier to construct faster data pipelines and reduce latency associated with data movement.


More Information

MapR-DB OJAI Documentation

Related Articles

Apache Spark 2.0 Released

Apache Spark Technical Preview

Spark Announcements

Apache Releases Spark 1.6

Spark 1.4 Released

MOOC On Apache Spark 

Learning Spark (book review) 


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.



Alexa For Developers

Amazon has released a new digital course for developers and designers who want to create apps, referred to as skills, for Amazon's voice assistant, Alexa. Is it a good idea to join the Alexa Skill ban [ ... ]

Celebrate World Emoji Day

It's World Emoji Day, when apparently there's a 'global celebration of emojis', along with emoji events and awards for various categories of emoji. If your response is a cynical 'seriously?', #WorldEm [ ... ]

More News






or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 05 June 2017 )