IBM Big SQL Sandbox
IBM Big SQL Sandbox
Written by Kay Ewbank   
Tuesday, 19 September 2017

IBM has released a sandbox version of Big SQL for desktop use. The Sandbox comes as a single node docker image, and is designed to let you started with Big SQL and Hortonworks Data platform.

Each Sandbox download comes preconfigured with sample data, a tutorial and an exercise for you to complete, and IBM says you'll be up and running in 30 minutes.

IBM Big SQL is IBM's SQL engine for Hadoop. IBM has worked with Hortonworks to integrate HDP (Hortonworks Data Platform) with IBM Big SQL, and Big SQL 5 extends the capabilities of Hive, and makes use of HBase and Spark to provide an integated analytics option.

 

bigsql

 

Big SQL makes use of IBM Fluid Query to virtualize data from many different data stores such as Hive, HBase, Spark, DB2, Oracle, SQL Server, Netezza, Informix, Teradata, WebHDFS and object store.

IBM Fluid Query was introduced in 2015. It is powered by Netezza technology, and can be used to create federated queries where the data is drawn from a variety of sources, without the users of the data neding to deal with managing multiple data stores or query systems. Fluid Query can also be used to carry out and control bulk data movement between data repositories. Netezza created the first data warehouse appliance, and as an independent company also developed advanced analytics applications. It was bought by IBM in 2010. 

Big SQL offers bi-directional integration with Spark, and supports synthesis between Spark executors and Big SQL worker nodes. Along with the big data support, it also supports SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, including built-in support for Oracle’s SQL and PL/SQL dialects. IBM's hope is that applications that were written against Oracle will be moved to run in Big SQL, because they can be moved across with minimal changes.

Big SQL also offers YARN integration through Slider. YARN (Yet Another Resource Negotiator) is Apache's cluster management technology, while Slider extends Hadoop and YARN to let other databases run in YARN without modification. Obviously thinking they hadn't included enough big data names and technologies, IBM has added a new technology to Big SQL called “Elastic Boost”.  IBM says this can improved Big SQL's performance by up to 50% by enabling allocation of multiple workers per node for more efficient CPU and memory utilization.

Big SQL also comes with an ANSI-compliant SQL parser that can run all the 99 TPC-DS queries without the need for query modifications and structured streaming with new APIs.

ibmbigsql

 

More Information

Big SQL Sandbox

IBM Big SQL

Related Articles

SQL At Hadoop Scale 

Hadoop Adds In-Memory Caching

Apache Spark With Structured Streaming

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

Banner


Top 10 From Around The Web: Python Development Resources
29/09/2017

Python is the topic of this week's catch of blog posts that might otherwise have escaped your notice. It ranges over many different scenarios including machine learning, distributed programming, scrip [ ... ]



Fast Data Requires New Frameworks
25/09/2017

Developers are adopting new streaming data frameworks and turning to microservices to meet the need to use data faster according to a report from Lightbend presented at today's Strata Data Conference  [ ... ]


More News

 

 
 

 

blog comments powered by Disqus

Last Updated ( Tuesday, 19 September 2017 )
 
 

   
Banner
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.