IBM Big SQL Sandbox
Written by Kay Ewbank   
Tuesday, 19 September 2017

IBM has released a sandbox version of Big SQL for desktop use. The Sandbox comes as a single node docker image, and is designed to let you started with Big SQL and Hortonworks Data platform.

Each Sandbox download comes preconfigured with sample data, a tutorial and an exercise for you to complete, and IBM says you'll be up and running in 30 minutes.

IBM Big SQL is IBM's SQL engine for Hadoop. IBM has worked with Hortonworks to integrate HDP (Hortonworks Data Platform) with IBM Big SQL, and Big SQL 5 extends the capabilities of Hive, and makes use of HBase and Spark to provide an integated analytics option.

 

bigsql

 

Big SQL makes use of IBM Fluid Query to virtualize data from many different data stores such as Hive, HBase, Spark, DB2, Oracle, SQL Server, Netezza, Informix, Teradata, WebHDFS and object store.

IBM Fluid Query was introduced in 2015. It is powered by Netezza technology, and can be used to create federated queries where the data is drawn from a variety of sources, without the users of the data neding to deal with managing multiple data stores or query systems. Fluid Query can also be used to carry out and control bulk data movement between data repositories. Netezza created the first data warehouse appliance, and as an independent company also developed advanced analytics applications. It was bought by IBM in 2010. 

Big SQL offers bi-directional integration with Spark, and supports synthesis between Spark executors and Big SQL worker nodes. Along with the big data support, it also supports SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, including built-in support for Oracle’s SQL and PL/SQL dialects. IBM's hope is that applications that were written against Oracle will be moved to run in Big SQL, because they can be moved across with minimal changes.

Big SQL also offers YARN integration through Slider. YARN (Yet Another Resource Negotiator) is Apache's cluster management technology, while Slider extends Hadoop and YARN to let other databases run in YARN without modification. Obviously thinking they hadn't included enough big data names and technologies, IBM has added a new technology to Big SQL called “Elastic Boost”.  IBM says this can improved Big SQL's performance by up to 50% by enabling allocation of multiple workers per node for more efficient CPU and memory utilization.

Big SQL also comes with an ANSI-compliant SQL parser that can run all the 99 TPC-DS queries without the need for query modifications and structured streaming with new APIs.

ibmbigsql

 

More Information

Big SQL Sandbox

IBM Big SQL

Related Articles

SQL At Hadoop Scale 

Hadoop Adds In-Memory Caching

Apache Spark With Structured Streaming

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Flox Releases Flox Hub
13/03/2024

Flox has announced that its Command Line Interface (CLI) and FloxHub are now generally available. The CLI is open source and FloxHub is free for anyone to use.



CISA Offers More Support For Open Source
22/03/2024

The Cybersecurity and Infrastructure Security Agency (CISA) has announced a number of key actions that they hope will improve the open source ecosystem.


More News

 

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 19 September 2017 )