IBM Big SQL Sandbox

Written by Kay Ewbank

Tuesday, 19 September 2017

IBM has released a sandbox version of Big SQL for desktop use. The Sandbox comes as a single node docker image, and is designed to let you started with Big SQL and Hortonworks Data platform.

Each Sandbox download comes preconfigured with sample data, a tutorial and an exercise for you to complete, and IBM says you'll be up and running in 30 minutes.

IBM Big SQL is IBM's SQL engine for Hadoop. IBM has worked with Hortonworks to integrate HDP (Hortonworks Data Platform) with IBM Big SQL, and Big SQL 5 extends the capabilities of Hive, and makes use of HBase and Spark to provide an integated analytics option.

bigsql

Big SQL makes use of IBM Fluid Query to virtualize data from many different data stores such as Hive, HBase, Spark, DB2, Oracle, SQL Server, Netezza, Informix, Teradata, WebHDFS and object store.

IBM Fluid Query was introduced in 2015. It is powered by Netezza technology, and can be used to create federated queries where the data is drawn from a variety of sources, without the users of the data neding to deal with managing multiple data stores or query systems. Fluid Query can also be used to carry out and control bulk data movement between data repositories. Netezza created the first data warehouse appliance, and as an independent company also developed advanced analytics applications. It was bought by IBM in 2010.

Big SQL offers bi-directional integration with Spark, and supports synthesis between Spark executors and Big SQL worker nodes. Along with the big data support, it also supports SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, including built-in support for Oracle’s SQL and PL/SQL dialects. IBM's hope is that applications that were written against Oracle will be moved to run in Big SQL, because they can be moved across with minimal changes.

Big SQL also offers YARN integration through Slider. YARN (Yet Another Resource Negotiator) is Apache's cluster management technology, while Slider extends Hadoop and YARN to let other databases run in YARN without modification. Obviously thinking they hadn't included enough big data names and technologies, IBM has added a new technology to Big SQL called “Elastic Boost”. IBM says this can improved Big SQL's performance by up to 50% by enabling allocation of multiple workers per node for more efficient CPU and memory utilization.

Big SQL also comes with an ANSI-compliant SQL parser that can run all the 99 TPC-DS queries without the need for query modifications and structured streaming with new APIs.

ibmbigsql

More Information

Big SQL Sandbox

IBM Big SQL

SQL At Hadoop Scale

Hadoop Adds In-Memory Caching

Apache Spark With Structured Streaming

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Google Tunix Hack Hackathon Now Open
14/11/2025

A Google hackathon on Kaggle is now open for entries showing how to use Tunix, Google's JAX-native library for LLM post-training, to train a model to show its work by laying out a reasoning trace befo [ ... ]

+ Full Story

Amazon Updates From re:Invent
03/12/2025

This week in Las Vegas, Amazon has made several announcements at its annual user conference, re:Invent, including updates to AWS Transform, and the introduction of Lambda managed instances.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 19 September 2017 )

More Information

Related Articles

Comments