|Splice Machine 3 Improves SQL Coverage|
|Written by Kay Ewbank|
|Thursday, 23 January 2020|
Splice Machine, a database that combines Hadoop and traditional relational abilities, has been updated. The new version offers improved SQL coverage, better workload management, and Kubernetes support.
The Splice Machine database is built on two technology stacks: Apache Derby, a Java-based, ANSI SQL Database, and HBase/Hadoop. The company says the it provides the scale-out technology of Hadoop, the distributed real-time computing power of the key-value store HBase, and the full features of an RDBMS, including ANSI SQL and ACID transactions.
Splice Machine aims to provide both transactional and analytical functionality. Most database systems concentrate either on being fast for data access through using indexes, or on offering scalability and support for non-structured data, but without providing support for ACID transactions and fast indexes. Splice Machine aims to do both, with auto-sharding for scalability, and ACID transactions and indexes.
Workload Management has been improved in this release with the addition of support for multiple OLAP (online analytical processing) queues meaning you can reserve cluster capacity for specific queries, and isolate workloads from each other so adequate resources are available when multiple resource intensive queries.
SQL coverage has been improved with support for DB2 specific SQL syntax. More generally, Splice Machine 3 has full Outer Join support, along with support for point in time queries. This makes it possible to query the database as it existed at some time in the past. Trigger support is also improved, so you can handle events that can trigger automatic actions and the actions that can be taken as a result of those triggers.
Replication options now include Active-Passive replication, meaning you can keep multiple DB clusters in sync automatically. Security is another area to have been enhanced with support for Schema Access Restrictions so you can restrict access to objects belonging to a specified schema so that other users cannot view or access them.
The area to have received most changes is data science, with new support for Jupyter notebooks, JupyterHub and BeakerX.
JupyterHub provides a way to serve Jupyter notebooks to multiple users, while BeakerX is an added layer that sits on top of Jupyter, providing features including polyglot programming and cross kernel variable support.
Polyglot programming support means you can use multiple different languages within a single Jupyter notebook, including SQL, R, Python, Java and Scala. BeakerX’s global namespace means you can build cross-language models, store variables into the global beakerx object, and access that data from other kernels. The new release has also added Model Workflow Management, based on MLFlow.
or email your comment to: email@example.com