Grail Open Sources Bigslice Cluster Computing For Golang
Written by Kay Ewbank   
Thursday, 17 October 2019

GRAIL has open sourced two projects, Bigslice and Bigmachine, which enable distributed computation across large datasets using simple Golang programs. 

Bigslice is a system for fast, large-scale, serverless data processing using Go. It exposes a composable API that lets the user express data processing tasks in terms of a series of data transformations that invoke user code.

The developers describe Bigslice as similar to data processing systems like Apache Spark and FlumeJava, but with different aims. Bigslice is built for Go, and is used as an ordinary Go package. Users use their existing Go code, and Bigslice binaries are compiled like ordinary Go binaries.

It is also serverless, and the team says that with nothing more than cloud credentials, Bigslice can be used to process large datasets without the use of any other external infrastructure.

bigslice

 

Bigslice programs are regular Go programs, providing users with a familiar environment and tools. A Bigslice program can be run on a single node like any other program, but it is also capable of transparently distributing itself across an ad hoc cluster, managed entirely by the program itself.

The data processing features of Bigslice come in the form of a coherent set of operators that can be used to work with large data sets using ordinary Go code. The operators are familiar data transformation primitives such as map, filter, reduce, and join. While the user’s computations are sequential — they specify how a dataset is to be transformed, step-by-step, into the desired result — Bigslice parallelizes the computation and can distribute it across many processors and over large compute clusters.

Bigslice achieves this by splitting the datasets into many smaller pieces, and performing the transformations individually on each piece so that they can fit in memory, and so they can be performed in parallel across many machines. When transformations require that data be rearranged (for operations like join or reduce), Bigslice arranges that the data are re-shuffled accordingly.

Bigslice uses Bigmachine to manage an ad-hoc cluster of compute nodes to support distribution. Bigmachine is a toolkit for building self-managing serverless applications in Go. It provides an API that lets a driver process form an ad-hoc cluster of machines to which user code is transparently distributed. User code is exposed through services, which are stateful Go objects associated with each machine.

bigslice

More Information

Bigslice Home Page

Related Articles

Spark Gets NLP Library

Apache Spark With Structured Streaming

Spark BI Gets Fine Grain Security

Spark 2.0 Released

Go 1.13 Modernizes Number Literals

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


DORA Report Reveals Widespread Reliance On AI
30/09/2025

90% of professional developers now use AI at work, up 14% from 2024, spending a median of two hours per day working with AI tools. Nearly two-thirds rely on AI for at least half their workflow, and fo [ ... ]



C Resumes Second Place In TIOBE Index
08/10/2025

The TIOBE index for October is out and C has overtaken  to regain the coveted second place in the ranking. What is it about C that makes it so special and can it continue to be as important a lan [ ... ]


More News

pico book

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 17 October 2019 )