Grail Open Sources Bigslice Cluster Computing For Golang
Written by Kay Ewbank   
Thursday, 17 October 2019

GRAIL has open sourced two projects, Bigslice and Bigmachine, which enable distributed computation across large datasets using simple Golang programs. 

Bigslice is a system for fast, large-scale, serverless data processing using Go. It exposes a composable API that lets the user express data processing tasks in terms of a series of data transformations that invoke user code.

The developers describe Bigslice as similar to data processing systems like Apache Spark and FlumeJava, but with different aims. Bigslice is built for Go, and is used as an ordinary Go package. Users use their existing Go code, and Bigslice binaries are compiled like ordinary Go binaries.

It is also serverless, and the team says that with nothing more than cloud credentials, Bigslice can be used to process large datasets without the use of any other external infrastructure.



Bigslice programs are regular Go programs, providing users with a familiar environment and tools. A Bigslice program can be run on a single node like any other program, but it is also capable of transparently distributing itself across an ad hoc cluster, managed entirely by the program itself.

The data processing features of Bigslice come in the form of a coherent set of operators that can be used to work with large data sets using ordinary Go code. The operators are familiar data transformation primitives such as map, filter, reduce, and join. While the user’s computations are sequential — they specify how a dataset is to be transformed, step-by-step, into the desired result — Bigslice parallelizes the computation and can distribute it across many processors and over large compute clusters.

Bigslice achieves this by splitting the datasets into many smaller pieces, and performing the transformations individually on each piece so that they can fit in memory, and so they can be performed in parallel across many machines. When transformations require that data be rearranged (for operations like join or reduce), Bigslice arranges that the data are re-shuffled accordingly.

Bigslice uses Bigmachine to manage an ad-hoc cluster of compute nodes to support distribution. Bigmachine is a toolkit for building self-managing serverless applications in Go. It provides an API that lets a driver process form an ad-hoc cluster of machines to which user code is transparently distributed. User code is exposed through services, which are stateful Go objects associated with each machine.


More Information

Bigslice Home Page

Related Articles

Spark Gets NLP Library

Apache Spark With Structured Streaming

Spark BI Gets Fine Grain Security

Spark 2.0 Released

Go 1.13 Modernizes Number Literals

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Avi Wigderson Gains Turing Award

Israeli mathematician and computer scientist, Avi Wigderson, is the recipient of the 2023 ACM A.M Turing Award which carries a $1 million prize with financial support from Google.

GR00T Could Be The Robot You Have Always Wanted

We may not have flying cars, but we could well soon have robots that match up to predictions for the 21st century. Nvidia has announced GR00T, a cleverly named project to build robots using foundation [ ... ]

More News

raspberry pi books



or email your comment to:

Last Updated ( Thursday, 17 October 2019 )