Kafka Graphs Framework Extends Kafka Streams
Written by Kay Ewbank   
Friday, 03 August 2018

A new graph processing framework for Apache Kafka extends Kafka Streams to provide distributed graph analytics based only on what is already provided by the Kafka layers. Kafka Graphs is a client layer for distributed processing of graphs.

Kafka Graphs provides a library for graph transformations alongside a distributed platform for executing graph algorithms. The developers say it was inspired by other platforms for graph analytics such as Apache Flink Gelly, Apache Spark GraphX, and Apache Giraph, but unlike these other frameworks it does not require anything other than what is already provided by the Kafka abstraction funnel.

The Kafka abstraction tunnel is the name given by the developer of Kafka Graphs, Robert Yokota, to four layers of the Kafka ecosystem  - Confluent's KSQL, the Kafka Streams DSL, the Kafka Streams Processor API, and the Producer-Consumer API. These form a hierarchy of abstractions that handle different parts of the stream being processed, with most cases being handled by KSQL, and subsequent layers handling smaller volumes to form the 'funnel'.

Graphs in Kafka Graphs are represented by two tables from Kafka Streams, one for vertices and one for edges. The vertex table is comprised of an ID and a vertex value, while the edge table is comprised of a source ID, target ID, and edge value. Once a graph is created, graph transformations can be performed on it.

Kafka Graphs provides a number of graph algorithms based on the approach taken by Pregel, a system for large-scale graph processing. Algorithms in Graphs include breadth-first search, local clustering coefficient, single- and multiple-source shortest path,weakly-connected components, and pagerank.  Custom Pregel-based graph algorithms can also be added by implementing the ComputeFunction interface.

Because Kafka Graphs is built on top of Kafka Streams, it is able to use the underlying partitioning scheme of Kafka Streams to support distributed graph processing. When multiple instantiations of the REST application are started on different hosts they will automatically work together to partition the set of vertices when executing a graph algorithm.

The developers say Kafka Graphs is still in its early stages, but it is available to try on GitHub.

kakfalogo

More Information

Kafka Graphs on GitHub

Pregel

Related Articles

Apache Kafka Adds New Streams API

Open Source GraphQL Engine Launched

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Mitch Kapor Gains MSc 45 Years After Dropping Out of MIT
04/07/2025

Mitch Kapor, founder of Lotus Development Corporation and designer of Lotus 1-2-3, the "killer application" which made the personal computer ubiquitous in the business world in the 1980s has completed [ ... ]



The AI Scam
11/06/2025

AI is not a scam, but the doubters are going to doubt. The latest attack is based on the idea that if you understand how it all works then it should be clear to you that it is a scam. What are they mi [ ... ]


More News

pico book

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 03 August 2018 )