Apache Kafka 0.8.2 Now Ready For You To Trial
Written by Kay Ewbank   
Friday, 06 February 2015

OK, we got the Kafka joke out of the way in the headline. So what is Kafka and why does it need a manager?

Apache Kafka 0.8.2 has been released, alongside news that Yahoo is making Kafka Manager available as an open source version.


Kafka is a fault-tolerant, low-latency, high-throughput distributed messaging system that is used in data pipelines at several companies. It was originally created at LinkedIn, where it forms a critical part of LinkedIn’s infrastructure and transmits data to all systems and applications. It is a top-level Apache project and is currently under active development.

The new version of Kafka has a new Java producer client. Until now, Kafka’s JVM clients haven’t changed much since the original release, and alongside the new producer Java client, the team is rewriting the consumer client. The new producer removes the distinction between the “sync” and “async” producer. Writing about the new version on the Confluent blog  Neha Narkhede said that

“effectively, all requests are sent asynchronously but always return a future response object that returns the offset as well as any error that may have occurred when the request is complete.”

The post adds that batching is now done whenever possible, meaning that the sync producer, under load, can get performance as good as the async producer. The batching means that any messages that arrive while a send is in progress are batched together.

The new version has yet another version of a delete topic feature. This is the third attempt at getting a working delete topic feature. Nerkhede says

“You would think deleting the data would be easy and not deleting it would be the hard part. But each of our previous attempts had flaws that appeared with usage. This time significant testing has been done and we are hopeful that your data will really be gone. “

A major improvement to the new version is in offset management. Kafka works out which messages have been consumed using offsets, markers it stores for the consumer’s position in the log of messages. Until now such markers were stored by the JVM client in ZooKeeper. When consumers want to update their position frequently, this can become a bottleneck, so the developers have added native offset storage functionality and a new API for making use of this. This makes offset storage horizontally scalable.

Other improvements include automated leader rebalancing, controlled shutdown, and stronger durability guarantees.

Alongside the release of Kafka 0.8.2, Yahoo is making Kafka Manager available in an open source version. Writing about the release on the Yahoo Engineering blog, Hiral P says that Kafka is used by many teams across Yahoo, including for the real-time analytics pipeline, where the Kafka cluster handles a peak bandwidth of more than 20Gbps (of compressed data).


The developers at Yahoo created a web-based tool called Kafka Manager to make it easier to maintain the Kafka clusters. The interface makes it easier to identify topics which are unevenly distributed across the cluster or have partition leaders unevenly distributed across the cluster.

It supports management of multiple clusters, preferred replica election, replica re-assignment, and topic creation. The manager was built with Scala, and the web console is based on the Play Framework which interacts with an actor based in-memory model built with Akka and Apache Curator. The Yahoo developers have also ported some utilities from Apache Kafka to work with the Apache Curator framework as well. Yahoo uses Curator to inspect the state of the cluster from Zookeeper.

The management utility has now been made available as an open source version on GitHub.




More Information

Kafka 0.8.2

Kafka Manager

Related Articles

Hadoop in Practice 2e (Manning)       

Amazon Kinesis For Real-Time Processing of Data       

Elastic MapReduce Demo Shows How to Handle Large Datasets 


To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.



We Built A Software Engineer

One of the most worrying things about being a programmer today is the threat from AI. It has gone so far that NVIDA CEO Jensen Huang proclaims that you really shouldn't start training as a programmer  [ ... ]

The University of Tübingen's Self-Driving Cars Course

The recorded lectures and the written material of a course on Self-Driving Cars at the University of Tübingen have been made available for free. It's a first class opportunity to learn the in an [ ... ]

More News


raspberry pi books



or email your comment to: comments@i-programmer.info


Last Updated ( Friday, 06 March 2015 )