Apache Kafka 0.8.2 Now Ready For You To Trial
Apache Kafka 0.8.2 Now Ready For You To Trial
Written by Kay Ewbank   
Friday, 06 February 2015

OK, we got the Kafka joke out of the way in the headline. So what is Kafka and why does it need a manager?

Apache Kafka 0.8.2 has been released, alongside news that Yahoo is making Kafka Manager available as an open source version.

kafkabanner

Kafka is a fault-tolerant, low-latency, high-throughput distributed messaging system that is used in data pipelines at several companies. It was originally created at LinkedIn, where it forms a critical part of LinkedIn’s infrastructure and transmits data to all systems and applications. It is a top-level Apache project and is currently under active development.

The new version of Kafka has a new Java producer client. Until now, Kafka’s JVM clients haven’t changed much since the original release, and alongside the new producer Java client, the team is rewriting the consumer client. The new producer removes the distinction between the “sync” and “async” producer. Writing about the new version on the Confluent blog  Neha Narkhede said that

“effectively, all requests are sent asynchronously but always return a future response object that returns the offset as well as any error that may have occurred when the request is complete.”

The post adds that batching is now done whenever possible, meaning that the sync producer, under load, can get performance as good as the async producer. The batching means that any messages that arrive while a send is in progress are batched together.

The new version has yet another version of a delete topic feature. This is the third attempt at getting a working delete topic feature. Nerkhede says

“You would think deleting the data would be easy and not deleting it would be the hard part. But each of our previous attempts had flaws that appeared with usage. This time significant testing has been done and we are hopeful that your data will really be gone. “

A major improvement to the new version is in offset management. Kafka works out which messages have been consumed using offsets, markers it stores for the consumer’s position in the log of messages. Until now such markers were stored by the JVM client in ZooKeeper. When consumers want to update their position frequently, this can become a bottleneck, so the developers have added native offset storage functionality and a new API for making use of this. This makes offset storage horizontally scalable.

Other improvements include automated leader rebalancing, controlled shutdown, and stronger durability guarantees.

Alongside the release of Kafka 0.8.2, Yahoo is making Kafka Manager available in an open source version. Writing about the release on the Yahoo Engineering blog, Hiral P says that Kafka is used by many teams across Yahoo, including for the real-time analytics pipeline, where the Kafka cluster handles a peak bandwidth of more than 20Gbps (of compressed data).

kafkamanager

The developers at Yahoo created a web-based tool called Kafka Manager to make it easier to maintain the Kafka clusters. The interface makes it easier to identify topics which are unevenly distributed across the cluster or have partition leaders unevenly distributed across the cluster.

It supports management of multiple clusters, preferred replica election, replica re-assignment, and topic creation. The manager was built with Scala, and the web console is based on the Play Framework which interacts with an actor based in-memory model built with Akka and Apache Curator. The Yahoo developers have also ported some utilities from Apache Kafka to work with the Apache Curator framework as well. Yahoo uses Curator to inspect the state of the cluster from Zookeeper.

The management utility has now been made available as an open source version on GitHub.

kafkalogo

 

 

More Information

Kafka 0.8.2

Kafka Manager

Related Articles

Hadoop in Practice 2e (Manning)       

Amazon Kinesis For Real-Time Processing of Data       

Elastic MapReduce Demo Shows How to Handle Large Datasets 

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


Google Opens Doors To Its Colaboratory
27/11/2017

Last month Google made another of its in-house data science tools freely available for anyone to use. Colaboratory is a document collaboration tool that has the ability to run code and show its output [ ... ]



Exploring AlphaGo Teach
13/12/2017

DeepMind has released AlphaGo Teach, an interactive tool that can be used to explore the opening moves of a game of Go. It lets you try out alternative moves and shows the probabilities of a win  [ ... ]


More News

 

 
 

 

blog comments powered by Disqus

<ASIN:1612931030>

Last Updated ( Friday, 06 March 2015 )
 
 

   
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.