|Comparing Kafka To RabbitMQ|
|Written by Kay Ewbank|
|Tuesday, 05 September 2017|
Researchers from Nokia Bell Labs have analyzed the relative merits of Apache Kafka and RabbitMQ, two open-source and commercially supported publish/subscribe systems to see how they compare.
In a publication on the Cornell University Library's Computer Science section, researchers Philippe Dobbelaere and Kyumars Sheykh Esmaili put forward a common comparison framework covering the core functionalities of publish and subscribe systems, considered the advantages of Kafka and RabbitMQ, and set out criteria on how to pick between the two for different circumstances. The researchers' conclusion is that Kafka and RabbitMQ have very different histories and design goals, and distinct features.
RabbitMQ is an efficient implementation of the AMQP protocol, which offers a flexible routing mechanism, using the exchanges/binding notions. It is much closer to the classic messaging systems. For example, it takes care of most of the consumption bookkeeping, its main design goal is to handle messages in memory, and its queue logic is optimized for empty-or-nearly-empty queues.
Kafka, on the other hand, is designed around a distributed commit log, aiming at high-throughput and consumers of varying speeds. To that end, it has departed from the classic principles of messaging systems in a few ways: extensive use of partitioning at the expense of data order, its queues are logical views on persisted logs, allowing replayability, but manual retention policies. Furthermore, it also applies a number of very effective optimization techniques, most notably, aggressive batching and reliance on persistent data structures and OS page cache.
In working out which one to choose, the researchers say that in terms of latency, both systems are capable of delivering low-latency results. In case of RabbitMQ, the difference between at most once and at least once delivery modes is not significant. For Kafka, on the other hand, latency increases to be about twice as large for the "at least once" mode. Additionally, if it needs to read from disk, its latency can grow by up to an order of magnitude.
In terms of throughput, in the most basic setup (i.e. on a single node, single producer/channel, single partition, no replication) RabbitMQ’s throughput outperforms Kafka’s. Increasing the Kafka partition count on the same node, however, can significantly improve its performance, demonstrating its superb scalability. Increasing the producer/channel count in RabbitMQ, on the other hand, could only improve its performance moderately.
Both Kafka and RabbitMQ can scale further by partitioning rows over multiple nodes. In RabbitMQ, this requires additional special logic, such as Consistent Hash Exchange and Sharding Exchange. In Kafka this comes for free. Finally, replication has a drastic impact on the performance of both RabbitMQ and Kafka and reduces their performance by 50% and 75%, respectively.
You can read the full findings on the Cornell University site.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Tuesday, 05 September 2017 )|