Differences between RabbitMQ and Kafka

Why does Cartesian Product Join aka Shuffle-and-Replication Nested Loop Join does not cause a shuffle?

March 5, 2021

What does hadoop namenode -format do and is it safe to run?

March 10, 2021

Published by Big Data In Real World at March 8, 2021

Origin

Kafka has its origin in LinkedIn and open source under Apache. Designed to solve issues inside LinkedIn and built for high speed data integration and real time data processing for huge volumes of data.

RabbitMQ one of the first open source message brokers to achieve a reasonable level of messaging features.

Design

Kafka employs a dumb broker and uses smart consumers to read its buffer. Kafka does not attempt to track which messages were read by each consumer. Kafka retains all messages for a set amount of time, and consumers are responsible to track message consumption.

RabbitMQ uses a smart broker / dumb consumer model, focused on consistent delivery of messages to consumers that consume at a roughly similar pace as the broker keeps track of consumer state.

Message handling

Kafka retains messages based on the retention time set on the topic

RabbitMQ removes messages as soon as the messages are consumed and acknowledged

Volume

Kakfa is designed to store and stream huge volume of data with very little overhead

RabbitMQ is fast when they have low volume in the queues. They perform slowly as the volume goes up.

Performance

If performance + volume is your use case, Kafka will be the best choice. You can easily get 100,000 messages per second.

Rabbit works best when volume is not too high. You will see slowness as the volume grows. You can expect around 20,000 messages per second.

Scaling

Kafka can be horizontally scaled – by adding more machines

RabbitMQ can be vertically scaled – by adding more power to machines

Horizontal scaling makes Kafka suitable for Big Data applications.

Monitoring

We can not call one tool to be best for Kafka monitoring. Confluent, Datadog and few other vendors have developed monitoring tools for Kafka.

RabbitMQ has inbuilt monitoring tools which allows queues, connections, exchanges and user permissions.

Routing rules

Kafka takes a simple routing approach. Consumption is managed by consumers entirely. Consumers in Kafka can read messages from the beginning or from a certain offset.

RabbitMQ supports complex routing rules and keeps tracks of message states – consumed, acknowledged etc.

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Differences between RabbitMQ and Kafka

Why does Cartesian Product Join aka Shuffle-and-Replication Nested Loop Join does not cause a shuffle?

What does hadoop namenode -format do and is it safe to run?

Why does Cartesian Product Join aka Shuffle-and-Replication Nested Loop Join does not cause a shuffle?

What does hadoop namenode -format do and is it safe to run?

Origin

Design

Message handling

Volume

Performance

Scaling

Monitoring

Routing rules

Big Data In Real World

Related posts

How does a consumer know the offset to read after restart in Kafka?

Stream Processing vs. Message Processing: What’s the Difference?

How to list topics without accessing Zookeeper in Kafka?