Why does Cartesian Product Join aka Shuffle-and-Replication Nested Loop Join does not cause a shuffle?
March 5, 2021What does hadoop namenode -format do and is it safe to run?
March 10, 2021Ever wondered the differences between Rabbit MQ and Kafka and the need for Kafka when there is already RabbitMQ? Read through the post and you will find the answer.
Origin
Kafka has its origin in LinkedIn and open source under Apache. Designed to solve issues inside LinkedIn and built for high speed data integration and real time data processing for huge volumes of data.
RabbitMQ one of the first open source message brokers to achieve a reasonable level of messaging features.
Design
Kafka employs a dumb broker and uses smart consumers to read its buffer. Kafka does not attempt to track which messages were read by each consumer. Kafka retains all messages for a set amount of time, and consumers are responsible to track message consumption.
RabbitMQ uses a smart broker / dumb consumer model, focused on consistent delivery of messages to consumers that consume at a roughly similar pace as the broker keeps track of consumer state.
Message handling
Kafka retains messages based on the retention time set on the topic
RabbitMQ removes messages as soon as the messages are consumed and acknowledged
Volume
Kakfa is designed to store and stream huge volume of data with very little overhead
RabbitMQ is fast when they have low volume in the queues. They perform slowly as the volume goes up.
Performance
If performance + volume is your use case, Kafka will be the best choice. You can easily get 100,000 messages per second.
Rabbit works best when volume is not too high. You will see slowness as the volume grows. You can expect around 20,000 messages per second.
Scaling
Kafka can be horizontally scaled – by adding more machines
RabbitMQ can be vertically scaled – by adding more power to machines
Horizontal scaling makes Kafka suitable for Big Data applications.
Monitoring
We can not call one tool to be best for Kafka monitoring. Confluent, Datadog and few other vendors have developed monitoring tools for Kafka.
RabbitMQ has inbuilt monitoring tools which allows queues, connections, exchanges and user permissions.
Routing rules
Kafka takes a simple routing approach. Consumption is managed by consumers entirely. Consumers in Kafka can read messages from the beginning or from a certain offset.
RabbitMQ supports complex routing rules and keeps tracks of message states – consumed, acknowledged etc.