Apache Kafka vs Apache Storm

How does Spark choose the join algorithm to use at runtime?

February 12, 2021

How to migrate an Amazon S3 bucket from one region to another?

February 17, 2021

Published by Big Data In Real World at February 15, 2021

Kafka

Distributed, durable and reliable message broker which can handle high volume of real time messages coming from realtime producers.

Storage for real time streaming data

Kafka has evolved quite a bit in the recent years with the addition of Kafka Streams which does provide stream computation abilities.

Kafka connect offers plug and play connection to many real-time sources.

From the architecture standpoint, Kafka cluster is made up of broker nodes and uses zookeeper for coordination style tasks.

Storm

Scalable, fault-tolerant, real-time analytic system.

Computation on real time streaming data

In Storm, a spout is a source of real-time streams and bolt does some computation on the stream. Set of spouts and streams are connected together forming a Storm topology which is capable of performing complex real-time computation.

From the architecture standpoint, Storm cluster is made up of supervisor nodes and use zookeeper for coordination style tasks.

Using Kafa and Storm together

Below high level architecture is very common in real world real-time stream processing applications.

Real-time stream producer => Kafka => Storm => NoSQL or Files

Real-time stream producer will produce streaming records which will be fed to Kafka where the real-time messages are stored and even enhanced with few computations or joining with other streams.

Storm will then pick up the messages in Kafka for more custom and elaborate computations by passing the data through Storm topologies

Processed data can be sent to a NoSQL database or can be persisted in files.

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Apache Kafka vs Apache Storm

How does Spark choose the join algorithm to use at runtime?

How to migrate an Amazon S3 bucket from one region to another?

How does Spark choose the join algorithm to use at runtime?

How to migrate an Amazon S3 bucket from one region to another?

Kafka

Storm

Using Kafa and Storm together

Big Data In Real World

Related posts

How does a consumer know the offset to read after restart in Kafka?

Stream Processing vs. Message Processing: What’s the Difference?

How to list topics without accessing Zookeeper in Kafka?