Apache Kafka vs Apache Storm - Big Data In Real World

Apache Kafka vs Apache Storm

How does Spark choose the join algorithm to use at runtime?
February 12, 2021
How to migrate an Amazon S3 bucket from one region to another?
February 17, 2021
How does Spark choose the join algorithm to use at runtime?
February 12, 2021
How to migrate an Amazon S3 bucket from one region to another?
February 17, 2021

Kafka 

Distributed, durable and reliable message broker which can handle high volume of real time messages coming from realtime producers.

Storage for real time streaming data

Kafka has evolved quite a bit in the recent years with the addition of Kafka Streams which does provide stream computation abilities.

Kafka connect offers plug and play connection to many real-time sources.

From the architecture standpoint, Kafka cluster is made up of broker nodes and uses zookeeper for coordination style tasks.

Storm 

Scalable, fault-tolerant, real-time analytic system. 

Computation on real time streaming data

In Storm, a spout is a source of real-time streams and bolt does some computation on the stream. Set of spouts and streams are connected together forming a Storm topology which is capable of performing complex real-time computation.

From the architecture standpoint, Storm cluster is made up of supervisor nodes and use zookeeper for coordination style tasks.

Using Kafa and Storm together

Below high level architecture is very common in real world real-time stream processing applications.

Real-time stream producer => Kafka => Storm => NoSQL or Files

Real-time stream producer will produce streaming records which will be fed to Kafka where the real-time messages are stored and even enhanced with few computations or joining with other streams.

Storm will then pick up the messages in Kafka for more custom and elaborate computations by passing the data through Storm topologies

Processed data can be sent to a NoSQL database or can be persisted in files.

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Apache Kafka vs Apache Storm
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X