What is consumer offset and the purpose of consumer offset in Kafka? - Big Data In Real World

What is consumer offset and the purpose of consumer offset in Kafka?

How to fix “Could not locate executable winutils.exe” issue in Hadoop?
April 23, 2021
How to get a list of YARN applications that are currently running in a Hadoop cluster?
April 28, 2021
How to fix “Could not locate executable winutils.exe” issue in Hadoop?
April 23, 2021
How to get a list of YARN applications that are currently running in a Hadoop cluster?
April 28, 2021

There is a lot of confusion when it comes to consumer offset, what is the purpose for it and how to change the offset. The goal of this post is to give you a clear understanding of what is consumer offset, how it is managed and how to change them?

What is consumer offset?

Consumer offset is used to track the messages that are consumed by consumers in a consumer group. A topic can be consumed by many consumer groups and each consumer group will have many consumers. A topic is divided into multiple partitions.

A consumer in a consumer group is assigned to a partition. Only one consumer is assigned to a partition. A consumer can be assigned to consume multiple partitions.

Consumer offset is managed at the partition level per consumer group. In summary –

  • Topic has many partitions
  • Topic can have many consumer groups
  • Only one consumer in a consumer group can be assigned to consume messages from a partition
  • A consumer offset is managed at the partition level per consumer group

What is the purpose of consumer offset?

Let’s say a topic has 2 partitions and partition-1 has 100 messages and the topic has 2 consumer groups.

Consumer group 1 – has 2 consumers and partition-1 is consumed by consumer-1

Consumer group 2 – has 2 consumers and partition-1 is consumed by consumer-2

Consumer offset of 9 in consumer group 1 for partition-1 mean that consumer-1 in consumer group 1 have read 10 messages from partition-1 (offset starts from 0)

Consumer offset of 50 in consumer group 2 for partition-1 mean that consumer-2 in consumer group 2 have read 51 messages from the partition-1 (offset starts from 0)

Consumer offset is recorded in Kafka so if the consumer processing the partition in the consumer group goes down and when the consumer comes back, the consumer will read the offset to start reading the messages from the topic from where it is left off. This avoids duplication in message consumption.

All consumers in the consumer group will have access to read the consumer offset for the partitions they are responsible for consuming and this avoids consuming a message which was already consumed.  This also avoids duplication in message consumption.

Where is consumer offset stored?

Kafka version older than 0.9 store offsets in Zookeeper

Kafka version 0.9 and later store offsets in Kafka brokers. This avoids another dependency on Zookeeper. Also, Kafka can handle the load easily when compared to Zookeeper when there are a lot of consumers to the cluster.

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

3 Comments

  1. […] post What is consumer offset and the purpose of consumer offset in Kafka? appeared first on Hadoop In Real […]

  2. […] our post titled “What is consumer offset and the purpose of consumer offset in Kafka?” to learn more about consumer […]

  3. […] Interested in learning more about consumer offset? Check out this post. […]

What is consumer offset and the purpose of consumer offset in Kafka?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X