How to read multiple files into a single RDD or DataFrame in Spark?
May 10, 2021What is the difference between explode and posexplode functions in Hive?
May 14, 2021This is a very common use case. Consumer offset is used to track the messages that are consumed by consumers in a consumer group. A topic can be consumed by many consumer groups and each consumer group will have many consumers. A topic is divided into multiple partitions.
A consumer in a consumer group is assigned to a partition. Only one consumer is assigned to a partition. A consumer can be assigned to consume multiple partitions.
Consumer offset is managed at the partition level per consumer group.
Check our post titled “What is consumer offset and the purpose of consumer offset in Kafka?” to learn more about consumer offset
Why change consumer offset?
In some scenarios, consumers which consumed the messages from a Kafka partition could have resulted in errors and the consumption would have been incomplete. In such cases of consumption failures you may have a need to re-consume the messages which were previously consumed. In such instances you would have to reset the consumer offset to an earlier offset.
How to find the current consumer offset?
Use the kafka-consumer-groups along with the consumer group id followed by a describe.
kafka-consumer-groups.sh --bootstrap-server <kafkahost:port> --group <group_id> --describe
You will see 2 entries related to offsets – CURRENT-OFFSET and LOG-END-OFFSET for the partitions in the topic for that consumer group. CURRENT-OFFSET is the current offset for the partition in the consumer group.
How to change consumer offset?
Use the kafka-consumer-groups.sh to change or reset the offset. You would have to specify the topic, consumer group and use the –reset-offsets flag to change the offset.
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --to-earliest --all-topics --execute
Consumer offset reset options
–reset-offsets has several options and you can pick the correct one based on your needs.
–shift-by <number-of-offsets>
Use shift-by to move the offset ahead or behind. It can take both +ve or -ve number.
Reset the offset by incrementing the current offset position by 10
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --shift-by 10 --topic sales_topic --execute
Reset the offset by decrementing the current offset position by 10
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --shift-by -10 --topic sales_topic --execute
–to-datetime <String>
Reset offsets to offset from datetime. Format: ‘YYYY-MM-DDTHH:mm:SS.sss’
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --to-datetime 2020-11-01T00:00:00Z --topic sales_topic --execute
–to-earliest
Reset offsets to earliest (oldest) offset available in the topic.
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --to-earliest --topic sales_topic --execute
–to-latest
Reset offsets to latest (recent) offset available in the topic.
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group my-group --reset-offsets --to-latest --topic sales_topic --execute
2 Comments
[…] + Read More […]
[…] How do you reset Kafka topic offset? […]