Can Kafka be used as a distributed work queue
I’m considering using Kafka as a distributed work queue from which multiple workers can retrieve tasks. My original design looks like this:
Work Producer ---> Kafka topic ------worker 1 | |__worker 2 ... |__worker n
The problem with this design is:
If a worker gets a task from a topic and immediately submits the offset, the task might not be reprocessed in case of failure.
If a worker takes over a task from a topic and submits the offset only when completed, another worker might also take over the task and work on it. If the task lasts quite a long time, then almost all workers will accept the same task and work on it, completely suppressing the nature of distribution.
I’m looking for a way to “mark” a task in the queue as “in progress” so that it doesn’t get used by anyone else, but doesn’t commit the offset (as it might fail and need to be reprocessed). Is it achievable?
If some worker takes a task from the topic and immediately commits offset then in case of failure the task may not be reprocessed.
In this case, I recommend using auto.commit.offset configuration that manually commits and disables the consumer.
If some worker takes a task from the topic and commits offset only on finish then other workers may also takes this task and process it. If the task is pretty long lasting then almost all workers will take the same task and process it completely inhibiting the distributing nature.
You can handle this situation by using partition design themes and using ConsumerGroup to design consumers. In Kafka, each partition can only be read by one consumer thread in the Consumer Group.
This means that as long as all of your consumers (or “workers”) belong to the same ConsumerGroup, there will never be a situation where two workers start reading and processing the same message.