When do I need Queue.join()?
The Python 3 documentation shows an example of a worker thread using queues ( https://docs.python.org/3/library/queue.html):
def worker(): while True: item = q.get() if item is None: break do_work(item) q.task_done() q = queue. Queue() threads =  for i in range(num_worker_threads): t = threading. Thread(target=worker) t.start() threads.append(t) for item in source(): q.put(item) # block until all tasks are done q.join() # stop workers for i in range(num_worker_threads): q.put(None) for t in threads: t.join()
In this example, why
do you need q.join()? Do subsequent
q.put(None) and t.join() operations complete the same operation that blocks the main thread until the worker thread completes?
That’s how I understand this example.
Each worker loops indefinitely, always looking for something new from the queue. If the item it gets is
None, it breaks and returns control to main.
So, first we leave the program waiting queue empty. Each call
to q.task_done() marks a new project as completed. The code is stuck below, so we make sure each item is marked as complete.
# block until all tasks are done q.join()
Then, below, we add
None items with the same number of workers to the queue (so we make sure that each worker gets one.) )
for i in range(num_worker_threads): q.put(None)
Next, we join all threads. Because we assign a
None entry to each worker through the queue, they all break. We want to get stuck here before they all crash and regain control.
for t in threads: t.join()
In doing so, we help avoid orphaned processes by ensuring that every item in the queue is processed, interrupting every worker thread when the queue is empty, and shutting down each worker thread before we continue with our code.