How to end Python's main threads Producer-Consumer?

advertisements

I have a Producer and a Consumer thread (threading.Thread), which share a queue of type Queue.

Producer run:

while self.running:
    product = produced() ### I/O operations
    queue.put(product)

Consumer run:

while self.running or not queue.empty():
    product = queue.get()
    time.sleep(several_seconds) ###
    consume(product)

Now I need to terminate both threads from main thread, with the requirement that queue must be empty (all consumed) before terminating.

Currently I'm using code like below to terminate these two threads:

main thread stop:

producer.running = False
producer.join()
consumer.running = False
consumer.join()

But I guess it's unsafe if there are more consumers.

In addition, I'm not sure whether the sleep will release schedule to the producer so that it can produce more products. In fact, I find the producer keeps "starving" but I'm not sure whether this is the root cause.

Is there a decent way to deal with this case?


Edit 2:

a) The reason your consumers keep taking so much time is because your loop runs continously even when you have no data.

b) I added code at that bottom that shows how to handle this.

If I understood you correctly, the producer/consumer is a continuous process, e.g. it is acceptable to delay the shutdown until you exit the current blocking I/O and process the data you received from that.

In that case, to shut down your producer and consumer in an orderly fashion, I would add communication from the main thread to the producer thread to invoke a shutdown. In the most general case, this could be a queue that the main thread can use to queue a "shutdown" code, but in the simple case of a single producer that is to be stopped and never restarted, it could simply be a global shutdown flag.

Your producer should check this shutdown condition (queue or flag) in its main loop right before it would start a blocking I/O operation (e.g. after you have finished sending other data to the consumer queue). If the flag is set, then it should put a special end-of-data code (that does not look like your normal data) on the queue to tell the consumer that a shut down is occurring, and then the producer should return (terminate itself).

The consumer should be modified to check for this end-of-data code whenever it pulls data out of the queue. If the end-of-data code is found, it should do an orderly shutdown and return (terminating itself).

If there are multiple consumers, then the producer could queue multiple end-of-data messages -- one for each consumer -- before it shuts down. Since the consumers stop consuming after they read the message, they will all eventually shut down.

Alternatively, if you do not know up-front how many consumers there are, then part of the orderly shut down of the consumer could be re-queueing the end-of-data code.

This will insure that all consumers eventually see the end-of-data code and shut down, and when all are done, there will be one remaining item in the queue -- the end-of-data code queued by the last consumer.

EDIT:

The correct way to represent your end-of-data code is highly application dependent, but in many cases a simple None works very well. Since None is a singleton, the consumer can use the very efficient if data is None construct to deal with the end case.

Another possibility that can be even more efficient in some cases is to set up a try /except outside your main consumer loop, in such a way that if the except happened, it was because you were trying to unpack the data in a way that always works except for when you are processing the end-of-data code.

EDIT 2:

Combining these concepts with your initial code, now the producer does this:

while self.running:
    product = produced() ### I/O operations
    queue.put(product)
for x in range(number_of_consumers):
    queue.put(None)  # Termination code

Each consumer does this:

while 1:
    product = queue.get()
    if product is None:
        break
    consume(product)

The main program can then just do this:

producer.running = False
producer.join()
for consumer in consumers:
    consumer.join()