Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #105398

Re: multiprocessing, pool, queue length

From Michael Welle <mwe012008@gmx.net>
Newsgroups comp.lang.python
Subject Re: multiprocessing, pool, queue length
Date 2016-03-21 20:46 +0100
Organization MB-NET.NET for Open-News-Network e.V.
Message-ID <olv5scxgnt.ln2@news.c0t0d0s0.de> (permalink)
References <0qu4scx89u.ln2@news.c0t0d0s0.de> <mailman.456.1458587577.12893.python-list@python.org>

Show all headers | View raw


Hello,

Ian Kelly <ian.g.kelly@gmail.com> writes:

> On Mon, Mar 21, 2016 at 4:25 AM, Michael Welle <mwe012008@gmx.net> wrote:
>> Hello,
>>
>> I use a multiprocessing pool. My producer calls pool.map_async()
>> to fill the pool's job queue. It can do that quite fast, while the
>> consumer processes need much more time to empty the job queue. Since the
>> producer can create a lot of jobs, I thought about asking the pool for
>> the amount of jobs it has in its queue and then only produce more jobs
>> if the current value is below a threshold. It seems like the pool
>> doesn't want to tell me the level of the queue, does it? What is a
>> better strategy to solve this problem? Implementing a pool around
>> multiprocessing's Process and Queue?
>
> A simple solution would be to have a shared multiprocessing.Value that
> tracks how many items are in the pool. Whenever the producer produces
> items it increments the Value, and whenever a consumer finishes a job
> it decrements the Value.
I thought about that, but it doesn't feel 'right'.


> An alternative solution that doesn't require adding a small amount of
> work to every job would be to have the producer add a sentinel task
> that does nothing at or near the end of the batch, and either wait on
> the result or check it periodically. When it's done, then the pool is
> low enough to add more jobs.
Wait on the result means to set a multiprocessing.Event if one of the
consumers finds the sentinel task and wait for it on the producer? Hmm,
that might be better than incrementing a counter. But still, it couples
the consumers and the producer more than I like.

Another idea that I had is to use map() instead of map_async() and then
put the producer in its own process. That should work if job creation is
fast. 

Regards
hmw

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

multiprocessing, pool, queue length Michael Welle <mwe012008@gmx.net> - 2016-03-21 11:25 +0100
  Re: multiprocessing, pool, queue length Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-21 13:12 -0600
    Re: multiprocessing, pool, queue length Michael Welle <mwe012008@gmx.net> - 2016-03-21 20:46 +0100
      Re: multiprocessing, pool, queue length Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-21 15:24 -0600
        Re: multiprocessing, pool, queue length Michael Welle <mwe012008@gmx.net> - 2016-03-22 07:19 +0100

csiph-web