Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #90170

Re: multiprocessing, queue

From Michael Welle <mwe012008@gmx.net>
Newsgroups comp.lang.python
Subject Re: multiprocessing, queue
Date 2015-05-08 16:31 +0200
Organization MB-NET.NET for Open-News-Network e.V.
Message-ID <autu1cxa2m.ln2@news.c0t0d0s0.de> (permalink)
References <gheu1cx2pd.ln2@news.c0t0d0s0.de> <mailman.251.1431092090.12865.python-list@python.org>

Show all headers | View raw


Hello,

Chris Angelico <rosuav@gmail.com> writes:

> On Fri, May 8, 2015 at 8:08 PM, Michael Welle <mwe012008@gmx.net> wrote:
>> Hello,
>>
>> what's wrong with [0]? As num_tasks gets higher proc.join() seems to
>> block forever. First I thought the magical frontier is around 32k tasks,
>> but then it seemed to work with 40k tasks. Now I'm stuck around 7k
>> tasks. I think I do something fundamentally wrong, but I can't find it.
>>
>> Regards
>> hmw
>>
>> [0] http://pastebin.com/adfBYgY9
>
> Your code's small enough to include inline, so I'm doing that:
[...]
> First thing I'd look at is the default queue size. If your result
> queue fills up, all processes will block until something starts
> retrieving results.
I tried to create the queues with a size of 100k, did not change the
behaviour.


> If you really want to have all your results stay
> in the queue like that, you may need to specify a huge queue size,
> which may cost you a lot of memory; much better would be to have each
> job post something on the result queue when it's done, and then you
> wait till they're all done:
>
> from multiprocessing import Process, Queue
>
> def foo(task_queue, result_queue):
>     while True:
>         n = task_queue.get()
>         if n is None: break
>         result_queue.put(1)
>     # Make sure None is not a possible actual result
>     # Otherwise, create an object() to use as a flag.
>     result_queue.put(None)
[...]
>     while num_procs:
>         result = results.get()
>         if result is None: num_procs -= 1
>         else: print('Result: {}'.format(result))
>
>     for proc in procs:
>         print("join")
>         proc.join()
>
> if __name__ == '__main__':
>     main()
>
>
> I've also made a few other changes (for instance, no need to subclass
> Process just to pass args), but the most important parts are a
> result_queue.put() just before the process ends,
In general there is no inherent reason for the last put() operation,
it's just the way you evaluate the results in the while loop?


> and switching the
> order of the result-queue-pump and process-join loops.
That seems to decide if my code blocks or not. Why do you do it that
way ;)? In the Queue's documentation one can find the following:


| Warning
| 
| As mentioned above, if a child process has put items on a queue (and
| it has not used JoinableQueue.cancel_join_thread), then that process
| will not terminate until all buffered items have been flushed to the
| pipe. 
| 
| This means that if you try joining that process you may get a deadlock
| unless you are sure that all items which have been put on the queue
| have been consumed. Similarly, if the child process is non-daemonic
| then the parent process may hang on exit when it tries to join all its
| non-daemonic children. 
| 
| Note that a queue created using a manager does not have this
| issue. See Programming guidelines. 

I guess that's what's biting me.


[...]
> As a general rule, queues need to have both ends operating
> simultaneously, otherwise you're likely to have them blocking. In
> theory, your code should all work with ridiculously low queue sizes;
> the only cost will be concurrency (since you'd forever be waiting for
> the queue, so your tasks will all be taking turns). I tested this by
> changing the Queue() calls to Queue(1), and the code took about twice
> as long to complete. :)
;) I know, as you might guess it's not a real world example. It's just
to explore the multiprocessing module.


Regards
hmw

-- 
biff4emacsen - A biff-like tool for (X)Emacs
http://www.c0t0d0s0.de/biff4emacsen/biff4emacsen.html
Flood - Your friendly network packet generator
http://www.c0t0d0s0.de/flood/flood.html

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

multiprocessing, queue Michael Welle <mwe012008@gmx.net> - 2015-05-08 12:08 +0200
  Re: multiprocessing, queue Chris Angelico <rosuav@gmail.com> - 2015-05-08 23:34 +1000
    Re: multiprocessing, queue Michael Welle <mwe012008@gmx.net> - 2015-05-08 16:31 +0200
      Re: multiprocessing, queue Chris Angelico <rosuav@gmail.com> - 2015-05-09 01:05 +1000

csiph-web