Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #7497

Re: parallel computations: subprocess.Popen(...).communicate()[0] does not work with multiprocessing.Pool

From Chris Torek <nospam@torek.net>
Newsgroups comp.lang.python
Subject Re: parallel computations: subprocess.Popen(...).communicate()[0] does not work with multiprocessing.Pool
Date 2011-06-12 22:00 +0000
Organization None of the Above
Message-ID <it3cuq01l04@news6.newsguy.com> (permalink)
References <mailman.105.1307737402.11593.python-list@python.org>

Show all headers | View raw


In article <mailman.105.1307737402.11593.python-list@python.org>
Hseu-Ming Chen  <hseuming@gmail.com> wrote:
>I am having an issue when making a shell call from within a
>multiprocessing.Process().  Here is the story: i tried to parallelize
>the computations in 800-ish Matlab scripts and then save the results
>to MySQL.   The non-parallel/serial version has been running fine for
>about 2 years.  However, in the parallel version via multiprocessing
>that i'm working on, it appears that the Matlab scripts have never
>been kicked off and nothing happened with subprocess.Popen.  The debug
>printing below does not show up either.

I obviously do not have your code, and have not even tried this as
an experiment in a simplified environment, but:

>import subprocess
>from multiprocessing import Pool
>
>def worker(DBrow,config):
>   #  run one Matlab script
>   cmd1 = "/usr/local/bin/matlab  ...  myMatlab.1.m"
>   subprocess.Popen([cmd1], shell=True, stdout=subprocess.PIPE).communicate()[0]
>   print "this does not get printed"
 ...
># kick off parallel processing
>pool = Pool()
>for DBrow in DBrows: pool.apply_async(worker,(DBrow,config))
>pool.close()
>pool.join()

The multiprocessing code makes use of pipes to communicate between
the various subprocesses it creates.  I suspect these "extra" pipes
are interfering with your subprocesses, when pool.close() waits
for the Matlab script to do something with its copy of the pipes.
To make the subprocess module close them -- so that Matlab does
not have them in the first place and hence pool.close() cannot get
stuck there -- add "close_fds=True" to the Popen() call.

There could still be issues with competing wait() and/or waitpid()
calls (assuming you are using a Unix-like system, or whatever the
equivalent is for Windows) "eating" the wrong subprocess completion
notifications, but that one is harder to solve in general :-) so
if close_fds fixes things, it was just the pipes.  If close_fds
does not fix things, you will probably need to defer the pool.close()
step until after all the subprocesses complete.
-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

parallel computations: subprocess.Popen(...).communicate()[0] does not work with multiprocessing.Pool Hseu-Ming Chen <hseuming@gmail.com> - 2011-06-10 16:23 -0400
  Re: parallel computations: subprocess.Popen(...).communicate()[0] does        not work with multiprocessing.Pool Chris Torek <nospam@torek.net> - 2011-06-12 22:00 +0000

csiph-web