Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #41066

Re: Advice regarding multiprocessing module

Path csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <amkulkar@uci.edu>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'ideally': 0.04; 'initialize': 0.05; '__name__': 0.07; 'python': 0.09; '(instead': 0.09; 'blocking': 0.09; 'grid': 0.09; 'slow.': 0.09; 'subject:module': 0.09; 'subject:skip:m 10': 0.09; 'cc:addr :python-list': 0.10; 'thread': 0.11; 'do,': 0.15; 'library': 0.15; "'__main__':": 0.16; 'big,': 0.16; 'dump': 0.16; 'iteration': 0.16; 'matlab': 0.16; 'numpy': 0.16; 'param': 0.16; 'processor,': 0.16; 'sec': 0.16; 'stated,': 0.16; 'threads': 0.16; 'time.time()': 0.16; 'wrote:': 0.17; 'intel': 0.17; 'issue,': 0.17; 'variables': 0.17; 'thanks,': 0.18; 'previously': 0.18; 'code,': 0.18; '(or': 0.18; 'module': 0.19; 'skip:p 30': 0.20; 'parameters': 0.20; 'all,': 0.21; 'runs': 0.22; 'cheers,': 0.23; 'cc:2**0': 0.23; 'url:profile': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'creating': 0.26; 'am,': 0.27; 'module.': 0.27; 'skip:( 20': 0.28; 'run': 0.28; 'post': 0.28; 'faster,': 0.29; 'gil': 0.29; 'overhead': 0.29; 'queue': 0.29; 'time:': 0.29; 'array': 0.29; 'points': 0.29; 'basic': 0.30; 'function': 0.30; 'code': 0.31; 'point': 0.31; 'url:python': 0.32; '-----': 0.32; 'running': 0.32; 'print': 0.32; 'everyone.': 0.33; 'point,': 0.33; 'subject:regarding': 0.33; 'hi,': 0.33; 'received:google.com': 0.34; 'list': 0.35; 'especially': 0.35; 'board': 0.35; 'posting': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'there': 0.35; 'list.': 0.35; 'url:org': 0.36; 'url:library': 0.36; "i'll": 0.36; 'thank': 0.36; 'too': 0.36; 'does': 0.37; 'quite': 0.37; 'received:209': 0.37; 'well.': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'perform': 0.38; 'store': 0.38; 'files': 0.38; 'some': 0.38; 'url:docs': 0.38; 'advice': 0.39; 'takes': 0.39; 'received:192': 0.39; 'where': 0.40; 'received:192.168': 0.40; 'end': 0.40; 'your': 0.60; 'easy': 0.60; 'you.': 0.61; 'profile': 0.61; 'skip:\xc2 10': 0.62; 'safe': 0.63; 'effective': 0.63; 'information': 0.63; 'within': 0.64; 'jobs': 0.65; 'person,': 0.65; 'dear': 0.66; 'laptop': 0.66; 'disclose': 0.69; '8bit%:100': 0.70; 'notice:': 0.71; 'privileged.': 0.72; '215': 0.84; 'amazed': 0.84; 'calculations': 0.84; 'analyzed': 0.91; 'graphical': 0.91; 'medium.': 0.91
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type:x-gm-message-state; bh=oSDI5NI9htUJCVTYJxNZFhcPY4KQLtTKclUJqdsbOaw=; b=RsW+1/qH1glYnyKY0MdyTl2Vl7KDjx8AgrTea5ghkFb1KTpsRqZl2f8Qm8dYeXkqgt bbIaDSEsxXDUVmHqjTTBpxwh5WRUG9ulRD2WZipyjen+DRWlzApZqAbKFXKCipOEM14W LlcmEYq/JNIoVh/E1AqA1xc/UVPL6zvXL4MQEeCCwEU+v66dym6naWKv/2EVSzwmUbPx Vx8OdrVXOrGLDgzfE0Utopj/0KA1syFiJNVGjbNpM3V5JKBwyyN7LcjgMOzMbiH17dr6 ZfX8Z3hgrN4S8QoMY6fBCL67rMogYatq3aTmkvnlgxCo3hJGERAiDqAN2Q19PaACu4SV wlFA==
X-Received by 10.182.93.193 with SMTP id cw1mr8884584obb.93.1363013834369; Mon, 11 Mar 2013 07:57:14 -0700 (PDT)
Date Mon, 11 Mar 2013 07:57:00 -0700
From Abhinav M Kulkarni <amkulkar@uci.edu>
User-Agent Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221 Thunderbird/17.0.3
MIME-Version 1.0
To Jean-Michel Pichavant <jeanmichel@sequans.com>
Subject Re: Advice regarding multiprocessing module
References <1522066129.3339778.1363000462968.JavaMail.root@sequans.com>
In-Reply-To <1522066129.3339778.1363000462968.JavaMail.root@sequans.com>
Content-Type multipart/alternative; boundary="------------090600010301020208010905"
X-Gm-Message-State ALoCoQmxfXBflpWZhyfCD8HMXbNp3Ta/3Xrc4YIYNkZxuQJWtlNjXjh/Gnl6yPJQlLv8oAnkkoIt
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3188.1363013840.2939.python-list@python.org> (permalink)
Lines 246
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1363013840 news.xs4all.nl 6856 [2001:888:2000:d::a6]:36527
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:41066

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

Hi Jean,

Below is the code where I am creating multiple processes:

if __name__ == '__main__':
     # List all files in the games directory
     files = list_sgf_files()

     # Read board configurations
     (intermediateBoards, finalizedBoards) = read_boards(files)

     # Initialize parameters
     param = Param()

     # Run maxItr iterations of gradient descent
     for itr in range(maxItr):
         # Each process analyzes one single data point
         # They dump their gradient calculations in queue q
         # Queue in Python is process safe
         start_time = time.time()
         q = Queue()
         jobs = []
         # Create a process for each game board
         for i in range(len(files)):
             p = Process(target=TrainGoCRFIsingGibbs, 
args=(intermediateBoards[i], finalizedBoards[i], param, q))
             p.start()
             jobs.append(p)
         # Blocking wait for each process to finish
         for p in jobs:
             p.join()
         elapsed_time = time.time() - start_time
         print 'Iteration: ', itr, '\tElapsed time: ', elapsed_time

As you recommended, I'll use the profiler to see which part of the code 
is slow.

Thanks,
Abhinav

On 03/11/2013 04:14 AM, Jean-Michel Pichavant wrote:
> ----- Original Message -----
>
>> Dear all,
>> I need some advice regarding use of the multiprocessing module.
>> Following is the scenario:
>> * I am running gradient descent to estimate parameters of a pairwise
>> grid CRF (or a grid based graphical model). There are 106 data
>> points. Each data point can be analyzed in parallel.
>> * To calculate gradient for each data point, I need to perform
>> approximate inference since this is a loopy model. I am using Gibbs
>> sampling.
>> * My grid is 9x9 so there are 81 variables that I am sampling in one
>> sweep of Gibbs sampling. I perform 1000 iterations of Gibbs
>> sampling.
>> * My laptop has quad-core Intel i5 processor, so I thought using
>> multiprocessing module I can parallelize my code (basically
>> calculate gradient in parallel on multiple cores simultaneously).
>> * I did not use the multi-threading library because of GIL issues,
>> GIL does not allow multiple threads to run at a time.
>> * As a result I end up creating a process for each data point
>> (instead of a thread that I would ideally like to do, so as to avoid
>> process creation overhead).
>> * I am using basic NumPy array functionalities.
>> Previously I was running this code in MATLAB. It runs quite faster,
>> one iteration of gradient descent takes around 14 sec in MATLAB
>> using parfor loop (parallel loop - data points is analyzed within
>> parallel loop). However same program takes almost 215 sec in Python.
>> I am quite amazed at the slowness of multiprocessing module. Is this
>> because of process creation overhead for each data point?
>> Please keep my email in the replies as I am not a member of this
>> mailing list.
>> Thanks,
>> Abhinav
> Hi,
>
> Can you post some code, especially the part where you're create/running the processes ? If it's not too big, the process function as well.
>
> Either multiprocess is slow like you stated, or you did something wrong.
>
> Alternatively, if posting code is an issue, you can profile your python code, it's very easy and effective at finding which the code is slowing down everyone.
> http://docs.python.org/2/library/profile.html
>
> Cheers,
>
> JM
>
>
> -- IMPORTANT NOTICE:
>
> The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Advice regarding multiprocessing module Abhinav M Kulkarni <amkulkar@uci.edu> - 2013-03-11 07:57 -0700

csiph-web