Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #41053

Advice regarding multiprocessing module

Date 2013-03-10 22:57 -0700
From Abhinav M Kulkarni <amkulkar@uci.edu>
Subject Advice regarding multiprocessing module
References <513D6FEB.9040706@uci.edu>
Newsgroups comp.lang.python
Message-ID <mailman.3176.1362981490.2939.python-list@python.org> (permalink)

Show all headers | View raw


[Multipart message — attachments visible in raw view] - view raw

Dear all,

I need some advice regarding use of the multiprocessing module. 
Following is the scenario:

  * I am running gradient descent to estimate parameters of a pairwise
    grid CRF (or a grid based graphical model). There are 106 data
    points. Each data point can be analyzed in parallel.
  * To calculate gradient for each data point, I need to perform
    approximate inference since this is a loopy model. I am using Gibbs
    sampling.
  * My grid is 9x9 so there are 81 variables that I am sampling in one
    sweep of Gibbs sampling. I perform 1000 iterations of Gibbs sampling.
  * My laptop has quad-core Intel i5 processor, so I thought using
    multiprocessing module I can parallelize my code (basically
    calculate gradient in parallel on multiple cores simultaneously).
  * I did not use the multi-threading library because of GIL issues, GIL
    does not allow multiple threads to run at a time.
  * As a result I end up creating a process for each data point (instead
    of a thread that I would ideally like to do, so as to avoid process
    creation overhead).
  * I am using basic NumPy array functionalities.

Previously I was running this code in MATLAB. It runs quite faster, one 
iteration of gradient descent takes around 14 sec in MATLAB using parfor 
loop (parallel loop - data points is analyzed within parallel loop). 
However same program takes almost 215 sec in Python.

I am quite amazed at the slowness of multiprocessing module. Is this 
because of process creation overhead for each data point?

Please keep my email in the replies as I am not a member of this mailing 
list.

Thanks,
Abhinav



Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Advice regarding multiprocessing module Abhinav M Kulkarni <amkulkar@uci.edu> - 2013-03-10 22:57 -0700

csiph-web