Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #41053 > unrolled thread
| Started by | Abhinav M Kulkarni <amkulkar@uci.edu> |
|---|---|
| First post | 2013-03-10 22:57 -0700 |
| Last post | 2013-03-10 22:57 -0700 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Advice regarding multiprocessing module Abhinav M Kulkarni <amkulkar@uci.edu> - 2013-03-10 22:57 -0700
| From | Abhinav M Kulkarni <amkulkar@uci.edu> |
|---|---|
| Date | 2013-03-10 22:57 -0700 |
| Subject | Advice regarding multiprocessing module |
| Message-ID | <mailman.3176.1362981490.2939.python-list@python.org> |
[Multipart message — attachments visible in raw view] — view raw
Dear all,
I need some advice regarding use of the multiprocessing module.
Following is the scenario:
* I am running gradient descent to estimate parameters of a pairwise
grid CRF (or a grid based graphical model). There are 106 data
points. Each data point can be analyzed in parallel.
* To calculate gradient for each data point, I need to perform
approximate inference since this is a loopy model. I am using Gibbs
sampling.
* My grid is 9x9 so there are 81 variables that I am sampling in one
sweep of Gibbs sampling. I perform 1000 iterations of Gibbs sampling.
* My laptop has quad-core Intel i5 processor, so I thought using
multiprocessing module I can parallelize my code (basically
calculate gradient in parallel on multiple cores simultaneously).
* I did not use the multi-threading library because of GIL issues, GIL
does not allow multiple threads to run at a time.
* As a result I end up creating a process for each data point (instead
of a thread that I would ideally like to do, so as to avoid process
creation overhead).
* I am using basic NumPy array functionalities.
Previously I was running this code in MATLAB. It runs quite faster, one
iteration of gradient descent takes around 14 sec in MATLAB using parfor
loop (parallel loop - data points is analyzed within parallel loop).
However same program takes almost 215 sec in Python.
I am quite amazed at the slowness of multiprocessing module. Is this
because of process creation overhead for each data point?
Please keep my email in the replies as I am not a member of this mailing
list.
Thanks,
Abhinav
Back to top | Article view | comp.lang.python
csiph-web