Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #62234

Multiprocessing pool with custom process class

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <sergey.y.fedorov@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.004
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; '(especially': 0.07; '(using': 0.07; '(aka': 0.09; 'caller': 0.09; 'stale': 0.09; 'subject:process': 0.09; 'runs': 0.10; 'missed': 0.12; 'thread': 0.14; 'advance!': 0.16; 'appreciated!': 0.16; 'blocking': 0.16; 'fetch': 0.16; 'subject:class': 0.16; 'all,': 0.19; 'module': 0.19; 'finished': 0.19; "python's": 0.19; 'thoughts': 0.19; 'not,': 0.20; '(in': 0.22; 'task': 0.26; 'function': 0.29; 'external': 0.29; "doesn't": 0.30; 'specified': 0.30; 'message- id:@mail.gmail.com': 0.30; "i'm": 0.30; 'getting': 0.31; 'bunch': 0.31; 'class': 0.32; 'call.': 0.33; 'problem': 0.35; 'subject:with': 0.35; 'received:google.com': 0.35; 'add': 0.35; 'there': 0.35; 'curious': 0.36; 'thanks': 0.36; "i'll": 0.36; 'possible': 0.36; 'handle': 0.38; 'to:addr:python-list': 0.38; 'resource': 0.38; 'to:addr:python.org': 0.39; 'series': 0.66; 'techniques': 0.66; 'due': 0.66; 'believe': 0.68; "it'd": 0.84; 'multicore': 0.84; 'subject:skip:M 10': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=wcyjPldBPpqkhC95iwwVdxVVER7BKNe3nuUwVP5o7kg=; b=m2MH1voylrGjVO0xRMEDakDt2q8kP3q13f0vLGooaSSgZHNtJa46MJ494dwHceBWpa CxjKE0CL8/eKY2kvAn2lhlkfuDJOKO26hH4L+jTMlbW6UxHEs5KvIHKcl7JilHBwHkfF domaSerMGaEsKJafzM1pfVeJOu7jtwNiUVLHzKAFvjvaJGQ5xR7ToNo3N66gLb62F9rB CXZRQUmBxMnxxAw1f5ZfyCnWkgrmz94980Iy1XaOaUo0DM/IAJ+7Mz8AUFCT+dbllk1F AUXGdcaxdXUxvs2nHfSd27aX3X4yopVuRMWYgPMhNjg5zSXebBWF0SwayM5hf816+TDo Q/bw==
MIME-Version 1.0
X-Received by 10.60.16.230 with SMTP id j6mr7189226oed.47.1387300175017; Tue, 17 Dec 2013 09:09:35 -0800 (PST)
Date Tue, 17 Dec 2013 09:09:34 -0800
Subject Multiprocessing pool with custom process class
From Sergey Fedorov <sergey.y.fedorov@gmail.com>
To python-list@python.org
Content-Type multipart/alternative; boundary=e89a8f503bb220ff9104edbdfeec
X-Mailman-Approved-At Tue, 17 Dec 2013 20:57:38 +0100
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4307.1387310259.18130.python-list@python.org> (permalink)
Lines 87
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1387310260 news.xs4all.nl 2836 [2001:888:2000:d::a6]:59180
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:62234

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

Hi All,

I have a web-service that needs to handle a bunch of work requests. Each
job involves IO call (DB, external web-services to fetch some data), so
part of the time is spent on the blocking IO call. On the other side, after
getting the data the job involves computational part (using numpy/pandas on
time series dataframes).
Service runs on multicore machine, so I want to use parallelism as much as
possible (especially considering python's GIL) and due to decent number of
IO, I want to use multiple threads inside each process so none of CPUs will
stale due to IO delays.

It'd be the best scenario to use pool of processes and thread pool (because
each worker will need to keep some state, like db connections). I already
have my own thread pool implementation, that uses some load-balancing and
fair-scheduling techniques that are specific to my problem domain.

I'm curious if there is any multiprocessing module that I missed and which
I can reuse. As it turned out, the on in the multiprocessing module doesn't
support custom Process class (if there were, I would be able to derive it
and add the functionality I need) (
http://stackoverflow.com/questions/740844/python-multiprocessing-pool-of-custom-processes).
Is there any alternative module that I can reuse?

If not, what's the best way to notify caller that the task finished its
execution (aka multiprocessing.Pool's apply() function behavior)? What
primitives are better to use for that purpose (in case I'll have to go with
my own implementation of multiprocessing pool)? Any reference to good
blog/educational resource will be highly appreciated!

If you believe that my solution is not optimal and have better/easier
solution (hope I specified my problem good enough), please share your
thoughts

Thanks in advance!

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Multiprocessing pool with custom process class Sergey Fedorov <sergey.y.fedorov@gmail.com> - 2013-12-17 09:09 -0800

csiph-web