Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #4082

Re: client-server parallellised number crunching

References <8ikj88-bs1.ln1@svn.schaathun.net>
Date 2011-04-26 13:31 -0700
Subject Re: client-server parallellised number crunching
From Dan Stromberg <drsalists@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.873.1303849865.9059.python-list@python.org> (permalink)

Show all headers | View raw


On Tue, Apr 26, 2011 at 12:55 PM, Hans Georg Schaathun
<georg@schaathun.net> wrote:
> I wonder if anyone has any experience with this ...
>
> I try to set up a simple client-server system to do some number
> crunching, using a simple ad hoc protocol over TCP/IP.  I use
> two Queue objects on the server side to manage the input and the output
> of the client process.  A basic system running seemingly fine on a single
> quad-core box was surprisingly simple to set up, and it seems to give
> me a reasonable speed-up of a factor of around 3-3.5 using four client
> processes in addition to the master process.  (If anyone wants more
> details, please ask.)
>
> Now, I would like to use remote hosts as well, more precisely, student
> lab boxen which are rather unreliable.  By experience I'd expect to
> lose roughly 4-5 jobs in 100 CPU hours on average.  Thus I need some
> way of detecting lost connections and requeue unfinished tasks,
> avoiding any serious delays in this detection.  What is the best way to
> do this in python?
>
> It is, of course, possible for the master thread upon processing the
> results, to requeue the tasks for any missing results, but it seems
> to me to be a cleaner solution if I could detect disconnects and
> requeue the tasks from the networking threads.  Is that possible
> using python sockets?
>
> Somebody will probably ask why I am not using one of the multiprocessing
> libraries.  I have tried at least two, and got trapped by the overhead
> of passing complex pickled objects across.  Doing it myself has at least
> helped me clarify what can be parallelised effectively.  Now,
> understanding the parallelisable subproblems better, I could try again,
> if I can trust that these libraries can robustly handle lost clients.
> That I don't know if I can.

You probably should assign a unique identifier to each piece of work,
and implement two timeouts - one on your socket, using select or poll
or similar, and one for the pieces of work based on the identifier.

http://gengnosis.blogspot.com/2007/01/level-triggered-and-edge-triggered.html

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-26 20:55 +0100
  Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 06:20 +1000
  Re: client-server parallellised number crunching Dan Stromberg <drsalists@gmail.com> - 2011-04-26 13:31 -0700
  Re: client-server parallellised number crunching Dan Stromberg <drsalists@gmail.com> - 2011-04-26 13:33 -0700
    Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-26 21:47 +0100
      Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 07:07 +1000
  Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 06:35 +1000
  Re: client-server parallellised number crunching geremy condra <debatem1@gmail.com> - 2011-04-26 14:31 -0700
    Re: client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-27 06:58 +0100
      Re: client-server parallellised number crunching geremy condra <debatem1@gmail.com> - 2011-04-26 23:54 -0700
        Re: client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-27 10:57 +0100
  Re: client-server parallellised number crunching Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2011-04-27 11:35 +0200
    Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 13:21 +0100
      Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 23:35 +1000
        Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 15:15 +0100
          Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-28 00:58 +1000
            Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 19:28 +0100

csiph-web