Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #4082
| References | <8ikj88-bs1.ln1@svn.schaathun.net> |
|---|---|
| Date | 2011-04-26 13:31 -0700 |
| Subject | Re: client-server parallellised number crunching |
| From | Dan Stromberg <drsalists@gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.873.1303849865.9059.python-list@python.org> (permalink) |
On Tue, Apr 26, 2011 at 12:55 PM, Hans Georg Schaathun <georg@schaathun.net> wrote: > I wonder if anyone has any experience with this ... > > I try to set up a simple client-server system to do some number > crunching, using a simple ad hoc protocol over TCP/IP. I use > two Queue objects on the server side to manage the input and the output > of the client process. A basic system running seemingly fine on a single > quad-core box was surprisingly simple to set up, and it seems to give > me a reasonable speed-up of a factor of around 3-3.5 using four client > processes in addition to the master process. (If anyone wants more > details, please ask.) > > Now, I would like to use remote hosts as well, more precisely, student > lab boxen which are rather unreliable. By experience I'd expect to > lose roughly 4-5 jobs in 100 CPU hours on average. Thus I need some > way of detecting lost connections and requeue unfinished tasks, > avoiding any serious delays in this detection. What is the best way to > do this in python? > > It is, of course, possible for the master thread upon processing the > results, to requeue the tasks for any missing results, but it seems > to me to be a cleaner solution if I could detect disconnects and > requeue the tasks from the networking threads. Is that possible > using python sockets? > > Somebody will probably ask why I am not using one of the multiprocessing > libraries. I have tried at least two, and got trapped by the overhead > of passing complex pickled objects across. Doing it myself has at least > helped me clarify what can be parallelised effectively. Now, > understanding the parallelisable subproblems better, I could try again, > if I can trust that these libraries can robustly handle lost clients. > That I don't know if I can. You probably should assign a unique identifier to each piece of work, and implement two timeouts - one on your socket, using select or poll or similar, and one for the pieces of work based on the identifier. http://gengnosis.blogspot.com/2007/01/level-triggered-and-edge-triggered.html
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar
client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-26 20:55 +0100
Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 06:20 +1000
Re: client-server parallellised number crunching Dan Stromberg <drsalists@gmail.com> - 2011-04-26 13:31 -0700
Re: client-server parallellised number crunching Dan Stromberg <drsalists@gmail.com> - 2011-04-26 13:33 -0700
Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-26 21:47 +0100
Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 07:07 +1000
Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 06:35 +1000
Re: client-server parallellised number crunching geremy condra <debatem1@gmail.com> - 2011-04-26 14:31 -0700
Re: client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-27 06:58 +0100
Re: client-server parallellised number crunching geremy condra <debatem1@gmail.com> - 2011-04-26 23:54 -0700
Re: client-server parallellised number crunching Hans Georg Schaathun <georg@schaathun.net> - 2011-04-27 10:57 +0100
Re: client-server parallellised number crunching Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2011-04-27 11:35 +0200
Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 13:21 +0100
Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-27 23:35 +1000
Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 15:15 +0100
Re: client-server parallellised number crunching Chris Angelico <rosuav@gmail.com> - 2011-04-28 00:58 +1000
Re: client-server parallellised number crunching Hans Georg Schaathun <hg@schaathun.net> - 2011-04-27 19:28 +0100
csiph-web