Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18688 > unrolled thread

Re: Parallel Processing

Started byDavid Hoese <dhoese@gmail.com>
First post2012-01-08 18:02 -0600
Last post2012-01-08 21:46 -0500
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Parallel Processing David Hoese <dhoese@gmail.com> - 2012-01-08 18:02 -0600
    Re: Parallel Processing Yigit Turgut <y.turgut@gmail.com> - 2012-01-08 17:46 -0800
      Re: Parallel Processing Dave Angel <d@davea.name> - 2012-01-08 21:46 -0500

#18688 — Re: Parallel Processing

FromDavid Hoese <dhoese@gmail.com>
Date2012-01-08 18:02 -0600
SubjectRe: Parallel Processing
Message-ID<mailman.4536.1326067374.27778.python-list@python.org>
On 1/8/12 1:45 PM, Yigit Turgut <y.turgut@gmail.com> wrote:
> There are no imports other than defined on the script, which are;
>
> import pygame
> import sys
> import time
> import math
> import pp
>
> You are correct about  trying to pass two functions and second one is
> in place where a tuple of arguments supposed to be. But what if these
> functions don't have any arguments ? I tested functions test1() and
> test2() seperately ; they work. Once I figure out how to run these
> functions simultaneously, I will add an argument to test2 and try then
> on. My main goal is to simultaneously run two functions, one of them
> has one argument the other doesn't. To get familiar with parallel
> processing I am experimenting now without arguments and then I will
> embed the code to my application. I am experimenting with the
> following ;
>
> import pygame
> import sys
> import time
> import math
> import pp
>
> screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
> timer = pygame.time.Clock()
> white = True
> start = time.time()
> end = time.time() - start
>
> def test1():
>    global end
>    global white
>    while(end<5):
>      end = time.time() - start
>      timer.tick(4) #FPS
>      screen.fill((255,255,255) if white else (0, 0, 0))
>      white = not white
>      pygame.display.update()
>
> def test2():
>    global end
>    while(end<5):
>      end = time.time() - start
>      print end
>
> ppservers = ()
> job_server = pp.Server(ppservers=ppservers)
> print "Starting pp with", job_server.get_ncpus(), "workers"
>
> job1 = job_server.submit(test1())
> job2 = job_server.submit(test2())
> result = job1()
> result2 = job2()
>
> print "Counting...", result2
>
> job_server.print_stats()
>
> test1() works as expected (job1) but test2() doesn't work and I get
> the following traceback ;
>
> Traceback (most recent call last):
>    File "fl.py", line 33, in<module>
>      job1 = job_server.submit(test1())
>    File "/usr/lib/python2.6/site-packages/pp.py", line 458, in submit
>      sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
>    File "/usr/lib/python2.6/site-packages/pp.py", line 629, in
> __dumpsfunc
>      sources = [self.__get_source(func) for func in funcs]
>    File "/usr/lib/python2.6/site-packages/pp.py", line 696, in
> __get_source
>      sourcelines = inspect.getsourcelines(func)[0]
>    File "/usr/lib/python2.6/inspect.py", line 678, in getsourcelines
>      lines, lnum = findsource(object)
>    File "/usr/lib/python2.6/inspect.py", line 519, in findsource
>      file = getsourcefile(object) or getfile(object)
>    File "/usr/lib/python2.6/inspect.py", line 441, in getsourcefile
>      filename = getfile(object)
>    File "/usr/lib/python2.6/inspect.py", line 418, in getfile
>      raise TypeError('arg is not a module, class, method, '
> TypeError: arg is not a module, class, method, function, traceback,
> frame, or code object
>
> Error is related to test1 not having an argument.  When I leave it
> empty as following ;
>
> job1 = job_server.submit(test1,())
>
> test1 doesn't run. When I do ;
>
> job1 = job_server.submit(test1())
>
> Display works but I get;
>
> TypeError: arg is not a module, class, method, function, traceback,
> frame, or code object (complete traceback same as above).
>
> And test2 doesn't work also. But when I do;
>
> job1 = job_server.submit(test1,())
> job2 = job_server.submit(test2())
>
> I get test2 working but test1 not working. Obviously related to
> argument arrangement in submit.
Hi,

I've never used pygame or Parallel Python, but I played around with the 
code you provided and did one of my favorite debugging techniques...I 
printed things out and read the output.

So one thing I did was printed the globals before the function 
definitions and inside test1().  Which the first print shows what I 
expect from calling "print globals()", then inside test1() I only get 
functions, modules, and a few other things.  So I checked the pp 
documentation and found this about the globals keyword:

     globals - dictionary from which all modules, functions and classes

It also handles imports funny because I tried doing "from pprint import 
pprint" and it couldn't find it properly even though that's a function 
(it couldn't find a class that that function uses).  So I think you'll 
have to pass things in as arguments or a dependency functions as others 
have suggested.  There is also a 'modules' keyword that you can provide 
names of modules to import, which might help.  And is there a reason you 
need to use Parallel Python and can't use something more simple like 
python's "multiprocessing" or the classic "os.fork()"?  I understand 
that Parallel Python can run on remote servers in parallel...but how 
complicated is your program going to be?

I got the following to work (not sure if its what you want):
###
import pp

def test1():
     start = time.time()
     end = time.time() - start
     screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
     timer = pygame.time.Clock()
     white = True
     while(end<5):
         end = time.time() - start
     timer.tick(4) #FPS
     screen.fill((255,255,255) if white else (0, 0, 0))
     white = not white
     pygame.display.update()

def test2():
     start = time.time()
     end2= time.time() - start
     while(end2<5):
         end2 = time.time() - start
     print end2

ppservers = ()
job_server = pp.Server(ppservers=ppservers)

job1 = job_server.submit(test1, modules=("pygame","time"))
job2 = job_server.submit(test2, modules=("time",))
result = job1()
result2 = job2()

print  result2

job_server.print_stats()
###

However, I don't know if this will always work the way you want it to, 
depending on how you setup your Parallel Python servers.  By that I 
mean, if you run this on any machine that isn't local, I think it will 
try to connect to that remote display when getting the pygame "screen".  
But again, I've never used pygame.  And this was also a quick throw 
together, so you could probably pass in "start" and stuff like that so 
that it doesn't have to be calculated both times.

Summary: Parallel Python doesn't handle global variables in a normal way 
(it doesn't like things that aren't functions, modules, or classes) 
so...don't use globals.  Let me know if any of that didn't make sense.

-Dave

P.S. If anyone has any other results I would be curious to hear.

[toc] | [next] | [standalone]


#18691

FromYigit Turgut <y.turgut@gmail.com>
Date2012-01-08 17:46 -0800
Message-ID<baff2c26-0cdf-4b1b-b7ee-1a9ad02d22f9@dp8g2000vbb.googlegroups.com>
In reply to#18688
On Jan 9, 12:02 am, Dave Angel <d...@davea.name> wrote:
> On 01/08/2012 11:39 AM, Yigit Turgut wrote:
>
>
>
>
>
>
>
>
>
> > screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
> > timer = pygame.time.Clock()
> > white = True
> > start = time.time()
> > end = time.time() - start
> > end2= time.time() - start
>
> > def test1():
> >    global end
> >    global white
> >    while(end<5):
> >      end = time.time() - start
> >      timer.tick(4) #FPS
> >      screen.fill((255,255,255) if white else (0, 0, 0))
> >      white = not white
> >      pygame.display.update()
>
> > def test2():
> >    global end2
> >    while(end2<5):
> >      end2 = time.time() - start
> >      print end2
>
> > ppservers = ()
> > job_server = pp.Server(ppservers=ppservers)
>
> > job1 = job_server.submit(test1, (), globals=globals())
> > job2 = job_server.submit(test2, (), globals=globals())
> > result = job1()
> > result2 = job2()
>
> > print  result2
>
> > job_server.print_stats()
>
> > This *supposed to* print values of 'end' and simultaneously execute
> > test1. Eventhough I set globals parameter and nothing seems to be
> > wrong this code generates the following traceback ;
>
> > Starting pp with 2 workers
> > An error has occured during the function execution
> > Traceback (most recent call last):
> >    File "/usr/lib/python2.6/site-packages/ppworker.py", line 90, in run
> >      __result = __f(*__args)
> >    File "<string>", line 4, in test1
> > NameError: global name 'end' is not defined
> > An error has occured during the function execution
> > Traceback (most recent call last):
> >    File "/usr/lib/python2.6/site-packages/ppworker.py", line 90, in run
> >      __result = __f(*__args)
> >    File "<string>", line 3, in test2
> > NameError: global name 'end2' is not defined
>
> > How can this be, what am I missing ?
>
> I don't see anything on thehttp://www.parallelpython.com
> <http://www.parallelpython.com/> website that indicates how it handles
> globals.  Remember this is creating a separate process, so it can't
> literally share the globals you have.  i would have expected it to
> pickle them when you say globals=globals(), but I dunno. In any case, I
> can't see any value in making end global with the "global" statement.
> I'd move the end= line inside the function, and forget about making it
> global.

globals=globals() works but not for each variable (lol). Some of them
gets recognized and some don't for some reason I couldn't figure out
yet. But that's the main scheme.
>
> The other thing you don't supply is a list of functions that might be
> called by your function.  See the depfuncs argument.  It probably
> handles all the system libraries, but I can't see how it'd be expected
> to handle pygame.

depfuncs passes the dependent functions that will/might be used at the
execution phase of the called function. PP doesn't require to set
modules for basic I/O,sys etc. Only 3rd party ones like numpy, scipy
and so on.
>
> With the limited information supplied by the website, I'd experiment
> first with simpler things.  Make two functions that are self-contained,
> and try them first.  No global statements, and no calls to pygame.
> After that much worked, then I'd try adding arguments, and then return
> values.
>

That's what I did. After investigating similar approaches to achieve
the task, I unconsciously developed an idea that 'this is not going to
be easy' and approached with that perception. Now I realize that it's
much more simple than I thought. The work it does is complex but it
requires very little effort to operate functionally.

> Then i'd try calling separate functions (declaring them in depfuncs).
> And finally I'd try some 3rd party library.

Don't think will try another package for the same task. I am now
moving on to PP + PyCUDA to harness GPU available CPU cores.

Thank you for the guidance.

On Jan 9, 2:02 am, David Hoese <dho...@gmail.com> wrote:
> On 1/8/12 1:45 PM, Yigit Turgut <y.tur...@gmail.com> wrote:
>
> > There are no imports other than defined on the script, which are;
>
> > import pygame
> > import sys
> > import time
> > import math
> > import pp
>
> > You are correct about  trying to pass two functions and second one is
> > in place where a tuple of arguments supposed to be. But what if these
> > functions don't have any arguments ? I tested functions test1() and
> > test2() seperately ; they work. Once I figure out how to run these
> > functions simultaneously, I will add an argument to test2 and try then
> > on. My main goal is to simultaneously run two functions, one of them
> > has one argument the other doesn't. To get familiar with parallel
> > processing I am experimenting now without arguments and then I will
> > embed the code to my application. I am experimenting with the
> > following ;
>
> > import pygame
> > import sys
> > import time
> > import math
> > import pp
>
> > screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
> > timer = pygame.time.Clock()
> > white = True
> > start = time.time()
> > end = time.time() - start
>
> > def test1():
> >    global end
> >    global white
> >    while(end<5):
> >      end = time.time() - start
> >      timer.tick(4) #FPS
> >      screen.fill((255,255,255) if white else (0, 0, 0))
> >      white = not white
> >      pygame.display.update()
>
> > def test2():
> >    global end
> >    while(end<5):
> >      end = time.time() - start
> >      print end
>
> > ppservers = ()
> > job_server = pp.Server(ppservers=ppservers)
> > print "Starting pp with", job_server.get_ncpus(), "workers"
>
> > job1 = job_server.submit(test1())
> > job2 = job_server.submit(test2())
> > result = job1()
> > result2 = job2()
>
> > print "Counting...", result2
>
> > job_server.print_stats()
>
> > test1() works as expected (job1) but test2() doesn't work and I get
> > the following traceback ;
>
> > Traceback (most recent call last):
> >    File "fl.py", line 33, in<module>
> >      job1 = job_server.submit(test1())
> >    File "/usr/lib/python2.6/site-packages/pp.py", line 458, in submit
> >      sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
> >    File "/usr/lib/python2.6/site-packages/pp.py", line 629, in
> > __dumpsfunc
> >      sources = [self.__get_source(func) for func in funcs]
> >    File "/usr/lib/python2.6/site-packages/pp.py", line 696, in
> > __get_source
> >      sourcelines = inspect.getsourcelines(func)[0]
> >    File "/usr/lib/python2.6/inspect.py", line 678, in getsourcelines
> >      lines, lnum = findsource(object)
> >    File "/usr/lib/python2.6/inspect.py", line 519, in findsource
> >      file = getsourcefile(object) or getfile(object)
> >    File "/usr/lib/python2.6/inspect.py", line 441, in getsourcefile
> >      filename = getfile(object)
> >    File "/usr/lib/python2.6/inspect.py", line 418, in getfile
> >      raise TypeError('arg is not a module, class, method, '
> > TypeError: arg is not a module, class, method, function, traceback,
> > frame, or code object
>
> > Error is related to test1 not having an argument.  When I leave it
> > empty as following ;
>
> > job1 = job_server.submit(test1,())
>
> > test1 doesn't run. When I do ;
>
> > job1 = job_server.submit(test1())
>
> > Display works but I get;
>
> > TypeError: arg is not a module, class, method, function, traceback,
> > frame, or code object (complete traceback same as above).
>
> > And test2 doesn't work also. But when I do;
>
> > job1 = job_server.submit(test1,())
> > job2 = job_server.submit(test2())
>
> > I get test2 working but test1 not working. Obviously related to
> > argument arrangement in submit.
>
> Hi,
>
> I've never used pygame or Parallel Python, but I played around with the
> code you provided and did one of my favorite debugging techniques...I
> printed things out and read the output.
>
> So one thing I did was printed the globals before the function
> definitions and inside test1().  Which the first print shows what I
> expect from calling "print globals()", then inside test1() I only get
> functions, modules, and a few other things.  So I checked the pp
> documentation and found this about the globals keyword:
>
>      globals - dictionary from which all modules, functions and classes
>
> It also handles imports funny because I tried doing "from pprint import
> pprint" and it couldn't find it properly even though that's a function
> (it couldn't find a class that that function uses).  So I think you'll
> have to pass things in as arguments or a dependency functions as others
> have suggested.  There is also a 'modules' keyword that you can provide
> names of modules to import, which might help.  And is there a reason you
> need to use Parallel Python and can't use something more simple like
> python's "multiprocessing" or the classic "os.fork()"?  I understand
> that Parallel Python can run on remote servers in parallel...but how
> complicated is your program going to be?
>
> I got the following to work (not sure if its what you want):
> ###
> import pp
>
> def test1():
>      start = time.time()
>      end = time.time() - start
>      screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
>      timer = pygame.time.Clock()
>      white = True
>      while(end<5):
>          end = time.time() - start
>      timer.tick(4) #FPS
>      screen.fill((255,255,255) if white else (0, 0, 0))
>      white = not white
>      pygame.display.update()
>
> def test2():
>      start = time.time()
>      end2= time.time() - start
>      while(end2<5):
>          end2 = time.time() - start
>      print end2
>
> ppservers = ()
> job_server = pp.Server(ppservers=ppservers)
>
> job1 = job_server.submit(test1, modules=("pygame","time"))
> job2 = job_server.submit(test2, modules=("time",))
> result = job1()
> result2 = job2()
>
> print  result2
>
> job_server.print_stats()
> ###
>
> However, I don't know if this will always work the way you want it to,
> depending on how you setup your Parallel Python servers.  By that I
> mean, if you run this on any machine that isn't local, I think it will
> try to connect to that remote display when getting the pygame "screen".
> But again, I've never used pygame.  And this was also a quick throw
> together, so you could probably pass in "start" and stuff like that so
> that it doesn't have to be calculated both times.
>
> Summary: Parallel Python doesn't handle global variables in a normal way
> (it doesn't like things that aren't functions, modules, or classes)
> so...don't use globals.  Let me know if any of that didn't make sense.
>
> -Dave
>
> P.S. If anyone has any other results I would be curious to hear.

Hi,

That's correct, parallel python has an unusual way of handling
functions but it's actually very simple compared to the work it's
doing. It also *may* be some buggy because it doesn't get globals()
frequently thus one should do globals().update at necessary lines.Also
weirdly, it doesn't require to point to pygame. Following works as
expected ;

ppservers = ()
job_server = pp.Server(ppservers=ppservers)
job1 = job_server.submit(test1, args=(), globals=globals())
job2 = job_server.submit(test2, args=(), globals=globals())
result = job1()
result2 = job2()

Thank you for the productive reply.

[toc] | [prev] | [next] | [standalone]


#18694

FromDave Angel <d@davea.name>
Date2012-01-08 21:46 -0500
Message-ID<mailman.4538.1326077174.27778.python-list@python.org>
In reply to#18691
On 01/08/2012 08:46 PM, Yigit Turgut wrote:
> On Jan 9, 12:02 am, Dave Angel<d...@davea.name>  wrote:
>> <SNIP>
>> Then i'd try calling separate functions (declaring them in depfuncs).
>> And finally I'd try some 3rd party library.
> Don't think will try another package for the same task. I am now
> moving on to PP + PyCUDA to harness GPU available CPU cores.
>
> Thank you for the guidance.
>
Actually, I wasn't suggesting an alternative to pp, but rather 
introducing refs to libraries like numpy.  I don't know what pp's 
methodology is, but I can guess which parts are trivial, and which parts 
tend to be trickier.  Once things are in a separate process, it's best 
not to assume any shared state between the processes.  Thus I'd expect 
globals() to be copied, not shared.  And the error messages had very low 
line numbers, which I could take to mean they didn't add your imports 
and other stuff.

Point is, when i start having trouble with code that's inadequately 
documented, I try the simplest things, and work up to the complex ones.



-- 

DaveA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web