Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52059 > unrolled thread

Using Pool map with a method of a class and a list

Started byLuca Cerone <luca.cerone@gmail.com>
First post2013-08-06 10:12 -0700
Last post2013-08-06 12:37 -0700
Articles 15 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-06 10:12 -0700
    Re: Using Pool map with a method of a class and a list Chris Angelico <rosuav@gmail.com> - 2013-08-06 18:38 +0100
      Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-06 12:42 -0700
        Re: Using Pool map with a method of a class and a list Joshua Landau <joshua@landau.ws> - 2013-08-07 07:48 +0100
          Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-07 01:33 -0700
            Re: Using Pool map with a method of a class and a list Joshua Landau <joshua@landau.ws> - 2013-08-07 10:47 +0100
              Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-07 03:10 -0700
                Re: Using Pool map with a method of a class and a list Joshua Landau <joshua@landau.ws> - 2013-08-07 12:53 +0100
                  Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-07 15:26 -0700
                    Re: Using Pool map with a method of a class and a list Joshua Landau <joshua@landau.ws> - 2013-08-07 23:49 +0100
                Re: Using Pool map with a method of a class and a list Peter Otten <__peter__@web.de> - 2013-08-07 16:46 +0200
                Re: Using Pool map with a method of a class and a list Joshua Landau <joshua@landau.ws> - 2013-08-07 16:52 +0100
                Re: Using Pool map with a method of a class and a list Peter Otten <__peter__@web.de> - 2013-08-07 18:15 +0200
                  Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-07 16:31 -0700
    Re: Using Pool map with a method of a class and a list Luca Cerone <luca.cerone@gmail.com> - 2013-08-06 12:37 -0700

#52059 — Using Pool map with a method of a class and a list

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-06 10:12 -0700
SubjectUsing Pool map with a method of a class and a list
Message-ID<96c575da-7601-4023-aa91-e80664f90333@googlegroups.com>
Hi guys,
I would like to apply the Pool.map method to a member of a class.

Here is a small example that shows what I would like to do:

from multiprocessing import Pool

class A(object):
   def __init__(self,x):
       self.value = x
   def fun(self,x):
       return self.value**x


l = range(10)

p = Pool(4)

op = p.map(A.fun,l)

#using this with the normal map doesn't cause any problem

This fails because it says that the methods can't be pickled.
(I assume it has something to do with the note in the documentation: "functionality within this package requires that the __main__ module be importable by the children.", which is obscure to me).

I would like to understand two things: why my code fails and when I can expect it to fail? what is a possible workaround?

Thanks a lot in advance to everybody for the help!

Cheers,
Luca

[toc] | [next] | [standalone]


#52061

FromChris Angelico <rosuav@gmail.com>
Date2013-08-06 18:38 +0100
Message-ID<mailman.268.1375810736.1251.python-list@python.org>
In reply to#52059
On Tue, Aug 6, 2013 at 6:12 PM, Luca Cerone <luca.cerone@gmail.com> wrote:
> from multiprocessing import Pool
>
> class A(object):
>    def __init__(self,x):
>        self.value = x
>    def fun(self,x):
>        return self.value**x
>
>
> l = range(10)
>
> p = Pool(4)
>
> op = p.map(A.fun,l)

Do you ever instantiate any A() objects? You're attempting to call an
unbound method without passing it a 'self'.

You may find the results completely different in Python 2 vs Python 3,
and between bound and unbound methods. In Python 3, an unbound method
is simply a function. In both versions, a bound method carries its
first argument around, so it has to be something different. Play
around with it a bit.

ChrisA

[toc] | [prev] | [next] | [standalone]


#52064

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-06 12:42 -0700
Message-ID<4cff0d5e-33ab-42cd-b6d4-2b4fe235a274@googlegroups.com>
In reply to#52061
Hi Chris, thanks

> Do you ever instantiate any A() objects? You're attempting to call an
> 
> unbound method without passing it a 'self'.

I have tried a lot of variations, instantiating the object, creating lambda functions that use the unbound version of fun (A.fun.__func__) etc etc..
I have played around it quite a bit before posting.

As far as I have understood the problem is due to the fact that Pool pickle the function and copy it in the various pools.. 
But since the methods cannot be pickled this fails..

The same example I posted won't run in Python 3.2 neither (I am mostly interested in a solution for Python 2.7, sorry I forgot to mention that).

Thanks in any case for the help, hopefully there will be some other advice in the ML :)

Cheers,
Luca

[toc] | [prev] | [next] | [standalone]


#52114

FromJoshua Landau <joshua@landau.ws>
Date2013-08-07 07:48 +0100
Message-ID<mailman.301.1375858161.1251.python-list@python.org>
In reply to#52064
On 6 August 2013 20:42, Luca Cerone <luca.cerone@gmail.com> wrote:
> Hi Chris, thanks
>
>> Do you ever instantiate any A() objects? You're attempting to call an
>>
>> unbound method without passing it a 'self'.
>
> I have tried a lot of variations, instantiating the object, creating lambda functions that use the unbound version of fun (A.fun.__func__) etc etc..
> I have played around it quite a bit before posting.
>
> As far as I have understood the problem is due to the fact that Pool pickle the function and copy it in the various pools..
> But since the methods cannot be pickled this fails..
>
> The same example I posted won't run in Python 3.2 neither (I am mostly interested in a solution for Python 2.7, sorry I forgot to mention that).
>
> Thanks in any case for the help, hopefully there will be some other advice in the ML :)


I think you might not understand what Chris said.

Currently this does *not* work with Python 2.7 as you suggested it would.

>>> op = map(A.fun,l)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unbound method fun() must be called with A instance as
first argument (got int instance instead)

This, however, does:

>>> op = map(A(3).fun,l)
>>> op
[1, 3, 9, 27, 81, 243, 729, 2187, 6561, 19683]


Chris might have also been confused because once you fix that it works
in Python 3.

You will find that
http://stackoverflow.com/questions/1816958/cant-pickle-type-instancemethod-when-using-pythons-multiprocessing-pool-ma
explains the problem in more detail than I understand. I suggest
reading it and relaying further questions back to us. Or use Python 3
;).

[toc] | [prev] | [next] | [standalone]


#52120

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-07 01:33 -0700
Message-ID<021dfe24-af83-4307-856e-441cf35cb93a@googlegroups.com>
In reply to#52114
Hi Joshua thanks!

> I think you might not understand what Chris said.
> Currently this does *not* work with Python 2.7 as you suggested it would.
> >>> op = map(A.fun,l)

Yeah actually that wouldn't work even in Python 3, since value attribute used by fun has not been set.
It was my mistake in the example, but it is not the source of the problem..

> This, however, does:
> >>> op = map(A(3).fun,l)
> 
> >>> op
> 
> [1, 3, 9, 27, 81, 243, 729, 2187, 6561, 19683]
> 
> 

This works fine (and I knew that).. but is not what I want...

You are using the map() function that comes with Python. I want
to use the map() method of the Pool class (available in the multiprocessing module).

And there are differences between map() and Pool.map() apparently, so that if something works fine with map() it may not work with Pool.map() (as in my case).

To correct my example:

from multiprocessing import Pool

class A(object):
    def __init__(self,x):
        self.value = x
    def fun(self,x):
        return self.value**x

l = range(100)
p = Pool(4)
op = p.map(A(3).fun, l)

doesn't work neither in Python 2.7, nor 3.2 (by the way I can't use Python 3 for my application).

> You will find that 
> http://stackoverflow.com/questions/1816958/cant-pickle-type-instancemethod-> > when-using-pythons-multiprocessing-pool-ma 
> explains the problem in more detail than I understand. I suggest 
> reading it and relaying further questions back to us. Or use Python 3 

:) Thanks, but of course I googled and found this link before posting. I don't understand much of the details as well, that's why I posted here.

Anyway, thanks for the attempt :)

Luca

[toc] | [prev] | [next] | [standalone]


#52127

FromJoshua Landau <joshua@landau.ws>
Date2013-08-07 10:47 +0100
Message-ID<mailman.311.1375868873.1251.python-list@python.org>
In reply to#52120
On 7 August 2013 09:33, Luca Cerone <luca.cerone@gmail.com> wrote:
> To correct my example:
>
> from multiprocessing import Pool
>
> class A(object):
>     def __init__(self,x):
>         self.value = x
>     def fun(self,x):
>         return self.value**x
>
> l = range(100)
> p = Pool(4)
> op = p.map(A(3).fun, l)
>
> doesn't work neither in Python 2.7, nor 3.2 (by the way I can't use Python 3 for my application).

Are you using Windows? Over here on 3.3 on Linux it does. Not on 2.7 though.

>> You will find that
>> http://stackoverflow.com/questions/1816958/cant-pickle-type-instancemethod-> > when-using-pythons-multiprocessing-pool-ma
>> explains the problem in more detail than I understand. I suggest
>> reading it and relaying further questions back to us. Or use Python 3
>
> :) Thanks, but of course I googled and found this link before posting. I don't understand much of the details as well, that's why I posted here.
>
> Anyway, thanks for the attempt :)

Reading there, the simplest method seems to be, in effect:

from multiprocessing import Pool
from functools import partial

class A(object):
    def __init__(self,x):
        self.value = x
    def fun(self,x):
        return self.value**x

def _getattr_proxy_partialable(instance, name, arg):
    return getattr(instance, name)(arg)

def getattr_proxy(instance, name):
    """
    A version of getattr that returns a proxy function that can
    be pickled. Only function calls will work on the proxy.
    """
    return partial(_getattr_proxy_partialable, instance, name)

l = range(100)
p = Pool(4)
op = p.map(getattr_proxy(A(3), "fun"), l)
print(op)

[toc] | [prev] | [next] | [standalone]


#52128

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-07 03:10 -0700
Message-ID<13807c2e-7f9f-45dd-b36e-4cdc7cde6709@googlegroups.com>
In reply to#52127
> > doesn't work neither in Python 2.7, nor 3.2 (by the way I can't use Python 3 for my application).
>  
> Are you using Windows? Over here on 3.3 on Linux it does. Not on 2.7 though.

No I am using Ubuntu (12.04, 64 bit).. maybe things changed from 3.2 to 3.3?
 
> from multiprocessing import Pool
> 
> from functools import partial
> 
> 
> 
> class A(object):
> 
>     def __init__(self,x):
> 
>         self.value = x
> 
>     def fun(self,x):
> 
>         return self.value**x
> 
> 
> 
> def _getattr_proxy_partialable(instance, name, arg):
> 
>     return getattr(instance, name)(arg)
> 
> 
> 
> def getattr_proxy(instance, name):
> 
>     """
> 
>     A version of getattr that returns a proxy function that can
> 
>     be pickled. Only function calls will work on the proxy.
> 
>     """
> 
>     return partial(_getattr_proxy_partialable, instance, name)
> 
> 
> 
> l = range(100)
> 
> p = Pool(4)
> 
> op = p.map(getattr_proxy(A(3), "fun"), l)
> 
> print(op)

I can't try it now, I'll let you know later if it works!
(Though just by reading I can't really understand what the code does).

Thanks for the help,
Luca

[toc] | [prev] | [next] | [standalone]


#52131

FromJoshua Landau <joshua@landau.ws>
Date2013-08-07 12:53 +0100
Message-ID<mailman.314.1375876463.1251.python-list@python.org>
In reply to#52128
On 7 August 2013 11:10, Luca Cerone <luca.cerone@gmail.com> wrote:
> I can't try it now, I'll let you know later if it works!
> (Though just by reading I can't really understand what the code does).

Well,

>> from multiprocessing import Pool
>> from functools import partial
>>
>> class A(object):
>>     def __init__(self,x):
>>         self.value = x
>>     def fun(self,x):
>>         return self.value**x

This is all the same, as with

>> l = range(100)
>> p = Pool(4)

You then wanted to do:

> op = p.map(A(3).fun, l)

but bound methods can't be pickled, it seems.

However, A(3) *can* be pickled. So what we want is a function:

    def proxy(arg):
        A(3).fun(arg)

so we can write:

> op = p.map(proxy, l)

To generalise you might be tempted to write:

    def generic_proxy(instance, name):
        def proxy(arg):
            # Equiv. of instance.name(arg)
            getattr(instance, name)(arg)

but the inner function won't work as functions-in-functions can't be
pickled either.

So we use:

>> def _getattr_proxy_partialable(instance, name, arg):
>>     return getattr(instance, name)(arg)

Which takes all instance, name and arg. Of course we only want our
function to take arg, so we partial it:

>> def getattr_proxy(instance, name):
>>     """
>>     A version of getattr that returns a proxy function that can
>>     be pickled. Only function calls will work on the proxy.
>>     """
>>     return partial(_getattr_proxy_partialable, instance, name)

partial objects are picklable, btw.

>> op = p.map(getattr_proxy(A(3), "fun"), l)
>> print(op)

:)

[toc] | [prev] | [next] | [standalone]


#52152

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-07 15:26 -0700
Message-ID<f290cfce-132c-4c0f-a2ac-b62c4337a73f@googlegroups.com>
In reply to#52131
Thanks for the post.
I actually don't know exactly what can and can't be pickles..
not what partialing a function means..
Maybe can you link me to some resources?
 
I still can't understand all the details in your code :)

[toc] | [prev] | [next] | [standalone]


#52155

FromJoshua Landau <joshua@landau.ws>
Date2013-08-07 23:49 +0100
Message-ID<mailman.331.1375915830.1251.python-list@python.org>
In reply to#52152
On 7 August 2013 23:26, Luca Cerone <luca.cerone@gmail.com> wrote:
> Thanks for the post.
> I actually don't know exactly what can and can't be pickles..

I just try it and see what works ;).

The general idea is that if it is module-level it can be pickled and
if it is defined inside of something else it cannot. It depends
though.

> not what partialing a function means..

"partial" takes a function and returns it with arguments "filled in":

    from functools import partial

    def add(a, b):
        return a + b

    add5 = partial(add, 5)

    print(add5(10)) # Returns 15 == 5 + 10

> Maybe can you link me to some resources?

http://docs.python.org/2/library/functools.html#functools.partial


> I still can't understand all the details in your code :)

Never mind that, though, as Peter Otten's code (with my very minor
suggested modifications) if by far the cleanest method of the two and
is arguably more correct too.

[toc] | [prev] | [next] | [standalone]


#52134

FromPeter Otten <__peter__@web.de>
Date2013-08-07 16:46 +0200
Message-ID<mailman.315.1375886779.1251.python-list@python.org>
In reply to#52128
Joshua Landau wrote:

> On 7 August 2013 11:10, Luca Cerone <luca.cerone@gmail.com> wrote:
>> I can't try it now, I'll let you know later if it works!
>> (Though just by reading I can't really understand what the code does).
> 
> Well,
> 
>>> from multiprocessing import Pool
>>> from functools import partial
>>>
>>> class A(object):
>>>     def __init__(self,x):
>>>         self.value = x
>>>     def fun(self,x):
>>>         return self.value**x
> 
> This is all the same, as with
> 
>>> l = range(100)
>>> p = Pool(4)
> 
> You then wanted to do:
> 
>> op = p.map(A(3).fun, l)
> 
> but bound methods can't be pickled, it seems.
> 
> However, A(3) *can* be pickled. So what we want is a function:
> 
>     def proxy(arg):
>         A(3).fun(arg)
> 
> so we can write:
> 
>> op = p.map(proxy, l)
> 
> To generalise you might be tempted to write:
> 
>     def generic_proxy(instance, name):
>         def proxy(arg):
>             # Equiv. of instance.name(arg)
>             getattr(instance, name)(arg)
> 
> but the inner function won't work as functions-in-functions can't be
> pickled either.
> 
> So we use:
> 
>>> def _getattr_proxy_partialable(instance, name, arg):
>>>     return getattr(instance, name)(arg)
> 
> Which takes all instance, name and arg. Of course we only want our
> function to take arg, so we partial it:
> 
>>> def getattr_proxy(instance, name):
>>>     """
>>>     A version of getattr that returns a proxy function that can
>>>     be pickled. Only function calls will work on the proxy.
>>>     """
>>>     return partial(_getattr_proxy_partialable, instance, name)
> 
> partial objects are picklable, btw.
> 
>>> op = p.map(getattr_proxy(A(3), "fun"), l)
>>> print(op)
> 
> :)


There is also the copy_reg module. Adapting

<http://mail.python.org/pipermail/python-list/2008-July/469164.html>

you get:

import copy_reg
import multiprocessing
import new

def make_instancemethod(inst, methodname):
    return getattr(inst, methodname)

def pickle_instancemethod(method):
    return make_instancemethod, (method.im_self, method.im_func.__name__)

copy_reg.pickle(
    new.instancemethod, pickle_instancemethod, make_instancemethod)

class A(object):
   def __init__(self, a):
       self.a = a
   def fun(self, b):
       return self.a**b

if __name__ == "__main__":
    items = range(10)
    pool = multiprocessing.Pool(4)
    print pool.map(A(3).fun, items)

[toc] | [prev] | [next] | [standalone]


#52135

FromJoshua Landau <joshua@landau.ws>
Date2013-08-07 16:52 +0100
Message-ID<mailman.316.1375891216.1251.python-list@python.org>
In reply to#52128
On 7 August 2013 15:46, Peter Otten <__peter__@web.de> wrote:
> import copy_reg
> import multiprocessing
> import new

"new" is deprecated from 2.6+; use types.MethodType instead of
new.instancemethod.

> def make_instancemethod(inst, methodname):
>     return getattr(inst, methodname)

This is just getattr -- you can replace the two uses of
make_instancemethod with getattr and delete this ;).

> def pickle_instancemethod(method):
>     return make_instancemethod, (method.im_self, method.im_func.__name__)
>
> copy_reg.pickle(
>     new.instancemethod, pickle_instancemethod, make_instancemethod)
>
> class A(object):
>    def __init__(self, a):
>        self.a = a
>    def fun(self, b):
>        return self.a**b
>
> if __name__ == "__main__":
>     items = range(10)
>     pool = multiprocessing.Pool(4)
>     print pool.map(A(3).fun, items)

Well that was easy. The Stackoverflow link made that look *hard*. -1
to my hack, +1 to this.

You can do this in one statement:

copy_reg.pickle(
    types.MethodType,
    lambda method: (getattr, (method.im_self, method.im_func.__name__)),
    getattr
)

[toc] | [prev] | [next] | [standalone]


#52136

FromPeter Otten <__peter__@web.de>
Date2013-08-07 18:15 +0200
Message-ID<mailman.317.1375892130.1251.python-list@python.org>
In reply to#52128
Joshua Landau wrote:

> On 7 August 2013 15:46, Peter Otten <__peter__@web.de> wrote:

>> def make_instancemethod(inst, methodname):
>>     return getattr(inst, methodname)
> 
> This is just getattr -- you can replace the two uses of
> make_instancemethod with getattr and delete this ;).

D'oh ;)

[toc] | [prev] | [next] | [standalone]


#52156

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-07 16:31 -0700
Message-ID<de159c12-7e2b-4858-8ec2-4c2c1fe28fcf@googlegroups.com>
In reply to#52136
Thanks for the help Peter!

> 
> 
> 
> >> def make_instancemethod(inst, methodname):
> 
> >>     return getattr(inst, methodname)
> 
> > 
> 
> > This is just getattr -- you can replace the two uses of
> 
> > make_instancemethod with getattr and delete this ;).
> 
> 
> 
> D'oh ;)

[toc] | [prev] | [next] | [standalone]


#52063

FromLuca Cerone <luca.cerone@gmail.com>
Date2013-08-06 12:37 -0700
Message-ID<b39d3961-9766-45d7-a4af-8db145c27fcd@googlegroups.com>
In reply to#52059
On Tuesday, 6 August 2013 18:12:26 UTC+1, Luca Cerone  wrote:
> Hi guys,
> 
> I would like to apply the Pool.map method to a member of a class.
> 
> 
> 
> Here is a small example that shows what I would like to do:
> 
> 
> 
> from multiprocessing import Pool
> 
> 
> 
> class A(object):
> 
>    def __init__(self,x):
> 
>        self.value = x
> 
>    def fun(self,x):
> 
>        return self.value**x
> 
> 
> 
> 
> 
> l = range(10)
> 
> 
> 
> p = Pool(4)
> 
> 
> 
> op = p.map(A.fun,l)
> 
> 
> 
> #using this with the normal map doesn't cause any problem
> 
> 
> 
> This fails because it says that the methods can't be pickled.
> 
> (I assume it has something to do with the note in the documentation: "functionality within this package requires that the __main__ module be importable by the children.", which is obscure to me).
> 
> 
> 
> I would like to understand two things: why my code fails and when I can expect it to fail? what is a possible workaround?
> 
> 
> 
> Thanks a lot in advance to everybody for the help!
> 
> 
> 
> Cheers,
> 
> Luca



On Tuesday, 6 August 2013 18:12:26 UTC+1, Luca Cerone  wrote:
> Hi guys,
> 
> I would like to apply the Pool.map method to a member of a class.
> 
> 
> 
> Here is a small example that shows what I would like to do:
> 
> 
> 
> from multiprocessing import Pool
> 
> 
> 
> class A(object):
> 
>    def __init__(self,x):
> 
>        self.value = x
> 
>    def fun(self,x):
> 
>        return self.value**x
> 
> 
> 
> 
> 
> l = range(10)
> 
> 
> 
> p = Pool(4)
> 
> 
> 
> op = p.map(A.fun,l)
> 
> 
> 
> #using this with the normal map doesn't cause any problem
> 
> 
> 
> This fails because it says that the methods can't be pickled.
> 
> (I assume it has something to do with the note in the documentation: "functionality within this package requires that the __main__ module be importable by the children.", which is obscure to me).
> 
> 
> 
> I would like to understand two things: why my code fails and when I can expect it to fail? what is a possible workaround?
> 
> 
> 
> Thanks a lot in advance to everybody for the help!
> 
> 
> 
> Cheers,
> 
> Luca

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web