Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #2594

Multiprocessing, shared memory vs. pickled copies

From John Ladasky <ladasky@my-deja.com>
Newsgroups comp.lang.python
Subject Multiprocessing, shared memory vs. pickled copies
Date 2011-04-04 13:20 -0700
Organization http://groups.google.com
Message-ID <6ace38dc-33c6-44ab-a17a-084d62d666cb@w9g2000prg.googlegroups.com> (permalink)

Show all headers | View raw


Hi folks,

I'm developing some custom neural network code.  I'm using Python 2.6,
Numpy 1.5, and Ubuntu Linux 10.10.  I have an AMD 1090T six-core CPU,
and I want to take full advantage of it.  I love to hear my CPU fan
running, and watch my results come back faster.

When I'm training a neural network, I pass two numpy.ndarray objects
to a function called evaluate.  One array contains the weights for the
neural network, and the other array contains the input data.  The
evaluate function returns an array of output data.

I have been playing with multiprocessing for a while now, and I have
some familiarity with Pool.  Apparently, arguments passed to a Pool
subprocess must be able to be pickled.  Pickling is still a pretty
vague progress to me, but I can see that you have to write custom
__reduce__ and __setstate__ methods for your objects.  An example of
code which creates a pickle-friendly ndarray subclass is here:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg02446.html

Now, I don't know that I actually HAVE to pass my neural network and
input data as copies -- they're both READ-ONLY objects for the
duration of an evaluate function (which can go on for quite a while).
So, I have also started to investigate shared-memory approaches.  I
don't know how a shared-memory object is referenced by a subprocess
yet, but presumably you pass a reference to the object, rather than
the whole object.   Also, it appears that subprocesses also acquire a
temporary lock over a shared memory object, and thus one process may
well spend time waiting for another (individual CPU caches may
sidestep this problem?) Anyway, an implementation of a shared-memory
ndarray is here:

https://bitbucket.org/cleemesser/numpy-sharedmem/src/3fa526d11578/shmarray.py

I've added a few lines to this code which allows subclassing the
shared memory array, which I need (because my neural net objects are
more than just the array, they also contain meta-data).  But I've run
into some trouble doing the actual sharing part.  The shmarray class
CANNOT be pickled.  I think that my understanding of multiprocessing
needs to evolve beyond the use of Pool, but I'm not sure yet.  This
post suggests as much.

http://mail.scipy.org/pipermail/scipy-user/2009-February/019696.html

I don't believe that my questions are specific to numpy, which is why
I'm posting here, in a more general Python forum.

When should one pickle and copy?  When to implement an object in
shared memory?  Why is pickling apparently such a non-trivial process
anyway?  And, given that multi-core CPU's are apparently here to stay,
should it be so difficult to make use of them?

Back to comp.lang.python | Previous | NextNext in thread | Find similar


Thread

Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-04 13:20 -0700
  Re: Multiprocessing, shared memory vs. pickled copies Philip Semanchuk <philip@semanchuk.com> - 2011-04-04 19:34 -0400
    Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-05 09:58 -0700
      Re: Multiprocessing, shared memory vs. pickled copies Philip Semanchuk <philip@semanchuk.com> - 2011-04-05 13:43 -0400
        Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-06 23:40 -0700
          Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-07 00:41 -0700
            Re: Multiprocessing, shared memory vs. pickled copies Philip Semanchuk <philip@semanchuk.com> - 2011-04-07 09:23 -0400
          Re: Multiprocessing, shared memory vs. pickled copies Robert Kern <robert.kern@gmail.com> - 2011-04-07 12:44 -0500
            Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-07 11:39 -0700
              Re: Multiprocessing, shared memory vs. pickled copies Robert Kern <robert.kern@gmail.com> - 2011-04-07 15:01 -0500
  Re: Multiprocessing, shared memory vs. pickled copies Robert Kern <robert.kern@gmail.com> - 2011-04-04 19:05 -0500
    Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-07 16:39 -0700
  Re: Multiprocessing, shared memory vs. pickled copies Philip Semanchuk <philip@semanchuk.com> - 2011-04-04 21:16 -0400
  Re: Multiprocessing, shared memory vs. pickled copies Robert Kern <robert.kern@gmail.com> - 2011-04-05 10:47 -0500
  Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-07 17:03 -0700
    Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-07 17:38 -0700
      Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-07 18:10 -0700
        Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-09 00:36 -0700
          Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-09 10:15 -0700
            Re: Multiprocessing, shared memory vs. pickled copies John Ladasky <ladasky@my-deja.com> - 2011-04-09 13:18 -0700
              Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-10 08:01 -0700
        Re: Multiprocessing, shared memory vs. pickled copies sturlamolden <sturlamolden@yahoo.no> - 2011-04-10 15:35 -0700

csiph-web