Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #4009 > unrolled thread

Simple map/reduce utility function for data analysis

Started byRaymond Hettinger <python@rcn.com>
First post2011-04-25 16:48 -0700
Last post2011-04-26 02:52 +0000
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Simple map/reduce utility function for data analysis Raymond Hettinger <python@rcn.com> - 2011-04-25 16:48 -0700
    Re: Simple map/reduce utility function for data analysis Paul Rubin <no.email@nospam.invalid> - 2011-04-25 19:42 -0700
      Re: Simple map/reduce utility function for data analysis Raymond Hettinger <python@rcn.com> - 2011-04-26 11:12 -0700
    Re: Simple map/reduce utility function for data analysis Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-04-26 02:52 +0000

#4009 — Simple map/reduce utility function for data analysis

FromRaymond Hettinger <python@rcn.com>
Date2011-04-25 16:48 -0700
SubjectSimple map/reduce utility function for data analysis
Message-ID<967b61ce-5641-484a-bd76-5a09bd4247e0@w9g2000prg.googlegroups.com>
Here's a handy utility function for you guys to play with:

    http://code.activestate.com/recipes/577676/


Raymond
twitter: @raymondh

[toc] | [next] | [standalone]


#4013

FromPaul Rubin <no.email@nospam.invalid>
Date2011-04-25 19:42 -0700
Message-ID<7xei4pohrq.fsf@ruckus.brouhaha.com>
In reply to#4009
Raymond Hettinger <python@rcn.com> writes:
> Here's a handy utility function for you guys to play with:
>     http://code.activestate.com/recipes/577676/

Cute, but why not use collections.defaultdict for the return dict?
Untested:

   d = defaultdict(list)
   for key,value in ifilter(bool,imap(mapper, data)):
      d[key].append(value)
   ...

[toc] | [prev] | [next] | [standalone]


#4061

FromRaymond Hettinger <python@rcn.com>
Date2011-04-26 11:12 -0700
Message-ID<71335df7-4f4a-431d-9e2c-4ba1e98c3f7d@d19g2000prh.googlegroups.com>
In reply to#4013
On Apr 25, 7:42 pm, Paul Rubin <no.em...@nospam.invalid> wrote:
> Raymond Hettinger <pyt...@rcn.com> writes:
> > Here's a handy utility function for you guys to play with:
> >    http://code.activestate.com/recipes/577676/
>
> Cute, but why not use collections.defaultdict for the return dict?
> Untested:

My first draft had a defaultdict but that implementation detail would
get exposed to the user unless the return value was first coerced to a
regular dict.  Also, I avoided modern python features so the code
would run well on psyco and so that it would make sense to beginning
users.


> Untested:
>   d = defaultdict(list)
>   for key,value in ifilter(bool,imap(mapper, data)):
>      d[key].append(value)
>   ...

Nice use of itertools.  FWIW, ifilter() will accept None for the first
argument -- that's a bit faster than using bool().


Raymond

[toc] | [prev] | [next] | [standalone]


#4014

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-04-26 02:52 +0000
Message-ID<4db63361$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#4009
On Mon, 25 Apr 2011 16:48:42 -0700, Raymond Hettinger wrote:

> Here's a handy utility function for you guys to play with:
> 
>     http://code.activestate.com/recipes/577676/

Nice. 

That's similar to itertools.groupby except that it consolidates all the 
equal key results into one list, instead of in consecutive runs.

Also groupby returns iterators instead of lists, which makes it a PITA to 
work with. map_reduce is much more straightforward to use.

Example given in the code:

>>> map_reduce(range(30), even_odd)
{0: [10, 12, 14, 16, 18, 20], 1: [11, 13, 15, 17, 19]}

>>> [(key[0], list(group)) for key,group in groupby(range(30), even_odd) 
if key is not None]
[(0, [10]), (1, [11]), (0, [12]), (1, [13]), (0, [14]), (1, [15]), (0, 
[16]), (1, [17]), (0, [18]), (1, [19]), (0, [20])]


So... when can we expect map_reduce in the functools module? :)



-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web