Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #39975 > unrolled thread

Re: groupby behaviour

Started byandrea crotti <andrea.crotti.0@gmail.com>
First post2013-02-26 17:09 +0000
Last post2013-02-26 09:35 -0800
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: groupby behaviour andrea crotti <andrea.crotti.0@gmail.com> - 2013-02-26 17:09 +0000
    Re: groupby behaviour Paul Rubin <no.email@nospam.invalid> - 2013-02-26 09:35 -0800

#39975 — Re: groupby behaviour

Fromandrea crotti <andrea.crotti.0@gmail.com>
Date2013-02-26 17:09 +0000
SubjectRe: groupby behaviour
Message-ID<mailman.2557.1361898573.2939.python-list@python.org>
2013/2/26 Ian Kelly <ian.g.kelly@gmail.com>:
> On Tue, Feb 26, 2013 at 9:27 AM, andrea crotti
> <andrea.crotti.0@gmail.com> wrote:
>> So I was trying to use groupby (which I used in the past), but I
>> noticed a very strange thing if using list on
>> the result:
>
> As stated in the docs:
>
> """
> The returned group is itself an iterator that shares the underlying
> iterable with groupby(). Because the source is shared, when the
> groupby() object is advanced, the previous group is no longer visible.
> So, if that data is needed later, it should be stored as a list:
> """
> --
> http://mail.python.org/mailman/listinfo/python-list


I should have read more carefully sorry, I was in the funny situation
where it would have actually worked in the production code but it was
failing in the unit tests (because I was using list only there).

It's very weird though this sharing and still doesn't really look
rightl, is it done just for performance reasons?

[toc] | [next] | [standalone]


#39978

FromPaul Rubin <no.email@nospam.invalid>
Date2013-02-26 09:35 -0800
Message-ID<7xhakzvuzp.fsf@ruckus.brouhaha.com>
In reply to#39975
andrea crotti <andrea.crotti.0@gmail.com> writes:
> It's very weird though this sharing and still doesn't really look
> rightl, is it done just for performance reasons?

It could consume unbounded amounts of memory otherwise.  E.g. if there
are millions of items in the group.

If you're not worried about that situation it's simplest to just
convert each group to a list for processing it:

   for k,g in groupby(...):
      gs = list(g)
      # do stuff with gs

calling list(g) reads all the elements in the group, advancing groupby's
internal iterator to the next group.  Since you've copied out all the
group elements, you don't have to worry about the sharing effect.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web