Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <9d0cd072-3cf7-4156-8e84-884faeef7048@googlegroups.com>
References: <9d0cd072-3cf7-4156-8e84-884faeef7048@googlegroups.com>
From: Joshua Landau <joshua.landau.ws@gmail.com>
Date: Mon, 8 Jul 2013 22:24:09 +0100
Subject: Re: make sublists of a list broken at nth certain list items
To: python-list <python-list@python.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.4401.1373318697.3114.python-list@python.org>
Lines: 64
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:50175

On 8 July 2013 21:52, CM <cmpython@gmail.com> wrote:
> I'm looking for a Pythonic way to do the following:
>
> I have data in the form of a long list of tuples.  I would like to break =
that list into four sub-lists.  The break points would be based on the nth =
occasion of a particular tuple.  (The list represents behavioral data trial=
s; the particular tuple represents the break between trials; I want to coll=
ect 20 trials at a time, so every 20th break between trials, start a new su=
blist).

I would do this like so:

from collections import deque

# Fast and hacky -- just how I like it
exhaust_iterable =3D deque(maxlen=3D0).extend

def chunk_of(data, *, length):
    count =3D 0
    for datum in data:
        count +=3D datum[0] =3D=3D 1

        yield datum

        if count =3D=3D 60:
            break

def chunked(data):
    data =3D iter(data)
    while True:
        chunk =3D chunk_of(data, length=3D20)
        yield chunk
        exhaust_iterable(chunk)

You use "chunked(data)" and iterate over the 'chunks' in that. If you
go to the next chunk before finishing the one you're on the previous
chunk will be lost, so convert it to a permanent form first.

Looking at you code:

> for tup in data_list:
>     if tup[0] =3D=3D 1.0: #Therefore the start of a new trial
>
>         #We have a match!  Therefore get the index in the data_list
>         data_list_index =3D data_list.index(tup)

This is no good (ninja'd by F=C3=A1bio). The proper way to keep an index is=
 by:

for index, tup in enumerate(data_list):

>         trial_count +=3D 1  #update the trial count.
>
>         if trial_count % 20 =3D=3D 0:  #this will match on 0, 20, 40, 60,=
 80
>             trial_break_indexes_list.append(data_list_index)
>
> print 'This is trial_break_indexes_list: ', trial_break_indexes_list
...
> I sense there is a way more elegant/simpler/Pythonic way to approach this=
, let alone one that is actually correct, but I don't know of it.  Suggesti=
ons appreciated!

Yup.