Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.006 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'represents': 0.05; '40,': 0.09; 'chunk': 0.09; 'correct,': 0.09; 'data:': 0.09; 'iterate': 0.09; 'def': 0.12; '#this': 0.16; '#we': 0.16; '21:52,': 0.16; '60,': 0.16; 'appreciated!': 0.16; 'collections': 0.16; 'count.': 0.16; 'deque': 0.16; 'pythonic': 0.16; 'subject:broken': 0.16; 'subject:make': 0.16; 'true:': 0.16; 'tup': 0.16; 'tuple': 0.16; 'tuple.': 0.16; 'index': 0.16; 'wrote:': 0.18; 'first.': 0.19; '(the': 0.22; 'import': 0.22; 'print': 0.22; 'code:': 0.26; 'header:In-Reply-To:1': 0.27; 'points': 0.29; 'subject:list': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'that.': 0.31; 'datum': 0.31; 'index,': 0.31; 'alone': 0.33; 'skip:t 40': 0.33; 'to:name:python-list': 0.33; 'sense': 0.34; 'skip:d 20': 0.34; 'convert': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'yield': 0.36; 'next': 0.36; 'list': 0.37; 'to:addr :python-list': 0.38; 'previous': 0.38; 'to:addr:python.org': 0.39; 'how': 0.40; 'break': 0.61; 'new': 0.61; "you're": 0.61; 'july': 0.63; 'more': 0.64; 'by:': 0.65; 'between': 0.67; '20,': 0.68; 'therefore': 0.72; 'behavioral': 0.74; 'trial': 0.83; 'occasion': 0.84; '20th': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=EOzRo4szveLCC1U4YwYM4WX2Ky0C2vKDjE/ZCM0/7R8=; b=QOJ9w0NhAFzl8wwH4KMbVYsTrkMhFCS15XGkbpYQeF1vxDnfSCJdnxCtauFA1C5wHJ eFE3dL5dh5ovtu8fnlD5YXwI2mXpupiTs7ioit2ggKhuTxuRJF4ixU8J1lEhUfGZW1eo CJOgXv0hM3ZkvBE/NjB3sZeYE534Efc2tP/YcbjNAwaGBATXaqTfkrlCNTOudTU59Als lIluXjlvIkqyTUa9GboAOFhhx9MKUs6wSZl28YN5j8m+x6RlhoanN9Jpjb9YyTmmVGOd l4TfZ7RLbjSP43kKjMO81g6ON70IekCTNvaCA4zlMETbtSLsdBz9MBSOVSjKSe/KHSxV uSDw== X-Received: by 10.112.5.199 with SMTP id u7mr11569439lbu.67.1373318689703; Mon, 08 Jul 2013 14:24:49 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <9d0cd072-3cf7-4156-8e84-884faeef7048@googlegroups.com> References: <9d0cd072-3cf7-4156-8e84-884faeef7048@googlegroups.com> From: Joshua Landau Date: Mon, 8 Jul 2013 22:24:09 +0100 Subject: Re: make sublists of a list broken at nth certain list items To: python-list Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 64 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1373318697 news.xs4all.nl 15882 [2001:888:2000:d::a6]:47933 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:50175 On 8 July 2013 21:52, CM wrote: > I'm looking for a Pythonic way to do the following: > > I have data in the form of a long list of tuples. I would like to break = that list into four sub-lists. The break points would be based on the nth = occasion of a particular tuple. (The list represents behavioral data trial= s; the particular tuple represents the break between trials; I want to coll= ect 20 trials at a time, so every 20th break between trials, start a new su= blist). I would do this like so: from collections import deque # Fast and hacky -- just how I like it exhaust_iterable =3D deque(maxlen=3D0).extend def chunk_of(data, *, length): count =3D 0 for datum in data: count +=3D datum[0] =3D=3D 1 yield datum if count =3D=3D 60: break def chunked(data): data =3D iter(data) while True: chunk =3D chunk_of(data, length=3D20) yield chunk exhaust_iterable(chunk) You use "chunked(data)" and iterate over the 'chunks' in that. If you go to the next chunk before finishing the one you're on the previous chunk will be lost, so convert it to a permanent form first. Looking at you code: > for tup in data_list: > if tup[0] =3D=3D 1.0: #Therefore the start of a new trial > > #We have a match! Therefore get the index in the data_list > data_list_index =3D data_list.index(tup) This is no good (ninja'd by F=C3=A1bio). The proper way to keep an index is= by: for index, tup in enumerate(data_list): > trial_count +=3D 1 #update the trial count. > > if trial_count % 20 =3D=3D 0: #this will match on 0, 20, 40, 60,= 80 > trial_break_indexes_list.append(data_list_index) > > print 'This is trial_break_indexes_list: ', trial_break_indexes_list ... > I sense there is a way more elegant/simpler/Pythonic way to approach this= , let alone one that is actually correct, but I don't know of it. Suggesti= ons appreciated! Yup.