Path: csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'elif': 0.04; 'none:': 0.05; 'that?': 0.05; 'val': 0.07; 'iterate': 0.09; 'raised,': 0.09; 'def': 0.10; 'dec': 0.15; 'cleaner': 0.16; 'iterable': 0.16; 'iterable:': 0.16; 'iterator': 0.16; 'iterator.': 0.16; 'iterators,': 0.16; 'lambda': 0.16; 'nick': 0.16; 'object()': 0.16; 'partitioning': 0.16; 'sequence,': 0.16; 'sequence.': 0.16; 'true:': 0.16; 'twice.': 0.16; 'wed,': 0.16; 'wrote:': 0.17; 'odd': 0.17; 'yield': 0.17; '>>>': 0.18; 'subject:skip:i 10': 0.22; 'this:': 0.23; 'seems': 0.23; 'raise': 0.24; 'second': 0.24; 'header:In-Reply-To:1': 0.25; 'am,': 0.27; 'separate': 0.27; 'message-id:@mail.gmail.com': 0.27; 'chris': 0.28; 'key,': 0.29; 'to:addr:python-list': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'fresh': 0.35; 'sequence': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'there': 0.35; 'next': 0.35; 'but': 0.36; 'skip:p 20': 0.36; 'two': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'end': 0.40; 'think': 0.40; 'red': 0.60; 'first': 0.61; 'skip:n 10': 0.63; 'ask,': 0.84; 'to:name:python': 0.84; 'drops': 0.91; 'subject:Good': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=3ZY79qz4BNnl3i+wRG4+L/37HUi56oQakeH+B1IH0UU=; b=V6I1jFUF8H4zX3QrXHdxnBAkGfoMkqRSjg/KGmoIzfXx5Sw5qO9nRUeN8kWLJycPIw KDpSDLZbsIopxM7P1k1Tr7Esh7PFFHbn1WZKzE7nL4hIDF8VfaK9jtPyOnZjOZNt2KxA fKSvorwOeLfNrAu4Uhuh30/K3gwyGGBI/D7oMNn4NSxWU399ombquMRJpJ8DHBc20k86 JT6JtFuSrWs+9kPxkTIsTJvVO4PJEYHwDDwik0eDY+VykTW0XHHZQtke3ogtJ2ziMIxF Zba1fLN8rLGljyFnhEkjR2Z9Lq2fZbgMq9Gubgz0eUB14/pg7RimYOcJRB16S9GEO8L2 m0Ig== MIME-Version: 1.0 In-Reply-To: References: <05bca175-2077-4fb8-917e-baee1a43a47d@googlegroups.com> From: Ian Kelly Date: Wed, 5 Dec 2012 09:16:09 -0700 Subject: Re: Good use for itertools.dropwhile and itertools.takewhile To: Python Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 50 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1354724202 news.xs4all.nl 6928 [2001:888:2000:d::a6]:53157 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:34292 On Wed, Dec 5, 2012 at 6:45 AM, Chris Angelico wrote: > On Wed, Dec 5, 2012 at 12:17 PM, Nick Mellor wrote: >> >> takewhile mines for gold at the start of a sequence, dropwhile drops the dross at the start of a sequence. > > When you're using both over the same sequence and with the same > condition, it seems odd that you need to iterate over it twice. > Perhaps a partitioning iterator would be cleaner - something like > this: > > def partitionwhile(predicate, iterable): > iterable = iter(iterable) > while True: > val = next(iterable) > if not predicate(val): break > yield val > raise StopIteration # Signal the end of Phase 1 > for val in iterable: yield val # or just "yield from iterable", I think > > Only the cold hard boot of reality just stomped out the spark of an > idea. Once StopIteration has been raised, that's it, there's no > "resuming" the iterator. Is there a way around that? Is there a clean > way to say "Done for now, but next time you ask, there'll be more"? Return two separate iterators, with the contract that the second iterator can't be used until the first has completed. Combined with Neil's groupby suggestion, we end up with something like this: def partitionwhile(predicate, iterable): it = itertools.groupby(iterable, lambda x: bool(predicate(x))) pushback = missing = object() def first(): nonlocal pushback pred, subit = next(it) if pred: yield from subit pushback = None else: pushback = subit def second(): if pushback is missing: raise TypeError("can't yield from second iterator before first iterator completes") elif pushback is not None: yield from pushback yield from itertools.chain.from_iterable(subit for key, subit in it) return first(), second() >>> list(map(' '.join, partitionwhile(lambda x: x.upper() == x, "CAPSICUM RED fresh from QLD".split()))) ['CAPSICUM RED', 'fresh from QLD']