Path: csiph.com!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail From: Jussi Piitulainen Newsgroups: comp.lang.python Subject: Re: The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?) Date: Tue, 22 Mar 2016 14:52:47 +0200 Organization: A noiseless patient Spider Lines: 51 Message-ID: References: <56e44258$0$1598$c3e8da3$5496439d@news.astraweb.com> <8737rvxs89.fsf@elektro.pacujo.net> <56e7483d$0$1608$c3e8da3$5496439d@news.astraweb.com> <56f09973$0$1601$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: mx02.eternal-september.org; posting-host="305c68510616a2e7ac08bcd2ff1598bd"; logging-data="30072"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18SQgfbaMyJV5wGsgW4APhwvUiuXi+Z/cc=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) Cancel-Lock: sha1:gR6alv3z5bw+4D8AcWmlPxd9c04= sha1:IcqfAtqZpptl+5piiKVQ0WzizAA= Xref: csiph.com comp.lang.python:105474 BartC writes: > Not everything fits into a for-loop you know! Why, take my own > readtoken() function: > > symbol = anything_other_than_skip_sym > > while symbol != skip_sym: > symbol = readnextsymbol() > > Of course, a repeat-until or repeat-while would suit this better (but > I don't know how it fits into Python syntax). So there's a case here > for increasing the number of loop statements not reducing them. Not sure why nobody seems to respond to this part. Perhaps I just missed it? It's true that while has its uses, or at least I think I've used it in Python once or twice. But there's more fun to be had by turning your data into a stream-like object. stream = iter(' /* this is C! */') # <-- produces a character at a time Now you can ask for the next item that satisfies a condition using a generator expression: next(symbol for symbol in stream if not symbol.isspace()) ---> '/' next(symbol for symbol in stream if not symbol.isspace()) ---> '*' Or collect the remaining items: list(symbol for symbol in stream if not symbol.isspace()) ---> ['t', 'h', 'i', 's', 'i', 's', 'C', '!', '*', '/'] You could also say: for symbol in stream: if symbol.isspace(): continue ... But this particular stream is empty by now. I work with long streams of tokenized and annotated sentences (which for me are streams of tokens) that sometimes come packed in streams of paragraphs packed in streams of texts. I build whatever stream I happen to want by nesting generator functions and generator expressions and some related machinery. (You could build on a character stream or a byte stream that you obtain by opening a file for reading; I tend to read line by line through itertools.groupby, because that's what I do.) These things compose well.