Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #13078 > unrolled thread

Idioms combining 'next(items)' and 'for item in items:'

Started byTerry Reedy <tjreedy@udel.edu>
First post2011-09-10 15:36 -0400
Last post2011-09-12 15:24 -0400
Articles 3 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  Idioms combining 'next(items)' and 'for item in items:' Terry Reedy <tjreedy@udel.edu> - 2011-09-10 15:36 -0400
    Re: Idioms combining 'next(items)' and 'for item in items:' Duncan Booth <duncan.booth@invalid.invalid> - 2011-09-12 13:06 +0000
      Re: Idioms combining 'next(items)' and 'for item in items:' Terry Reedy <tjreedy@udel.edu> - 2011-09-12 15:24 -0400

#13078 — Idioms combining 'next(items)' and 'for item in items:'

FromTerry Reedy <tjreedy@udel.edu>
Date2011-09-10 15:36 -0400
SubjectIdioms combining 'next(items)' and 'for item in items:'
Message-ID<mailman.944.1315683474.27778.python-list@python.org>
Python's iterator protocol for an iterator 'items' allows combinations 
of explicit "next(items)" calls with the implicit calls of a "for item 
in items:" loop. There are at least three situations in which this can 
be useful. (While the code posted here is not testable, being incomplete 
or having pseudocode lines, I have tested full examples of each idiom 
with 3.2.)


1. Process first item of an iterable separately.

A traditional solution is a flag variable that is tested for each item.

first = True
<other setup>
for item in iterable:
   if first:
     <process first>
     first = False
   else:
     <process non-first>

(I have seen code like this posted on this list several times, including 
today.)

Better, to me, is to remove the first item *before* the loop.

items = iter(iterable)
<set up with next(items)
for item in items:
   <process non-first>

Sometimes <other setup> and <process first> can be combined in <setup 
with next(items)>. For instance, "result = []" followed by 
"result.append(process_first(item))" becomes "result = 
[process_first(item)]".

The statement containing the explicit next(items) call can optionally be 
wrapped to explicitly handle the case of an empty iterable in whatever 
manner is desired.

try:
     <set up with next(items)>
except StopIteration:
     raise ValueError("iterable cannot be empty")


For an iterable known to have a small number of items, the first item 
can be removed, with only a small copying penalty, with a simple assignment.

first, *rest = iterable

The older and harder to write version of this requires the iterable to 
be a sequence?

first, rest = seq[0], seq[1:]


2. Process the last item of an iterable differently. As far as I know, 
this cannot be done for a generic iterable with a flag. It requires a 
look ahead.

items = iter(iterable)
current = next(items)
for item in items:
   <process non-last current>
   current = item
<process last current>

To treat both first and last differently, pull off the first before 
binding 'current'. Wrap next() calls as desired.


3. Process the items of an iterable in pairs.

items = iter(iterable)
for first in items:
     second = next(items)
     <process first and second>

This time, StopIteration is raised for an odd number of items. Catch and 
process as desired. One possibility is to raise ValueError("Iterable 
must have an even number of items").

A useful example of just sometimes pairing is iterating a (unicode) 
string by code points ('characters') rather than by code units. This 
requires combining the surrogate pairs used for supplementary characters 
when the units are 16 bits.

chars = iter(string)
for char in chars:
     if is_first_surrogate(char):
         char += next(chars)
     yield char

-- 
Terry Jan Reedy

[toc] | [next] | [standalone]


#13176

FromDuncan Booth <duncan.booth@invalid.invalid>
Date2011-09-12 13:06 +0000
Message-ID<Xns9F5E8F6211D03duncanbooth@127.0.0.1>
In reply to#13078
Terry Reedy <tjreedy@udel.edu> wrote:

> The statement containing the explicit next(items) call can optionally be 
> wrapped to explicitly handle the case of an empty iterable in whatever 
> manner is desired.
> 
> try:
>      <set up with next(items)>
> except StopIteration:
>      raise ValueError("iterable cannot be empty")
> 
> 
Alternatively, if all you want is for an empty iterable to do nothing, you 
could write it like this:

    items = iter(iterable)
    for first in items:
        <process first>
        break
    for item in items:
        <process non-first>

However, the issue I have with any of this pulling the first element out of 
the loop is that if you want special processing for the first element you 
are likely to also want it for the last, and if it is a single item you need 
to process that item with both bits of special code. I don't see how that works 
unless you have all elements within the single loop and test for first/last.

> 2. Process the last item of an iterable differently. As far as I know, 
> this cannot be done for a generic iterable with a flag. It requires a 
> look ahead.

I think that must be correct, e.g. if reading from an interactive prompt you 
cannot detect end of input until you fail to read any more.

See my answer to http://stackoverflow.com/questions/7365372/is-there-a-pythonic-way-of-knowing-when-the-first-and-last-loop-in-a-for-is-being/7365552#7365552
for a generator that wraps the lookahead.

-- 
Duncan Booth http://kupuguy.blogspot.com

[toc] | [prev] | [next] | [standalone]


#13195

FromTerry Reedy <tjreedy@udel.edu>
Date2011-09-12 15:24 -0400
Message-ID<mailman.1045.1315855471.27778.python-list@python.org>
In reply to#13176
On 9/12/2011 9:06 AM, Duncan Booth wrote:
> Terry Reedy<tjreedy@udel.edu>  wrote:
>
>> The statement containing the explicit next(items) call can optionally be
>> wrapped to explicitly handle the case of an empty iterable in whatever
>> manner is desired.
>>
>> try:
>>       <set up with next(items)>
>> except StopIteration:
>>       raise ValueError("iterable cannot be empty")
>>
>>
> Alternatively, if all you want is for an empty iterable to do nothing,

To do nothing, just pass above. If the function does nothing, it returns 
None. In the fix_title function, it should return '', not None.

> you could write it like this:
>
>      items = iter(iterable)
>      for first in items:
>          <process first>
>          break

I could, but I doubt I would ;-). Try...except StopIteration: pass is 
more explicit and less roundabout.

>      for item in items:
>          <process non-first>
>
> However, the issue I have with any of this pulling the first element out of
> the loop is that if you want special processing for the first element you
> are likely to also want it for the last,

Likely? I would say occasionally. Sentences have first words; file have 
headers. Special processing for last items is only an issue if it 
*replaces* the normal processing, rather than following the normal 
processing.

 > and if it is a single item you need
> to process that item with both bits of special code. I don't see how that works
> unless you have all elements within the single loop and test for first/last.

Like so, with tests:

def first_last_special(iterable):
     print("\nIterable is",repr(iterable))
     items = iter(iterable)
     try:
         first = next(items)
     except StopIteration:
         print('Nothing'); return
     print(first, 'is the first item')
     try:
         current = next(items)
     except StopIteration:
         current = first
     else:
         for item in items:
             print(current, 'is a middle item')
             current = item
     print(current, 'is the last item')

first_last_special('')
first_last_special('1')
first_last_special('12')
first_last_special('123')
first_last_special('12345')

-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web