Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44180

Re: There must be a better way

References (3 earlier) <mailman.869.1366506610.3114.python-list@python.org> <kl0opb$pcr$1@theodyn.ncf.ca> <atl0i2Fto6uU2@mid.individual.net> <kl3stb$5ck$1@theodyn.ncf.ca> <atnh2jFgv8iU1@mid.individual.net>
From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date 2013-04-23 15:15 +0100
Subject Re: There must be a better way
Newsgroups comp.lang.python
Message-ID <mailman.972.1366726549.3114.python-list@python.org> (permalink)

Show all headers | View raw


On 23 April 2013 14:36, Neil Cerutti <neilc@norwich.edu> wrote:
> On 2013-04-22, Colin J. Williams <cjw@ncf.ca> wrote:
>> Since I'm only interested in one or two columns, the simpler
>> approach is probably better.
>
> Here's a sketch of how one of my projects handles that situation.
> I think the index variables are invaluable documentation, and
> make it a bit more robust. (Python 3, so not every bit is
> relevant to you).
>
> with open("today.csv", encoding='UTF-8', newline='') as today_file:
>     reader = csv.reader(today_file)
>     header = next(reader)

I once had a bug that took a long time to track down and was caused by
using next() without an enclosing try/except StopIteration (or the
optional default argument to next).

This is a sketch of how you can get the bug that I had:

$ cat next.py
#!/usr/bin/env python

def join(iterables):
    '''Join iterable of iterables, stripping first item'''
    for iterable in iterables:
        iterator = iter(iterable)
        header = next(iterator)  # Here's the problem
        for val in iterator:
            yield val

data = [
    ['foo', 1, 2, 3],
    ['bar', 4, 5, 6],
    [], # Whoops! Who put this empty iterable here?
    ['baz', 7, 8, 9],
]

for x in join(data):
    print(x)

$ ./next.py
1
2
3
4
5
6

The values 7, 8 and 9 are not printed but no error message is shown.
This is because calling next on the iterator over the empty list
raises a StopIteration that is not caught in the join generator. The
StopIteration is then "caught" by the for loop that iterates over
join() causing the loop to terminate prematurely. Since the exception
is caught and cleared by the for loop there's no practical way to get
a debugger to hook into the event that causes it.

In my case this happened somewhere in the middle of a long running
process. It was difficult to pin down what was causing this as the
iteration was over non-constant data and I didn't know what I was
looking for. As a result of the time spent fixing this I'm always very
cautious about calling next() to think about what a StopIteration
would do in context.

In this case a StopIteration is raised when reading from an empty csv file:

>>> import csv
>>> with open('test.csv', 'w'): pass
...
>>> with open('test.csv') as csvfile:
...     reader = csv.reader(csvfile)
...     header = next(reader)
...
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
StopIteration

If that code were called from a generator then it would most likely be
susceptible to the problem I'm describing. The fix is to use
next(reader, None) or try/except StopIteration.


Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-20 19:46 -0400
  Re: There must be a better way Chris Rebert <clp2@rebertia.com> - 2013-04-20 16:57 -0700
  Re: There must be a better way Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-21 00:06 +0000
    Re: There must be a better way Tim Chase <python.list@tim.thechases.com> - 2013-04-20 19:34 -0500
    Re: There must be a better way Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-20 21:07 -0400
      Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 09:15 -0400
        Re: There must be a better way Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2013-04-21 16:39 +0300
          Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:17 -0400
        Re: There must be a better way Peter Otten <__peter__@web.de> - 2013-04-21 15:43 +0200
          Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:30 -0400
          Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:30 -0400
        Re: There must be a better way Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 15:32 +0100
        Re: There must be a better way Neil Cerutti <neilc@norwich.edu> - 2013-04-22 14:42 +0000
          Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-22 13:44 -0400
            Re: There must be a better way Neil Cerutti <neilc@norwich.edu> - 2013-04-23 13:36 +0000
              Re: There must be a better way Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 15:15 +0100
              Re: There must be a better way Tim Chase <python.list@tim.thechases.com> - 2013-04-23 09:30 -0500
              Re: There must be a better way Skip Montanaro <skip@pobox.com> - 2013-04-23 09:36 -0500
              Re: There must be a better way (correction) Tim Chase <python.list@tim.thechases.com> - 2013-04-23 10:02 -0500

csiph-web