Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27194

set and dict iteration

From Aaron Brady <castironpi@gmail.com>
Newsgroups comp.lang.python
Subject set and dict iteration
Date 2012-08-16 11:00 -0700
Organization http://groups.google.com
Message-ID <b8dd3aca-2a87-4124-ad6e-66a8720af99a@googlegroups.com> (permalink)

Show all headers | View raw


Hello,

I observed an inconsistency in the behavior of 'set' and 'dict' iterators.  It is "by design" according to the docs.

'''
http://docs.python.org/dev/library/stdtypes.html#dict-views

iter(dictview).  Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.
'''

The 'set' has the same behavior.  Iteration might also complete successfully.

The inconsistency is, if we remove an element from a set and add another during iteration, the new element might appear later in the iteration, and might not, depending on the hash code; therefore comparing the size of the set between iterations isn't adequate.  Example:

http://home.comcast.net/~castironpi-misc/clpy-0062%20set%20iterators.py  

'''
# py: { 'ver': '3' }

set0= set( ( 1, 2 ) )

iter0= iter( set0 )
print( next( iter0 ) )

set0.add( 3 )
set0.remove( 2 )
print( next( iter0 ) )


print( )

set0= set( ( 6, 7 ) )

iter0= iter( set0 )
print( next( iter0 ) )

set0.add( 8 )
set0.remove( 7 )
print( next( iter0 ) )
'''

Output:

'''
1
3

6
Traceback (most recent call last):
  File [...] line 22, in <module>
    print( next( iter0 ) )
StopIteration
'''

Iteration should behave the same regardless of the contents of the set.  Continuing iteration over sets and dicts after a modification isn't defined; it should unconditionally raise an error.

What's going on, is '8' is added before the position of the iterator due to hashing in the second part, but the size doesn't change, so the iterator reaches the end of the set after '7' is removed.

The inconsistency isn't easily solved.  One possibility is to use a timestamp or other serial index in the object and iterators, and compare them on every iteration to determine if a modification has occurred.

Another possibility which the author prefers, is to maintain a secondary collection of the iterators of an object, and invalidate them upon modification.  The applicable collection structure is a doubly-linked linked list, informally depicted:

http://home.comcast.net/~castironpi-misc/clpy-0062%20set%20iterators.png

Upon modification, the set traverses its iterators, setting an 'invalid' flag on each; and subsequent calls to any of them raise an 'IterationError'.  Adding and removing iterators to and from the secondary list is performed in O( 1 ) time with no penalty.

The above example depicted a 'Set'.  'Dicts' have the same anomaly, but the solution is ambiguous, since dict values can be changed meaningfully without altering the structure of the object.  In the author's opinion, the dict should not raise an 'IterationError' on value changes, only key changes like the set, but the argument isn't conclusive.

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-16 11:00 -0700
  Re: set and dict iteration Dave Angel <d@davea.name> - 2012-08-16 15:49 -0400
    Re: set and dict iteration Paul Rubin <no.email@nospam.invalid> - 2012-08-16 14:26 -0700
      Re: set and dict iteration Dave Angel <davea@dejaviewphoto.com> - 2012-08-16 19:11 -0400
        Re: set and dict iteration Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-17 02:24 +0000
          Re: set and dict iteration Paul Rubin <no.email@nospam.invalid> - 2012-08-16 19:30 -0700
            Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-17 11:11 -0700
          Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-17 11:37 -0700
            Re: set and dict iteration Chris Angelico <rosuav@gmail.com> - 2012-08-18 07:57 +1000
              Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-18 13:29 -0700
              Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-18 13:29 -0700
                Re: set and dict iteration MRAB <python@mrabarnett.plus.com> - 2012-08-18 23:14 +0100
                Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-18 19:28 -0700
                Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-23 09:49 -0700
                Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-23 09:49 -0700
                Re: set and dict iteration Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-23 18:11 +0000
                Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-18 19:28 -0700
      Re: set and dict iteration Ian Kelly <ian.g.kelly@gmail.com> - 2012-08-16 17:43 -0600
        Re: set and dict iteration Paul Rubin <no.email@nospam.invalid> - 2012-08-16 18:01 -0700
          Re: set and dict iteration Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2012-08-17 13:16 +0200
          Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-17 11:03 -0700
  Re: set and dict iteration Ian Kelly <ian.g.kelly@gmail.com> - 2012-08-16 16:55 -0600
  Re: set and dict iteration Ian Kelly <ian.g.kelly@gmail.com> - 2012-08-16 17:07 -0600
    Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-17 10:47 -0700
    Re: set and dict iteration Aaron Brady <castironpi@gmail.com> - 2012-08-17 10:47 -0700

csiph-web