Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #60697 > unrolled thread

For-each behavior while modifying a collection

Started byValentin Zahnd <v.zahnd@gmail.com>
First post2013-11-28 16:49 +0100
Last post2013-11-29 11:37 -0500
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  For-each behavior while modifying a collection Valentin Zahnd <v.zahnd@gmail.com> - 2013-11-28 16:49 +0100
    Re: For-each behavior while modifying a collection Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-29 03:22 +0000
      Re: For-each behavior while modifying a collection Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-11-29 11:37 -0500

#60697 — For-each behavior while modifying a collection

FromValentin Zahnd <v.zahnd@gmail.com>
Date2013-11-28 16:49 +0100
SubjectFor-each behavior while modifying a collection
Message-ID<mailman.3360.1385653859.18130.python-list@python.org>
Hello

For-each does not iterate ober all entries of collection, if one
removes elements during the iteration.

Example code:

def keepByValue(self, key=None, value=[]):
    for row in self.flows:
        if not row[key] in value:
            self.flows.remove(row)

It is clear why it behaves on that way. Every time one removes an
element, the length of the colleciton decreases by one while the
counter of the for each statement is not.
The questions are:
1. Why does the interprete not uses a copy of the collection to
iterate over it? Are there performance reasons?
2. Why is the counter for the iteration not modified?

Valentin

[toc] | [next] | [standalone]


#60755

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-29 03:22 +0000
Message-ID<52980885$0$29993$c3e8da3$5496439d@news.astraweb.com>
In reply to#60697
On Thu, 28 Nov 2013 16:49:21 +0100, Valentin Zahnd wrote:

> It is clear why it behaves on that way. Every time one removes an
> element, the length of the colleciton decreases by one while the counter
> of the for each statement is not. The questions are:
> 1. Why does the interprete not uses a copy of the collection to iterate
> over it? Are there performance reasons? 

Of course. Taking a copy of the loop sequence takes time, possible a 
*lot* of time depending on the size of the list, and that is a total 
waste of both time and memory if you don't modify the loop sequence. And 
Python cannot determine whether or not you modify the sequence. Consider 
this:

data = some_list_of_something
for item in data:
    func(item)


Does func modify the global variable data? How can you tell? Without 
whole-of-program semantic analysis, you cannot tell whether data is 
modified or not. Consider this one:

def func(obj):
    stuff = globals()['DA'.lower() + 'ta']
    eval("stuff.remove(obj)")


Do you expect Python to analyse the code in sufficient detail to realise 
that in this case, it needs to make a copy of the loop sequence? I don't. 
It is much better to have the basic principle that Python will not make a 
copy of anything unless you ask it to. You, the programmer, are in the 
best position to realise whether you are modifying the loop sequence and 
can decide whether to make a shallow copy or a deep copy.

It is a basic principle in programming that you shouldn't modify objects 
that you are traversing over unless you are very, very careful. Given 
that, Python does the right thing here.


> 2. Why is the counter for the iteration not modified?

What counter? There is no counter. You are iterating over an iterator, 
not running a C or Pascal "for i := 1 to 20" style loop.

Even if there was a counter, how should it be modified? The code you show 
was this:

def keepByValue(self, key=None, value=[]):
    for row in self.flows:
        if not row[key] in value:
            self.flows.remove(row)


What exactly does the remove() method do? How do you know?

self.flows could be *any object at all*, it won't be known until run-
time. The remove method could do *anything*, that won't be known until 
runtime either. Just because you, the programmer, expects that self.flows 
will be a list, and that remove() will remove at most one item, doesn't 
mean that Python can possibly know that. Perhaps self.flows returns an 
subclass of list, and remove() will remove all of the matching items, not 
just one. Perhaps it is some other object, and rather than removing 
anything, in fact it actually inserts extra items in the middle of the 
sequence. (There is no law that says that methods must do what they say 
they do.)

You are expecting Python to know more about your program than you do. 
That is not the case.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#60771

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-11-29 11:37 -0500
Message-ID<mailman.3401.1385743058.18130.python-list@python.org>
In reply to#60755
On 29 Nov 2013 03:22:45 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following:


>Even if there was a counter, how should it be modified? The code you show 
>was this:
>
>def keepByValue(self, key=None, value=[]):
>    for row in self.flows:
>        if not row[key] in value:
>            self.flows.remove(row)
>
>
>What exactly does the remove() method do? How do you know?
>
>self.flows could be *any object at all*, it won't be known until run-
>time. The remove method could do *anything*, that won't be known until 
>runtime either. Just because you, the programmer, expects that self.flows 
>will be a list, and that remove() will remove at most one item, doesn't 
>mean that Python can possibly know that. Perhaps self.flows returns an 
>subclass of list, and remove() will remove all of the matching items, not 
>just one. Perhaps it is some other object, and rather than removing 
>anything, in fact it actually inserts extra items in the middle of the 
>sequence. (There is no law that says that methods must do what they say 
>they do.)
>

	Let's really confuse matters...

	Say "self.flows" is something derived from a database cursor/result
set...

	Then "self.flows.remove(row)" could: 1) remove the row from the
cursor/result set [iterating over a cursor tends, in my experience, to use
up the items]; 2) execute a query to remove the matching row from the
database itself; 3) do both...
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web