Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #92530 > unrolled thread
| Started by | Fabien <fabien.maussion@gmail.com> |
|---|---|
| First post | 2015-06-12 17:00 +0200 |
| Last post | 2015-06-13 16:16 +0000 |
| Articles | 16 — 10 participants |
Back to article view | Back to comp.lang.python
zip as iterator and bad/good practices Fabien <fabien.maussion@gmail.com> - 2015-06-12 17:00 +0200
Re: zip as iterator and bad/good practices Fabien <fabien.maussion@gmail.com> - 2015-06-12 17:05 +0200
Re: zip as iterator and bad/good practices Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-12 09:26 -0600
Re: zip as iterator and bad/good practices Fabien <fabien.maussion@gmail.com> - 2015-06-12 17:34 +0200
Re: zip as iterator and bad/good practices Fabien <fabien.maussion@gmail.com> - 2015-06-12 17:59 +0200
Re: zip as iterator and bad/good practices Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-12 17:22 +0100
Re: zip as iterator and bad/good practices Laura Creighton <lac@openend.se> - 2015-06-12 22:34 +0200
Re: zip as iterator and bad/good practices Terry Reedy <tjreedy@udel.edu> - 2015-06-12 19:27 -0400
Re: zip as iterator and bad/good practices Terry Reedy <tjreedy@udel.edu> - 2015-06-12 19:43 -0400
Re: zip as iterator and bad/good practices sohcahtoa82@gmail.com - 2015-06-12 17:02 -0700
Re: zip as iterator and bad/good practices Chris Angelico <rosuav@gmail.com> - 2015-06-13 10:26 +1000
Re: zip as iterator and bad/good practices sohcahtoa82@gmail.com - 2015-06-12 17:39 -0700
Re: zip as iterator and bad/good practices jimages <jimages123@gmail.com> - 2015-06-13 13:32 +0800
Re: zip as iterator and bad/good practices Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-13 07:17 +0000
Re: zip as iterator and bad/good practices Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-06-13 13:48 +0100
Re: zip as iterator and bad/good practices Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-13 16:16 +0000
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2015-06-12 17:00 +0200 |
| Subject | zip as iterator and bad/good practices |
| Message-ID | <mles66$sk2$1@speranza.aioe.org> |
Folks, I am developing a program which I'd like to be python 2 and 3 compatible. I am still relatively new to python and I use primarily py3 for development. Every once in a while I use a py2 interpreter to see if my tests pass through. I just spent several hours tracking down a bug which was related to the fact that zip is an iterator in py3 but not in py2. Of course I did not know about that difference. I've found the izip() function which should do what I want, but that awful bug made me wonder: is it a bad practice to interactively modify the list you are iterating over? I am computing mass fluxes along glacier branches ordered by hydrological order, i.e. branch i is guaranteed to flow in a branch later in that list. Branches are objects which have a pointer to the object they are flowing into. In pseudo code: for stuff, branch in zip(stuffs, branches): # compute flux ... # add to the downstream branch id_branch = branches.index(branch.flows_to) branches[id_branch].property.append(stuff_i_computed) So, all downstream branches in python2 where missing information from their tributaries. It is quite a dangerous code but I can't find a more elegant solution. Thanks! Fabien
[toc] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2015-06-12 17:05 +0200 |
| Message-ID | <mlesfd$sk2$2@speranza.aioe.org> |
| In reply to | #92530 |
On 06/12/2015 05:00 PM, Fabien wrote: > I've found the izip() function which should do what I want I've just come accross a stackoverflow post where they recommend: from future_builtins import zip which is OK since I don't want to support versions <= 2.6
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-06-12 09:26 -0600 |
| Message-ID | <mailman.428.1434122838.13271.python-list@python.org> |
| In reply to | #92530 |
On Fri, Jun 12, 2015 at 9:00 AM, Fabien <fabien.maussion@gmail.com> wrote:
> Folks,
>
> I am developing a program which I'd like to be python 2 and 3 compatible. I
> am still relatively new to python and I use primarily py3 for development.
> Every once in a while I use a py2 interpreter to see if my tests pass
> through.
>
> I just spent several hours tracking down a bug which was related to the fact
> that zip is an iterator in py3 but not in py2. Of course I did not know
> about that difference. I've found the izip() function which should do what I
> want
If you're supporting both 2 and 3, you may want to look into using the
third-party "six" library, which provides utilities for writing
cross-compatible code. Using the correct zip() function with six is
just:
from six.moves import zip
> but that awful bug made me wonder: is it a bad practice to
> interactively modify the list you are iterating over?
Generally speaking, yes, it's bad practice to add or remove items
because this may result in items being visited more than once or not
at all. Modifying or replacing items however is usually not an issue.
> I am computing mass fluxes along glacier branches ordered by hydrological
> order, i.e. branch i is guaranteed to flow in a branch later in that list.
> Branches are objects which have a pointer to the object they are flowing
> into.
>
> In pseudo code:
>
> for stuff, branch in zip(stuffs, branches):
> # compute flux
> ...
> # add to the downstream branch
> id_branch = branches.index(branch.flows_to)
> branches[id_branch].property.append(stuff_i_computed)
Er, I don't see the problem here. The branch object in the zip list
and the branch object in branches should be the *same* object, so the
downstream branch update should be reflected when you visit it later
in the iteration, regardless of whether zip returns a list or an iterator.
Tangentially, unless you're using id_branch for something else that
isn't shown here, is it really necessary to search the list for the
downstream branch when it looks like you already have a reference to
it? Could the above simply be replaced with:
branch.flows_to.property.append(stuff_i_computed)
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2015-06-12 17:34 +0200 |
| Message-ID | <mleu5a$1pf$1@speranza.aioe.org> |
| In reply to | #92532 |
On 06/12/2015 05:26 PM, Ian Kelly wrote: >> for stuff, branch in zip(stuffs, branches): >> > # compute flux >> > ... >> > # add to the downstream branch >> > id_branch = branches.index(branch.flows_to) >> > branches[id_branch].property.append(stuff_i_computed) > Er, I don't see the problem here. The branch object in the zip list > and the branch object in branches should be the*same* object, so the > downstream branch update should be reflected when you visit it later > in the iteration, regardless of whether zip returns a list or an iterator. > > Tangentially, unless you're using id_branch for something else that > isn't shown here, is it really necessary to search the list for the > downstream branch when it looks like you already have a reference to > it? Could the above simply be replaced with: > > branch.flows_to.property.append(stuff_i_computed) Thanks a lot for your careful reading! I overly simplified my example and indeed this line works fine. I was adding things to "stuffs" too, which is a list of lists... Sorry for the confusion!
[toc] | [prev] | [next] | [standalone]
| From | Fabien <fabien.maussion@gmail.com> |
|---|---|
| Date | 2015-06-12 17:59 +0200 |
| Message-ID | <mlevkm$4nt$1@speranza.aioe.org> |
| In reply to | #92532 |
On 06/12/2015 05:26 PM, Ian Kelly wrote:
>> but that awful bug made me wonder: is it a bad practice to
>> >interactively modify the list you are iterating over?
> Generally speaking, yes, it's bad practice to add or remove items
> because this may result in items being visited more than once or not
> at all. Modifying or replacing items however is usually not an issue.
>
Thanks. In that case I was modifying items and needed them to be updated
during the loop. I kept the solution as is and my tests pass in 2 and 3.
I will consider using six. Currently all my modules begin with:
from __future__ import division
try:
from itertools import izip as zip
except ImportError:
pass
Which might even become longer if I find other bugs ;-)
Fabien
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-06-12 17:22 +0100 |
| Message-ID | <mailman.430.1434126164.13271.python-list@python.org> |
| In reply to | #92530 |
On 12/06/2015 16:00, Fabien wrote: > Folks, > > I am developing a program which I'd like to be python 2 and 3 > compatible. I am still relatively new to python and I use primarily py3 > for development. Every once in a while I use a py2 interpreter to see if > my tests pass through. > > I just spent several hours tracking down a bug which was related to the > fact that zip is an iterator in py3 but not in py2. Of course I did not > know about that difference. I've found the izip() function which should > do what I want, but that awful bug made me wonder: is it a bad practice > to interactively modify the list you are iterating over? > > I am computing mass fluxes along glacier branches ordered by > hydrological order, i.e. branch i is guaranteed to flow in a branch > later in that list. Branches are objects which have a pointer to the > object they are flowing into. > > In pseudo code: > > for stuff, branch in zip(stuffs, branches): > # compute flux > ... > # add to the downstream branch > id_branch = branches.index(branch.flows_to) > branches[id_branch].property.append(stuff_i_computed) > > So, all downstream branches in python2 where missing information from > their tributaries. It is quite a dangerous code but I can't find a more > elegant solution. > > Thanks! > > Fabien > Start here https://docs.python.org/3/howto/pyporting.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-06-12 22:34 +0200 |
| Message-ID | <mailman.439.1434141281.13271.python-list@python.org> |
| In reply to | #92530 |
The real problem is removing things from lists when you are iterating over them, not adding things to the end of lists. Python 2.7.9 (default, Mar 1 2015, 12:57:24) [GCC 4.9.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> mylist = [1,2,3] >>> for i in mylist: ... print i ... mylist.remove(i) ... 1 3 >>> mylist [2] Most people expect 1 2 and 3 to get printed, and mylist to be empty at the end of this loop. Laura
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-06-12 19:27 -0400 |
| Message-ID | <mailman.446.1434151715.13271.python-list@python.org> |
| In reply to | #92530 |
On 6/12/2015 11:00 AM, Fabien wrote: > is it a bad practice > to interactively modify the list you are iterating over? One needs care. Appending to the end of the list is OK, unless you append a billion items or so ;-) Appending to the end of a queue while *removing* items from the front of the queue, where the queue resizes itself at the front as needed, is standard for breadth-first search. A deque.Deque can be used for this. Depth-first search appends to and deletes from the end (or top) of a stack, but this is NOT forward-iteration as implemented by Python iterators. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-06-12 19:43 -0400 |
| Message-ID | <mailman.447.1434152635.13271.python-list@python.org> |
| In reply to | #92530 |
On 6/12/2015 4:34 PM, Laura Creighton wrote: > The real problem is removing things from lists when you are iterating > over them, not adding things to the end of lists. One needs to iterate backwards. >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5] >>> for i in range(len(ints)-1, -1, -1): if ints[i] % 2: del ints[i] >>> ints [0, 2, 2, 4, 6] But using a list comp and, if necessary, copying the result back into the original list is much easier. >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5] >>> ints[:] = [i for i in ints if not i % 2] >>> ints [0, 2, 2, 4, 6] -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | sohcahtoa82@gmail.com |
|---|---|
| Date | 2015-06-12 17:02 -0700 |
| Message-ID | <e5d750bd-d7e3-4d93-9794-e9f16a4b40bd@googlegroups.com> |
| In reply to | #92567 |
On Friday, June 12, 2015 at 4:44:08 PM UTC-7, Terry Reedy wrote: > On 6/12/2015 4:34 PM, Laura Creighton wrote: > > The real problem is removing things from lists when you are iterating > > over them, not adding things to the end of lists. > > One needs to iterate backwards. > > >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5] > > >>> for i in range(len(ints)-1, -1, -1): > if ints[i] % 2: > del ints[i] > > >>> ints > [0, 2, 2, 4, 6] > > But using a list comp and, if necessary, copying the result back into > the original list is much easier. > > >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5] > >>> ints[:] = [i for i in ints if not i % 2] > >>> ints > [0, 2, 2, 4, 6] > > > -- > Terry Jan Reedy On the second line of your final solution, is there any reason you're using `ints[:]` rather than just `ints`?
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-06-13 10:26 +1000 |
| Message-ID | <mailman.450.1434155212.13271.python-list@python.org> |
| In reply to | #92569 |
On Sat, Jun 13, 2015 at 10:02 AM, <sohcahtoa82@gmail.com> wrote:
>> >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5]
>> >>> ints[:] = [i for i in ints if not i % 2]
>> >>> ints
>> [0, 2, 2, 4, 6]
>>
>>
>> --
>> Terry Jan Reedy
>
> On the second line of your final solution, is there any reason you're using `ints[:]` rather than just `ints`?
If you use "ints = [...]", it rebinds the name ints to the new list.
If you use "ints[:] = [...]", it replaces the entire contents of the
list with the new list. The two are fairly similar if there are no
other references to that list, but the replacement matches the
mutation behaviour of remove().
def just_some(ints):
ints[:] = [i for i in ints if not i % 2]
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | sohcahtoa82@gmail.com |
|---|---|
| Date | 2015-06-12 17:39 -0700 |
| Message-ID | <df28e29a-cbc9-48b9-bf30-8b1f311848c2@googlegroups.com> |
| In reply to | #92572 |
On Friday, June 12, 2015 at 5:27:21 PM UTC-7, Chris Angelico wrote: > On Sat, Jun 13, 2015 at 10:02 AM, <sohcahtoa82@gmail.com> wrote: > >> >>> ints = [0, 1, 2, 2, 1, 4, 6, 5, 5] > >> >>> ints[:] = [i for i in ints if not i % 2] > >> >>> ints > >> [0, 2, 2, 4, 6] > >> > >> > >> -- > >> Terry Jan Reedy > > > > On the second line of your final solution, is there any reason you're using `ints[:]` rather than just `ints`? > > If you use "ints = [...]", it rebinds the name ints to the new list. > If you use "ints[:] = [...]", it replaces the entire contents of the > list with the new list. The two are fairly similar if there are no > other references to that list, but the replacement matches the > mutation behaviour of remove(). > > def just_some(ints): > ints[:] = [i for i in ints if not i % 2] > > ChrisA Ah that makes sense. Thanks.
[toc] | [prev] | [next] | [standalone]
| From | jimages <jimages123@gmail.com> |
|---|---|
| Date | 2015-06-13 13:32 +0800 |
| Message-ID | <mailman.452.1434175078.13271.python-list@python.org> |
| In reply to | #92530 |
> On Jun 12, 2015, at 11:00 PM, Fabien <fabien.maussion@gmail.com> wrote:
> but that awful bug made me wonder: is it a bad practice to interactively modify the list you are iterating over?
Yes.
I am a newbie. I also have been confused when I read the tutorial. It recommends make a copy before looping. Then I try.
#--------------------------
Test = [1, 2]
For i in Test:
Test.append(i)
#--------------------------
But when i execute. The script does not end. I know there must something wrong. So I launch debugger and deserve the list after each loop.
And I see:
Loop 1: [ 1, 2, 1]
Loop 2: [ 1, 2, 1, 2]
Loop 3: [ 1, 2, 1, 2, 1]
Loop 4: [ 1, 2, 1, 2, 1, 2]
......
So you can see that loop will *never* end.
So I think you regard the 'i' as a pointer. After execute one loop the pointer repoints to next element , but at the same time you are appending element. So pointer will *never* repoints to the last element.
How to solve?
Change code to this
#--------------------------
Test = [1, 2]
For i in Test[:] :
Test.append(i)
#--------------------------
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-06-13 07:17 +0000 |
| Message-ID | <557bd903$0$11125$c3e8da3@news.astraweb.com> |
| In reply to | #92582 |
On Sat, 13 Jun 2015 13:32:59 +0800, jimages wrote:
> I am a newbie. I also have been confused when I read the tutorial. It
> recommends make a copy before looping. Then I try.
> #--------------------------
> Test = [1, 2]
> For i in Test:
> Test.append(i)
> #--------------------------
You don't make a copy of Test here. You could try this instead:
Test = [1, 2]
copy_test = Test[:] # [:] makes a slice copy of the whole list
for i in copy_test: # iterate over the copy
Test.append(i) # and append to the original
print(Test)
But an easier way is:
Test = [1, 2]
Test.extend(Test)
print(Test)
> But when i execute. The script does not end. I know there must something
> wrong. So I launch debugger and deserve the list after each loop. And I
> see:
> Loop 1: [ 1, 2, 1]
> Loop 2: [ 1, 2, 1, 2]
> Loop 3: [ 1, 2, 1, 2, 1]
> Loop 4: [ 1, 2, 1, 2, 1, 2]
> ......
> So you can see that loop will *never* end. So I think you regard the 'i'
> as a pointer.
i is not a pointer. It is just a variable that gets a value from the
list, the same as:
# first time through the loop
i = Test[0]
# second time through the loop
i = Test[1] # the second item
The for loop statement:
for item in seq: ...
understands sequences, lists, and other iterables, not "item". item is
just an ordinary variable, nothing special about it. The for statement
takes the items in seq, one at a time, and assigns them to the variable
"item". In English:
for each item in seq ...
or to put it another way:
get the first item of seq
assign it to "item"
process the block
get the second item of seq
assign it to "item"
process the block
get the third item of seq
assign it to "item"
process the block
...
and so on, until seq runs out of items. But if you keep appending items
to the end, it will never run out.
> Change code to this
> #--------------------------
> Test = [1, 2]
> For i in Test[:] :
> Test.append(i)
> #--------------------------
Yes, this will work.
--
Steve
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2015-06-13 13:48 +0100 |
| Message-ID | <mailman.455.1434199754.13271.python-list@python.org> |
| In reply to | #92583 |
On 13 June 2015 at 08:17, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sat, 13 Jun 2015 13:32:59 +0800, jimages wrote:
>
>> I am a newbie. I also have been confused when I read the tutorial. It
>> recommends make a copy before looping. Then I try.
>> #--------------------------
>> Test = [1, 2]
>> For i in Test:
>> Test.append(i)
>> #--------------------------
>
> You don't make a copy of Test here. You could try this instead:
>
> Test = [1, 2]
> copy_test = Test[:] # [:] makes a slice copy of the whole list
> for i in copy_test: # iterate over the copy
> Test.append(i) # and append to the original
>
> print(Test)
>
>
> But an easier way is:
>
> Test = [1, 2]
> Test.extend(Test)
> print(Test)
I can't see anything in the docs that specify the behaviour that
occurs here. If I change it to
Test.extend(iter(Test))
then it borks my system in 1s after consuming 8GB of RAM (I recovered
with killall python in the tty).
According to the docs:
"""
list.extend(L)
Extend the list by appending all the items in the given list;
equivalent to a[len(a):] = L.
"""
https://docs.python.org/2/tutorial/datastructures.html#more-on-lists
The alternate form
Test[len(Test):] = Test
is equivalent but
Test[len(Test):] = iter(Test)
is not since it doesn't bork my system.
I looked here:
https://docs.python.org/2/library/stdtypes.html#mutable-sequence-types
but I don't see anything that specifies how self-referential slice
assignment should behave.
I checked under pypy and all behaviour is the same but I'm not sure if
this shouldn't be considered implementation-defined or undefined
behaviour. It's not hard to see how a rearrangement of the list.extend
method would lead to a change of behaviour and I can't see that the
current behaviour is really guaranteed by the language and in fact
it's inconsistent with the docs for list.extend.
As an aside they say that pypy is fast but it took about 10 times
longer than cpython to bork my system. :)
--
Oscar
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-06-13 16:16 +0000 |
| Message-ID | <557c575d$0$11128$c3e8da3@news.astraweb.com> |
| In reply to | #92587 |
On Sat, 13 Jun 2015 13:48:45 +0100, Oscar Benjamin wrote:
> On 13 June 2015 at 08:17, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> But an easier way is:
>>
>> Test = [1, 2]
>> Test.extend(Test)
>> print(Test)
>
> I can't see anything in the docs that specify the behaviour that occurs
> here.
Neither do I, but there is a test for it:
a.extend(a)
self.assertEqual(a, self.type2test([0, 0, 1, 0, 0, 1]))
https://hg.python.org/cpython/file/a985b6455fde/Lib/test/list_tests.py
> If I change it to
>
> Test.extend(iter(Test))
>
> then it borks my system in 1s after consuming 8GB of RAM (I recovered
> with killall python in the tty).
The reason that fails should be obvious: as new items keep getting added
to Test, the iterator likewise sees more items to iterate over. I don't
know if this is documented, but you can see what happens here:
py> L = [10, 20]
py> it = iter(L)
py> L.append(next(it)); print L
[10, 20, 10]
py> L.append(next(it)); print L
[10, 20, 10, 20]
py> L.append(next(it)); print L
[10, 20, 10, 20, 10]
py> L.append(next(it)); print L
[10, 20, 10, 20, 10, 20]
So as Test.extend tries to iterate over iter(Test), it just keeps growing
as more items are added to Test.
--
Steven D'Aprano
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web