Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #19676 > unrolled thread
| Started by | Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> |
|---|---|
| First post | 2012-02-01 01:39 +0000 |
| Last post | 2012-02-01 12:28 +0100 |
| Articles | 14 — 8 participants |
Back to article view | Back to comp.lang.python
Iterate from 2nd element of a huge list Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2012-02-01 01:39 +0000
Re: Iterate from 2nd element of a huge list Cameron Simpson <cs@zip.com.au> - 2012-02-01 13:02 +1100
Re: Iterate from 2nd element of a huge list duncan smith <buzzard@urubu.freeserve.co.uk> - 2012-02-01 02:31 +0000
Re: Iterate from 2nd element of a huge list Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-02-01 02:43 +0000
Re: Iterate from 2nd element of a huge list Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2012-02-01 03:16 +0000
Re: Iterate from 2nd element of a huge list Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2012-02-01 03:34 +0000
Re: Iterate from 2nd element of a huge list Cameron Simpson <cs@zip.com.au> - 2012-02-01 15:55 +1100
Re: Iterate from 2nd element of a huge list Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> - 2012-02-02 07:23 +0000
Re: Iterate from 2nd element of a huge list Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-02-02 07:38 +0000
Re: Iterate from 2nd element of a huge list Arnaud Delobelle <arnodel@gmail.com> - 2012-02-01 07:09 +0000
Re: Iterate from 2nd element of a huge list Peter Otten <__peter__@web.de> - 2012-02-01 09:11 +0100
Re: Iterate from 2nd element of a huge list Arnaud Delobelle <arnodel@gmail.com> - 2012-02-01 10:54 +0000
Re: Iterate from 2nd element of a huge list Paul Rubin <no.email@nospam.invalid> - 2012-02-01 01:25 -0800
Re: Iterate from 2nd element of a huge list Stefan Behnel <stefan_ml@behnel.de> - 2012-02-01 12:28 +0100
| From | Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> |
|---|---|
| Date | 2012-02-01 01:39 +0000 |
| Subject | Iterate from 2nd element of a huge list |
| Message-ID | <jga54q$vqg$1@speranza.aioe.org> |
Hi! What is the best way to iterate thru a huge list having the 1st element a different process? I.e.: process1(mylist[0]) for el in mylist[1:]: process2(el) This way mylist is almost duplicated, isn't it? Thanks.
[toc] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2012-02-01 13:02 +1100 |
| Message-ID | <mailman.5280.1328061789.27778.python-list@python.org> |
| In reply to | #19676 |
On 01Feb2012 01:39, Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> wrote: | What is the best way to iterate thru a huge list having the 1st element | a different process? I.e.: | | process1(mylist[0]) | for el in mylist[1:]: | process2(el) | | This way mylist is almost duplicated, isn't it? Yep. What about (untested): process1(mylist[0]) for i in xrange(1,len(mylist)): process2(mylist[i]) Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ The truth is, I will not give myself the trouble to write sense long, for I would as soon please fools as wise men; because fools are more numerous, and every prudent man will go with the majority. - Hugh Henry Brackenridge
[toc] | [prev] | [next] | [standalone]
| From | duncan smith <buzzard@urubu.freeserve.co.uk> |
|---|---|
| Date | 2012-02-01 02:31 +0000 |
| Message-ID | <kr1Wq.10224$kH7.1639@newsfe12.ams2> |
| In reply to | #19676 |
On 01/02/12 01:39, Paulo da Silva wrote:
> Hi!
>
> What is the best way to iterate thru a huge list having the 1st element
> a different process? I.e.:
>
> process1(mylist[0])
> for el in mylist[1:]:
> process2(el)
>
> This way mylist is almost duplicated, isn't it?
>
> Thanks.
Maybe (untested),
it = iter(mylist)
process1(it.next())
for el in it:
process2(el)
Duncan
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-02-01 02:43 +0000 |
| Message-ID | <4f28a6da$0$29895$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #19676 |
On Wed, 01 Feb 2012 01:39:38 +0000, Paulo da Silva wrote:
> Hi!
>
> What is the best way to iterate thru a huge list having the 1st element
> a different process? I.e.:
>
> process1(mylist[0])
> for el in mylist[1:]:
> process2(el)
>
> This way mylist is almost duplicated, isn't it?
Yes. But don't be too concerned: what you consider a huge list and what
Python considers a huge list are unlikely to be the same. In my
experience, many people consider 10,000 items to be huge, but that's only
about 45K of memory. Copying it will be fast. On my laptop:
steve@runes:~$ python -m timeit -s "L = range(10000)" "L2 = L[1:]"
10000 loops, best of 3: 57.1 usec per loop
But if you have tens of millions of items, or think that you might
someday have to deal with tens of millions of items, here's an easy
technique to use:
it = iter(mylist)
process1(next(it)) # In Python 2.5, use it.next() instead.
for el in it:
process2(el)
No copying is performed.
For tiny lists, the iterator overhead will mean this is a smidgen slower,
but for tiny lists, who cares if it takes 2 nanoseconds instead of 1?
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> |
|---|---|
| Date | 2012-02-01 03:16 +0000 |
| Message-ID | <jgaaqr$abr$1@speranza.aioe.org> |
| In reply to | #19676 |
Em 01-02-2012 01:39, Paulo da Silva escreveu: > Hi! > > What is the best way to iterate thru a huge list having the 1st element > a different process? I.e.: > > process1(mylist[0]) > for el in mylist[1:]: > process2(el) > > This way mylist is almost duplicated, isn't it? > > Thanks. I think iter is nice for what I need. Thank you very much to all who responded.
[toc] | [prev] | [next] | [standalone]
| From | Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> |
|---|---|
| Date | 2012-02-01 03:34 +0000 |
| Message-ID | <jgabs9$btl$1@speranza.aioe.org> |
| In reply to | #19682 |
Em 01-02-2012 03:16, Paulo da Silva escreveu: > Em 01-02-2012 01:39, Paulo da Silva escreveu: >> Hi! >> >> What is the best way to iterate thru a huge list having the 1st element >> a different process? I.e.: >> >> process1(mylist[0]) >> for el in mylist[1:]: >> process2(el) >> >> This way mylist is almost duplicated, isn't it? >> >> Thanks. > > > I think iter is nice for what I need. > Thank you very much to all who responded. BTW, iter seems faster than iterating thru mylist[1:]!
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2012-02-01 15:55 +1100 |
| Message-ID | <mailman.5284.1328072129.27778.python-list@python.org> |
| In reply to | #19683 |
On 01Feb2012 03:34, Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> wrote: | Em 01-02-2012 03:16, Paulo da Silva escreveu: | > I think iter is nice for what I need. | > Thank you very much to all who responded. | | BTW, iter seems faster than iterating thru mylist[1:]! I would hope the difference can be attributed to the cost of copying mylist[1:]. Do your timings suggest this? (Remembering also that for most benchmarking you need to run things many times unless the effect is quite large). Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ Any company large enough to have a research lab is large enough not to listen to it. - Alan Kay
[toc] | [prev] | [next] | [standalone]
| From | Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> |
|---|---|
| Date | 2012-02-02 07:23 +0000 |
| Message-ID | <jgddkn$58n$1@speranza.aioe.org> |
| In reply to | #19687 |
Em 01-02-2012 04:55, Cameron Simpson escreveu: > On 01Feb2012 03:34, Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> wrote: > | BTW, iter seems faster than iterating thru mylist[1:]! > > I would hope the difference can be attributed to the cost of copying > mylist[1:]. I don't think so. I tried several times and the differences were almost always consistent. I put mylist1=mylist[1:] outside the time control. iter still seems a little bit faster. Running both programs several times (10000000 elements list) I only got iter being slower once! But, of course, most of the difference comes from the copy.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-02-02 07:38 +0000 |
| Message-ID | <4f2a3d8d$0$29895$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #19774 |
On Thu, 02 Feb 2012 07:23:04 +0000, Paulo da Silva wrote: > Em 01-02-2012 04:55, Cameron Simpson escreveu: >> On 01Feb2012 03:34, Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> >> wrote: > >> | BTW, iter seems faster than iterating thru mylist[1:]! >> >> I would hope the difference can be attributed to the cost of copying >> mylist[1:]. > I don't think so. I tried several times and the differences were almost > always consistent. Yes, actually iterating over a list-iterator appears to be trivially faster (although this may not apply to arbitrary iterators): steve@runes:~$ python -m timeit -s "L=range(10000)" "for x in L: pass" 1000 loops, best of 3: 280 usec per loop steve@runes:~$ python -m timeit -s "L=range(10000)" "for x in iter(L): pass" 1000 loops, best of 3: 274 usec per loop The difference of 6 microseconds would be lost in the noise if the loops actually did something useful. Also keep in mind that for tiny lists, the overhead of creating the iterator is probably much greater than the time of iterating over the list: steve@runes:~$ python -m timeit -s "L=range(3)" "for x in L: pass" 1000000 loops, best of 3: 0.238 usec per loop steve@runes:~$ python -m timeit -s "L=range(3)" "for x in iter(L): pass" 1000000 loops, best of 3: 0.393 usec per loop But of course the difference is only relatively significant, in absolute terms nobody is going to notice an extra 0.1 or 0.2 microseconds. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Arnaud Delobelle <arnodel@gmail.com> |
|---|---|
| Date | 2012-02-01 07:09 +0000 |
| Message-ID | <mailman.5288.1328080180.27778.python-list@python.org> |
| In reply to | #19682 |
On 1 February 2012 03:16, Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> wrote:
> Em 01-02-2012 01:39, Paulo da Silva escreveu:
>> Hi!
>>
>> What is the best way to iterate thru a huge list having the 1st element
>> a different process? I.e.:
>>
>> process1(mylist[0])
>> for el in mylist[1:]:
>> process2(el)
>>
>> This way mylist is almost duplicated, isn't it?
>>
>> Thanks.
>
>
> I think iter is nice for what I need.
> Thank you very much to all who responded.
Nobody mentioned itertools.islice, which can be handy, especially if
you weren't interested in the first element of the list:
from itertools import islice:
for el in islice(mylist, 1):
process2(el)
--
Arnaud
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2012-02-01 09:11 +0100 |
| Message-ID | <mailman.5290.1328083912.27778.python-list@python.org> |
| In reply to | #19682 |
Arnaud Delobelle wrote: >> Em 01-02-2012 01:39, Paulo da Silva escreveu: >>> What is the best way to iterate thru a huge list having the 1st element >>> a different process? I.e.: > Nobody mentioned itertools.islice, which can be handy, especially if > you weren't interested in the first element of the list: Also, skipping two or seven or ... items is just as easy. The example should be > from itertools import islice: for el in islice(mylist, 1, None): > process2(el)
[toc] | [prev] | [next] | [standalone]
| From | Arnaud Delobelle <arnodel@gmail.com> |
|---|---|
| Date | 2012-02-01 10:54 +0000 |
| Message-ID | <mailman.5297.1328093699.27778.python-list@python.org> |
| In reply to | #19682 |
On 1 February 2012 08:11, Peter Otten <__peter__@web.de> wrote: > Arnaud Delobelle wrote: > The example should be > >> from itertools import islice: > > for el in islice(mylist, 1, None): >> process2(el) Oops! -- Arnaud
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2012-02-01 01:25 -0800 |
| Message-ID | <7xy5sm6h3i.fsf@ruckus.brouhaha.com> |
| In reply to | #19676 |
Paulo da Silva <p_s_d_a_s_i_l_v_a@netcabo.pt> writes:
> process1(mylist[0])
> for el in mylist[1:]:
> process2(el)
>
> This way mylist is almost duplicated, isn't it?
I think it's cleanest to use itertools.islice to get the big sublist
(not tested):
from itertools import islice
process1 (mylist[0])
for el in islice(mylist, 1, None):
process2 (el)
The islice has a small, constant amount of storage overhead instead of
duplicating almost the whole list.
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2012-02-01 12:28 +0100 |
| Message-ID | <mailman.5299.1328095735.27778.python-list@python.org> |
| In reply to | #19698 |
Paul Rubin, 01.02.2012 10:25:
> Paulo da Silva writes:
>> process1(mylist[0])
>> for el in mylist[1:]:
>> process2(el)
>>
>> This way mylist is almost duplicated, isn't it?
>
> I think it's cleanest to use itertools.islice to get the big sublist
> (not tested):
>
> from itertools import islice
>
> process1 (mylist[0])
> for el in islice(mylist, 1, None):
> process2 (el)
>
> The islice has a small, constant amount of storage overhead instead of
> duplicating almost the whole list.
It also has a tiny runtime overhead, though. So, if your code is totally
performance critical and you really just want to strip off the first
element and then run through all the rest, it may still be better to go the
iter() + next() route.
python3.3 -m timeit -s 'l=list(range(100000))' \
'it = iter(l); next(it); all(it)'
1000 loops, best of 3: 935 usec per loop
python3.3 -m timeit -s 'l=list(range(100000))' \
-s 'from itertools import islice' \
'all(islice(l, 1, None))'
1000 loops, best of 3: 1.63 msec per loop
Stefan
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web