Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #101891 > unrolled thread
| Started by | Arshpreet Singh <arsh840@gmail.com> |
|---|---|
| First post | 2016-01-18 23:27 -0800 |
| Last post | 2016-01-19 15:25 -0500 |
| Articles | 8 — 4 participants |
Back to article view | Back to comp.lang.python
Why generators take long time? Arshpreet Singh <arsh840@gmail.com> - 2016-01-18 23:27 -0800
Re: Why generators take long time? Arshpreet Singh <arsh840@gmail.com> - 2016-01-19 00:06 -0800
Re: Why generators take long time? Steven D'Aprano <steve@pearwood.info> - 2016-01-19 21:11 +1100
Re: Why generators take long time? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-01-19 10:24 +0000
Re: Why generators take long time? Steven D'Aprano <steve@pearwood.info> - 2016-01-20 10:58 +1100
Re: Why generators take long time? Arshpreet Singh <arsh840@gmail.com> - 2016-01-19 11:35 -0800
Re: Why generators take long time? Jason Swails <jason.swails@gmail.com> - 2016-01-19 15:19 -0500
Re: Why generators take long time? Jason Swails <jason.swails@gmail.com> - 2016-01-19 15:25 -0500
| From | Arshpreet Singh <arsh840@gmail.com> |
|---|---|
| Date | 2016-01-18 23:27 -0800 |
| Subject | Why generators take long time? |
| Message-ID | <1e092318-5052-4383-aabe-dd979d72645d@googlegroups.com> |
I was playing with Generators and found that using Generators time is bit more than list-comprehensions or I am doing it wrong?
Function with List comprehensions:
def sum_text(number_range):
return sum([i*i for i in xrange(number_range)])
%timeit sum_text(100000000)
1 loops, best of 3: 14.8 s per loop
Using generator Expressions:
def sum_text(number_range):
return sum((i*i for i in xrange(number_range)))
%timeit sum_text(100000000)
1 loops, best of 3: 16.4 s per loop
[toc] | [next] | [standalone]
| From | Arshpreet Singh <arsh840@gmail.com> |
|---|---|
| Date | 2016-01-19 00:06 -0800 |
| Message-ID | <aed75e44-e6b4-4c0b-b9f0-bfcd26d5ba00@googlegroups.com> |
| In reply to | #101891 |
On Tuesday, 19 January 2016 12:58:28 UTC+5:30, Arshpreet Singh wrote:
> I was playing with Generators and found that using Generators time is bit more than list-comprehensions or I am doing it wrong?
>
>
> Function with List comprehensions:
>
> def sum_text(number_range):
> return sum([i*i for i in xrange(number_range)])
>
> %timeit sum_text(100000000)
> 1 loops, best of 3: 14.8 s per loop
>
> Using generator Expressions:
>
> def sum_text(number_range):
>
> return sum((i*i for i in xrange(number_range)))
>
> %timeit sum_text(100000000)
>
> 1 loops, best of 3: 16.4 s per loop
Or even when I use generator function:
def sum_text(number_range):
yield sum((i*i for i in xrange(number_range)))
another function to get value:
def get_sum():
for i in sum_text(100000000):
return i
%timeit get_sum()
1 loops, best of 3: 16.1 s per loop
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-01-19 21:11 +1100 |
| Message-ID | <569e0beb$0$1621$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101891 |
On Tue, 19 Jan 2016 06:27 pm, Arshpreet Singh wrote:
>
> I was playing with Generators and found that using Generators time is bit
> more than list-comprehensions or I am doing it wrong?
Generators may have slightly more overhead than iterating over a list. They
save memory, not time.
On my machine, I see a small difference of about 15% between summing a list
comp or a generator expression:
[steve@ando ~]$ python -m timeit "sum([i for i in xrange(1000)])"
10000 loops, best of 3: 102 usec per loop
[steve@ando ~]$ python -m timeit "sum(i for i in xrange(1000))"
10000 loops, best of 3: 117 usec per loop
But both are MUCH slower than summing the xrange object itself:
[steve@ando ~]$ python -m timeit "sum(xrange(1000))"
10000 loops, best of 3: 21.4 usec per loop
The generator has to pause execution of the code, then start it up again,
over and over again, while the list comprehension can run without
interruption, then iterate over that list.
This is, I think, the fastest way to iterate over the objects, which
eliminates as much of the overhead as possible:
[steve@ando ~]$ python -m timeit -s "from collections import deque"
-s "it = iter([i for i in xrange(1000)])" "deque(it, maxlen=0)"
1000000 loops, best of 3: 0.913 usec per loop
[steve@ando ~]$ python -m timeit -s "from collections import deque"
-s "it = (i for i in xrange(1000))" "deque(it, maxlen=0)"
1000000 loops, best of 3: 0.965 usec per loop
which is about a 6% difference. So if you eliminate everything else, and
just iterate over a pre-populated list as fast as possible, versus a simple
generator expression, the list is about 6% faster.
BUT there is a lot of variation in these timings. For the list comprehension
version, I ran it four times and got these four results:
1000000 loops, best of 3: 0.913 usec per loop
1000000 loops, best of 3: 0.911 usec per loop
1000000 loops, best of 3: 0.977 usec per loop
1000000 loops, best of 3: 0.938 usec per loop
and for the generator expression:
1000000 loops, best of 3: 0.965 usec per loop
1000000 loops, best of 3: 0.914 usec per loop
1000000 loops, best of 3: 0.981 usec per loop
1000000 loops, best of 3: 0.931 usec per loop
So there's plenty of random variation in the times.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2016-01-19 10:24 +0000 |
| Message-ID | <mailman.105.1453199051.15297.python-list@python.org> |
| In reply to | #101897 |
On 19 Jan 2016 10:16, "Steven D'Aprano" <steve@pearwood.info> wrote: > > [steve@ando ~]$ python -m timeit -s "from collections import deque" > -s "it = iter([i for i in xrange(1000)])" "deque(it, maxlen=0)" > 1000000 loops, best of 3: 0.913 usec per loop > > > [steve@ando ~]$ python -m timeit -s "from collections import deque" > -s "it = (i for i in xrange(1000))" "deque(it, maxlen=0)" > 1000000 loops, best of 3: 0.965 usec per loop Surely the iterator in the above benchmarks is consumed during the first of 1000000 loops. Subsequent loops are just calling next on an exhausted iterator. -- Oscar
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-01-20 10:58 +1100 |
| Message-ID | <569ecdb4$0$1600$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101899 |
On Tue, 19 Jan 2016 09:24 pm, Oscar Benjamin wrote: > On 19 Jan 2016 10:16, "Steven D'Aprano" <steve@pearwood.info> wrote: >> >> [steve@ando ~]$ python -m timeit -s "from collections import deque" >> -s "it = iter([i for i in xrange(1000)])" "deque(it, maxlen=0)" >> 1000000 loops, best of 3: 0.913 usec per loop >> >> >> [steve@ando ~]$ python -m timeit -s "from collections import deque" >> -s "it = (i for i in xrange(1000))" "deque(it, maxlen=0)" >> 1000000 loops, best of 3: 0.965 usec per loop > > Surely the iterator in the above benchmarks is consumed during the first > of 1000000 loops. Subsequent loops are just calling next on an exhausted > iterator. You know, I think you're right. Damn. I keep getting bitten by that aspects of iterators and timeit. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Arshpreet Singh <arsh840@gmail.com> |
|---|---|
| Date | 2016-01-19 11:35 -0800 |
| Message-ID | <8aca5094-8690-41db-8fa6-92243ed39712@googlegroups.com> |
| In reply to | #101897 |
On Tuesday, 19 January 2016 15:42:16 UTC+5:30, Steven D'Aprano wrote: > [steve@ando ~]$ python -m timeit -s "from collections import deque" > -s "it = iter([i for i in xrange(1000)])" "deque(it, maxlen=0)" > 1000000 loops, best of 3: 0.913 usec per loop > > > [steve@ando ~]$ python -m timeit -s "from collections import deque" > -s "it = (i for i in xrange(1000))" "deque(it, maxlen=0)" > 1000000 loops, best of 3: 0.965 usec per loop Thanks Steven, it was really helpful answer.
[toc] | [prev] | [next] | [standalone]
| From | Jason Swails <jason.swails@gmail.com> |
|---|---|
| Date | 2016-01-19 15:19 -0500 |
| Message-ID | <mailman.111.1453234780.15297.python-list@python.org> |
| In reply to | #101891 |
On Tue, Jan 19, 2016 at 2:27 AM, Arshpreet Singh <arsh840@gmail.com> wrote: > > I was playing with Generators and found that using Generators time is bit > more than list-comprehensions or I am doing it wrong? > > > Function with List comprehensions: > > def sum_text(number_range): > return sum([i*i for i in xrange(number_range)]) > > %timeit sum_text(100000000) > 1 loops, best of 3: 14.8 s per loop > > Using generator Expressions: > > def sum_text(number_range): > > return sum((i*i for i in xrange(number_range))) > > %timeit sum_text(100000000) > > 1 loops, best of 3: 16.4 s per loop > Steven already pointed out the additional overhead in a generator expression vs. a list comprehension. In addition to the memory savings you get via generator expressions, though, you can also get significant time savings when generator expressions have the ability to short-circuit. For instance, have a look at the following: In [1]: import random In [2]: %timeit all(random.random() < 0.5 for i in range(1000)) The slowest run took 4.85 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 3.57 µs per loop In [3]: %timeit all([random.random() < 0.5 for i in range(1000)]) 1000 loops, best of 3: 422 µs per loop In [4]: %timeit any(random.random() < 0.5 for i in range(1000)) 100000 loops, best of 3: 3.18 µs per loop In [5]: %timeit any([random.random() < 0.5 for i in range(1000)]) 1000 loops, best of 3: 408 µs per loop This is using IPython with Python 3.5. The difference here is that for functions that short-circuit (like any and all), the generator expression does not have to exhaust all of its elements (particularly since for each element there's a 50-50 chance of being True or False in each case). In this case, the difference is a couple orders of magnitude. The larger the range argument is, the bigger this difference. Also, in Python 2, the generator expression does not leak into the global namespace, while the list comprehension does: Python 2.7.10 (default, Jul 14 2015, 19:46:27) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> list(i for i in range(10)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> i Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'i' is not defined >>> [i for i in range(10)] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> i 9 Python 3 does not leak the iterator variable in either case. However, it would be madness to have code actually relying on this behavior :). At the end of the day, I use list comprehensions in the following circumstances: - I *know* I won't blow memory with a too-large list - I want to iterate over the object multiple times or I want/may want non-sequential access - I know I want all the elements I'm creating (i.e., no chance of short-circuiting) I use generator expressions when - I *might* want to All the best, Jason P.S. There is a "cross-over" point where the memory requirements of the list comp passes the generator overhead. For instance: In [17]: %timeit sum(i for i in range(10000000)) 1 loops, best of 3: 2.08 s per loop In [18]: %timeit sum([i for i in range(10000000)]) 1 loops, best of 3: 1.86 s per loop In [19]: %timeit sum(i for i in range(100000000)) 1 loops, best of 3: 21.8 s per loop In [20]: %timeit sum([i for i in range(100000000)]) 1 loops, best of 3: 26.1 s per loop -- Jason M. Swails BioMaPS, Rutgers University Postdoctoral Researcher
[toc] | [prev] | [next] | [standalone]
| From | Jason Swails <jason.swails@gmail.com> |
|---|---|
| Date | 2016-01-19 15:25 -0500 |
| Message-ID | <mailman.112.1453235154.15297.python-list@python.org> |
| In reply to | #101891 |
On Tue, Jan 19, 2016 at 3:19 PM, Jason Swails <jason.swails@gmail.com> wrote: > > I use generator expressions when > > - I *might* want to > I forgot to finish my thought here. I use generator expressions when I don't want to worry about memory, there's a decent chance of short-circuiting, or I just want to do some simple iteration where the iterator will not be the bottleneck (which is almost always). -- Jason M. Swails BioMaPS, Rutgers University Postdoctoral Researcher
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web