Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #51405 > unrolled thread
| Started by | Roy Smith <roy@panix.com> |
|---|---|
| First post | 2013-07-28 15:59 -0400 |
| Last post | 2013-07-30 16:04 +0300 |
| Articles | 12 — 6 participants |
Back to article view | Back to comp.lang.python
collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 15:59 -0400
Re: collections.Counter surprisingly slow Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:51 +0000
Re: collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 17:57 -0400
Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-29 13:46 +0200
Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 13:07 +0100
Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 09:25 +0300
Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 12:49 +0100
Re: collections.Counter surprisingly slow Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-29 11:19 -0600
Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 22:37 +0300
Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:39 +0200
Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:51 +0200
Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-30 16:04 +0300
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-07-28 15:59 -0400 |
| Subject | collections.Counter surprisingly slow |
| Message-ID | <roy-8C60F5.15590428072013@news.panix.com> |
I've been doing an informal "intro to Python" lunchtime series for some
co-workers (who are all experienced programmers, in other languages).
This week I was going to cover list comprehensions, exceptions, and
profiling. So, I did a little demo showing different ways to build a
dictionary counting how many times a string appears in some input:
test() implements a "look before you leap" python loop
exception() uses a try/catch construct in a similar python loop
default() uses a defaultdict
count() uses a Counter
I profiled it, to show how the profiler works. The full code is below.
The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
unique strings). Python 2.7.3 on Ubuntu Precise.
As I expected, test() is slower than exception(), which is slower than
default(). I'm rather shocked to discover that count() is the slowest
of all! I expected it to be the fastest. Or, certainly, no slower
than default().
The full profiler dump is at the end of this message, but the gist of
it is:
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
1 0.097 0.097 0.097 0.097 ./stations.py:36(default)
Why is count() [i.e. collections.Counter] so slow?
-------------------------------------------------------------
from collections import defaultdict, Counter
def main():
lines = open('stations.txt').readlines()
d1 = test(lines)
d2 = exception(lines)
d3 = default(lines)
d4 = count(lines)
print d1 == d2
print d1 == d3
print d1 == d4
def test(lines):
d = {}
for station in lines:
if station in d:
d[station] += 1
else:
d[station] = 1
return d
def exception(lines):
d = {}
for station in lines:
try:
d[station] += 1
except KeyError:
d[station] = 1
return d
def default(lines):
d = defaultdict(int)
for station in lines:
d[station] += 1
return d
def count(lines):
d = Counter(lines)
return d
if __name__ == '__main__':
import cProfile
import pstats
cProfile.run('main()', 'stations.stats')
p = pstats.Stats('stations.stats')
p.sort_stats('cumulative').print_stats()
-------------------------------------------------------------
570335 function calls (570332 primitive calls) in 0.776 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.023 0.023 0.776 0.776 <string>:1(<module>)
1 0.006 0.006 0.753 0.753 ./stations.py:5(main)
1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
1 0.000 0.000 0.322 0.322 /usr/lib/python2.7/collections.py:407(__init__)
1 0.242 0.242 0.322 0.322 /usr/lib/python2.7/collections.py:470(update)
1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
1 0.097 0.097 0.097 0.097 ./stations.py:36(default)
570285 0.080 0.000 0.080 0.000 {method 'get' of 'dict' objects}
1 0.055 0.055 0.055 0.055 {method 'readlines' of 'file' objects}
1 0.000 0.000 0.000 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/abc.py:128(__instancecheck__)
2/1 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/abc.py:148(__subclasscheck__)
3/1 0.000 0.000 0.000 0.000 {issubclass}
4 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:58(__iter__)
1 0.000 0.000 0.000 0.000 {open}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:26(__exit__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:20(__enter__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:36(__init__)
3 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:68(__contains__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:52(_commit_removals)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:81(add)
3 0.000 0.000 0.000 0.000 {getattr}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:16(__init__)
2 0.000 0.000 0.000 0.000 {method 'remove' of 'set' objects}
2 0.000 0.000 0.000 0.000 {method '__subclasses__' of 'type' objects}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_abcoll.py:97(__subclasshook__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
4 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-07-28 20:51 +0000 |
| Message-ID | <51f5843f$0$29971$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #51405 |
On Sun, 28 Jul 2013 15:59:04 -0400, Roy Smith wrote:
[...]
> I'm rather shocked to discover that count() is the slowest
> of all! I expected it to be the fastest. Or, certainly, no slower than
> default().
>
> The full profiler dump is at the end of this message, but the gist of it
> is:
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
> 1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
> 1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
> 1 0.097 0.097 0.097 0.097 ./stations.py:36(default)
>
> Why is count() [i.e. collections.Counter] so slow?
It's within a factor of 2 of test, and 3 of exception or default (give or
take). I don't think that's surprisingly slow. In 2.7, Counter is written
in Python, while defaultdict has an accelerated C version. I expect that
has something to do with it.
Calling Counter ends up calling essentially this code:
for elem in iterable:
self[elem] = self.get(elem, 0) + 1
(although micro-optimized), where "iterable" is your data (lines).
Calling the get method has higher overhead than dict[key], that will also
contribute.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-07-28 17:57 -0400 |
| Message-ID | <roy-4B9DE0.17571228072013@news.panix.com> |
| In reply to | #51407 |
In article <51f5843f$0$29971$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > > Why is count() [i.e. collections.Counter] so slow? > > It's within a factor of 2 of test, and 3 of exception or default (give or > take). I don't think that's surprisingly slow. It is for a module which describes itself as "High-performance container datatypes" :-)
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-07-29 13:46 +0200 |
| Message-ID | <mailman.5225.1375098437.3114.python-list@python.org> |
| In reply to | #51407 |
Steven D'Aprano, 28.07.2013 22:51: > Calling Counter ends up calling essentially this code: > > for elem in iterable: > self[elem] = self.get(elem, 0) + 1 > > (although micro-optimized), where "iterable" is your data (lines). > Calling the get method has higher overhead than dict[key], that will also > contribute. It comes with a C accelerator (at least in Py3.4dev), but it seems like that stumbles a bit over its own feet. The accelerator function special cases the (exact) dict type, but the Counter class is a subtype of dict and thus takes the generic path, which makes it benefit a bit less than possible. Look for _count_elements() in http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c Nevertheless, even the generic C code path looks fast enough in general. I think the problem is just that the OP used Python 2.7, which doesn't have this accelerator function. Stefan
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua@landau.ws> |
|---|---|
| Date | 2013-07-29 13:07 +0100 |
| Message-ID | <mailman.5228.1375099712.3114.python-list@python.org> |
| In reply to | #51407 |
[Multipart message — attachments visible in raw view] — view raw
On 29 July 2013 12:46, Stefan Behnel <stefan_ml@behnel.de> wrote:
> Steven D'Aprano, 28.07.2013 22:51:
> > Calling Counter ends up calling essentially this code:
> >
> > for elem in iterable:
> > self[elem] = self.get(elem, 0) + 1
> >
> > (although micro-optimized), where "iterable" is your data (lines).
> > Calling the get method has higher overhead than dict[key], that will also
> > contribute.
>
> It comes with a C accelerator (at least in Py3.4dev), but it seems like
> that stumbles a bit over its own feet. The accelerator function special
> cases the (exact) dict type, but the Counter class is a subtype of dict and
> thus takes the generic path, which makes it benefit a bit less than
> possible.
>
> Look for _count_elements() in
>
> http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c
>
> Nevertheless, even the generic C code path looks fast enough in general. I
> think the problem is just that the OP used Python 2.7, which doesn't have
> this accelerator function.
>
# _count_elements({}, items), _count_elements(dict_subclass(), items),
Counter(items), defaultdict(int) loop with exception handling
# "items" is always 1m long with varying levels of repetition
>>> for items in randoms:
... helper.timeit(1), helper_subclass.timeit(1), counter.timeit(1),
default.timeit(1)
...
(0.18816172199876746, 0.4679023139997298, 0.9684444869999425,
0.33518486200046027)
(0.2936601179990248, 0.6056111739999324, 1.1316078849995392,
0.46283868699902087)
(0.35396358400066674, 0.685048443998312, 1.2120939880005608,
0.5497965239992482)
(0.5337620789996436, 0.8658702100001392, 1.4507492869997805,
0.7772859329998028)
(0.745282343999861, 1.1455801379997865, 2.116569702000561,
1.3293145009993168)
:(
I have the helper but Counter is still slow. Is it not getting used for
some reason? It's not even as fast as helper on a dict's (direct, no
overridden methods) subclass.
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2013-07-29 09:25 +0300 |
| Message-ID | <mailman.5210.1375079167.3114.python-list@python.org> |
| In reply to | #51405 |
28.07.13 22:59, Roy Smith написав(ла): > The input is an 8.8 Mbyte file containing about 570,000 lines (11,000 > unique strings). Repeat you tests with totally unique lines. > The full profiler dump is at the end of this message, but the gist of > it is: Profiler affects execution time. In particular it slowdown Counter implementation which uses more function calls. For real world measurement use different approach. > Why is count() [i.e. collections.Counter] so slow? Feel free to contribute a patch which fixes this "wart". Note that Counter shouldn't be slowdowned on mostly unique data.
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua@landau.ws> |
|---|---|
| Date | 2013-07-29 12:49 +0100 |
| Message-ID | <mailman.5226.1375098641.3114.python-list@python.org> |
| In reply to | #51405 |
[Multipart message — attachments visible in raw view] — view raw
On 29 July 2013 07:25, Serhiy Storchaka <storchaka@gmail.com> wrote: > 28.07.13 22:59, Roy Smith написав(ла): > > The input is an 8.8 Mbyte file containing about 570,000 lines (11,000 >> unique strings). >> > > Repeat you tests with totally unique lines. Counter is about ½ the speed of defaultdict in that case (as opposed to ⅓). > The full profiler dump is at the end of this message, but the gist of >> it is: >> > > Profiler affects execution time. In particular it slowdown Counter > implementation which uses more function calls. For real world measurement > use different approach. Doing some re-times, it seems that his originals for defaultdict, exception and Counter were about right. I haven't timed the other. > Why is count() [i.e. collections.Counter] so slow? >> > > Feel free to contribute a patch which fixes this "wart". Note that Counter > shouldn't be slowdowned on mostly unique data. I find it hard to agree that counter should be optimised for the unique-data case, as surely it's much more oft used when there's a point to counting? Also, couldn't Counter just extend from defaultdict?
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2013-07-29 11:19 -0600 |
| Message-ID | <mailman.5249.1375118411.3114.python-list@python.org> |
| In reply to | #51405 |
On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau <joshua@landau.ws> wrote: > Also, couldn't Counter just extend from defaultdict? It could, but I expect the C helper function in 3.4 will be faster since it doesn't even need to call __missing__ in the first place. And the cost (both in terms of maintenance and run-time) of adding defaultdict to the Counter MRO just to reuse one tiny __missing__ method seems questionable, especially after taking into account that the constructor signature is incompatible.
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2013-07-29 22:37 +0300 |
| Message-ID | <mailman.5256.1375126808.3114.python-list@python.org> |
| In reply to | #51405 |
29.07.13 20:19, Ian Kelly написав(ла): > On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau <joshua@landau.ws> wrote: >> Also, couldn't Counter just extend from defaultdict? > > It could, but I expect the C helper function in 3.4 will be faster > since it doesn't even need to call __missing__ in the first place. I'm surprised, but the Counter constructor with commented out import of this accelerator is faster (at least for some data).
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-07-30 08:39 +0200 |
| Message-ID | <mailman.5302.1375166352.3114.python-list@python.org> |
| In reply to | #51405 |
Serhiy Storchaka, 29.07.2013 21:37: > 29.07.13 20:19, Ian Kelly написав(ла): >> On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau wrote: >>> Also, couldn't Counter just extend from defaultdict? >> >> It could, but I expect the C helper function in 3.4 will be faster >> since it doesn't even need to call __missing__ in the first place. > > I'm surprised, but the Counter constructor with commented out import of > this accelerator is faster (at least for some data). Read my post. The accelerator doesn't take the fast path for dicts as Counter is only a subtype of dict, not exactly a dict. That means that it raises and catches a KeyError exception for each new value that it finds, and that is apparently more costly than the overhead of calling get(). So, my expectation is that it's faster for highly repetitive data and slower for mostly unique data. Maybe a "fast_dict_lookup" option for the accelerator that forces the fast path would fix this. The Counter class, just like many (most?) other subtypes of dict, definitely doesn't need the fallback behaviour. Stefan
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-07-30 08:51 +0200 |
| Message-ID | <mailman.5303.1375167121.3114.python-list@python.org> |
| In reply to | #51405 |
Stefan Behnel, 30.07.2013 08:39: > Serhiy Storchaka, 29.07.2013 21:37: >> 29.07.13 20:19, Ian Kelly написав(ла): >>> On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau wrote: >>>> Also, couldn't Counter just extend from defaultdict? >>> >>> It could, but I expect the C helper function in 3.4 will be faster >>> since it doesn't even need to call __missing__ in the first place. >> >> I'm surprised, but the Counter constructor with commented out import of >> this accelerator is faster (at least for some data). > > Read my post. The accelerator doesn't take the fast path for dicts as > Counter is only a subtype of dict, not exactly a dict. That means that it > raises and catches a KeyError exception for each new value that it finds, > and that is apparently more costly than the overhead of calling get(). > > So, my expectation is that it's faster for highly repetitive data and > slower for mostly unique data. > > Maybe a "fast_dict_lookup" option for the accelerator that forces the fast > path would fix this. The Counter class, just like many (most?) other > subtypes of dict, definitely doesn't need the fallback behaviour. Or rather drop the fallback path completely. It's not worth having code duplication if it's not predictable up-front (before looking at the data) if it will help or not. http://bugs.python.org/issue18594 Stefan
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2013-07-30 16:04 +0300 |
| Message-ID | <mailman.5305.1375189475.3114.python-list@python.org> |
| In reply to | #51405 |
29.07.13 14:49, Joshua Landau написав(ла): > I find it hard to agree that counter should be optimised for the > unique-data case, as surely it's much more oft used when there's a point > to counting? Different methods are faster for different data. LBYL approach is best for the mostly unique data case, while EAFP approach is best for the mostly repeated data case. In general case a performance of particular method is a function of its performances in this two extreme cases. When it slow for one of extreme case it can be slow in a number of intermediate cases. > Also, couldn't Counter just extend from defaultdict? Unfortunately this only will slowdown it.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web