Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #51438

Re: collections.Counter surprisingly slow

References <roy-8C60F5.15590428072013@news.panix.com> <51f5843f$0$29971$c3e8da3$5496439d@news.astraweb.com> <kt5knb$9tj$1@ger.gmane.org>
From Joshua Landau <joshua@landau.ws>
Date 2013-07-29 13:07 +0100
Subject Re: collections.Counter surprisingly slow
Newsgroups comp.lang.python
Message-ID <mailman.5228.1375099712.3114.python-list@python.org> (permalink)

Show all headers | View raw


[Multipart message — attachments visible in raw view] - view raw

On 29 July 2013 12:46, Stefan Behnel <stefan_ml@behnel.de> wrote:

> Steven D'Aprano, 28.07.2013 22:51:
> > Calling Counter ends up calling essentially this code:
> >
> > for elem in iterable:
> >     self[elem] = self.get(elem, 0) + 1
> >
> > (although micro-optimized), where "iterable" is your data (lines).
> > Calling the get method has higher overhead than dict[key], that will also
> > contribute.
>
> It comes with a C accelerator (at least in Py3.4dev), but it seems like
> that stumbles a bit over its own feet. The accelerator function special
> cases the (exact) dict type, but the Counter class is a subtype of dict and
> thus takes the generic path, which makes it benefit a bit less than
> possible.
>
> Look for _count_elements() in
>
> http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c
>
> Nevertheless, even the generic C code path looks fast enough in general. I
> think the problem is just that the OP used Python 2.7, which doesn't have
> this accelerator function.
>

# _count_elements({}, items), _count_elements(dict_subclass(), items),
Counter(items), defaultdict(int) loop with exception handling
# "items" is always 1m long with varying levels of repetition

>>> for items in randoms:
... helper.timeit(1), helper_subclass.timeit(1), counter.timeit(1),
default.timeit(1)
...
(0.18816172199876746, 0.4679023139997298, 0.9684444869999425,
0.33518486200046027)
(0.2936601179990248, 0.6056111739999324, 1.1316078849995392,
0.46283868699902087)
(0.35396358400066674, 0.685048443998312, 1.2120939880005608,
0.5497965239992482)
(0.5337620789996436, 0.8658702100001392, 1.4507492869997805,
0.7772859329998028)
(0.745282343999861, 1.1455801379997865, 2.116569702000561,
1.3293145009993168)

:(

I have the helper but Counter is still slow. Is it not getting used for
some reason? It's not even as fast as helper on a dict's (direct, no
overridden methods) subclass.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 15:59 -0400
  Re: collections.Counter surprisingly slow Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:51 +0000
    Re: collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 17:57 -0400
    Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-29 13:46 +0200
    Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 13:07 +0100
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 09:25 +0300
  Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 12:49 +0100
  Re: collections.Counter surprisingly slow Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-29 11:19 -0600
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 22:37 +0300
  Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:39 +0200
  Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:51 +0200
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-30 16:04 +0300

csiph-web