Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #51436

Re: collections.Counter surprisingly slow

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder3.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <joshua.landau.ws@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.018
X-Spam-Evidence '*H*': 0.96; '*S*': 0.00; 'fixes': 0.07; 'calls.': 0.09; 'lines.': 0.09; 'subject:skip:c 10': 0.09; 'cc:addr:python- list': 0.11; 'contribute': 0.11; 'mostly': 0.14; 'defaultdict': 0.16; 'dump': 0.16; 'measurement': 0.16; 'optimised': 0.16; 'roy': 0.16; 'subject:slow': 0.16; 'exception': 0.16; 'sender:addr:gmail.com': 0.17; 'wrote:': 0.18; 'seems': 0.21; 'input': 0.22; 'email addr:gmail.com&gt;': 0.22; 'tests': 0.22; 'cc:addr:python.org': 0.22; 'affects': 0.24; "shouldn't": 0.24; "haven't": 0.24; 'cc:2**0': 0.24; 'right.': 0.26; 'header:In- Reply-To:1': 0.27; 'point': 0.28; 'function': 0.29; 'patch': 0.29; 'message-id:@mail.gmail.com': 0.30; 'lines': 0.31; 'file': 0.32; 'extend': 0.32; 'totally': 0.33; 'agree': 0.35; 'case,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'opposed': 0.36; 'surely': 0.36; 'doing': 0.36; 'should': 0.36; 'skip:& 10': 0.38; '8bit%:86': 0.38; "couldn't": 0.39; 'free': 0.61; 'full': 0.61; 'real': 0.63; 'july': 0.63; 'more': 0.64; 'different': 0.65; 'to:addr:gmail.com': 0.65; 'world': 0.66; 'smith': 0.68; 'containing': 0.69; 'repeat': 0.74; 'other.': 0.75; 'approach.': 0.91; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=lwFrChsK2uTBwc/GWdTbPYxVFqVnC4WMsBp+Wo5e7N4=; b=LBnw489AgQl3EoTRXSeaF5bAUWz761jVzIvr1Flqpn3dj7fhL+gbjZTx0mK+Y+t4mv YLO4ueMRQHyoROSKyoYZuWb26Xc5ILJ4GSgwBTZzh58zz8t3pjzXbPZMOkdVehczSnxf j27NUTpRUmW7SuU7pwzjRY7vwBG0jM3QzEToUKsjxY35MwUZNNmvGFHjnQxOViSKMgJc LerK5FpgFsk+vJ4Flkr4enUVfc2seSwdeLBMNFXoLO9hz9U3paxUfp3L1aHd1VJ/cwP3 THcyUNMpuAAwIqHF9lqZ3EuiL1h1iXuHuWa5ocYyn6GbHBJzT2zqKvo0JOaOEv3V0nnW xAcQ==
X-Received by 10.152.42.193 with SMTP id q1mr8650156lal.65.1375098633461; Mon, 29 Jul 2013 04:50:33 -0700 (PDT)
MIME-Version 1.0
Sender joshua.landau.ws@gmail.com
In-Reply-To <kt51t3$r61$1@ger.gmane.org>
References <roy-8C60F5.15590428072013@news.panix.com> <kt51t3$r61$1@ger.gmane.org>
From Joshua Landau <joshua@landau.ws>
Date Mon, 29 Jul 2013 12:49:53 +0100
X-Google-Sender-Auth WTOunxAlnlX1FETaL9x9h4yjvFU
Subject Re: collections.Counter surprisingly slow
To Serhiy Storchaka <storchaka@gmail.com>
Content-Type multipart/alternative; boundary=001a11c3643894271f04e2a51987
Cc python-list <python-list@python.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5226.1375098641.3114.python-list@python.org> (permalink)
Lines 109
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1375098641 news.xs4all.nl 15984 [2001:888:2000:d::a6]:57278
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:51436

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

On 29 July 2013 07:25, Serhiy Storchaka <storchaka@gmail.com> wrote:

> 28.07.13 22:59, Roy Smith написав(ла):
>
>    The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
>> unique strings).
>>
>
> Repeat you tests with totally unique lines.


Counter is about ½ the speed of defaultdict in that case (as opposed to ⅓).


>  The full profiler dump is at the end of this message, but the gist of
>> it is:
>>
>
> Profiler affects execution time. In particular it slowdown Counter
> implementation which uses more function calls. For real world measurement
> use different approach.


Doing some re-times, it seems that his originals for defaultdict, exception
and Counter were about right. I haven't timed the other.


>  Why is count() [i.e. collections.Counter] so slow?
>>
>
> Feel free to contribute a patch which fixes this "wart". Note that Counter
> shouldn't be slowdowned on mostly unique data.


I find it hard to agree that counter should be optimised for the
unique-data case, as surely it's much more oft used when there's a point to
counting?

Also, couldn't Counter just extend from defaultdict?

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 15:59 -0400
  Re: collections.Counter surprisingly slow Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:51 +0000
    Re: collections.Counter surprisingly slow Roy Smith <roy@panix.com> - 2013-07-28 17:57 -0400
    Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-29 13:46 +0200
    Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 13:07 +0100
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 09:25 +0300
  Re: collections.Counter surprisingly slow Joshua Landau <joshua@landau.ws> - 2013-07-29 12:49 +0100
  Re: collections.Counter surprisingly slow Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-29 11:19 -0600
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-29 22:37 +0300
  Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:39 +0200
  Re: collections.Counter surprisingly slow Stefan Behnel <stefan_ml@behnel.de> - 2013-07-30 08:51 +0200
  Re: collections.Counter surprisingly slow Serhiy Storchaka <storchaka@gmail.com> - 2013-07-30 16:04 +0300

csiph-web