Path: csiph.com!usenet.pasdenom.info!news.franciliens.net!news.muarf.org!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'ideally': 0.04; 'list?': 0.07; 'string': 0.09; 'false,': 0.09; 'items)': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'slices': 0.09; 'assume': 0.14; 'creates': 0.14; '(say': 0.16; 'array.': 0.16; 'boolean': 0.16; 'bytearray': 0.16; 'numpy': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'sub-class': 0.16; 'elements': 0.16; 'wrote:': 0.18; "skip:' 30": 0.19; '>>>': 0.22; 'import': 0.22; 'header:User-Agent:1': 0.23; 'extension': 0.26; 'possibly': 0.26; 'values': 0.27; 'header:X-Complaints-To:1': 0.27; 'function': 0.29; "doesn't": 0.30; 'fastest': 0.30; 'sets': 0.30; 'copying': 0.34; 'problem': 0.35; 'advice': 0.35; 'convert': 0.35; 'hundreds': 0.35; 'but': 0.35; 'there': 0.35; 'grateful': 0.36; 'subject:List': 0.36; 'possible': 0.36; 'list': 0.37; 'list.': 0.37; 'problems': 0.38; 'lists.': 0.38; 'to:addr:python-list': 0.38; 'rather': 0.38; 'aspects': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'offer': 0.62; 'skip:n 10': 0.64; 'provide': 0.64; 'more': 0.64; 'believe': 0.68; 'costly': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: List Count Date: Mon, 22 Apr 2013 15:22:20 +0200 Organization: None References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p5084b50b.dip0.t-ipconnect.de User-Agent: KNode/4.7.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 53 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1366636933 news.xs4all.nl 2167 [2001:888:2000:d::a6]:34792 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:44066 Blind Anagram wrote: > I would be grateful for any advice people can offer on the fastest way > to count items in a sub-sequence of a large list. > > I have a list of boolean values that can contain many hundreds of > millions of elements for which I want to count the number of True values > in a sub-sequence, one from the start up to some value (say hi). > > I am currently using: > > sieve[:hi].count(True) > > but I believe this may be costly because it copies a possibly large part > of the sieve. > > Ideally I would like to be able to use: > > sieve.count(True, hi) > > where 'hi' sets the end of the count but this function is, sadly, not > available for lists. > > The use of a bytearray with a memoryview object instead of a list solves > this particular problem but it is not a solution for me as it creates > more problems than it solves in other aspects of the program. > > Can I assume that one possible solution would be to sub-class list and > create a C based extension to provide list.count(value, limit)? > > Are there any other solutions that will avoid copying a large part of > the list? If the list doesn't change often you can convert it to a string >>> items = [True, False, False] * 10 >>> sitems = "".join("FT"[i] for i in items) >>> sitems 'TFFTFFTFFTFFTFFTFFTFFTFFTFFTFF' >>> sitems.count("T", 3, 10) 3 >>> sitems.count("F", 3, 10) 4 Or you use a[3:10].sum() on a boolean numpy array. Its slices are views rather than copies: >>> import numpy >>> a = numpy.array([True, False, False]*10) >>> a[3:10].sum() 3