Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44108

Re: List Count

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder5.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <oscar.j.benjamin@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.029
X-Spam-Evidence '*H*': 0.94; '*S*': 0.00; 'indicating': 0.07; 'indices': 0.07; 'plenty': 0.07; 'sys': 0.07; 'counting': 0.09; 'integers': 0.09; 'linear': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'stored': 0.12; '(and,': 0.16; '2.7.3': 0.16; 'algorithmic': 0.16; 'benjamin': 0.16; 'bisect': 0.16; 'integers.': 0.16; 'interval.': 0.16; 'say.': 0.16; 'storing': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'trying': 0.19; '>>>': 0.22; 'memory': 0.22; 'import': 0.22; 'cc:addr:python.org': 0.22; 'question': 0.24; 'cc:2**0': 0.24; 'cc:no real name:2**0': 0.24; 'certain': 0.27; 'values': 0.27; 'header:In-Reply-To:1': 0.27; 'correct': 0.29; 'array': 0.29; 'compared': 0.30; 'message- id:@mail.gmail.com': 0.30; 'gives': 0.31; '(on': 0.31; 'sep': 0.31; 'values.': 0.31; 'class': 0.32; 'themselves': 0.32; '(e.g.': 0.33; 'implemented': 0.33; 'not.': 0.33; 'sense': 0.34; "can't": 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'module.': 0.36; 'subject:List': 0.36; 'possible': 0.36; 'application': 0.37; 'two': 0.37; 'list': 0.37; 'saves': 0.38; 'rather': 0.38; 'realize': 0.39; 'enough': 0.39; 'how': 0.40; 'numbers': 0.61; 'range': 0.61; 'simple': 0.61; "you're": 0.61; 'further': 0.61; 'first': 0.61; 'save': 0.62; 'more': 0.64; 'total': 0.65; 'different': 0.65; 'within': 0.65; 'here': 0.66; 'between': 0.67; 'obvious': 0.74; 'prime': 0.74; 'potentially': 0.81; 'savings': 0.81; 'discover': 0.82; 'billion.': 0.84; 'density': 0.84; 'faster.': 0.84; 'oscar': 0.84; 'approach.': 0.91; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=jyqV+i/J0UHQG7RFrC/Dnsz5C4Zr0Y6OVaKn6rL+/w8=; b=OoTRuU2O3w/RXW4unBKAnvPCGS+hjKXRnZtCwsYrqDhldmfWcitxQ2nunjJ2s5s4FI bCRvhf47NASXwnZVQjMsG2XwuPQRZG77BYCpOLaRElSr2X2YGoiTymME/wzHpKC8YGbv 9B0uG05QI3aa9j7+ox4sm0IuKZTYNlJCr6Y13IYOWBSh7Lz2dOfmkGR+iYEsafgXi6UK Zq+9zoHL804sB96ANZGc9MYEMSb7t0JuhBWDrBfjOGGdIiZfjmkRMngN8edQ5we+fA7C TwhxX5tlpiIisuAveQE88OXeIyCERJ1WIzxr1OXNtz2Y098FbPazbDxzMr2izy9Wv3L3 Vtqg==
X-Received by 10.58.144.133 with SMTP id sm5mr20323722veb.23.1366661915973; Mon, 22 Apr 2013 13:18:35 -0700 (PDT)
MIME-Version 1.0
In-Reply-To <51756769.20206@nowhere.org>
References <sridnffI6YhxuOjMnZ2dnUVZ7tSdnZ2d@brightview.co.uk> <5175377f$0$29977$c3e8da3$5496439d@news.astraweb.com> <517545F7.5090209@nowhere.org> <mailman.927.1366644147.3114.python-list@python.org> <51755C38.4000204@nowhere.org> <mailman.929.1366646795.3114.python-list@python.org> <51756769.20206@nowhere.org>
From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date Mon, 22 Apr 2013 21:18:15 +0100
Subject Re: List Count
To Blind Anagram <blindanagram@nowhere.org>
Content-Type text/plain; charset=ISO-8859-1
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.936.1366661918.3114.python-list@python.org> (permalink)
Lines 57
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1366661918 news.xs4all.nl 2296 [2001:888:2000:d::a6]:53737
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:44108

Show key headers only | View raw


On 22 April 2013 17:38, Blind Anagram <blindanagram@nowhere.org> wrote:
> On 22/04/2013 17:06, Oscar Benjamin wrote:
>
>> I don't know what your application is but I would say that my first
>> port of call here would be to consider a different algorithmic
>> approach. An obvious question would be about the sparsity of this data
>> structure. How frequent are the values that you are trying to count?
>> Would it make more sense to store a list of their indices?
>
> Actually it is no more than a simple prime sieve implemented as a Python
> class (and, yes, I realize that there are plenty of these around).

If I understand correctly, you have a list of roughly a billion
True/False values indicating which integers are prime and which are
not. You would like to discover how many prime numbers there are
between two numbers a and b. You currently do this by counting the
number of True values in your list between the indices a and b.

If my description is correct then I would definitely consider using a
different algorithmic approach. The density of primes from 1 to 1
billlion is about 5%. Storing the prime numbers themselves in a sorted
list would save memory and allow a potentially more efficient way of
counting the number of primes within some interval.

To see how it saves memory (on a 64 bit system):

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> a = ([True] + [False]*19) * 50000
>>> len(a)
1000000
>>> sys.getsizeof(a)
8000072
>>> a = list(range(50000))
>>> sys.getsizeof(a)
450120
>>> sum(sys.getsizeof(x) for x in a)
1200000

So you're using about 1/5th of the memory with a list of primes
compared to a list of True/False values. Further savings would be
possible if you used an array to store the primes as 64 bit integers.
In this case it would take about 400MB to store all the primes up to 1
billion.

The more efficient way of counting the primes would then be to use the
bisect module. This gives you a way of counting the primes between a
and b with a cost that is logarithmic in the total number of primes
stored rather than linear in the size of the range (e.g. b-a). For
large enough primes/ranges this is certain to be faster. Whether it
actually works that way for your numbers I can't say.


Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 12:58 +0100
  Re: List Count Dave Angel <davea@davea.name> - 2013-04-22 08:51 -0400
    Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 14:03 +0100
    Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 14:03 +0100
  Re: List Count Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-22 13:13 +0000
    Re: List Count Skip Montanaro <skip@pobox.com> - 2013-04-22 08:57 -0500
    Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 15:15 +0100
      Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 16:14 +0100
        Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 16:50 +0100
          Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 17:06 +0100
            Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 17:38 +0100
              Re: List Count Skip Montanaro <skip@pobox.com> - 2013-04-22 12:48 -0500
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 20:22 +0100
              Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 21:18 +0100
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 22:25 +0100
                Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 00:06 +0100
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 07:45 +0100
                Re: List Count Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-22 23:28 +0000
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 08:00 +0100
              Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 22:03 +0100
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 22:32 +0100
                Re: List Count Dave Angel <davea@davea.name> - 2013-04-22 21:47 -0400
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 08:02 +0100
            Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 17:38 +0100
        Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-22 16:50 +0100
      Re: List Count Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-22 23:01 +0000
        Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 08:05 +0100
          Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 12:08 +0100
            Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 12:45 +0100
              Re: List Count Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-23 15:01 -0400
          Re: List Count Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-23 14:49 +0000
            Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 17:57 +0100
              Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 18:45 +0100
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 19:30 +0100
                Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 20:16 +0100
              Re: List Count Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-23 16:00 -0400
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-23 21:41 +0100
              Re: List Count Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 21:38 +0100
              Re: List Count Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-24 01:59 +0000
                Re: List Count Blind Anagram <blindanagram@nowhere.org> - 2013-04-24 10:01 +0100
  Re: List Count Peter Otten <__peter__@web.de> - 2013-04-22 15:22 +0200
  Re: List Count 88888 Dihedral <dihedral88888@googlemail.com> - 2013-04-22 06:36 -0700

csiph-web