Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18463

Re: Large list in memory slows Python

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python.': 0.04; 'memory.': 0.05; 'subject:Python': 0.05; 'python': 0.08; '[1]:': 0.09; '[2]:': 0.09; 'finally:': 0.09; 'function:': 0.09; 'garbage': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:80.91.229.12': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'received:lo.gmane.org': 0.09; 'def': 0.13; '0.03': 0.16; '[4]:': 0.16; 'centos': 0.16; 'creation,': 0.16; 'cyclic': 0.16; 'did:': 0.16; 'explanation': 0.16; 'pythons': 0.16; 'received:dip.t-dialin.net': 0.16; 'received:t-dialin.net': 0.16; 'row.': 0.16; 'subject:memory': 0.16; 'switching': 0.16; 'test()': 0.16; 'test():': 0.16; 'wrote:': 0.18; 'memory': 0.21; 'subject:list': 0.21; 'from:addr:web.de': 0.23; 'sep': 0.23; 'creating': 0.25; 'tests': 0.25; "i'm": 0.26; 'import': 0.27; 'problem': 0.29; 'ran': 0.30; 'tuples': 0.30; 'time:': 0.32; 'total:': 0.32; 'list': 0.32; 'objects': 0.32; 'header:X -Complaints-To:1': 0.33; 'decide': 0.33; 'to:addr:python-list': 0.34; '2.4': 0.34; 'see,': 0.34; 'try:': 0.34; 'test': 0.35; 'solutions.': 0.37; 'run': 0.37; 'machine': 0.37; 'received:org': 0.38; 'to:addr:python.org': 0.40; 'your': 0.61; 'here': 0.65; 'collection': 0.69; 'suited': 0.73; 'million': 0.76; '9.15': 0.84; 'experiencing': 0.84; 'heuristic': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Peter Otten <__peter__@web.de>
Subject Re: Large list in memory slows Python
Date Tue, 03 Jan 2012 23:59:14 +0100
Organization None
References <CAOCWG6p+JowpgT6dHYeUs4eBkDZsVtGCmSSyGOkY=haHTGXjjA@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding 7Bit
X-Gmane-NNTP-Posting-Host p5084943f.dip.t-dialin.net
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4392.1325631562.27778.python-list@python.org> (permalink)
Lines 49
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1325631562 news.xs4all.nl 6890 [2001:888:2000:d::a6]:60856
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:18463

Show key headers only | View raw


Benoit Thiell wrote:

> I am experiencing a puzzling problem with both Python 2.4 and Python
> 2.6 on CentOS 5. I'm looking for an explanation of the problem and
> possible solutions. Here is what I did:
> 
> Python 2.4.3 (#1, Sep 21 2011, 19:55:41)
> IPython 0.8.4 -- An enhanced Interactive Python.
> 
> In [1]: def test():
>    ...:     return [(i,) for i in range(10**6)]
> 
> In [2]: %time x = test()
> CPU times: user 0.82 s, sys: 0.04 s, total: 0.86 s
> Wall time: 0.86 s
> 
> In [4]: big_list = range(50 * 10**6)
> 
> In [5]: %time y = test()
> CPU times: user 9.11 s, sys: 0.03 s, total: 9.14 s
> Wall time: 9.15 s
> 
> As you can see, after creating a list of 50 million integers, creating
> the same list of 1 million tuples takes about 10 times longer than the
> first time.
> 
> I ran these tests on a machine with 144GB of memory and it is not
> swapping. Before creating the big list of integers, IPython used 111MB
> of memory; After the creation, it used 1664MB of memory.

In older Pythons the heuristic used to decide when to run the cyclic garbage 
collection is not well suited for the creation of many objects in a row.
Try switching it off temporarily with

import gc
gc.disable()
# create many objects that are here to stay
gc.enable()

You may also encorporate that into your test function:

def test():
    gc.disable()
    try:
        return [...]
    finally:
        gc.enable()

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Large list in memory slows Python Peter Otten <__peter__@web.de> - 2012-01-03 23:59 +0100

csiph-web