Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18500

Re: Large list in memory slows Python

References <CAOCWG6p+JowpgT6dHYeUs4eBkDZsVtGCmSSyGOkY=haHTGXjjA@mail.gmail.com> <je017m$ar6$1@dough.gmane.org>
Date 2012-01-04 09:57 -0500
Subject Re: Large list in memory slows Python
From Benoit Thiell <bthiell@cfa.harvard.edu>
Newsgroups comp.lang.python
Message-ID <mailman.4411.1325689078.27778.python-list@python.org> (permalink)

Show all headers | View raw


On Tue, Jan 3, 2012 at 5:59 PM, Peter Otten <__peter__@web.de> wrote:
> Benoit Thiell wrote:
>
>> I am experiencing a puzzling problem with both Python 2.4 and Python
>> 2.6 on CentOS 5. I'm looking for an explanation of the problem and
>> possible solutions. Here is what I did:
>>
>> Python 2.4.3 (#1, Sep 21 2011, 19:55:41)
>> IPython 0.8.4 -- An enhanced Interactive Python.
>>
>> In [1]: def test():
>>    ...:     return [(i,) for i in range(10**6)]
>>
>> In [2]: %time x = test()
>> CPU times: user 0.82 s, sys: 0.04 s, total: 0.86 s
>> Wall time: 0.86 s
>>
>> In [4]: big_list = range(50 * 10**6)
>>
>> In [5]: %time y = test()
>> CPU times: user 9.11 s, sys: 0.03 s, total: 9.14 s
>> Wall time: 9.15 s
>>
>> As you can see, after creating a list of 50 million integers, creating
>> the same list of 1 million tuples takes about 10 times longer than the
>> first time.
>>
>> I ran these tests on a machine with 144GB of memory and it is not
>> swapping. Before creating the big list of integers, IPython used 111MB
>> of memory; After the creation, it used 1664MB of memory.
>
> In older Pythons the heuristic used to decide when to run the cyclic garbage
> collection is not well suited for the creation of many objects in a row.
> Try switching it off temporarily with
>
> import gc
> gc.disable()
> # create many objects that are here to stay
> gc.enable()
>
> You may also encorporate that into your test function:
>
> def test():
>    gc.disable()
>    try:
>        return [...]
>    finally:
>        gc.enable()

Thanks Peter, this is very helpful. Modifying my test according to
your directions produced much more consistent results.

Benoit.

-- 
Benoit Thiell
The SAO/NASA Astrophysics Data System
http://adswww.harvard.edu/

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Large list in memory slows Python Benoit Thiell <bthiell@cfa.harvard.edu> - 2012-01-04 09:57 -0500

csiph-web