Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #30005

Re: Memory usage per top 10x usage per heapy

Date 2012-09-24 21:21 -0700
From Junkshops <junkshops@gmail.com>
Subject Re: Memory usage per top 10x usage per heapy
References <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com> <5061056F.6080702@davea.name>
Newsgroups comp.lang.python
Message-ID <mailman.1267.1348546870.27098.python-list@python.org> (permalink)

Show all headers | View raw


> Just curious;  which is it, two million lines, or half a million bytes?
I have, in fact, this very afternoon, invented a means of writing a 
carriage return character using only 2 bits of information. I am 
prepared to sell licenses to this revolutionary technology for the low 
price of $29.95 plus tax.

Sorry, that should've been a 500Mb, 2M line file.

> which machine is 2gb, the Windows machine, or the VM?
VM. Winders is 4gb.

> ...but I would point out that just because
> you free up the memory from the Python doesn't mean it gets released
> back to the system.  The C runtime manages its own heap, and is pretty
> persistent about hanging onto memory once obtained.  It's not normally a
> problem, since most small blocks are reused.  But it can get
> fragmented.  And i have no idea how well Virtual Box maps the Linux
> memory map into the Windows one.
Right, I understand that - but what's confusing me is that, given the 
memory use is (I assume) monotonically increasing, the code should never 
use more than what's reported by heapy once all the data is loaded into 
memory, given that memory released by the code to the Python runtime is 
reused. To the best of my ability to tell I'm not storing anything I 
shouldn't, so the only thing I can think of is that all the object 
creation and destruction, for some reason, it preventing reuse of 
memory. I'm at a bit of a loss regarding what to try next.

Cheers, MrsE

On 9/24/2012 6:14 PM, Dave Angel wrote:
> On 09/24/2012 05:59 PM, MrsEntity wrote:
>> Hi all,
>>
>> I'm working on some code that parses a 500kb, 2M line file
> Just curious;  which is it, two million lines, or half a million bytes?
>
>> line by line and saves, per line, some derived strings into various data structures. I thus expect that memory use should monotonically increase. Currently, the program is taking up so much memory - even on 1/2 sized files - that on 2GB machine
> which machine is 2gb, the Windows machine, or the VM?  You could get
> thrashing at either level.
>
>> I'm thrashing swap. What's strange is that heapy (http://guppy-pe.sourceforge.net/) is showing that the code uses about 10x less memory than reported by top, and the heapy data seems consistent with what I was expecting based on the objects the code stores. I tried using memory_profiler (http://pypi.python.org/pypi/memory_profiler) but it didn't really provide any illuminating information. The code does create and discard a number of objects per line of the file, but they should not be stored anywhere, and heapy seems to confirm that. So, my questions are:
>>
>> 1) For those of you kind enough to help me figure out what's going on, what additional data would you like? I didn't want swamp everyone with the code and heapy/memory_profiler output but I can do so if it's valuable.
>> 2) How can I diagnose (and hopefully fix) what's causing the massive memory usage when it appears, from heapy, that the code is performing reasonably?
>>
>> Specs: Ubuntu 12.04 in Virtualbox on Win7/64, Python 2.7/64
>>
>> Thanks very much.
> Tim raised most of my concerns, but I would point out that just because
> you free up the memory from the Python doesn't mean it gets released
> back to the system.  The C runtime manages its own heap, and is pretty
> persistent about hanging onto memory once obtained.  It's not normally a
> problem, since most small blocks are reused.  But it can get
> fragmented.  And i have no idea how well Virtual Box maps the Linux
> memory map into the Windows one.
>
>
>

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Memory usage per top 10x usage per heapy MrsEntity <junkshops@gmail.com> - 2012-09-24 14:59 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-24 18:22 -0500
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 16:58 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-24 21:14 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 21:21 -0700
  Re: Memory usage per top 10x usage per heapy Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-25 00:41 -0400
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 05:51 -0500
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 07:06 -0400
  Re: Memory usage per top 10x usage per heapy Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:10 +0100
  Re: gracious responses (was: Memory usage per top 10x usage per heapy) Tim Chase <python.list@tim.thechases.com> - 2012-09-25 06:40 -0500
    Re: gracious responses (was: Memory usage per top 10x usage per heapy) alex23 <wuwei23@gmail.com> - 2012-09-25 05:44 -0700
      Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 13:53 +0100
  Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:54 +0100
    Re: gracious responses Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-25 15:17 +0000
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 14:50 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:02 -0700
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:35 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 17:10 -0500
  Re: Memory usage per top 10x usage per heapy Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-25 16:09 -0600
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 18:35 -0500

csiph-web