Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #29987

Re: Memory usage per top 10x usage per heapy

Date 2012-09-24 21:14 -0400
From Dave Angel <d@davea.name>
Subject Re: Memory usage per top 10x usage per heapy
References <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com>
Newsgroups comp.lang.python
Message-ID <mailman.1260.1348535702.27098.python-list@python.org> (permalink)

Show all headers | View raw


On 09/24/2012 05:59 PM, MrsEntity wrote:
> Hi all,
>
> I'm working on some code that parses a 500kb, 2M line file 

Just curious;  which is it, two million lines, or half a million bytes?

> line by line and saves, per line, some derived strings into various data structures. I thus expect that memory use should monotonically increase. Currently, the program is taking up so much memory - even on 1/2 sized files - that on 2GB machine 

which machine is 2gb, the Windows machine, or the VM?  You could get
thrashing at either level.

> I'm thrashing swap. What's strange is that heapy (http://guppy-pe.sourceforge.net/) is showing that the code uses about 10x less memory than reported by top, and the heapy data seems consistent with what I was expecting based on the objects the code stores. I tried using memory_profiler (http://pypi.python.org/pypi/memory_profiler) but it didn't really provide any illuminating information. The code does create and discard a number of objects per line of the file, but they should not be stored anywhere, and heapy seems to confirm that. So, my questions are:
>
> 1) For those of you kind enough to help me figure out what's going on, what additional data would you like? I didn't want swamp everyone with the code and heapy/memory_profiler output but I can do so if it's valuable.
> 2) How can I diagnose (and hopefully fix) what's causing the massive memory usage when it appears, from heapy, that the code is performing reasonably?
>
> Specs: Ubuntu 12.04 in Virtualbox on Win7/64, Python 2.7/64
>
> Thanks very much.

Tim raised most of my concerns, but I would point out that just because
you free up the memory from the Python doesn't mean it gets released
back to the system.  The C runtime manages its own heap, and is pretty
persistent about hanging onto memory once obtained.  It's not normally a
problem, since most small blocks are reused.  But it can get
fragmented.  And i have no idea how well Virtual Box maps the Linux
memory map into the Windows one.



-- 

DaveA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Memory usage per top 10x usage per heapy MrsEntity <junkshops@gmail.com> - 2012-09-24 14:59 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-24 18:22 -0500
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 16:58 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-24 21:14 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 21:21 -0700
  Re: Memory usage per top 10x usage per heapy Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-25 00:41 -0400
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 05:51 -0500
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 07:06 -0400
  Re: Memory usage per top 10x usage per heapy Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:10 +0100
  Re: gracious responses (was: Memory usage per top 10x usage per heapy) Tim Chase <python.list@tim.thechases.com> - 2012-09-25 06:40 -0500
    Re: gracious responses (was: Memory usage per top 10x usage per heapy) alex23 <wuwei23@gmail.com> - 2012-09-25 05:44 -0700
      Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 13:53 +0100
  Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:54 +0100
    Re: gracious responses Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-25 15:17 +0000
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 14:50 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:02 -0700
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:35 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 17:10 -0500
  Re: Memory usage per top 10x usage per heapy Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-25 16:09 -0600
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 18:35 -0500

csiph-web