Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #29987

Re: Memory usage per top 10x usage per heapy

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <d@davea.name>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.002
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'url:sourceforge': 0.03; 'output': 0.04; 'discard': 0.05; 'lines,': 0.05; 'raised': 0.07; 'sized': 0.07; 'python': 0.09; 'currently,': 0.09; 'derived': 0.09; 'runtime': 0.09; 'cc:addr :python-list': 0.10; 'stored': 0.10; 'url:)': 0.13; 'file,': 0.15; '10x': 0.16; 'blocks': 0.16; 'heap,': 0.16; 'subject:usage': 0.16; 'virtualbox': 0.16; 'wrote:': 0.17; 'tim': 0.18; 'memory': 0.18; 'windows': 0.19; 'are:': 0.20; 'causing': 0.20; 'all,': 0.21; 'cc:2**0': 0.23; 'seems': 0.23; 'thus': 0.24; 'cc:no real name:2**0': 0.24; 'idea': 0.24; 'linux': 0.24; 'machine': 0.24; 'tried': 0.25; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; "doesn't": 0.28; 'diagnose': 0.29; 'subject:per': 0.29; 'objects': 0.29; "i'm": 0.29; 'that.': 0.30; 'normally': 0.30; 'performing': 0.30; 'figure': 0.30; 'on,': 0.30; 'expect': 0.31; 'code': 0.31; 'point': 0.31; '(and': 0.32; 'gets': 0.32; 'url:python': 0.32; 'file': 0.32; 'could': 0.32; 'hopefully': 0.33; 'much.': 0.33; 'ubuntu': 0.33; 'everyone': 0.33; 'thanks': 0.34; 'consistent': 0.35; 'massive': 0.35; 'problem,': 0.35; 'so,': 0.35; 'pm,': 0.35; 'really': 0.36; 'but': 0.36; 'url:org': 0.36; 'level.': 0.36; "didn't": 0.36; 'should': 0.36; 'enough': 0.36; 'reported': 0.37; 'does': 0.37; 'two': 0.37; 'uses': 0.37; 'virtual': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'mean': 0.38; 'some': 0.38; 'system.': 0.39; 'received:192': 0.39; 'received:192.168': 0.40; 'help': 0.40; 'most': 0.61; 'map': 0.61; 'kind': 0.61; 'free': 0.61; 'back': 0.62; 'strange': 0.62; 'provide': 0.62; 'confirm': 0.64; 'taking': 0.65; 'header:Reply-To:1': 0.68; 'received:74.208': 0.71; 'million': 0.72; 'reply-to:no real name:2**0': 0.72; 'hanging': 0.84; 'manages': 0.84; 'received:74.208.4.194': 0.84; 'specs:': 0.84; 'stores.': 0.84; 'anywhere,': 0.93
Date Mon, 24 Sep 2012 21:14:23 -0400
From Dave Angel <d@davea.name>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0
MIME-Version 1.0
To MrsEntity <junkshops@gmail.com>
Subject Re: Memory usage per top 10x usage per heapy
References <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com>
In-Reply-To <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com>
Content-Type text/plain; charset=ISO-8859-1
Content-Transfer-Encoding 7bit
X-Provags-ID V02:K0:qIlFQCLKD5rJ0RHUp6ulwc1vdkRDNYTmzR8IUPDyoyQ /TeAvdetacMbZedkrvpIJeyVj+vAiUGBa1xG+llCsjvN9nFLwU 9OxWazmZD8ca2lpWAxqQfcV6dLQfBOpQMepMap+qZKgCdqZ9zS RbkMm48+IV7X7R8aUSHLcoAYjeFKgtOW+SebS6cCY+jGhKkZ66 YJueFK2T5ugi2WBpuFxZLTjuEAJ+pNIQhCj6+LDudAxDlpotIm XF3ITDLif+RUBTz5+ANg5T9PakWZiQOiM5KqhMR4QmhNLsXCeh Zwt0HqhCSalufqqPROrFVpTXeTJ01gG11RWpuK1PkZIcFjufg= =
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
Reply-To d@davea.name
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1260.1348535702.27098.python-list@python.org> (permalink)
Lines 35
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1348535702 news.xs4all.nl 6869 [2001:888:2000:d::a6]:35385
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:29987

Show key headers only | View raw


On 09/24/2012 05:59 PM, MrsEntity wrote:
> Hi all,
>
> I'm working on some code that parses a 500kb, 2M line file 

Just curious;  which is it, two million lines, or half a million bytes?

> line by line and saves, per line, some derived strings into various data structures. I thus expect that memory use should monotonically increase. Currently, the program is taking up so much memory - even on 1/2 sized files - that on 2GB machine 

which machine is 2gb, the Windows machine, or the VM?  You could get
thrashing at either level.

> I'm thrashing swap. What's strange is that heapy (http://guppy-pe.sourceforge.net/) is showing that the code uses about 10x less memory than reported by top, and the heapy data seems consistent with what I was expecting based on the objects the code stores. I tried using memory_profiler (http://pypi.python.org/pypi/memory_profiler) but it didn't really provide any illuminating information. The code does create and discard a number of objects per line of the file, but they should not be stored anywhere, and heapy seems to confirm that. So, my questions are:
>
> 1) For those of you kind enough to help me figure out what's going on, what additional data would you like? I didn't want swamp everyone with the code and heapy/memory_profiler output but I can do so if it's valuable.
> 2) How can I diagnose (and hopefully fix) what's causing the massive memory usage when it appears, from heapy, that the code is performing reasonably?
>
> Specs: Ubuntu 12.04 in Virtualbox on Win7/64, Python 2.7/64
>
> Thanks very much.

Tim raised most of my concerns, but I would point out that just because
you free up the memory from the Python doesn't mean it gets released
back to the system.  The C runtime manages its own heap, and is pretty
persistent about hanging onto memory once obtained.  It's not normally a
problem, since most small blocks are reused.  But it can get
fragmented.  And i have no idea how well Virtual Box maps the Linux
memory map into the Windows one.



-- 

DaveA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Memory usage per top 10x usage per heapy MrsEntity <junkshops@gmail.com> - 2012-09-24 14:59 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-24 18:22 -0500
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 16:58 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-24 21:14 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 21:21 -0700
  Re: Memory usage per top 10x usage per heapy Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-25 00:41 -0400
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 05:51 -0500
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 07:06 -0400
  Re: Memory usage per top 10x usage per heapy Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:10 +0100
  Re: gracious responses (was: Memory usage per top 10x usage per heapy) Tim Chase <python.list@tim.thechases.com> - 2012-09-25 06:40 -0500
    Re: gracious responses (was: Memory usage per top 10x usage per heapy) alex23 <wuwei23@gmail.com> - 2012-09-25 05:44 -0700
      Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 13:53 +0100
  Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:54 +0100
    Re: gracious responses Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-25 15:17 +0000
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 14:50 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:02 -0700
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:35 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 17:10 -0500
  Re: Memory usage per top 10x usage per heapy Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-25 16:09 -0600
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 18:35 -0500

csiph-web