Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #29958

Re: Memory usage per top 10x usage per heapy

Date 2012-09-24 18:22 -0500
From Tim Chase <python.list@tim.thechases.com>
Subject Re: Memory usage per top 10x usage per heapy
References <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com>
Newsgroups comp.lang.python
Message-ID <mailman.1241.1348528879.27098.python-list@python.org> (permalink)

Show all headers | View raw


On 09/24/12 16:59, MrsEntity wrote:
> I'm working on some code that parses a 500kb, 2M line file line
> by line and saves, per line, some derived strings into various
> data structures. I thus expect that memory use should
> monotonically increase. Currently, the program is taking up so
> much memory - even on 1/2 sized files - that on 2GB machine I'm
> thrashing swap.

It might help to know what comprises the "into various data
structures".  I do a lot of ETL work on far larger files,
with similar machine specs, and rarely touch swap.

> 2) How can I diagnose (and hopefully fix) what's causing the
> massive memory usage when it appears, from heapy, that the code
> is performing reasonably?

I seem to recall that Python holds on to memory that the VM
releases, but that it *should* reuse it later.  So you'd get
the symptom of the memory-usage always increasing, never
decreasing.

Things that occur to me:

- check how you're reading the data:  are you iterating over
  the lines a row at a time, or are you using
  .read()/.readlines() to pull in the whole file and then
  operate on that?

- check how you're storing them:  are you holding onto more
  than you think you are?  Would it hurt to switch from a
  dict to store your data (I'm assuming here) to using the
  anydbm module to temporarily persist the large quantity of
  data out to disk in order to keep memory usage lower?

Without actual code, it's hard to do a more detailed
analysis.

-tkc

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Memory usage per top 10x usage per heapy MrsEntity <junkshops@gmail.com> - 2012-09-24 14:59 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-24 18:22 -0500
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 16:58 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
    Re: Memory usage per top 10x usage per heapy bryanjugglercryptographer@yahoo.com - 2012-09-27 01:00 -0700
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-24 21:14 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-24 21:21 -0700
  Re: Memory usage per top 10x usage per heapy Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-25 00:41 -0400
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 05:51 -0500
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 07:06 -0400
  Re: Memory usage per top 10x usage per heapy Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:10 +0100
  Re: gracious responses (was: Memory usage per top 10x usage per heapy) Tim Chase <python.list@tim.thechases.com> - 2012-09-25 06:40 -0500
    Re: gracious responses (was: Memory usage per top 10x usage per heapy) alex23 <wuwei23@gmail.com> - 2012-09-25 05:44 -0700
      Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 13:53 +0100
  Re: gracious responses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-09-25 12:54 +0100
    Re: gracious responses Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-25 15:17 +0000
  Re: Memory usage per top 10x usage per heapy Dave Angel <d@davea.name> - 2012-09-25 14:50 -0400
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:02 -0700
  Re: Memory usage per top 10x usage per heapy Junkshops <junkshops@gmail.com> - 2012-09-25 14:35 -0700
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 17:10 -0500
  Re: Memory usage per top 10x usage per heapy Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-25 16:09 -0600
  Re: Memory usage per top 10x usage per heapy Tim Chase <python.list@tim.thechases.com> - 2012-09-25 18:35 -0500

csiph-web