Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'received:209.85.223': 0.03; 'memory.': 0.05; '64-bit': 0.07; 'lines.': 0.07; 'python': 0.09; 'alternatives': 0.09; 'confuse': 0.09; 'derived': 0.09; 'facts': 0.09; 'to:addr:comp.lang.python': 0.09; 'cc:addr:python- list': 0.10; '"just"': 0.16; '(virtual)': 0.16; 'subject:usage': 0.16; 'surprising': 0.16; 'virtualbox': 0.16; 'wrote:': 0.17; 'bytes': 0.17; 'pointed': 0.17; 'tim': 0.18; 'input': 0.18; 'memory': 0.18; 'trying': 0.21; 'bit': 0.21; 'do.': 0.21; 'friend.': 0.22; 'machine.': 0.22; 'runs': 0.22; "i've": 0.23; 'cc:2**1': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply- To:1': 0.25; 'header:User-Agent:1': 0.26; 'chase': 0.29; 'subject:per': 0.29; 'running': 0.32; 'could': 0.32; 'ram': 0.33; 'ubuntu': 0.33; 'received:google.com': 0.34; 'compared': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'add': 0.36; 'but': 0.36; "i'll": 0.36; 'two': 0.37; 'why': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'supports': 0.38; 'some': 0.38; 'nothing': 0.38; 'where': 0.40; 'think': 0.40; 'your': 0.60; 'easy': 0.60; 'from:no real name:2**0': 0.60; 'you.': 0.61; "you've": 0.61; 'save': 0.61; 'worth': 0.63; 'more': 0.63; 'life': 0.66; 'sounds': 0.71; 'embraced': 0.84; 'ram,': 0.84; "they'd": 0.84; 'dollars': 0.98; 'serious': 0.98 Newsgroups: comp.lang.python Date: Thu, 27 Sep 2012 01:00:51 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=99.173.27.120; posting-account=J5j24gkAAABQ9Yx9iE8ZAnhHMzpxizi7 References: <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com> <5060EB2C.6080508@tim.thechases.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 99.173.27.120 MIME-Version: 1.0 Subject: Re: Memory usage per top 10x usage per heapy From: bryanjugglercryptographer@yahoo.com To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Python X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 30 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1348732860 news.xs4all.nl 6939 [2001:888:2000:d::a6]:55094 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:30282 MrsEntity wrote: > Based on heapy, a db based solution would be serious overkill. I've embraced overkill and my life is better for it. Don't confuse overkill= with cost. Overkill is your friend. The facts of the case: You need to save some derived strings for each of 2M= input lines. Even half the input runs over the 2GB RAM in your (virtual) m= achine. You're using Ubuntu 12.04 in Virtualbox on Win7/64, Python 2.7/64. That screams "sqlite3". It's overkill, in a good way. It's already there fo= r the importing. Other approaches? You could try to keep everything in RAM, but use less. Ti= m Chase pointed out the memory-efficiency of named tuples. You could save s= ome more by switching to Win7/32, Python 2.7/32; VirtualBox makes trying su= ch alternatives quick and easy. Or you could add memory. Compared to good old 32-bit, 64-bit operation cons= umes significantly more memory and supports vastly more memory. There's a b= it of a mis-match in a 64-bit system with just 2GB of RAM. I know, sounds w= eird, "just" two billion bytes of RAM. I'll rephrase: just ten dollars wort= h of RAM. Less if you buy it where I do. I don't know why the memory profiling tools are misleading you. I can think= of plausible explanations, but they'd just be guesses. There's nothing all= that surprising in running out of RAM, given what you've explained. A coup= le K per line is easy to burn.=20 -Bryan