Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!eweka.nl!hq-usenetpeers.eweka.nl!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!194.109.133.85.MISMATCH!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'licenses': 0.04; '"""': 0.05; 'bits': 0.07; 'derived': 0.09; 'ignoring': 0.09; 'sep': 0.09; 'thread,': 0.09; 'unnamed': 0.09; 'cc:addr:python-list': 0.10; ':-)': 0.13; 'encoding': 0.15; '*can*': 0.16; '*this*': 0.16; '-tkc': 0.16; 'although,': 0.16; 'carriage': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'message- id:@tim.thechases.com': 0.16; 'participate.': 0.16; 'received:70.251': 0.16; 'received:dsl.rcsntx.swbell.net': 0.16; 'received:rcsntx.swbell.net': 0.16; 'received:swbell.net': 0.16; 'subject:usage': 0.16; 'tax.': 0.16; 'later': 0.16; 'mon,': 0.16; 'wrote:': 0.17; 'file.': 0.20; 'all,': 0.21; 'sorry,': 0.22; 'cc:2**0': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'have,': 0.27; 'see,': 0.27; 'lines': 0.28; 'subject:per': 0.29; 'character': 0.29; "i'm": 0.29; 'writes': 0.30; 'code': 0.31; 'file': 0.32; 'could': 0.32; 'minimum': 0.34; 'list': 0.35; 'data.': 0.36; 'rather': 0.37; 'subject:: ': 0.38; 'store': 0.38; 'some': 0.38; 'list,': 0.39; 'times': 0.63; 'price': 0.66; 'fact,': 0.69; '11.': 0.81; 'low': 0.83; 'received:50.22': 0.84; 'dennis': 0.91 Date: Tue, 25 Sep 2012 05:51:02 -0500 From: Tim Chase User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111120 Icedove/3.1.16 MIME-Version: 1.0 To: Dennis Lee Bieber Subject: Re: Memory usage per top 10x usage per heapy References: <983c532f-3ff6-4bd2-bb48-07cf4d065a4b@googlegroups.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - boston.accountservergroup.com X-AntiAbuse: Original Domain - python.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tim.thechases.com X-Source: X-Source-Args: X-Source-Dir: Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 37 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1348570196 news.xs4all.nl 6884 [2001:888:2000:d::a6]:37760 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:30059 On 09/24/12 23:41, Dennis Lee Bieber wrote: > On Mon, 24 Sep 2012 14:59:47 -0700 (PDT), MrsEntity > declaimed the following in > gmane.comp.python.general: > >> Hi all, >> >> I'm working on some code that parses a 500kb, 2M line file line by line and saves, per line, some derived strings > > Pardon? A 2million line file will contain, at the minimum 2million > line-end characters. That four times 500kB just in the line-ends, > ignoring any data. As corrected later in the thread, MrsEntity writes """ I have, in fact, this very afternoon, invented a means of writing a carriage return character using only 2 bits of information. I am prepared to sell licenses to this revolutionary technology for the low price of $29.95 plus tax. Sorry, that should've been a 500Mb, 2M line file. """ If only other unnamed persons on the list were so gracious rather than turning the flame-dial to 11. I hope that when people come to the list, *this* is what they see, laugh, and want to participate. Although, MrsEntity could be zombie David A. Huffman, whose encoding scheme actually *can* store 2M lines in 500kb :-) -tkc