Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.010 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; '64-bit': 0.07; '32-bit': 0.09; 'append': 0.09; 'chunk': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; '4gb': 0.16; 'booted': 0.16; 'chunks': 0.16; 'implies': 0.16; 'message-id:@4ax.com': 0.16; 'outputs': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'file,': 0.19; 'feb': 0.22; 'memory': 0.22; 'merge': 0.24; 'processor': 0.24; 'url:home': 0.24; 'file.': 0.24; 'sort': 0.25; 'mention': 0.26; 'performing': 0.26; 'second': 0.26; 'header:X -Complaints-To:1': 0.27; 'installed': 0.27; 'record': 0.27; 'point': 0.28; 'gives': 0.31; 'that.': 0.31; 'too.': 0.31; 'subject:size': 0.31; 'file': 0.32; 'option': 0.32; 'another': 0.32; 'could': 0.34; 'operations': 0.35; 'done,': 0.36; 'set.': 0.36; 'next': 0.36; 'charset:us-ascii': 0.36; 'error.': 0.37; 'server': 0.38; 'writes': 0.38; 'handle': 0.38; 'needed': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'days': 0.60; 'even': 0.60; 'read': 0.60; 'most': 0.60; 'first': 0.61; 'back': 0.62; 'reach': 0.63; 'more': 0.64; 'reads': 0.68; 'records.': 0.68; 'records': 0.73; 'repeat': 0.74; 'received:108': 0.93 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Dennis Lee Bieber Subject: Re: Finding size of Variable Date: Tue, 04 Feb 2014 09:06:48 -0500 Organization: IISS Elusive Unicorn References: <8e4c1ab1-e65d-483f-ad9d-6933ae2052c3@googlegroups.com> <2728aca8-735b-4c38-9e7e-a164e8ed36f9@googlegroups.com> <40d95427-0c96-46af-9efe-0343953ac460@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: adsl-108-79-220-239.dsl.klmzmi.sbcglobal.net X-Newsreader: Forte Agent 6.00/32.1186 X-No-Archive: YES X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 30 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1391523018 news.xs4all.nl 2950 [2001:888:2000:d::a6]:37184 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:65426 On Tue, 4 Feb 2014 05:19:48 -0800 (PST), Ayushi Dalmia declaimed the following: >I need to chunk out the outputs otherwise it will give Memory Error. I need to do some postprocessing on the data read from the file too. If I donot stop before memory error, I won't be able to perform any more operations on it. 10 200MB files is only 2GB... Most any 64-bit processor these days can handle that. Even some 32-bit systems could handle it (WinXP booted with the server option gives 3GB to user processes -- if the 4GB was installed in the machine). However, you speak of an n-way merge. The traditional merge operation only reads one record from each file at a time, examines them for "first", writes that "first", reads next record from the file "first" came from, and then reassesses the set. You mention needed to chunk the data -- that implies performing a merge sort in which you read a few records from each file into memory, sort them, and right them out to newFile1; then read the same number of records from each file, sort, and write them to newFile2, up to however many files you intend to work with -- at that point you go back and append the next chunk to newFile1. When done, each file contains chunks of n*r records. You now make newFilex the inputs, read/merge the records from those chunks outputting to another file1, when you reach the end of the first chunk in the files you then read/merge the second chunk into another file2. You repeat this process until you end up with only one chunk in one file. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/