Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder4.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed1a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '(using': 0.07; 'suppose': 0.07; 'variables': 0.07; 'alternatives': 0.09; 'lines.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; '2100': 0.16; 'manageable': 0.16; 'merged': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'script?': 0.16; 'try?': 0.16; 'wrote:': 0.18; 'variable': 0.18; 'memory': 0.22; 'header :User-Agent:1': 0.23; 'bytes': 0.24; 'merge': 0.24; 'header:X -Complaints-To:1': 0.27; "doesn't": 0.30; 'gives': 0.31; 'lines': 0.31; 'subject:size': 0.31; 'file': 0.32; 'checking': 0.33; 'actual': 0.34; 'but': 0.35; 'add': 0.35; 'data,': 0.36; 'should': 0.36; 'example,': 0.37; 'list': 0.37; 'to:addr:python-list': 0.38; 'files': 0.38; 'list,': 0.38; 'track': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; 'received:org': 0.40; 'read': 0.60; 'complete': 0.62; 'limit': 0.70; 'algorithm,': 0.84; 'picture': 0.97 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: Finding size of Variable Date: Tue, 04 Feb 2014 12:40:25 +0100 Organization: None References: <8e4c1ab1-e65d-483f-ad9d-6933ae2052c3@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p50848427.dip0.t-ipconnect.de User-Agent: KNode/4.7.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 21 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1391514038 news.xs4all.nl 2974 [2001:888:2000:d::a6]:59417 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:65416 Ayushi Dalmia wrote: > I have 10 files and I need to merge them (using K way merging). The size > of each file is around 200 MB. Now suppose I am keeping the merged data in > a variable named mergedData, I had thought of checking the size of > mergedData using sys.getsizeof() but it somehow doesn't gives the actual > value of the memory occupied. > > For example, if a file in my file system occupies 4 KB of data, if I read > all the lines in a list, the size of the list is around 2100 bytes only. > > Where am I going wrong? What are the alternatives I can try? getsizeof() gives you the size of the list only; to complete the picture you have to add the sizes of the lines. However, why do you want to keep track of the actual memory used by variables in your script? You should instead concentrate on the algorithm, and as long as either the size of the dataset is manageable or you can limit the amount of data accessed at a given time you are golden.