Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102888

What is heating the memory here? hashlib?

Path csiph.com!aioe.org!.POSTED!not-for-mail
From Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt>
Newsgroups comp.lang.python
Subject What is heating the memory here? hashlib?
Date Sat, 13 Feb 2016 19:29:35 +0000
Organization Aioe.org NNTP Server
Lines 49
Message-ID <n9o06t$1hjo$1@gioia.aioe.org> (permalink)
NNTP-Posting-Host rpgRlhg9tMo1Vs7b/IQ9OA.user.gioia.aioe.org
Mime-Version 1.0
Content-Type text/plain; charset=utf-8
Content-Transfer-Encoding 7bit
X-Complaints-To abuse@aioe.org
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1
X-Enigmail-Draft-Status N1110
X-Notice Filtered by postfilter v. 0.8.2
X-Mozilla-News-Host news://nntp.aioe.org:119
Xref csiph.com comp.lang.python:102888

Show key headers only | View raw


Hello all.

I'm running in a very strange (for me at least) problem.

	def getHash(self):
		bfsz=File.blksz
		h=hashlib.sha256()
		hu=h.update
		with open(self.getPath(),'rb') as f:
			f.seek(File.hdrsz)	# Skip header
			b=f.read(bfsz)
			while len(b)>0:
				hu(b)
				b=f.read(bfsz)
		fhash=h.digest()
		return fhash

hdrsz is always 4K here. All files are greater than 4K.

If I use a 40MB bfsz this tooks all my memory very quickly. After few
hundreds of files it begins to swap ending up with the program being
killed (BTW, I'm using linux kubuntu 14.04).

If I reduce bfsz to 1MB it successfully completes my full test (~100000
files) reaching about 6GB of memory.

If I reduce further bfsz to 16KB there is no noticeable memory taken!!

I have tried the following code, but it didn't fix the problem:

	def getHash(self):
		bfsz=File.blksz
		h=hashlib.sha256()
		hu=h.update
		with open(self.getPath(),'rb') as f:
			husz=8192
			f.seek(File.hdrsz)	# Skip header
			b=f.read(bfsz)
			while len(b)>0:
				for i in range(0,len(b),husz):
					hu(b[i:i+husz])
				b=f.read(bfsz)
		fhash=h.digest()
		return fhash

What is wrong here?!

Thanks for any help/comments.
Paulo

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 19:29 +0000
  Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 22:26 +0000
    Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 09:45 +1100
      Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 01:44 +0000
        Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 13:01 +1100
  Re: What is heating the memory here? hashlib? Steven D'Aprano <steve@pearwood.info> - 2016-02-14 13:21 +1100
    Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 08:05 +0000
  Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 07:04 +0000
    Re: What is heating the memory here? hashlib? INADA Naoki <songofacandy@gmail.com> - 2016-02-14 18:49 +0900
      Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 07:38 +0000
    Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 02:21 +0000
      Re: What is heating the memory here? hashlib? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2016-02-15 09:12 +0100
        Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 17:29 +0000

csiph-web