Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102888

What is heating the memory here? hashlib?

From Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt>
Newsgroups comp.lang.python
Subject What is heating the memory here? hashlib?
Date 2016-02-13 19:29 +0000
Organization Aioe.org NNTP Server
Message-ID <n9o06t$1hjo$1@gioia.aioe.org> (permalink)

Show all headers | View raw


Hello all.

I'm running in a very strange (for me at least) problem.

	def getHash(self):
		bfsz=File.blksz
		h=hashlib.sha256()
		hu=h.update
		with open(self.getPath(),'rb') as f:
			f.seek(File.hdrsz)	# Skip header
			b=f.read(bfsz)
			while len(b)>0:
				hu(b)
				b=f.read(bfsz)
		fhash=h.digest()
		return fhash

hdrsz is always 4K here. All files are greater than 4K.

If I use a 40MB bfsz this tooks all my memory very quickly. After few
hundreds of files it begins to swap ending up with the program being
killed (BTW, I'm using linux kubuntu 14.04).

If I reduce bfsz to 1MB it successfully completes my full test (~100000
files) reaching about 6GB of memory.

If I reduce further bfsz to 16KB there is no noticeable memory taken!!

I have tried the following code, but it didn't fix the problem:

	def getHash(self):
		bfsz=File.blksz
		h=hashlib.sha256()
		hu=h.update
		with open(self.getPath(),'rb') as f:
			husz=8192
			f.seek(File.hdrsz)	# Skip header
			b=f.read(bfsz)
			while len(b)>0:
				for i in range(0,len(b),husz):
					hu(b[i:i+husz])
				b=f.read(bfsz)
		fhash=h.digest()
		return fhash

What is wrong here?!

Thanks for any help/comments.
Paulo

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 19:29 +0000
  Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 22:26 +0000
    Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 09:45 +1100
      Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 01:44 +0000
        Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 13:01 +1100
  Re: What is heating the memory here? hashlib? Steven D'Aprano <steve@pearwood.info> - 2016-02-14 13:21 +1100
    Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 08:05 +0000
  Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 07:04 +0000
    Re: What is heating the memory here? hashlib? INADA Naoki <songofacandy@gmail.com> - 2016-02-14 18:49 +0900
      Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 07:38 +0000
    Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 02:21 +0000
      Re: What is heating the memory here? hashlib? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2016-02-15 09:12 +0100
        Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 17:29 +0000

csiph-web