Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #102898
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: What is heating the memory here? hashlib? |
| Date | 2016-02-14 13:01 +1100 |
| Message-ID | <mailman.102.1455415322.22075.python-list@python.org> (permalink) |
| References | <n9o06t$1hjo$1@gioia.aioe.org> <n9oaja$39n$1@gioia.aioe.org> <mailman.99.1455403508.22075.python-list@python.org> <n9om6l$hkt$1@gioia.aioe.org> |
On Sun, Feb 14, 2016 at 12:44 PM, Paulo da Silva
<p_s_d_a_s_i_l_v_a_ns@netcabo.pt> wrote:
>> What happens if, after hashing each file (and returning from this
>> function), you call gc.collect()? If that reduces your RAM usage, you
>> have reference cycles somewhere.
>>
> I have used gc and del. No luck.
>
> The most probable cause seems to be hashlib not correctly handling big
> buffers updates. I am working in a computer and testing in another. For
> the second part may be somehow I forgot to transfer the change to the
> other computer. Unlikely but possible.
I'd like to see the problem boiled down to just the hashlib calls.
Something like this:
import hashlib
data = b"*" * 4*1024*1024
lastdig = None
while "simulating files":
h = hashlib.sha256()
hu = h.update
for chunk in range(100):
hu(data)
dig = h.hexdigest()
if lastdig is None:
lastdig = dig
print("Digest:",dig)
else:
if lastdig != dig:
print("Digest fail!")
Running this on my system (Python 3.6 on Debian Linux) produces a
long-running process with stable memory usage, which is exactly what
I'd expect. Even using different data doesn't change that:
import hashlib
import itertools
byte = itertools.count()
data = b"*" * 4*1024*1024
while "simulating files":
h = hashlib.sha256()
hu = h.update
for chunk in range(100):
hu(data + bytes([next(byte)&255]))
dig = h.hexdigest()
print("Digest:",dig)
Somewhere between my code and yours is something that consumes all
that memory. Can you neuter the actual disk reading (replacing it with
constants, like this) and make a complete and shareable program that
leaks all that memory?
ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 19:29 +0000
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-13 22:26 +0000
Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 09:45 +1100
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 01:44 +0000
Re: What is heating the memory here? hashlib? Chris Angelico <rosuav@gmail.com> - 2016-02-14 13:01 +1100
Re: What is heating the memory here? hashlib? Steven D'Aprano <steve@pearwood.info> - 2016-02-14 13:21 +1100
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 08:05 +0000
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-14 07:04 +0000
Re: What is heating the memory here? hashlib? INADA Naoki <songofacandy@gmail.com> - 2016-02-14 18:49 +0900
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 07:38 +0000
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 02:21 +0000
Re: What is heating the memory here? hashlib? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2016-02-15 09:12 +0100
Re: What is heating the memory here? hashlib? Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-15 17:29 +0000
csiph-web