Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #71922
| From | Adam Funk <a24061@ducksburg.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | hashing strings to integers (was: hashing strings to integers for sqlite3 keys) |
| Date | 2014-05-23 11:27 +0100 |
| Organization | $CABAL |
| Message-ID | <sck35bxdc5.ln2@news.ducksburg.com> (permalink) |
| References | (1 earlier) <mailman.10220.1400764235.18130.python-list@python.org> <05c15bxrpj.ln2@news.ducksburg.com> <mailman.10223.1400768058.18130.python-list@python.org> <k9f15bxoql.ln2@news.ducksburg.com> <mailman.10225.1400772863.18130.python-list@python.org> |
On 2014-05-22, Peter Otten wrote:
> Adam Funk wrote:
>> Well, J*v* returns a byte array, so I used to do this:
>>
>> digester = MessageDigest.getInstance("MD5");
>> ...
>> digester.reset();
>> byte[] digest = digester.digest(bytes);
>> return new BigInteger(+1, digest);
>
> In Python 3 there's int.from_bytes()
>
>>>> h = hashlib.sha1(b"Hello world")
>>>> int.from_bytes(h.digest(), "little")
> 538059071683667711846616050503420899184350089339
Excellent, thanks for pointing that out. I've just recently started
using Python 3 instead of 2, & appreciate pointers to new things like
that. The only thing that really bugs me in Python 3 is that execfile
has been removed (I find it useful for testing things interactively).
>> I dunno why language designers don't make it easy to get a single big
>> number directly out of these things.
>
> You hardly ever need to manipulate the numerical value of the digest. And on
> its way into the database it will be re-serialized anyway.
I now agree that my original plan to hash strings for the SQLite3
table was pointless, so I've changed the subject header. :-)
I have had good reason to use int hashes in the past, however. I was
doing some experiments with Andrei Broder's "sketches of shingles"
technique for finding partial duplication between documents, & you
need integers for that so you can do modulo arithmetic.
I've also used hashes of strings for other things involving
deduplication or fast lookups (because integer equality is faster than
string equality). I guess if it's just for deduplication, though, a
set of byte arrays is as good as a set of int?
--
Classical Greek lent itself to the promulgation of a rich culture,
indeed, to Western civilization. Computer languages bring us
doorbells that chime with thirty-two tunes, alt.sex.bestiality, and
Tetris clones. (Stoll 1995)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 12:47 +0100
Re: hashing strings to integers for sqlite3 keys Peter Otten <__peter__@web.de> - 2014-05-22 14:58 +0200
Re: hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 14:41 +0100
Re: hashing strings to integers for sqlite3 keys Chris Angelico <rosuav@gmail.com> - 2014-05-23 00:08 +1000
Re: hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 15:40 +0100
Re: hashing strings to integers for sqlite3 keys Chris Angelico <rosuav@gmail.com> - 2014-05-22 23:03 +1000
Re: hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 14:47 +0100
Re: hashing strings to integers for sqlite3 keys Tim Chase <python.list@tim.thechases.com> - 2014-05-22 08:09 -0500
Re: hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 14:54 +0100
Re: hashing strings to integers for sqlite3 keys Chris Angelico <rosuav@gmail.com> - 2014-05-23 00:14 +1000
Re: hashing strings to integers for sqlite3 keys Adam Funk <a24061@ducksburg.com> - 2014-05-22 15:47 +0100
Re: hashing strings to integers for sqlite3 keys Chris Angelico <rosuav@gmail.com> - 2014-05-23 01:09 +1000
Re: hashing strings to integers for sqlite3 keys Peter Otten <__peter__@web.de> - 2014-05-22 17:34 +0200
hashing strings to integers (was: hashing strings to integers for sqlite3 keys) Adam Funk <a24061@ducksburg.com> - 2014-05-23 11:27 +0100
Re: hashing strings to integers Adam Funk <a24061@ducksburg.com> - 2014-05-23 11:36 +0100
Re: hashing strings to integers Chris Angelico <rosuav@gmail.com> - 2014-05-23 21:01 +1000
Re: hashing strings to integers (was: hashing strings to integers for sqlite3 keys) Chris Angelico <rosuav@gmail.com> - 2014-05-23 20:59 +1000
Re: hashing strings to integers Adam Funk <a24061@ducksburg.com> - 2014-05-27 16:13 +0100
Re: hashing strings to integers Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-27 17:02 +0000
Re: hashing strings to integers Chris Angelico <rosuav@gmail.com> - 2014-05-28 05:16 +1000
Re: hashing strings to integers Dan Sommers <dan@tombstonezero.net> - 2014-05-28 01:55 +0000
Re: hashing strings to integers Adam Funk <a24061@ducksburg.com> - 2014-06-03 11:29 +0100
Re: hashing strings to integers Adam Funk <a24061@ducksburg.com> - 2014-06-03 11:32 +0100
Re: hashing strings to integers Terry Reedy <tjreedy@udel.edu> - 2014-05-23 15:10 -0400
Re: hashing strings to integers Adam Funk <a24061@ducksburg.com> - 2014-05-27 16:20 +0100
Re: hashing strings to integers for sqlite3 keys alister <alister.nospam.ware@ntlworld.com> - 2014-05-22 14:48 +0000
csiph-web