Groups | Search | Server Info | Keyboard shortcuts | Login | Register
Groups > comp.lang.java.programmer > #5128
| Path | csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!weretis.net!feeder4.news.weretis.net!news.musoftware.de!wum.musoftware.de!feeder.erje.net!news.internetdienste.de!news.tu-darmstadt.de!news.belwue.de!rz.uni-karlsruhe.de!feed.news.schlund.de!schlund.de!news.online.de!not-for-mail |
|---|---|
| From | Lothar Kimmeringer <news200709@kimmeringer.de> |
| Newsgroups | comp.lang.java.programmer, comp.programming, comp.lang.java.databases |
| Subject | Re: Storing large strings for future equality checks |
| Followup-To | comp.lang.java.programmer |
| Date | Wed, 8 Jun 2011 20:28:11 +0200 |
| Organization | Organization?! Only chaos here! |
| Lines | 42 |
| Message-ID | <171dpt2926br2.dlg@kimmeringer.de> (permalink) |
| References | <iso8cm$a80$1@speranza.aioe.org> |
| Reply-To | news@kimmeringer.de |
| NNTP-Posting-Host | mnch-4d044605.pool.mediaways.net |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="us-ascii" |
| Content-Transfer-Encoding | 7bit |
| X-Trace | online.de 1307557691 32691 77.4.70.5 (8 Jun 2011 18:28:11 GMT) |
| X-Complaints-To | abuse@einsundeins.com |
| NNTP-Posting-Date | Wed, 8 Jun 2011 18:28:11 +0000 (UTC) |
| User-Agent | 40tude_Dialog/2.0.15.1de |
| Xref | x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:5128 comp.programming:444 comp.lang.java.databases:466 |
Cross-posted to 3 groups.
Followups directed to: comp.lang.java.programmer
Show key headers only | View raw
F'up to cljp
Abu Yahya wrote:
> I considered using an SHA-512 hash of these strings and storing them in
> the database. However, while these will save on storage space, it will
> take time to do the hashing before comparing an incoming string. So I'm
> still wasting time. (Collisions due to hashing will not be a problem,
> since an occasional false positive will not be fatal for my application).
If you write seldom and read often, why not using two columns:
string_hashcode
sha1_hashcode
If the first is equal, you can calculate the sha1-hash for the string
to be checked and if that is equal as well, you can consider the
string as equal. That both hashes collide I expect to be very
very unlikely (which is why I changed the other alg to sha-1, that
should be considerably more performant than sha512).
So calculation of the more complex algorithm is only done while
storing to the database and when checking a string that is already
in the database. If you have that case very often you still might
get a better performance with String.hashcode and SHA1 than with
just SHA512.
> What would be the best approach?
There is no single best approach, only an optimal one. Which
one it is dependend on what defines one way to be better than
the other (in terms of performance, storage-space, collision-
rates, etc).
Regards, Lothar
--
Lothar Kimmeringer E-Mail: spamfang@kimmeringer.de
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
Always remember: The answer is forty-two, there can only be wrong
questions!
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 22:05 +0530
Re: Storing large strings for future equality checks markspace <-@.> - 2011-06-08 09:49 -0700
Re: Storing large strings for future equality checks Willem <willem@toad.stack.nl> - 2011-06-08 17:28 +0000
Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:45 +0530
Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:45 +0530
Re: Storing large strings for future equality checks David Kerber <dkerber@WarrenRogersAssociates.invalid> - 2011-06-08 12:58 -0400
Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:49 +0530
Re: Storing large strings for future equality checks Lothar Kimmeringer <news200709@kimmeringer.de> - 2011-06-08 20:31 +0200
Re: Storing large strings for future equality checks Harry Tuttle <OTPXDAJCSJVU@spammotel.com> - 2011-06-09 10:50 +0200
Re: Storing large strings for future equality checks bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-06-09 11:44 +0100
Re: Storing large strings for future equality checks Harry Tuttle <OTPXDAJCSJVU@spammotel.com> - 2011-06-10 10:15 +0200
Re: Storing large strings for future equality checks Gene Wirchenko <genew@ocis.net> - 2011-06-08 11:07 -0700
Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:58 +0530
Re: Storing large strings for future equality checks Hallvard B Furuseth <h.b.furuseth@usit.uio.no> - 2011-06-09 12:38 +0200
Re: Storing large strings for future equality checks Michael Wojcik <mwojcik@newsguy.com> - 2011-06-09 17:32 -0400
Re: Storing large strings for future equality checks bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-06-10 10:51 +0100
Re: Storing large strings for future equality checks Lothar Kimmeringer <news200709@kimmeringer.de> - 2011-06-08 20:28 +0200
Re: Storing large strings for future equality checks Martin Gregorie <martin@address-in-sig.invalid> - 2011-06-08 22:02 +0000
Re: Storing large strings for future equality checks rossum <rossum48@coldmail.com> - 2011-06-08 21:38 +0100
Re: Storing large strings for future equality checks Robert Klemme <shortcutter@googlemail.com> - 2011-06-08 23:20 +0200
Re: Storing large strings for future equality checks Tom Anderson <twic@urchin.earth.li> - 2011-06-08 23:02 +0100
Re: Storing large strings for future equality checks Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-09 15:01 -0700
csiph-web