Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #5167

Re: Storing large strings for future equality checks

Path csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!postnews.google.com!f31g2000pri.googlegroups.com!not-for-mail
From Joshua Maurice <joshuamaurice@gmail.com>
Newsgroups comp.lang.java.programmer, comp.programming, comp.lang.java.databases
Subject Re: Storing large strings for future equality checks
Date Thu, 9 Jun 2011 15:01:27 -0700 (PDT)
Organization http://groups.google.com
Lines 32
Message-ID <21013c4d-3ae9-4e81-8999-d8c18e620e5c@f31g2000pri.googlegroups.com> (permalink)
References <iso8cm$a80$1@speranza.aioe.org>
NNTP-Posting-Host 12.108.188.134
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1
Content-Transfer-Encoding quoted-printable
X-Trace posting.google.com 1307656888 9826 127.0.0.1 (9 Jun 2011 22:01:28 GMT)
X-Complaints-To groups-abuse@google.com
NNTP-Posting-Date Thu, 9 Jun 2011 22:01:28 +0000 (UTC)
Complaints-To groups-abuse@google.com
Injection-Info f31g2000pri.googlegroups.com; posting-host=12.108.188.134; posting-account=C7XBLgoAAAAxMpmeFo8Iv_pud1pyFhjy
User-Agent G2/1.0
X-Google-Web-Client true
X-Google-Header-Order HUALESNKRC
X-HTTP-UserAgent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1,gzip(gfe)
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:5167 comp.programming:452 comp.lang.java.databases:474

Cross-posted to 3 groups.

Show key headers only | View raw


On Jun 8, 9:35 am, Abu Yahya <abu_ya...@invalid.com> wrote:
> A small application that I'm making requires me to store very long
> strings (>1000 characters) in a database.
>
> I will need to use these strings later to compare for equality to
> incoming strings from another application. I will also want to add some
> of the incoming strings to the storage, if they meet certain criteria.
>
> For my application, I get a feeling that storing these strings in my
> table will be a waste of space, and will impact performance due to
> retrieval and storage times, as well as comparison times.
>
> I considered using an SHA-512 hash of these strings and storing them in
> the database. However, while these will save on storage space, it will
> take time to do the hashing before comparing an incoming string. So I'm
> still wasting time. (Collisions due to hashing will not be a problem,
> since an occasional false positive will not be fatal for my application).
>
> What would be the best approach?

If it's that relevant that you're asking, measure first to see if it's
a problem. If you're that concerned that it will be, then code a
number of reasonable alternatives and measure.

Presumably you need to do a Map lookup on the incoming strings. I
thought about some itern scheme, but that won't work if you're
receiving a lot of incoming new strings. Storing hashs could work. Do
you need to store the strings in a database? If you can store them
locally, maybe a trie?
http://en.wikipedia.org/wiki/Trie
I somewhat doubt (maybe?) that you're going to get much better lookup
performance than a trie (but of course I would measure too).

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar


Thread

Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 22:05 +0530
  Re: Storing large strings for future equality checks markspace <-@.> - 2011-06-08 09:49 -0700
    Re: Storing large strings for future equality checks Willem <willem@toad.stack.nl> - 2011-06-08 17:28 +0000
      Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:45 +0530
    Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:45 +0530
  Re: Storing large strings for future equality checks David Kerber <dkerber@WarrenRogersAssociates.invalid> - 2011-06-08 12:58 -0400
    Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:49 +0530
    Re: Storing large strings for future equality checks Lothar Kimmeringer <news200709@kimmeringer.de> - 2011-06-08 20:31 +0200
      Re: Storing large strings for future equality checks Harry Tuttle <OTPXDAJCSJVU@spammotel.com> - 2011-06-09 10:50 +0200
        Re: Storing large strings for future equality checks bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-06-09 11:44 +0100
      Re: Storing large strings for future equality checks Harry Tuttle <OTPXDAJCSJVU@spammotel.com> - 2011-06-10 10:15 +0200
  Re: Storing large strings for future equality checks Gene Wirchenko <genew@ocis.net> - 2011-06-08 11:07 -0700
    Re: Storing large strings for future equality checks Abu Yahya <abu_yahya@invalid.com> - 2011-06-08 23:58 +0530
    Re: Storing large strings for future equality checks Hallvard B Furuseth <h.b.furuseth@usit.uio.no> - 2011-06-09 12:38 +0200
    Re: Storing large strings for future equality checks Michael Wojcik <mwojcik@newsguy.com> - 2011-06-09 17:32 -0400
      Re: Storing large strings for future equality checks bugbear <bugbear@trim_papermule.co.uk_trim> - 2011-06-10 10:51 +0100
  Re: Storing large strings for future equality checks Lothar Kimmeringer <news200709@kimmeringer.de> - 2011-06-08 20:28 +0200
    Re: Storing large strings for future equality checks Martin Gregorie <martin@address-in-sig.invalid> - 2011-06-08 22:02 +0000
  Re: Storing large strings for future equality checks rossum <rossum48@coldmail.com> - 2011-06-08 21:38 +0100
  Re: Storing large strings for future equality checks Robert Klemme <shortcutter@googlemail.com> - 2011-06-08 23:20 +0200
  Re: Storing large strings for future equality checks Tom Anderson <twic@urchin.earth.li> - 2011-06-08 23:02 +0100
  Re: Storing large strings for future equality checks Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-09 15:01 -0700

csiph-web