Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #14852

Re: Patricia trie vs binary search.

From Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Newsgroups comp.lang.java.programmer
Subject Re: Patricia trie vs binary search.
References <jpmev4$63l$1@dont-email.me> <uAewr.45671$On2.38133@newsfe16.iad> <s0m5s7t5gt33plp5m6cn07g4ovehvedfb7@4ax.com>
Message-ID <ADDwr.2826$9Q6.1871@newsfe18.iad> (permalink)
Date 2012-05-27 22:00 -0700

Show all headers | View raw


On 5/27/12 6:44 PM, Gene Wirchenko wrote:
> On Sat, 26 May 2012 17:30:17 -0700, Daniel Pitts
> <newsgroup.nospam@virtualinfinity.net>  wrote:
>
> [snip]
>
>> I tend to use a Deterministic Finite State Automata for this.  You can
>> load the entire English dictionary fairly easily with that scheme. Yes,
>        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> you use a bit of memory, but unless you're doing this on an embedded
>> device, its probably not enough memory to be concerned about.
>
>       Including all affixes?
I suppose it depends on the particular dictionary, but we're only 
talking a few hundred thousand entries, at least with the Moby 
word-lists as a base:

cat compound-words.txt often-mispelled.txt english-most-frequent.txt 
male-names.txt female-names.txt common-names.txt common-dictionary.txt 
official-scrabble-* | sort | uniq | wc -lc
   388997 4801599

That's just over 4MB of strings, if stored as naively as possible. 
Certainly no problem for a modern day computer. It can actually be quite 
compressed when converted it to a DFSA, assuming reasonably optimized 
implementations.
>
> [snip]
>
> Sincerely,
>
> Gene Wirchenko

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Patricia trie vs binary search. markspace <-@.> - 2012-05-24 16:07 -0700
  Re: Patricia trie vs binary search. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-05-24 23:39 +0000
    Re: Patricia trie vs binary search. markspace <-@.> - 2012-05-24 17:56 -0700
      Password quality (Was: Patricia trie vs binary search.) Lew <lewbloch@gmail.com> - 2012-05-25 09:41 -0700
        Re: Password quality (Was: Patricia trie vs binary search.) markspace <-@.> - 2012-05-25 12:17 -0700
  Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-26 17:30 -0700
    Re: Patricia trie vs binary search. markspace <-@.> - 2012-05-26 18:17 -0700
    Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-27 18:44 -0700
      Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-27 22:00 -0700
        Re: Patricia trie vs binary search. markspace <-@.> - 2012-05-28 08:20 -0700
          Re: Patricia trie vs binary search. markspace <-@.> - 2012-05-28 14:38 -0700
        Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-28 09:20 -0700
          Re: Patricia trie vs binary search. Lew <noone@lewscanon.com> - 2012-05-28 21:54 -0700
            Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-29 09:14 -0700
              Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 09:55 -0700
                Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-29 11:17 -0700
                Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 11:22 -0700
                Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-29 14:44 -0700
                Re: Patricia trie vs binary search. Lew <lewbloch@gmail.com> - 2012-05-29 14:03 -0700
                Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-29 14:49 -0700
                Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 15:23 -0700
                Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 15:39 -0700
                Re: Patricia trie vs binary search. Gene Wirchenko <genew@ocis.net> - 2012-05-29 16:08 -0700
                Re: Patricia trie vs binary search. Lew <lewbloch@gmail.com> - 2012-05-29 18:25 -0700
            Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 09:16 -0700
              Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 13:37 -0400
                Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 10:49 -0700
                Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 13:58 -0400
                Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 14:20 -0400
                Re: Patricia trie vs binary search. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-29 11:21 -0700
                Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 14:29 -0400
                Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 15:00 -0400
  Re: Patricia trie vs binary search. Jeff Higgins <jeff@invalid.invalid> - 2012-05-29 09:24 -0400

csiph-web