Path: csiph.com!usenet.pasdenom.info!gegeweb.org!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: Jeff Higgins Newsgroups: comp.lang.java.programmer Subject: Re: Patricia trie vs binary search. Date: Tue, 29 May 2012 09:24:16 -0400 Organization: A noiseless patient Spider Lines: 19 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 29 May 2012 13:24:27 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="lN5NBsxb/Hl+dKZczfqfVQ"; logging-data="15473"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/L7LUg/3rrAco+/oNfVichfAmTePRV1Sc=" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.16) Gecko/20120507 Icedove/3.0.11 In-Reply-To: Cancel-Lock: sha1:/m3MZ7rb8MRsuxC2M+mtKUTUNbE= Xref: csiph.com comp.lang.java.programmer:14882 On 05/24/2012 07:07 PM, markspace wrote: > For some reason I was thinking about sub-string searches today. I > looked up Patricia tries (a kind of radix tree) to see if they would > help. While interesting, the radix tree seems to have a lot of overhead > for large numbers of entries. > > The radix tree uses a bucket at each level to hold all children (and > there could be quite a lot of children). Each child if present requires > a pointer (an object in Java) to hold it. For the example given, this > could be as much as one object per character in each string, plus the > bucket to hold it and its siblings. If the number strings is very large, > this could really result in an explosion of memory usage. > > > > So is there a better way for large data sets? > Different. Better?