Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #10808
| From | Tom Anderson <twic@urchin.earth.li> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: Fast String search algorithm |
| Date | 2011-12-16 18:58 +0000 |
| Organization | Stack Usenet News Service |
| Message-ID | <alpine.DEB.2.00.1112161852030.13913@urchin.earth.li> (permalink) |
| References | <i02ne7ptvvl8ofv9uracdu2jm3oq61v2h3@4ax.com> |
On Fri, 16 Dec 2011, Roedy Green wrote: > Let's say I had a million records each with a text field. I wanted to > find all records that contained a given substring. Are there fast > algorithms to do that or do you have to scan the whole thing linearly? http://en.wikipedia.org/wiki/Inverted_index > I imagine you could build an index of words. Then maybe you could > figure out which words contain the string, and narrow your search to > records with those words. Exactly that! > Similarly, I was wondering how you might build an index to all the files > on a computer so that you could find one just by giving a wild card or > partial filename. There is a standard utility on unix that does just that: http://en.wikipedia.org/wiki/Locate_%28Unix%29 The source is in C, but might be informative (this is for the 'mlocate' version): https://fedorahosted.org/mlocate/browser/src tom -- Get a fucking hobby that isn't breathing, browsing 4chan, or fapping. -- The Well Cultured Anonymous, on Manners
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Fast String search algorithm Roedy Green <see_website@mindprod.com.invalid> - 2011-12-16 10:08 -0800
Re: Fast String search algorithm Tom Anderson <twic@urchin.earth.li> - 2011-12-16 18:58 +0000
Re: Fast String search algorithm markspace <-@.> - 2011-12-17 09:26 -0800
Re: Fast String search algorithm Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-12-17 21:07 -0800
Re: Fast String search algorithm Tassilo Horn <tassilo@member.fsf.org> - 2011-12-16 20:07 +0100
Re: Fast String search algorithm Roedy Green <see_website@mindprod.com.invalid> - 2011-12-16 13:32 -0800
Re: Fast String search algorithm Gene Wirchenko <genew@ocis.net> - 2011-12-16 12:12 -0800
Re: Fast String search algorithm glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-12-16 21:22 +0000
Re: Fast String search algorithm Arne Vajhøj <arne@vajhoej.dk> - 2011-12-16 22:20 -0500
Re: Fast String search algorithm Lew <lewbloch@gmail.com> - 2011-12-16 21:39 -0800
Re: Fast String search algorithm Wanja Gayk <brixomatic@yahoo.com> - 2011-12-17 16:15 +0100
Re: Fast String search algorithm Travers Naran <tnaran@gmail.com> - 2011-12-18 09:17 -0800
csiph-web