Groups | Search | Server Info | Login | Register
Groups > comp.compilers > #162
| From | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Re: Matching very large patterns, was Re: Inverse grep |
| Date | 2011-06-20 04:52 +0000 |
| Organization | A noiseless patient Spider |
| Message-ID | <11-06-036@comp.compilers> (permalink) |
| References | <11-06-015@comp.compilers> <11-06-022@comp.compilers> <11-06-033@comp.compilers> |
Chris F Clark <cfc@shell01.theworld.com> wrote: > Our moderator asked: >> How serious is the explosion problem in practice, now that compilers >> no longer have to fit into 64k? We use lex scanners with a thousand >> tokens, and spamassassin feeds about 2000 patterns to re2c if you use >> the compiled patterns feature. None of them seem to strain our storage >> systems. -John > For compilers on mainline chips this is generally not a problem. > However, this is such a useful tool, it gets used in areas where the > problems are much-much larger (or the device is smaller). The > Aho-Corasick algorithm is one of the key algorithms in matching a set > of strings to a large set of incoming data. The two places where I've > seen it used and where size is a problem is networking and > computational biology. The favorite for computational biology for many years was dynamic programming, though often it is too slow. The dynamic programming algorithm used for diff originated in the one Needleman-Wunsch used for protein sequences. (I believe that it is referenced in the diff documentation.) More recently, the Burrows-Wheeler transform, commonly used in compression, has become popular for computational biology. BLAST is based on a DFA similar to the first part of Aho-Corasick, but dynamic programming is used instead of the usual second part. -- glen
Back to comp.compilers | Previous | Next — Previous in thread | Next in thread | Find similar
Inverse grep glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-06-08 23:01 +0000
Re: Inverse grep Chris F Clark <cfc@shell01.TheWorld.com> - 2011-06-12 14:16 -0400
Re: Inverse grep Tony Finch <dot@dotat.at> - 2011-06-13 17:48 +0100
Re: Inverse grep torbenm@diku.dk (Torben Ægidius Mogensen) - 2011-06-14 11:16 +0200
Re: Inverse grep anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-06-15 12:28 +0000
Matching very large patterns, was Re: Inverse grep Chris F Clark <cfc@shell01.TheWorld.com> - 2011-06-19 21:45 -0400
Re: Matching very large patterns, was Re: Inverse grep glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-06-20 04:52 +0000
Re: Inverse grep Paul Wankadia <junyer@gmail.com> - 2012-05-28 07:50 -0700
csiph-web