Groups | Search | Server Info | Login | Register
Groups > comp.compilers > #150
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Re: Inverse grep |
| Date | 2011-06-15 12:28 +0000 |
| Organization | Institut fuer Computersprachen, Technische Universitaet Wien |
| Message-ID | <11-06-024@comp.compilers> (permalink) |
| References | <11-06-015@comp.compilers> <11-06-022@comp.compilers> |
torbenm@diku.dk (Torben Fgidius Mogensen) writes: >glen herrmannsfeldt <gah@ugcs.caltech.edu> writes: > >> I suppose this is a strange question, but I was wondering if >> there was ever something like an inverse grep. That is, >> match a string against a file full of regular expressions. ... >[How serious is the explosion problem in practice, now that compilers >no longer have to fit into 64k? We use lex scanners with a thousand >tokens, and spamassassin feeds about 2000 patterns to re2c if you use >the compiled patterns feature. None of them seem to strain our storage >systems. -John] Even if it was a problem, the solution has been around a long time (although I am not aware of an academic paper about it): Generate the DFA on-demand, only for those transitions that are actually used. Then the worst-case memory demands are O(n) where n is the length of the input string (and in the usual case the memory usage grows much slower). In tree-parsing automata (i.e., not for regexps) I have had cases where the size of the statically generated automaton was a problem <https://www.complang.tuwien.ac.at/anton/lazyburg/gforth9.burg>: We could not generate the automata for the largest tree grammars in practical time and even had problems compiling some of the code that we could generate. OTOH, on-demand generation of automaton states worked even for the largest tree grammars, and memory consumption was on the order of 2MB IIRC. The numbers will be different for regexps, but I expect the effect to be the same: on-demand automata will work in many cases where static automata take too long to generate or are too big to run. The paper about this work is <http://www.complang.tuwien.ac.at/papers/ertl+06pldi.ps.gz> - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
Back to comp.compilers | Previous | Next — Previous in thread | Next in thread | Find similar
Inverse grep glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-06-08 23:01 +0000
Re: Inverse grep Chris F Clark <cfc@shell01.TheWorld.com> - 2011-06-12 14:16 -0400
Re: Inverse grep Tony Finch <dot@dotat.at> - 2011-06-13 17:48 +0100
Re: Inverse grep torbenm@diku.dk (Torben Ægidius Mogensen) - 2011-06-14 11:16 +0200
Re: Inverse grep anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-06-15 12:28 +0000
Matching very large patterns, was Re: Inverse grep Chris F Clark <cfc@shell01.TheWorld.com> - 2011-06-19 21:45 -0400
Re: Matching very large patterns, was Re: Inverse grep glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-06-20 04:52 +0000
csiph-web