Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!npeer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!nx02.iad01.newshosting.com!newshosting.com!novia!news-out.readnews.com!news-xxxfer.readnews.com!news.misty.com!news.iecc.com!nerds-end From: Paul Wankadia Newsgroups: comp.compilers Subject: Re: Inverse grep Date: Mon, 28 May 2012 07:50:26 -0700 (PDT) Organization: Compilers Central Lines: 20 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <12-05-026@comp.compilers> References: <11-06-015@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: leila.iecc.com 1338230984 73349 64.57.183.58 (28 May 2012 18:49:44 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Mon, 28 May 2012 18:49:44 +0000 (UTC) Keywords: lex Posted-Date: 28 May 2012 14:49:44 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com X-Received-Bytes: 1777 Xref: csiph.com comp.compilers:655 On Jun 9 2011, 9:01 am, glen herrmannsfeldt wrote: > I suppose this is a strange question, but I was wondering if > there was ever something like an inverse grep. That is, > match a string against a file full of regular expressions. > > Now, one could just read the file, compile the regex one at > a time, and do the match, but maybe there is another way. The best way to match lots of regular expressions is not to match lots of regular expressions. :) In a nutshell, extract the fixed substrings of the regular expressions, identify which of the fixed substrings (if any) are present in the input string and determine which of the regular expressions (if any) are worth trying to match. The RE2 library includes this functionality, but you will need to BYO Aho-Corasick implementation. Refer to http://code.google.com/p/re2/source/browse/re2/filtered_re2.h.