Path: csiph.com!3.us.feeder.erje.net!feeder.erje.net!news.alt.net!feeder.usenetexpress.com!tr1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Christopher F Clark Newsgroups: comp.compilers Subject: How make multifinished DFA for merged regexps? Date: Fri, 20 Dec 2019 06:15:46 -0500 Organization: Compilers Central Lines: 40 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <19-12-007@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="51403"; mail-complaints-to="abuse@iecc.com" Keywords: lex, DFA Posted-Date: 20 Dec 2019 10:53:52 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2388 Andy wrote: > I can create DFA direct from regexp. > But for language lexer I must have DFA for couple regexp. > One solution is crating DFA with multi finished states. example below > How to check if r0 and r1 are disjoint? > For example > r0 = ab ab(F0) > r1 = ac ac(F1) > | 0 | 1 > a | 1 | > b | | 2(F) 2(F0) > c | | 3(F) 3(F1) There are two solutions. 1, Don't just blindly merge your REs, keep the original regex name as part of the final step. See, how I indicated it in my tweak to your example. 2. Don't merge the REs at all, build the final DFA from the REs in parallel. The easiest way to see how to do that is to do roughly what building the LR(0) machine does when doing SLR/LALR parsing. That's a variant of the subset construction algorithm that does essentially what you want. That's how we made Yacc++ do ELR grammars. -- ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres 23 Bailey Rd voice: (508) 435-5016 Berlin, MA 01503 USA twitter: @intel_chris ------------------------------------------------------------------------------