Path: csiph.com!xmission!news.snarked.org!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Matt Timmermans Newsgroups: comp.compilers Subject: Re: How make multifinished DFA for merged regexps? Date: Mon, 23 Dec 2019 22:29:57 -0800 (PST) Organization: Compilers Central Lines: 26 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <19-12-028@comp.compilers> References: <19-12-005@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="26605"; mail-complaints-to="abuse@iecc.com" Keywords: lex, DFA Posted-Date: 25 Dec 2019 21:22:49 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <19-12-005@comp.compilers> Xref: csiph.com comp.compilers:2409 On Thursday, 19 December 2019 23:01:24 UTC-5, Andy wrote: > I can create DFA direct from regexp. > But for language lexer I must have DFA for couple regexp. > One solution is crating DFA with multi finished states. > For example > r0 = ab > r1 = ac > > | 0 | 1 > a | 1 | > b | | 2(F) > c | | 3(F) > > How to check if r0 and r1 are disjoint? You build the NFA with a different kind of accepting state for each rule. When you build the DFA with subset construction, each DFA state will correspond to a set of NFA states, and therefore each accepting state will correspond to a *set* of rules. The rules are all disjoint if all those sets are singletons. If you do Hopcroft minimization, then your initial partition puts each distinct set of accepted rules in its own partition. I have an open source project that does this if it helps: https://github.com/mtimmerm/dfalex