Path: csiph.com!xmission!news.snarked.org!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Matt Timmermans <matt.timmermans@gmail.com>
Newsgroups: comp.compilers
Subject: Re: How make multifinished DFA for merged regexps?
Date: Mon, 23 Dec 2019 22:29:57 -0800 (PST)
Organization: Compilers Central
Lines: 26
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <19-12-028@comp.compilers>
References: <19-12-005@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="26605"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, DFA
Posted-Date: 25 Dec 2019 21:22:49 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <19-12-005@comp.compilers>
Xref: csiph.com comp.compilers:2409

On Thursday, 19 December 2019 23:01:24 UTC-5, Andy  wrote:
> I can create DFA direct from regexp.
> But for language lexer I must have DFA for couple regexp.
> One solution is crating DFA with multi finished states.
> For example
> r0 = ab
> r1 = ac
>
>   | 0 | 1
> a | 1 |
> b |   | 2(F)
> c |   | 3(F)
>
> How to check if r0 and r1 are disjoint?

You build the NFA with a different kind of accepting state for each
rule.  When you build the DFA with subset construction, each DFA state
will correspond to a set of NFA states, and therefore each accepting
state will correspond to a *set* of rules.

The rules are all disjoint if all those sets are singletons.

If you do Hopcroft minimization, then your initial partition puts each
distinct set of accepted rules in its own partition.

I have an open source project that does this if it helps: https://github.com/mtimmerm/dfalex