Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #2414
| Path | csiph.com!tncsrv06.tnetconsulting.net!news.snarked.org!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end |
|---|---|
| From | rockbrentwood@gmail.com |
| Newsgroups | comp.compilers |
| Subject | Re: How make multifinished DFA for merged regexps? |
| Date | Sun, 29 Dec 2019 20:56:21 -0800 (PST) |
| Organization | Compilers Central |
| Lines | 77 |
| Sender | news@iecc.com |
| Approved | comp.compilers@iecc.com |
| Message-ID | <19-12-033@comp.compilers> (permalink) |
| References | <19-12-005@comp.compilers> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="UTF-8" |
| Content-Transfer-Encoding | 8bit |
| Injection-Info | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="7825"; mail-complaints-to="abuse@iecc.com" |
| Keywords | lex, DFA |
| Posted-Date | 30 Dec 2019 11:30:03 EST |
| X-submission-address | compilers@iecc.com |
| X-moderator-address | compilers-request@iecc.com |
| X-FAQ-and-archives | http://compilers.iecc.com |
| In-Reply-To | <19-12-005@comp.compilers> |
| Xref | csiph.com comp.compilers:2414 |
Show key headers only | View raw
On Thursday, December 19, 2019 at 10:01:24 PM UTC-6, Andy wrote: > I can create DFA direct from regexp. > But for language lexer I must have DFA for couple regexp. > One solution is crating DFA with multi finished states. > For example > r0 = ab > r1 = ac > > | 0 | 1 > a | 1 | > b | | 2(F) > c | | 3(F) > > How to check if r0 and r1 are disjoint? In a day or so, I'll put back up on GitHub our regex utilities that had once been up on the comp.compilers archive. The "DFA" program can process boolean operations - including intersection and relative compliment. (The "NFA" program produces efficient linear-sized near-deterministic FA, "REX" is like GREP except it can process boolean combinations too, and "FSC" is a "finite state classifier"). I'll post a link when it's up on GitHub. Denote intersection by &. The regular expressions ab, ac are the least fixed-point solutions to the following systems [0] = ab >= a[1] [1] = b >= b[2] [2] = 1 [3] = ac >= a[4] [4] = c >= c[5] [5] = 1 The system for the intersection is obtained by distributivity: [0]&[3] >= a([1]&[4]) [1]&[4] >= (b[2] | c0) & (b0 | c[2]) = b([2]&0) | c(0&[2]) = b0 | c0 = 0 The least fixed-point solution is [1]&[4] = 0, [0]&[3] = a0 = 0. So, there's 0 intersection. A more interesting example with relative compliment "-": (a|b)* - (a|b)*(aa|bb)(a|b)* The first term (a|b)* has the following right-linear system [0] = (a|b)* = 1 | (a|b)(a|b)* = 1 | a(a|b)* | b(a|b)* >= 1 | a[0] | b[0] The second term (a|b)*(aa|bb)(a|b)* has the following: [1] = (a|b)*(aa|bb)(a|b)* = (aa|bb)(a|b)* | a[1] | b[1] = a(a(a|b)* | [1]) | b(b(a|b)* | [1]) >= a[2] | b[3] [2] = a(a|b)* | [1] = a(a|b)* | 1 | a[1] | b[1] = 1 | a((a|b)* | [1]) | b[1] >= 1 | a[4] | b[1] [3] = b(a|b)* | [1] = b(a|b)* | 1 | a[1] | b[1] = 1 | a[1] | b((a|b)* | [1]) >= 1 | a[1] | b[4] [4] = (a|b)* | [1] = 1 | (a|b)(a|b)* | 1 | a[1] | b[1] = 1 | 1 | a((a|b)* | [1]) | b((a|b)* | [1]) >= 1 | a[4] | b[4] The system for the relative compliment can be found by distributivity: [0]-[1] >= (1|a[0]|b[0]) - (0|a[2]|b[3]) = 1-0 | a([0]-[2]) | b([0]-[3]) = 1 | a([0]-[2]) | b([0]-[3]) [0]-[2] >= (1-1) | a([0]-[4]) | b([0]-[1]) = a([0]-[4]) | b([0]-[1]) [0]-[3] >= (1-1) | a([0]-[1]) | b([0]-[4]) = a([0]-[1]) | b([0]-[4]) [0]-[4] >= (1-1) | a([0]-[4]) | b([0]-[4]) = a([0]-[4]) | b([0]-[4]) The least fixed-point solution x >= ax | bx is x = 0, so for the last item, it is [0]-[4] = 0. For the remaining items, the system therefore reduces to [0]-[1] >= 1 | a([0]-[2]) | b([0]-[3]) [0]-[2] >= a0 | b([0]-[1]) = b([0]-[1]) [0]-[3] >= a([0]-[1]) | b0 = a([0]-[1]) which upon substitution reduces to a single inequality [0]-[1] >= 1 | ab([0]-[1]) | ba([0]-[1]) = 1 | (ab|ba)([0]-[1]). The least fixed point solution to x >= 1 | cx is x = c*. Thus, for [0]-[1], it is [0]-[1] = (ab|ba)* Therefore, the relative compliment evaluates to: (a|b)* - (a|b)*(aa|bb)(a|b)* = (ab|ba)*.
Back to comp.compilers | Previous | Next — Previous in thread | Find similar
How make multifinished DFA for merged regexps? Andy <borucki.andrzej@gmail.com> - 2019-12-19 18:19 -0800
Re: How make multifinished DFA for merged regexps? Kaz Kylheku <493-878-3164@kylheku.com> - 2019-12-20 05:54 +0000
Re: How make multifinished DFA for merged regexps? Andy <borucki.andrzej@gmail.com> - 2019-12-20 16:29 -0800
Re: How make multifinished DFA for merged regexps? Kaz Kylheku <493-878-3164@kylheku.com> - 2019-12-21 04:04 +0000
Re: How make multifinished DFA for merged regexps? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-12-24 02:15 +0100
Re: How make multifinished DFA for merged regexps? Matt Timmermans <matt.timmermans@gmail.com> - 2019-12-23 22:29 -0800
Re: How make multifinished DFA for merged regexps? rockbrentwood@gmail.com - 2019-12-29 20:56 -0800
csiph-web