Path: csiph.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!news.uzoreto.com!news.etla.org!news.litech.org!adore2!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Andy Newsgroups: comp.compilers Subject: Re: How make multifinished DFA for merged regexps? Date: Fri, 20 Dec 2019 16:29:01 -0800 (PST) Organization: Compilers Central Lines: 7 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <19-12-010@comp.compilers> References: <19-12-005@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="46163"; mail-complaints-to="abuse@iecc.com" Keywords: lex, DFA Posted-Date: 20 Dec 2019 19:33:10 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <19-12-005@comp.compilers> Xref: csiph.com comp.compilers:2391 Greedy algorithms match longest regexp. For example operators "+" and "++", int numbers "123" and float numbers "123.456e3". On '.' will finish state of number, but we will inside automata for float number. But can be errors: after '.' will 'a'. We must backtrack to last finished state? I want avoid backtracking. Maybe after backtracking we must read chars from auxiliary token buffer instead of stream up to previous position? But this complicated parsing.