Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2407

Re: How make multifinished DFA for merged regexps?

Path csiph.com!3.us.feeder.erje.net!feeder.erje.net!news.snarked.org!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Newsgroups comp.compilers
Subject Re: How make multifinished DFA for merged regexps?
Date Tue, 24 Dec 2019 02:15:40 +0100
Organization Compilers Central
Lines 23
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <19-12-026@comp.compilers> (permalink)
References <19-12-005@comp.compilers> <19-12-010@comp.compilers>
Mime-Version 1.0
Content-Type text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding 8bit
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="60281"; mail-complaints-to="abuse@iecc.com"
Keywords lex
Posted-Date 23 Dec 2019 21:55:05 EST
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:2407

Show key headers only | View raw


Am 21.12.2019 um 01:29 schrieb Andy:
> Greedy algorithms match longest regexp. For example operators "+" and "++",
> int numbers "123" and float numbers "123.456e3".
> On '.' will finish state of number, but we will inside automata for float
> number. But can be errors: after '.' will 'a'. We must backtrack to last
> finished state?

Why should "123." not form a valid float number? In fact it's the C way
to force a possibly int number into a float.

If your lexer requires backtracking, because it e.g. is LR(n), then this
is the only solution. Unlike parsers, which may work based on
shift/reduce actions, a scanner should be made simpler.

> I want avoid backtracking. Maybe after backtracking we must
> read chars from auxiliary token buffer instead of stream up to previous
> position? But this complicated parsing.

Parsers require a lookahead of at least one token. So scanners should
implement at least a lookahead of one character, depending on the
complexity or weirdness of a language definition.

DoDi

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

How make multifinished DFA for merged regexps? Andy <borucki.andrzej@gmail.com> - 2019-12-19 18:19 -0800
  Re: How make multifinished DFA for merged regexps? Kaz Kylheku <493-878-3164@kylheku.com> - 2019-12-20 05:54 +0000
  Re: How make multifinished DFA for merged regexps? Andy <borucki.andrzej@gmail.com> - 2019-12-20 16:29 -0800
    Re: How make multifinished DFA for merged regexps? Kaz Kylheku <493-878-3164@kylheku.com> - 2019-12-21 04:04 +0000
    Re: How make multifinished DFA for merged regexps? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-12-24 02:15 +0100
  Re: How make multifinished DFA for merged regexps? Matt Timmermans <matt.timmermans@gmail.com> - 2019-12-23 22:29 -0800
  Re: How make multifinished DFA for merged regexps? rockbrentwood@gmail.com - 2019-12-29 20:56 -0800

csiph-web