Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #3045

Re: State-of-the-art algorithms for lexical analysis?

From gah4 <gah4@u.washington.edu>
Newsgroups comp.compilers
Subject Re: State-of-the-art algorithms for lexical analysis?
Date 2022-06-05 16:05 -0700
Organization Compilers Central
Message-ID <22-06-007@comp.compilers> (permalink)
References <22-06-006@comp.compilers>

Show all headers | View raw


On Sunday, June 5, 2022 at 2:08:12 PM UTC-7, Roger L Costello wrote:

(snip)

> Are regular expressions still the best way to specify tokens?

Some years ago, I used to work with a company that sold hardware
search processors to a certain three letter agency that we are not
supposed to mention, but everyone knows.

It has a completely different PSL, Pattern Specification Language,
much more powerful than the usual regular expression.

Both the standard and extended regular expression are nice, in that we
get used to using them, especially with grep, and without thinking too
much about them.

I suspect, though, that if they hadn't previously been defined, we
might come up with something different today.

Among others, PSL has the ability to define approximate matches,
such as a word with one or more misspellings, that is insertions,
deletions, or substitutions. Usual RE don't have that ability.

There are also PSL expressions for ranges of numbers.
You can often do that with very complicated RE, considering
all of the possibilities.  PSL automatically processes those
possibilities.  (Some can expand to complicated code.)

I suspect that in many cases the usual RE is not optimal for
lexical analysis, other than being well known.

But as noted, DFA are likely the best way to do them.

Though that could change with changes in computer hardware.

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

State-of-the-art algorithms for lexical analysis? Roger L Costello <costello@mitre.org> - 2022-06-05 20:53 +0000
  Re: State-of-the-art algorithms for lexical analysis? gah4 <gah4@u.washington.edu> - 2022-06-05 16:05 -0700
    Re: State-of-the-art algorithms for lexical analysis? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-06-06 08:59 +0200
      State-of-the-art algorithms for lexical analysis? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-06 21:16 +0300
        Re: State-of-the-art algorithms for lexical analysis? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-06-07 06:52 +0200
          Re: State-of-the-art algorithms for lexical analysis? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-07 19:40 +0300
            Re: State-of-the-art algorithms for lexical analysis? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-06-08 05:32 +0200
              Re: counted strings, was State-of-the-art algorithms for lexical analysis? gah4 <gah4@u.washington.edu> - 2022-06-09 11:54 -0700
                Re: counted characters in strings "Robin Vowels" <robin51@dodo.com.au> - 2022-06-10 12:21 +1000
                Re: counted characters in strings Martin Ward <martin@gkc.org.uk> - 2022-06-11 10:52 +0100
                Re: counted characters in strings drb@msu.edu (Dennis Boone) - 2022-06-11 11:09 -0500
    Re: State-of-the-art algorithms for lexical analysis? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-06-06 16:00 +0000
    References for PSL ? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-06 20:11 +0300

csiph-web