Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2959

Re: Parallel lexers by chunking the input

From anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups comp.compilers
Subject Re: Parallel lexers by chunking the input
Date 2022-03-25 08:04 +0000
Organization Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID <22-03-065@comp.compilers> (permalink)
References <22-03-058@comp.compilers> <22-03-064@comp.compilers>

Show all headers | View raw


Christopher F Clark <christopher.f.clark@compiler-resources.com> writes:
>But, the real issue is strings, comments, and markdown like features.
>They all suffer from the same issue, which is that they are generally
>unbounded strings that start and stop with very short sequences.  And,
>by picking some arbitrary point in the future to start scanning, you
>cannot be sure whether you are inside or outside one of those
>features.

Yes.  But you can represent this uncertainty in the state of the DFA.
Essentially you can take the original DFA, and make all states of that
(nondeterministic) start states of our continuation DFA.  Process the
pieces of the input in parallel with the continuation automaton.

Now you know the end state of the original automaton the
first piece, and the end state of the continuation automaton of the
second piece; you can look up the end state of the original automaton
for the second piece from these informations.  And so on for the
further pieces.  This is a sequential component, but it is fast.

Now you can reprocess the second-to-last pieces in parallel using the
original automaton and performing the actions (e.g., converting
numbers into binary representation).

Not sure how to interface that to the parser, but if we have a similar
idea for parallelizing the parser, even the lex/flex interface might
be ok.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[Experiments would be useful here.  Yes, you can do this, but do you
end up throwing away so much work that it's not worth it? -John]

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

How can the speed of a scanner be independent of the number of rules? Roger L Costello <costello@mitre.org> - 2022-03-23 18:49 +0000
  Re: How can the speed of a scanner be independent of the number of rules? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-03-23 21:27 +0000
    Re: How can the speed of a scanner be independent of the number of rules? gah4 <gah4@u.washington.edu> - 2022-03-23 19:57 -0700
    Re: How can the speed of a scanner be independent of the number of rules? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-24 13:12 +0200
    Re: How can the speed of a scanner be independent of the number of rules? Roger L Costello <costello@mitre.org> - 2022-03-24 11:53 +0000
      Re: How can the speed of a scanner be independent of the number of rules? Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2022-03-24 11:56 -0700
        Re: How can the speed of a scanner be independent of the number of rules? gah4 <gah4@u.washington.edu> - 2022-03-24 13:08 -0700
  Re: How can the speed of a scanner be independent of the number of rules? Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2022-03-24 02:01 -0700
    Re: Parallel scanning, was How can the speed of a scanner be independent of the number of rules? Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2022-03-24 11:18 -0700
      Parallel lexers by chunking the input Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-24 22:32 +0200
        Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-03-25 08:04 +0000
          Re: Parallel lexers by chunking the input gah4 <gah4@u.washington.edu> - 2022-03-25 07:58 -0700
          Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-03-25 17:47 +0000
            Parallel lexers by chunking the input. Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-26 00:36 +0200
            Parallel lexers by chunking the input Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-26 01:00 +0200
            Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-03-26 09:27 +0000

csiph-web