Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #729

Re: lexer speed, was Bison

From Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups comp.compilers
Subject Re: lexer speed, was Bison
Date 2012-08-21 07:40 +0100
Organization Compilers Central
Message-ID <12-08-012@comp.compilers> (permalink)
References <12-08-005@comp.compilers> <12-08-006@comp.compilers> <12-08-008@comp.compilers> <12-08-011@comp.compilers>

Show all headers | View raw


BGB schrieb:

> 1: actually, my C parser uses a pretty big hash for type-name lookup,
> namely 16k entry IIRC, whereas for most other things 1024 or 4096 entry
> is plenty sufficient (for a chain-hash).

The global (top level) symbol table can become very big, in detail in C
compilers. A separate table for type names is not required.

> the main reason for this is the
> large number of typedefs in headers like "windows.h", which can easily
> saturate a 1024 or 4096 entry hash.

Right. AFAIR I added capabilities to skip already processed header
files, at least when a guard is detected. Such (not fully conforming)
behaviour can influence the time spent with loading the same files
moreoften.

> I have used bigger hash tables though, namely for things like interning
> strings (64k) and dynamic+interface method-dispatch (256k, but this one
> is an open-address hash).
>
>
>> DoDi
>> [The benchmarks I did were a while ago, but they showed a large
>> fraction of time in the lexer.  I wouldn't disagree that building the
>> symbol table is slow, but figure out some estimate of the ratio of
>> the number of characters in a source file to the number of tokens,
>> and that is a rough estimate of how much slower the lexer will be
>> than the parser. I agree that some current analyses would be useful.
>> -John]
>>
>
> yep.
>
> can't compare exactly, as my parsers tend to be recursive-descent and
> build ASTs directly.

The same for my compilers. My C to Pascal converter uses an LL(1)
parser, with only one exception where LL(2) lookahead is required.
Expressions are parsed differently, using an table-driven parser for
operator precedence and associativity. The preprocessor is implemented
in a couple of filter stages between the lexer and parser.


P.S.: When the time for loading source files can be separated from other
lexer tasks, I'd expect that this operation takes most of the time.

DoDi

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support hsad005@gmail.com - 2012-08-17 11:22 -0700
  Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-08-18 10:13 +0100
    Re: lexer speed, was Bison Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-08-20 01:01 +0100
      Re: lexer speed, was Bison Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-08-20 16:14 +0100
      Re: lexer speed, was Bison BGB <cr88192@hotmail.com> - 2012-08-20 14:14 -0500
        Re: lexer speed, was Bison Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-08-21 07:40 +0100
      Re: lexer speed, was Bison "BartC" <bc@freeuk.com> - 2012-08-21 17:39 +0100
    Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-08-20 13:35 +0000
      Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support BGB <cr88192@hotmail.com> - 2012-08-21 14:45 -0500
        Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support "BartC" <bc@freeuk.com> - 2012-08-22 14:04 +0100
          Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support BGB <cr88192@hotmail.com> - 2012-08-26 19:37 -0500
            Bison deterministic LALR parser for Java/C++ "BartC" <bc@freeuk.com> - 2012-08-29 22:03 +0100
              speeding up C recompilation, was Re: Bison deterministic LALR BGB <cr88192@hotmail.com> - 2012-09-04 13:45 -0500
              Re: C include handling, was Bison deterministic LALR Marco van de Voort <marcov@toad.stack.nl> - 2012-09-05 08:40 +0000
  Re: Bison determinis​tic LALR(1) parser for Java/C++ (kind of complex langauge) without 'lexar hack' support hsad005@gmail.com - 2012-08-18 02:09 -0700

csiph-web