Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #3456

Re: Deep expression chain performance

From Kaz Kylheku <864-117-4973@kylheku.com>
Newsgroups comp.compilers
Subject Re: Deep expression chain performance
Date 2023-04-27 00:28 +0000
Organization A noiseless patient Spider
Message-ID <23-04-015@comp.compilers> (permalink)
References <23-04-012@comp.compilers>

Show all headers | View raw


On 2023-04-25, Andy <borucki.andrzej@gmail.com> wrote:
> I am writing C grammar.
> Grammar speed may be down caused by deep expression chain.
> For example, simple "n=0" has 20 levels:
> assignmentExpression
> conditionalExpression
> logicalOrExpression
> logicalAndExpression
> inclusiveOrExpression
> exclusiveOrExpression
> andExpression
> eqaulityExpression
> relationalExpression
> shiftExpression
> additiveExpression
> multiplicativeExpression
> castExpression
> unaryExpression
> postfixExpression
> postfixExpressionLeft
> atom
> literal
> IntegerConstant
> DeciamlConstant

ISO C specifies its grammar in an obtusely verbose way, similar to the
above. The reason is that it makes the grammar unambiguous, meaning
that it does not require any annotation or external remarks about
ambiguities having to be resolved by a certain associativity or
precedence.

If you have some grammar processor which literally follows all those
productions, doing all of their reductions, then it will likely
not perform well.

I suspect even a LALR(1) compiled grammar will have an issue
there; I'm not aware that Yacc-like parser generators do any
collapsing of cascaded reductions which do nothing but { $$ = $1; }.

Here is a concrete example. Attached below is over 200 lines of
y.output from a Yacc implementation (Bison). In that y.output
we can trace the actions of the state machine and see what happens
when a the simple token LIT is reduced to an expression.

In State 0, when the parser sees a LIT, it will shift the token
and go to State 1. There, a reduction takes place via Rule 9,
and so we have a parser_expression on the top of the stack.

Then, back to State 0, where now have a primary_expression,
and so we go to state 8.

State 8, similarly to State 1, performs a reduction:
primary expression to mult_expression.

Back to State 0, where we have to dispatch a similar
thing again for multi_expression, then additive_expression,
then expression and finally top.

Thanks for raising this issue; I suspect it's a real performance
problem. Not only is a factored-out grammar hard to read but the
performance is bad due to useless back and forth transitions between
superfluous states to propagate reductions.

Above-referenced material follows:

Grammar

    0 $accept: top $end

    1 top: expression ';'
    2    | expression ';' expression

    3 expression: additive_expression

    4 additive_expression: mult_expression
    5                    | mult_expression '+' additive_expression

    6 mult_expression: primary_expression
    7                | mult_expression '*' primary_expression

    8 primary_expression: '(' expression ')'
    9                   | LIT
   10                   | IDENT


Terminals, with rules where they appear

$end (0) 0
'(' (40) 8
')' (41) 8
'*' (42) 7
'+' (43) 5
';' (59) 1 2
error (256)
LIT (258) 9
IDENT (259) 10


Nonterminals, with rules where they appear

$accept (10)
    on left: 0
top (11)
    on left: 1 2, on right: 0
expression (12)
    on left: 3, on right: 1 2 8
additive_expression (13)
    on left: 4 5, on right: 3 5
mult_expression (14)
    on left: 6 7, on right: 4 5 7
primary_expression (15)
    on left: 8 9 10, on right: 6 7


state 0

    0 $accept: . top $end

    LIT    shift, and go to state 1
    IDENT  shift, and go to state 2
    '('    shift, and go to state 3

    top                  go to state 4
    expression           go to state 5
    additive_expression  go to state 6
    mult_expression      go to state 7
    primary_expression   go to state 8


state 1

    9 primary_expression: LIT .

    $default  reduce using rule 9 (primary_expression)


state 2

   10 primary_expression: IDENT .

    $default  reduce using rule 10 (primary_expression)


state 3

    8 primary_expression: '(' . expression ')'

    LIT    shift, and go to state 1
    IDENT  shift, and go to state 2
    '('    shift, and go to state 3

    expression           go to state 9
    additive_expression  go to state 6
    mult_expression      go to state 7
    primary_expression   go to state 8


state 4

    0 $accept: top . $end

    $end  shift, and go to state 10


state 5

    1 top: expression . ';'
    2    | expression . ';' expression

    ';'  shift, and go to state 11


state 6

    3 expression: additive_expression .

    $default  reduce using rule 3 (expression)


state 7

    4 additive_expression: mult_expression .
    5                    | mult_expression . '+' additive_expression
    7 mult_expression: mult_expression . '*' primary_expression

    '+'  shift, and go to state 12
    '*'  shift, and go to state 13

    $default  reduce using rule 4 (additive_expression)


state 8

    6 mult_expression: primary_expression .

    $default  reduce using rule 6 (mult_expression)


state 9

    8 primary_expression: '(' expression . ')'

    ')'  shift, and go to state 14


state 10

    0 $accept: top $end .

    $default  accept


state 11

    1 top: expression ';' .
    2    | expression ';' . expression

    LIT    shift, and go to state 1
    IDENT  shift, and go to state 2
    '('    shift, and go to state 3

    $default  reduce using rule 1 (top)

    expression           go to state 15
    additive_expression  go to state 6
    mult_expression      go to state 7
    primary_expression   go to state 8


state 12

    5 additive_expression: mult_expression '+' . additive_expression

    LIT    shift, and go to state 1
    IDENT  shift, and go to state 2
    '('    shift, and go to state 3

    additive_expression  go to state 16
    mult_expression      go to state 7
    primary_expression   go to state 8


state 13

    7 mult_expression: mult_expression '*' . primary_expression

    LIT    shift, and go to state 1
    IDENT  shift, and go to state 2
    '('    shift, and go to state 3

    primary_expression  go to state 17


state 14

    8 primary_expression: '(' expression ')' .

    $default  reduce using rule 8 (primary_expression)


state 15

    2 top: expression ';' expression .

    $default  reduce using rule 2 (top)


state 16

    5 additive_expression: mult_expression '+' additive_expression .

    $default  reduce using rule 5 (additive_expression)


state 17

    7 mult_expression: mult_expression '*' primary_expression .

    $default  reduce using rule 7 (mult_expression)


--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Deep expression chain performance Andy <borucki.andrzej@gmail.com> - 2023-04-25 06:51 -0700
  Re: Deep expression chain performance anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-04-26 16:33 +0000
  Re: Deep expression chain performance gah4 <gah4@u.washington.edu> - 2023-04-26 13:06 -0700
  Re: Deep expression chain performance Kaz Kylheku <864-117-4973@kylheku.com> - 2023-04-27 00:28 +0000
    Re: Deep expression chain performance anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-04-27 06:17 +0000

csiph-web