Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Andy Newsgroups: comp.compilers Subject: Deep expression chain performance Date: Tue, 25 Apr 2023 06:51:55 -0700 (PDT) Organization: Compilers Central Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <23-04-012@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="39482"; mail-complaints-to="abuse@iecc.com" Keywords: C, parse, question, comment Posted-Date: 25 Apr 2023 12:21:40 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3453 I am writing C grammar. Grammar speed may be down caused by deep expression chain. For example, simple "n=0" has 20 levels: assignmentExpression conditionalExpression logicalOrExpression logicalAndExpression inclusiveOrExpression exclusiveOrExpression andExpression eqaulityExpression relationalExpression shiftExpression additiveExpression multiplicativeExpression castExpression unaryExpression postfixExpression postfixExpressionLeft atom literal IntegerConstant DeciamlConstant In other hand, Rust grammar for Antlr4 https://github.com/antlr/grammars-v4/blob/master/rust/RustParser.g4 has precedence definition: expression : outerAttribute+ expression # AttributedExpression // technical, remove left recursive | literalExpression # LiteralExpression_ | pathExpression # PathExpression_ | expression '.' pathExprSegment '(' callParams? ')' # MethodCallExpression // 8.2.10 | expression '.' identifier # FieldExpression // 8.2.11 | expression '.' tupleIndex # TupleIndexingExpression // 8.2.7 | expression '.' 'await' # AwaitExpression // 8.2.18 .............................. If I do similar in C grammar: expr : atom # atom_ | '__extension__'? '(' compoundStatement ')' # groupedExpression | expr '(' argumentExpressionList? ')' # postfixExpression | expr '.' Identifier # postfixExpression | expr '->' Identifier # postfixExpression | expr '[' expr ']' # postfixExpression parsing time increased by orders of magnitude: short .c files was processed longer a long file I didn't wait for it to finish (probably increasing nonlinear) My ask: form expr: expr operator expr is always ambiguous and waste time or I something bad have written? How is possible to reduce depth of chain? Parsing time with Antlr is much longer than parsing + semantic actions with GCC. Probably big part of this time is long depth of expressions. [If your parser is taking a lot of time, something is wrong. In the compilers I know, the pareer is always a tiny part of the work, which is dominated by the lexer which has to look at each character in the input, and optimizers that have to make multiple passes over the whole intermediate representation. -John]