Groups | Search | Server Info | Login | Register


Groups > comp.compilers > #199

Re: Parsing C#-like generics

From "Ben L. Titzer" <ben.titzer@gmail.com>
Newsgroups comp.compilers
Subject Re: Parsing C#-like generics
Date 2011-07-13 10:19 -0700
Organization Compilers Central
Message-ID <11-07-023@comp.compilers> (permalink)
References <11-07-019@comp.compilers>

Show all headers | View raw


On Jul 11, 11:22 am, "Harold Aptroot" <harold.aptr...@gmail.com>
> I'm having some trouble parsing generics when mixed with comparisons. The
> way I try to do it, there is an ambiguity between LessThan and a "list of
> types between angle brackets".
> For example, x<x>(x<x) should be syntactically OK, and it should be parsed
> to a function call x with a type parameter list < x > and a single argument
> which is the expression x<x (ok not really, I threw in semantics here to
> make it clearer, the actual result should just be an AST).
> My parser generator (GOLD parsing system) complains about a shift-reduce
> error, and the parser it produces doesn't want to parse any expression with
> a LessThan in it because it believes that to be a incomplete type list
> (lacking a closing > )
>
> I know it is actually inherently ambiguous, because t<t2>(t3) could mean
> two things:
> - LessThan(t, BiggerThan(t2, t3)
> - invoke t<t2> with argument t3
> In that case I want to pick option two.
> For t<t2>t3 I want to pick option one, not report "missing ( "
>
> Can this be done with an LALR parser at all? If so, how?


One trick I've used in the past is to lex the '<' that introduces a
type parameter list as part of the identifier:

"foo" would lex as a single IDENT token.
and
"foo<" would lex as a single PARAMETERIZED_IDENT token.
and
"foo <" would lex as IDENT followed by LESS_THAN

You can then use the IDENT and PARAMETERIZED_IDENT tokens in various
places in the grammar, with PARAMETERIZED_IDENT being followed by a
type list and a '>' token.

This then requires any use of the '<' operator that follow an
identifer to have intervening whitespace. It also requires that any
parameterization of an identifier not have intervening whitespace. I
think it's a decent tradeoff if you are defining the language
yourself, but won't work for languages with more complex rules for
resolving the ambiguity.

Back to comp.compilers | Previous | NextPrevious in thread | Find similar


Thread

Parsing C#-like generics "Harold Aptroot" <harold.aptroot@gmail.com> - 2011-07-11 20:22 +0200
  Re: Parsing C#-like generics Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2011-07-12 13:25 +0100
    Re: Parsing C#-like generics BGB <cr88192@hotmail.com> - 2011-07-14 13:13 -0700
  Re: Parsing C#-like generics BGB <cr88192@hotmail.com> - 2011-07-12 16:39 -0700
  Re: Parsing C#-like generics "Ben L. Titzer" <ben.titzer@gmail.com> - 2011-07-13 10:19 -0700

csiph-web