Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: luser droog Newsgroups: comp.compilers Subject: Re: Supporting multiple input syntaxes Date: Thu, 13 Aug 2020 21:36:13 -0700 (PDT) Organization: Compilers Central Lines: 49 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <20-08-006@comp.compilers> References: <20-08-002@comp.compilers> <20-08-004@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="99725"; mail-complaints-to="abuse@iecc.com" Keywords: C, parse Posted-Date: 15 Aug 2020 10:44:32 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <20-08-004@comp.compilers> Xref: csiph.com comp.compilers:2558 On Thursday, August 13, 2020 at 5:22:51 PM UTC-5, Hans-Peter Diettrich wrote: > Am 13.08.2020 um 00:20 schrieb luser droog: > > I've got my project successfully parsing the circa-1975 C syntax > > from that old manual. I'd like to add parsers for K&R1 and c90 > > syntaxes. > > > > How separate should these be? Should they be complete > > separate grammars, or more piecewise selection? > > IMO this depends widely on the usage of the parser output (diagnostics, > backend...). C90 is much stricter than K&R, requires more checks. Do you > need extensive error diagnostics, or do you assume that all source code > is free of errors? > > > > https://github.com/luser-dr00g/pcomb/blob/master/pc9syn.c > > You seem to implement an LL(1) parser? My C98 Parser is LL(2), i.e. an > LL(1) parser with one or two locations where more lookahead is required. > Also identifiers are classified as typenames and others prior to their > usage. > Yes, it's basically LL(1) with backtracking. There's one part of the grammar I'm using that's left-recursive and I still need to work that out. > For real-world testing (recommended!) a preprocessor is required and a > copy of the standard libraries of existing compiler(s). > > Your test_syntax() source misses "=" from the variable declarations > (initializers). What about pointer syntax/semantics? If you add these > (and other) syntax differences conditionally (version specific) to your > code, which way would look better to you? Which way will be safer to > maintain? > That's actually correct for the 1975 dialect: no '=' to initialize variables. I think it's pretty ugly without it, but it could be removed anyway for the AST. > > Nice code BTW :-) > Thanks! I think I need to sidetrack a bit and work up some primitives for pattern matching and decomposition to make the backend easier. I'll report back if/when it can do more tricks.