Path: csiph.com!xmission!usenet.csail.mit.edu!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: antispam@math.uni.wroc.pl Newsgroups: comp.compilers Subject: Re: Supporting multiple input syntaxes Date: Tue, 23 Feb 2021 23:28:16 +0000 (UTC) Organization: Politechnika Wroclawska Lines: 32 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <21-02-008@comp.compilers> References: <20-08-002@comp.compilers> <21-02-004@comp.compilers> <21-02-005@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="74196"; mail-complaints-to="abuse@iecc.com" Keywords: parse Posted-Date: 23 Feb 2021 20:47:52 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2633 Elijah Stone wrote: > On Thu, 11 Feb 2021, antispam@math.uni.wroc.pl wrote: > > > My impression is that variation in Pascal dialects is larger than in C > > dialects, so case for unified parser in C IMHO > > Pascal is more fragmented, but it's also much easier to parse than C. I > think it's a wash. I did a C parser, it was not hard at all. I in C (like in standard Pascal) there are conflicts, but that conflicts can be resolved easily using semantic info. Alternativly, for C one can use 2 token lookahead. Turbo Pascal folks introduced "interesting" difficulty with caret constants. Frank Heckenbach worked out how to handle them and his analysis indicates that correct handling of Turbo Pascal needs IIRC 6 tokens of lookahead. Note that for both Pascal and C, with 1 token of lokahead semantic info is available when needed to disambiguate parsing, once you have more than 1 token of lokahead semantic info is sometimes too late and in effect paser must work purely syntactically. > (I also think the whole idea is horrifying and ought not to be pursued; > but.) What you mean by "whole idea"? Do you think that creating compiler that can correctly handle multiple dialects (Pascal or other language) is wrong? -- Waldek Hebisch