Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: How do you create a grammar for a multi-language language? Date: Sun, 6 Mar 2022 16:50:08 -0800 (PST) Organization: Compilers Central Lines: 52 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-03-013@comp.compilers> References: <22-03-004@comp.compilers> <22-03-006@comp.compilers> <22-03-010@comp.compilers> <22-03-011@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="1066"; mail-complaints-to="abuse@iecc.com" Keywords: C, history Posted-Date: 06 Mar 2022 20:44:17 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-03-011@comp.compilers> Xref: csiph.com comp.compilers:2918 On Sunday, March 6, 2022 at 3:22:57 PM UTC-8, gah4 wrote: (snip) > I don't believe I ever tried preprocessor statements inside C string > constants, but as far as I know, it works. (snip) > [PHP effectively treats material between ?> and to print the material as if it were a quoted string. I suppose it works that way. > C says that the input is tokenized before it does the preprocessor > phase, so it does not look inside quoted strings. The # and ## > preprocessor operators allow some preprocessor time creation of quoted > strings. -John] It seems that the traditional C compiler, sometimes used with other languages, such as Fortran, has different parsing and tokenizing rules. gcc -E --traditional quote.c will process files with a preprocessor statement inside quotes. On the other hand, the result is likely not what was wanted, at least not for C. Among others, it puts out lines like: # 6 "quote.c" 2 which then end up inside the string. On the other hand, if you language uses ' for other than strings, or for two uses, then it should work. One of the stranger things that I have known for about 50 years, is direct access I/O in IBM Fortran IV. There are statements like: WRITE(1'N) X, Y, Z where N is the record in the direct access file. The compiler also accepts string (Hollerith) constants with apostrophes. (But not in direct access I/O statements.) In any case, the --traditional C preprocessor is commonly used with Fortran, where I suspect C tokenizing could cause problems. Some C preprocessors, but not the current ISO C version, and it seems not gcc -E --traditional, will substitute preprocessor symbols inside quoted strings. As the OP notes, mixed parsing can have surprising effects!