Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: Basic Lexing Question Date: Wed, 29 Jun 2022 16:27:06 -0700 (PDT) Organization: Compilers Central Lines: 71 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-06-087@comp.compilers> References: <22-06-086@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="77772"; mail-complaints-to="abuse@iecc.com" Keywords: lex, macros Posted-Date: 29 Jun 2022 20:24:20 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-06-086@comp.compilers> Xref: csiph.com comp.compilers:3104 On Wednesday, June 29, 2022 at 2:02:08 PM UTC-7, nob...@gmail.com wrote: > The following line is from a makefile accepted by gmake: > onefile: $(AVAR) > I'm wondering what the ramification are of lexing what's on the right of the > colon as a single string and then breaking it apart later, as opposed to > returning a more detailed sequence of tokens, such as DOLLAR LPAREN NAME > RPAREN. I suspect that the question is more complicated than it looks. Well, first, you might look at the gmake manual, and especially here: https://www.gnu.org/software/make/manual/html_node/Flavors.html#Flavors Often in interpreted languages, and also in languages that use a preprocessor, you have to consider that things might be parsed more than once. As well as I know it, in processing that line gmake searches the line for $, without (mostly) looking at the rest of the line. (Even more, I am not sure about string constants.) So variables are replaced, and then the line is executed. Except when it isn't. It seems that in variable assignment: bvar = $AVAR the variable isn't expanded yet, but $AVAR is the value of bvar. Then, later, when there is a $bvar, and $AVAR is substituted, and then the value of AVAR is substituted. Even more, gmake has cvar ::= $AVAR where $AVAR is expanded. I first thought about this for PHP, which is a preprocessor (meant for) HTML. The processor doesn't know about HTML at all, but looks for is processed by PHP, with the result sent out be the server for the web browser to process. I am not sure of the exact rules, so it might be that it is processed differently in quoted strings, but I suspect not. The gmake manual has the example, which they recommend not using: foo = c prog.o : prog.$(foo) $(foo)$(foo) -$(foo) prog.$(foo) Note that the $(foo)$(foo) is replaced by cc to run the C compiler. Some of the more interesting parsing examples come with TeX, which allows one to change, while it is running, which characters are letters. Letters can be used in control-sequence name longer than one character. (Note unlike many languages, not digits ... unless they are letters!) TeX also has \expandafter, which allows for delaying expansion of something until what follows it expanded. In any case, when input is parsed more than once, often by parsers with different rules, the exact order of processing is very important!