Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: luser droog Newsgroups: comp.compilers Subject: Re: Learning only one lexer made me blind to its hidden assumptions Date: Tue, 12 Jul 2022 19:49:31 -0700 (PDT) Organization: Compilers Central Lines: 33 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-007@comp.compilers> References: <22-07-006@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="82964"; mail-complaints-to="abuse@iecc.com" Keywords: lex, comment Posted-Date: 12 Jul 2022 23:25:36 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-07-006@comp.compilers> Xref: csiph.com comp.compilers:3111 On Monday, July 11, 2022 at 7:26:08 PM UTC-5, Roger L Costello wrote: > Hi Folks, > > For months I have been immersed in learning and using Flex. Great fun indeed. > > But recently I have been reading a book, Crafting a Compiler with C, and > reading its chapter on lexers. The chapter describes two lexer-generators: > ScanGen and Lex. Oh my! Learning ScanGen opened my eyes to the hidden > assumptions in Lex/Flex. Without learning ScanGen I would have continued to > think that the way things are done in Lex/Flex way is the only way. > > Below I have documented some of the differences between Lex/Flex and ScanGen. [snip] > Difference: > - Flex regexes use juxtaposition for specifying concatenation. > - ScanGen uses '.' to specify concatenation. And oh by the way, ScanGen calls > it 'catenation' not 'concatenation' I think this difference in word choice has possibly some etymological significance. Both word come from "catenary" which is the shape a rope or cord makes when you drape it over some spokes or frames or hooks or whatever. So, to *catenate* is to hoist the string or rope up onto some hooks or poles so it makes that dangling *garland* kind of curve. So, it's focused on the *rope* as an entity. *Concatenate* adds the prefix "con" meaning "with". I interpret this as embellishing the rope with beads or light bulbs or something. So now we're stringing up a bunch of beads *together*, focusing on the hanging objects. The original APL book uses "catenate" in a way that I think is consistent with my interpretation here. But I could also be wrong. I have not actually researched this beyond having run into it a few times and attempted to come up with a plausible reason. [Lots of people agree with that etymology. Where do you think the Unix "cat" command came from? -John]