Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: Looking for Unix lex for modern systems Date: Fri, 7 Jan 2022 15:36:44 -0800 (PST) Organization: Compilers Central Lines: 45 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-01-030@comp.compilers> References: <22-01-023@comp.compilers> <22-01-024@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="31744"; mail-complaints-to="abuse@iecc.com" Keywords: lex, history, comment Posted-Date: 07 Jan 2022 20:28:31 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-01-024@comp.compilers> Xref: csiph.com comp.compilers:2806 (snip, our moderator wrote) > [Flex can take the same input as lex but its internals are totally different. > > Bell Labs long ago released the code to early Unix systems. The source > for lex is here: > https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/lex or on > the 4.2BSD src archive at > https://www.tuhs.org/Archive/Distributions/UCB/4.2BSD/ > I tried to compile the 4.2BSD version on FreeBSD and the errors were > ugly. -John] It seems that real lex known about RATFOR, and I suspect that actual flex doesn't. Is that a good test for which source you have? In any case, with gcc -std=c89 -Dunix there aren't so many errors (that aren't warnings). The warnings are from conversion of either the wrong pointer type, or between integer and pointer. I am not so sure how well current systems do the latter. (That seems to be usual for C from those years.) Fixing the actual errors, including removing the initialization of *errorf with stdout, and not declaring calloc, it compiles and (with the -t option) runs. It then stops with: (Error) output table overflow 5/1000 nodes(%e), 10/2500 positions(%p), 3/500 (%n), 254 transitions , 2/1000 packed char classes(%k), 3/2000 packed transitions(%a), 0/0 output slots(%o) (I have the sample file from the Wikipedia page for input.) Reminds me, in the days of OS/2 1.0, I was compiling the GNU utilities, and especially grep and diff, for OS/2. In many cases, they would mix integer and (char*), especially in function arguments. Replacing 0 with (char*)0 fixed those, but I also complained to the GNU people. The reply was that, pretty much, any system with sizeof(int) not equal to sizeof(char*) was broken, and it wasn't their problem to fix. [If the comments in the source code say "written by Eric Schmidt", it's lex, otherwise, it's flex. Yes, that Eric Schmidt. -John]