Path: csiph.com!eternal-september.org!feeder.eternal-september.org!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Philipp Klaus Krause Newsgroups: comp.compilers Subject: What happens at the end of the file for lex? Date: Wed, 3 Jun 2020 10:04:50 +0200 Organization: solani.org Lines: 44 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <20-06-001@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="98343"; mail-complaints-to="abuse@iecc.com" Keywords: lex, question, comment Posted-Date: 03 Jun 2020 19:57:44 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-US Xref: csiph.com comp.compilers:2522 I wonder what is supposed to happen when a lex lexer reaches the end of the input calling input(). The only information I found in the flex 2.6.4 manual states: If 'input()' encounters an end-of-file the normal 'yywrap()' processing is done. A "real" end-of-file is returned by 'input()' as 'EOF'. What is the difference between an "end-of-file" and a "'real' end-of-file"? I did a quick test using this .lex file: %% . {for(int i = 0; i < 8; i++) {int ch = input(); printf("%d\n", ch);}} %% main() { yylex(); } And using a single-character input-file, I see: philipp@notebook5:/tmp$ ./a.out < test.c 10 0 0 0 0 0 0 0 So apparently input() just returns 0 (and keeps doing so). Is input() supposed to always return 0 at the end? Could inut() return 0 in some other situation? When would input() return EOF? Philipp [The convention in lex and flex is that input() returns 0 at tne end of input. You can use a <> rule if you want your lexer to do something other than return when it gets to EOF. The yywrap() routine is used for file switching if your lexer handles multiple input files in a single run. -John]