Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <864-117-4973@kylheku.com> Newsgroups: comp.compilers Subject: Re: Parsing repetitive tokens across multiple patterns Date: Fri, 16 Jun 2023 00:31:40 -0000 Organization: Compilers Central Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <23-06-005@comp.compilers> References: <23-06-003@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="84439"; mail-complaints-to="abuse@iecc.com" Keywords: parse, design Posted-Date: 15 Jun 2023 22:31:43 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3494 On 2023-06-15, Archana Deshmukh wrote: > I need to parse following patterns using bison and flex and retrieve the data and store to list. > There are total 9 patterns, so there will 9 lists to store the corresponding values. I am able to parse all patterns. > However, I need to add lots of flags as tokens are repetitive and used across patterns. > Also within patterns same token is repeated multiple times. I think there can be a better way to do this. Your descriptions do not constitute a coherent problem statement. What are the inputs to your system, and the corresponding, expected outputs? What is the smallest input set? Next smallest? Corner cases? Tokens are often repeated in programming and data languages. Why do you believe this is significant in your problem? > e.g. for token INTEGER, the code in bison parser file is > > INTEGER: > if(pattern1) > { > if(flag1) > { > } > if(flag2) > { > } > . > . > . > } What does this mean? We are in the middle of parsing a pattern pattern, and we already know which one due to earlier parsing actions, such that the pattern1 variable (and others) inform us? And so then when an INTEGER occurs and has to be treated differently based on which pattern? Why and how? There can be reasons to treat tokens in a context-sensitive way. Usually that happens when there are multiple sub-languages integrated into one language. Your different patterns don't look like a different language; why would a token like 5 have to be treated differently based on which one of 9 similar patterns it occurs in. I sense an X/Y problem here. The real problem being solved is some Y that you have not revealed, and you have invested in some chosen approach X that you're trying to debug into working, and asking questions about. There may be a better way, but that way may be a complete replacement for X to solve the Y. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca