Path: csiph.com!eternal-september.org!feeder3.eternal-september.org!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: John R Levine Newsgroups: comp.compilers Subject: paper: Towards Automatic Error Recovery in Parsing Expression Date: Tue, 08 Jul 2025 20:59:25 -0400 Organization: Compilers Central Sender: johnl%iecc.com Approved: comp.compilers@iecc.com Message-ID: <25-07-004@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="89646"; mail-complaints-to="abuse@iecc.com" Keywords: parse, errors Posted-Date: 08 Jul 2025 21:40:25 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3671 Lots of compilers tried to do error recovery back in the batch era, but it fell out of favor until IDEs tried to build parse trees out of less than perfect source code. Abstract Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to also be an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present an algorithm that automatically annotates a PEG with labels, and builds their corresponding recovery expressions. We evaluate this algorithm by adding error recovery to the parser of the Titan programming language. The results shown that with a small amount of manual intervention our algorithm can be used to produce error recovering parsers for PEGs where most of the alternatives are disjoint. https://arxiv.org/abs/2507.03629 Regards, John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly