Path: csiph.com!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <643-408-1753@kylheku.com> Newsgroups: comp.compilers Subject: Nice trick for "superfix" relational operators operator precedence parsing. Date: Wed, 07 May 2025 02:44:45 -0000 Organization: Compilers Central Sender: johnl%iecc.com Approved: comp.compilers@iecc.com Message-ID: <25-05-002@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="55436"; mail-complaints-to="abuse@iecc.com" Keywords: syntax, history, comment Posted-Date: 07 May 2025 12:23:49 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3645 I wanted to have compounding relational operators (which I call "superfix") in an operator precedence parser. This means that syntax like: 1 < j < k <= N behaves as one comparison operation meaning (1 < j) and (j < k) and (k <= N), except that the j and k terms are not evaluated twice. There may be short-circuiting of the logical and. We want parentheses to override this, so that 1 < (j < k) <= N is treated as 1 < T < N where T is a single term consisting of (j < k). Anyway, I came up with a nice solution for bringing this about without complicating the Shunting-Yard-based operator precedence parser. 1. I put all the relational operators into a single level of precedence, and made them right-associative. 2. A pass is made through the resulting syntax tree, whic recognizes the right-associative tree patterns involving these operators and rewrites them to the "and" terms. We can see this live: 1> (parse-infix '(1 < i < j <= N)) (< 1 (< i (<= j N))) 2> (sys:finish-infix *1) (and (< 1 i j) (<= j N)) (In my actual implementation, the < function is n-ary so we reduce to a single (< 1 i j) term. Because this is a strictly evaluated function call, it means that even if (< 1 i) is false, j is still evaluated, unlike with (and (< 1 i) (< i j))). Tree pattern matching matchin and rewriting is used to handle patterns like (relop1 A (relop2 B C)) into (and (relop1 A B) (relop2 B C)). Oh, about multiple evaluation, let's replace i with (inc i), an expression wth a side-effect of pre-incrementing i (returning the new value) which must happen once: 1> (parse-infix '(1 < (inc i) < j <= N)) (< 1 (< (inc i) (<= j N))) 2> (sys:finish-infix *1) (let (#:g0014) (and (< 1 (set #:g0014 (inc i))) (< #:g0014 j) (<= j N))) Code is emitted to bind a temporary hidden variable, which receives the result of the first (inc i). The subsequent term that would be an occurrence of (inc i) under the < function instead refers to that variable. The strategy both preserves left-to-right evaluation of all terms, and also depends on it or correctness. Other than adjusting the declarations of associativity and precedence, nothing had to be done in the parser; all the work in in the separate "finish infix" pass. It is then easy to document that way, too; the parsing model is very clear, requiring no concepts beyond binary operator precedence, and the heuristics of the "superfix" transformation are easy to clearly document also. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca [COBOL had roughly that shortcut comparison syntax in 1960 so a lot of people must have looked at this. -John]