Groups | Search | Server Info | Keyboard shortcuts | Login | Register


Groups > comp.compilers > #3560

Figuring out grammars from examples

Path csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From John R Levine <johnl@taugh.com>
Newsgroups comp.compilers
Subject Figuring out grammars from examples
Date Fri, 12 Apr 2024 18:39:51 -0400
Organization Compilers Central
Sender johnl%iecc.com
Approved comp.compilers@iecc.com
Message-ID <24-04-001@comp.compilers> (permalink)
MIME-Version 1.0
Content-Type text/plain; charset="UTF-8"
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="68172"; mail-complaints-to="abuse@iecc.com"
Keywords parse
Posted-Date 12 Apr 2024 18:48:31 EDT
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:3560

Show key headers only | View raw


There's been a surprising amount of work on taking language strings and
feeding them to a box that infers a grammar for them.  One of the harder
bits is figuring out nesting constructs, which in the past has often
required hints.

In this paper, a Visibly Pushdown Grammar is large but more tractable
subset of context free grammars.

V-Star: Learning Visibly Pushdown Grammars from Program Inputs
Xiaodong Jia, Gang Tan

Accurate description of program inputs remains a critical challenge in the
field of programming languages. Active learning, as a well-established
field, achieves exact learning for regular languages. We offer an
innovative grammar inference tool, V-Star, based on the active learning of
visibly pushdown automata. V-Star deduces nesting structures of program
input languages from sample inputs, employing a novel inference mechanism
based on nested patterns. This mechanism identifies token boundaries and
converts languages such as XML documents into VPLs. We then adapted
Angluin's L-Star, an exact learning algorithm, for VPA learning, which
improves the precision of our tool. Our evaluation demonstrates that
V-Star effectively and efficiently learns a variety of practical grammars,
including S-Expressions, JSON, and XML, and outperforms other
state-of-the-art tools.

https://arxiv.org/abs/2404.04201v1

Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

Back to comp.compilers | Previous | NextNext in thread | Find similar


Thread

Figuring out grammars from examples John R Levine <johnl@taugh.com> - 2024-04-12 18:39 -0400
  Re: Figuring out grammars from examples Derek <derek@shape-of-code.com> - 2024-04-13 12:06 +0100
    Re: Figuring out grammars from examples Derek <derek@shape-of-code.com> - 2024-04-15 02:17 +0100

csiph-web