Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #1989

Re: Parser Reversed

From "Matt P. Dziubinski" <matdzb@gmail.com>
Newsgroups comp.compilers
Subject Re: Parser Reversed
Date 2018-03-11 15:08 +0100
Organization http://www.wit.edu.pl
Message-ID <18-03-040@comp.compilers> (permalink)
References <18-03-038@comp.compilers>

Show all headers | View raw


On 3/11/2018 08:32, Hans-Peter Diettrich wrote:
> A grammar can be used to *check* for valid sentences of a language, but
> it also can be used to *create* valid sentences. For a pretty printer or
> decompiler test I need a sentence generator for logical expressions. For
> now the language can be restricted to AND, OR, variables and (kind of)
> parentheses. Later on NOT and XOR can be added. RPN is one alternative
> for the "kind of parentheses", eliminating the need for a specific
> operator precedence.
>
> Now I'm looking for possible implementations of such a generator, in
> addition to my own ideas. So far the output can be anything, e.g. source
> code or machine code, or some tree (AST...).
>
> Any ideas or references to such projects?

Hi!

Csmith comes to mind: https://embed.cs.utah.edu/csmith/

Reference: Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. PLDI
2011. "Finding and Understanding Bugs in C Compilers"
Paper: http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf
LtU post: http://lambda-the-ultimate.org/node/4241

Summary (from the paper): "The shape of a program generated by Csmith is
governed by a grammar for a subset of C. A program is a collection of
type, variable, and function definitions; a function body is a block; a
block contains a list of declarations and a list of statements; and a
statement is an expression, control-flow construct (e.g., `if`,
`return`, `goto`, or `for`), assignment, or block. Assignments are
modeled as statementsbnot expressionsbwhich reflects the most common
idiom for assignments in C code. We leverage our grammar to produce
other idiomatic code as well: in particular, we include a statement kind
that represents a loop iterating over an array. The grammar is
implemented by a collection of hand-coded C++ classes."

You may also want to take a look at the following:

* "Effect-Driven QuickChecking of Compilers" (notably, the following
goes substantially further than relying solely on the grammar grammar by
making use of the type system -- more in the paper):

Code (Effect-Driven Compiler Tester): https://github.com/jmid/efftester
Paper: http://janmidtgaard.dk/papers/Midtgaard-al%3AICFP17-full.pdf
Talk: https://podcasts.ox.ac.uk/effect-driven-quickchecking-compilers

* "Structure-aware fuzzing for Clang and LLVM with libprotobuf-mutator"
- Kostya Serebryany, Vitaly Buka and Matt Morehouse - 2017 LLVM
Developersb Meeting
https://www.youtube.com/watch?v=U60hC16HEDY
https://llvm.org/devmtg/2017-10/#talk8

See: https://llvm.org/docs/FuzzingLLVM.html
In particular:
https://github.com/llvm-mirror/clang/tree/master/tools/clang-fuzzer

"This directory contains two utilities for fuzzing Clang: clang-fuzzer
and clang-proto-fuzzer.  Both use libFuzzer to generate inputs to clang
via coverage-guided mutation.

The two utilities differ, however, in how they structure inputs to
Clang. clang-fuzzer makes no attempt to generate valid C++ programs and
is therefore primarily useful for stressing the surface layers of Clang
(i.e. lexer, parser). clang-proto-fuzzer uses a protobuf class to
describe a subset of the C++ language and then uses libprotobuf-mutator
to mutate instantiations of that class, producing valid C++ programs in
the process.  As a result, clang-proto-fuzzer is better at stressing
deeper layers of Clang and LLVM."

For further reference, perhaps the following compiler correctness
resources (literature & software) can also be of help:
https://github.com/MattPD/cpplinks/blob/master/compilers.correctness.md

Best,

Matt P. Dziubinski

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Parser Reversed Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2018-03-11 08:32 +0100
  Re: Parser Reversed "Matt P. Dziubinski" <matdzb@gmail.com> - 2018-03-11 15:08 +0100
    Re: Parser Reversed Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2018-03-13 11:23 +0100
  Re: Parser Reversed Kaz Kylheku <157-073-9834@kylheku.com> - 2018-03-12 21:00 +0000
    Re: Parser Reversed Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2018-03-13 12:21 +0100

csiph-web