Re: binary search debugging of compilers

From	Kaz Kylheku <864-117-4973@kylheku.com>
Newsgroups	comp.compilers
Subject	Re: binary search debugging of compilers
Date	2023-05-15 21:35 +0000
Organization	A noiseless patient Spider
Message-ID	<23-05-010@comp.compilers> (permalink)
References	<23-05-003@comp.compilers> <23-05-007@comp.compilers>

Show all headers | View raw

On 2023-05-14, Thomas Koenig <tkoenig@netcologne.de> wrote:
> Russ Cox <rsc@swtch.com> schrieb:
>
>> As I mentioned at the top, I am interested to hear about earlier uses
>> of approaches like this, or any other ideas for situations where the
>> approach might be applicable. Clearly the general problem has overlaps
>> with group testing [1], and in particular Hwang's binary search-based
>> approach (1972) [2].
>
> For GCC regression-hunting, two techniques are routinely used.
>
> One of them is the use of "gcc bisect", described at
> https://git-scm.com/docs/git-bisect .  If there is a failing test
> case, this can (at the cost of some CPU time) usually pinpoint
> offending commit.

The technique described by Russ Cox is applicable when you know what
the offending commit is. E.g. you made some changes, say, to the
optimizer, and something is going wrong when you boostrap the compiler.
Obviously, the optimization changes are breaking something; the prior
commit works. But you don't understand what is breaking where.

>
> For reducing test cases, at least for C and C++, cvise
> https://github.com/marxin/cvise is now the mehod of choice; it is
> brutally efficient at reducing test cases to an absolute minimum.
> Because it can transform C and C++ programs, it can make
> transformations of the program which are language-aware.

This is only applicable if you have an external test case which
reproduces some compiler problem (like a crash) and would like
to make it smaller.

The binary search technique described by Cox is uniquely applicable
in the situation that something goes wrong with the bootstrapping
of a compiler.

1. A bug has been introduced into the compiler source.
   You already know exactly which commit did that, and
   what change is responsible, due to having done the git bisect
   or whatever. Maybe this is just new, uncommited work that is
   breaking compared to the working HEAD.

2. Consequently, the compiler miscompiles something in its own
   source code (anywhere in its source: in the compiler proiper, or
   standard library support routines of that language which are
   used by the compiler.)

3. Consequently, the compiled compiler misbehaves somehow.
   (Or possibly, the boostrapping even fails: compiled code is
   loaded into the compiler's image, and it cannot continue
   the boostrapping process.)

Characterizing what is wrong in (3) is not too difficult (though it can
be). The problem is that the issues exhibited in 3 are in good parts of
the compiler. So they are red herring issues.  If you look at the source
code for those parts, it is fine and has not been touched. Rather, the
object code is incorrect due to the introduced compiler problem.

You might know the commit which introduces the problem, yet be stumped
as to how, due to the complexity of all the pieces and how they
interact, and the difficulty of some of the algorithms. (Plus hidden
problems, like the new work actually being correct, but exposing a bug
in existing code; so staring at just the new code until you are blue
does no good.)

Cox's trick lets you turn on the new, known-bad code only for a
seletected subset of functions (or whatever logical units of the image).

You can iteratively shrink the subset until it contains only one
mistranslated unit.

Then you know---aha!--the bad optimizer change is, say, mistranslating
the partition function in stdlib/quicksort.mylang. If just that function
is treated with the buggy new change, and nothing else, the show grinds
to a halt.

So now you know you can focus your investigation on what the buggy
compiler change is doing to the partition function in
stdlib/quicksort.mylang.

You can write test cases exercising that function to understand
how its behavior has gone wrong, and study the object code to
see how it's organized to bring about the wrong behavior, and how
that relates the bad new code in the compiler.

You can tweak the code, and build the image with just that partition
function being processed with the new code to see whether the regression
has gone away, and iterate on that. Then try boostrapping the entire
image with the repaired code.

Etc.

The bug can show up as a mistranslation of anything, which is why I
picked the partition helper of quicksort example. That could wreck the
compiler's behavior, if some logic in it somemwhere depends on
sorting something. You could really be on a wild goose chase trying
to debug *that*, rather than it completely unrelated root cause.

> Both are tools external to the compiler.

Right; so not applicable to this internal technique where basically we
subdivide the compiler's own image into "parts compiled with the
known-good compiler" and "parts compiled with the new, known-bad change
in effect", where the parts are on a finer granularity than files.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Back to comp.compilers | Previous | Next — Previous in thread | Find similar

Thread

binary search debugging of compilers Russ Cox <rsc@swtch.com> - 2023-05-12 13:59 -0400
  Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-13 03:20 +0000
  Re: binary search debugging of compilers Fernando <pronesto@gmail.com> - 2023-05-13 04:47 -0700
    Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-14 02:49 +0000
      Re: binary search debugging of compilers gah4 <gah4@u.washington.edu> - 2023-05-14 13:38 -0700
        Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-15 21:52 +0000
          Re: binary search debugging of compilers gah4 <gah4@u.washington.edu> - 2023-05-16 23:52 -0700
            Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-17 18:28 +0000
              Re: binary search debugging of compilers gah4 <gah4@u.washington.edu> - 2023-05-17 15:23 -0700
                Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-19 03:21 +0000
                Re: binary search debugging of compilers Thomas Koenig <tkoenig@netcologne.de> - 2023-05-19 21:59 +0000
                Re: binary search debugging of compilers gah4 <gah4@u.washington.edu> - 2023-05-20 20:20 -0700
                Re: Old C compilers, binary search debugging of compilers Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2023-05-22 09:05 +0200
              binary search debugging of compilers Max B <tekk.nolagi@gmail.com> - 2023-05-19 16:31 -0500
  Re: binary search debugging of compilers Thomas Koenig <tkoenig@netcologne.de> - 2023-05-14 19:59 +0000
    Re: binary search debugging of compilers Cameron McInally <cameron.mcinally@nyu.edu> - 2023-05-14 23:28 -0400
    Re: binary search debugging of compilers Kaz Kylheku <864-117-4973@kylheku.com> - 2023-05-15 21:35 +0000

csiph-web