Groups | Search | Server Info | Login | Register
Groups > comp.compilers > #3475
| From | Russ Cox <rsc@swtch.com> |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Re: binary search debugging of compilers |
| Date | 2023-05-18 15:28 -0400 |
| Organization | Compilers Central |
| Message-ID | <23-05-019@comp.compilers> (permalink) |
| References | <CADSkJJVN7RGqhgVDZaz_K5be6uEDaMofMu5OF3RR4Y5-fDu00Q@mail.gmail.com> |
> Any partial ordering can be extended to a total ordering. Why can't you > just do that instead of using 'the longest "straight" line' ? A precondition for binary search is that you are searching a function f over some domain [0, n) with the property that f(i) == true implies f(j) == true for all j >= i. That is, once the condition "flips", it stays flipped. Otherwise binary search is not a correct algorithm. The assumption in tools like git bisect is that some commit introduces the bug, and then the bug is detectable in all future commits as well, establishing the precondition. The test f(i), testing at commit i, can be viewed as asking whether the bad commit is in the range [0, i] (the full history of that commit). With that definition, clearly f(i) implies f(j) for all j >= i, because the commit history [0, j] is a superset of the commit history [0, i]. If instead you squash unrelated commit lines together in a total order, the precondition is no longer true: if i is from one line but j > i is from a different unmerged line, then testing j does not include some history that testing i does. The bug can disappear as you cross from one line to another. Focusing on the longest straight line in the commit history is one way to establish the necessary precondition. Another possibility, which is what git bisect does, is to maintain a set of commits not yet decided and then find some "good" midpoint along the longest line of commits remaining. You may in general end up with multiple disconnected lines in your set of commits not yet decided, and you have to handle each one separately until you find the bug. What git bisect does, then, is a more complex algorithm that uses binary search over longest lines as a subroutine. Restricted to a single line of commits the precondition holds and binary search works. (I am simplifying a bit. The actual implementation is perhaps not this principled, but this is the essence of what it is doing and why it works.) The binary search I described to start this thread is also not exactly the same binary search algorithm, because the underlying input problem is not a single-input function with domain [0, n). Instead the input problem (in the simplest form) is a function taking a set of possible changes [lo, hi) that reports whether the bug is in that set. We recognize it as binary search because it's still the same code and idea, and every binary search does maintain this progressively narrower range, but normal binary searches don't tell f the range. Best, Russ
Back to comp.compilers | Previous | Next | Find similar
Re: binary search debugging of compilers Russ Cox <rsc@swtch.com> - 2023-05-18 15:28 -0400
csiph-web