Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > muc.lists.netbsd.tech.toolchain > #3593 > unrolled thread

netbsd-11 gcc bug

Started byManuel Bouyer <bouyer@antioche.eu.org>
First post2026-05-23 10:59 +0200
Last post2026-05-26 10:02 -0400
Articles 20 on this page of 27 — 13 participants

Back to article view | Back to muc.lists.netbsd.tech.toolchain


Contents

  netbsd-11 gcc bug Manuel Bouyer <bouyer@antioche.eu.org> - 2026-05-23 10:59 +0200
    Re: netbsd-11 gcc bug Manuel Bouyer <bouyer@antioche.eu.org> - 2026-05-23 12:55 +0200
      Re: netbsd-11 gcc bug Manuel Bouyer <bouyer@antioche.eu.org> - 2026-05-23 20:45 +0200
    Re: netbsd-11 gcc bug Valery Ushakov <uwe@stderr.spb.ru> - 2026-05-23 14:22 +0300
      Re: netbsd-11 gcc bug Manuel Bouyer <bouyer@antioche.eu.org> - 2026-05-23 13:28 +0200
    Re: netbsd-11 gcc bug Roland Illig <roland.illig@gmx.de> - 2026-05-23 20:38 +0200
      Re: netbsd-11 gcc bug Manuel Bouyer <bouyer@antioche.eu.org> - 2026-05-23 20:47 +0200
        Re: netbsd-11 gcc bug Robert Elz <kre@munnari.OZ.AU> - 2026-05-24 02:29 +0700
          Re: netbsd-11 gcc bug Mouse <mouse@Rodents-Montreal.ORG> - 2026-05-23 19:11 -0400
            Re: netbsd-11 gcc bug Martin Husemann <martin@duskware.de> - 2026-05-24 10:44 +0200
              Re: netbsd-11 gcc bug Mouse <mouse@Rodents-Montreal.ORG> - 2026-05-24 08:33 -0400
                Re: netbsd-11 gcc bug Jason Thorpe <thorpej@me.com> - 2026-05-24 11:38 -0400
                  Re: netbsd-11 gcc bug "Greg A. Woods" <woods@planix.ca> - 2026-05-24 16:53 -0700
                Re: netbsd-11 gcc bug David Holland <dholland-tech@netbsd.org> - 2026-05-25 00:02 +0000
                  Re: netbsd-11 gcc bug Mouse <mouse@Rodents-Montreal.ORG> - 2026-05-25 08:23 -0400
                    C compiler (over-)optimization (was: netbsd-11 gcc bug) Edgar Fuß <ef@math.uni-bonn.de> - 2026-05-29 19:12 +0200
                      Re: C compiler (over-)optimization (was: netbsd-11 gcc bug) Mouse <mouse@Rodents-Montreal.ORG> - 2026-05-29 14:21 -0400
                        Re: C compiler (over-)optimization (was: netbsd-11 gcc bug) "Greg A. Woods" <woods@planix.ca> - 2026-05-29 14:27 -0700
                          Re: C compiler (over-)optimization Anders Magnusson <ragge@tethuvudet.se> - 2026-05-30 12:30 +0200
                          Re: C compiler (over-)optimization (was: netbsd-11 gcc bug) Andrew Cagney <andrew.cagney@gmail.com> - 2026-05-31 13:02 -0400
                            Re: C compiler (over-)optimization (was: netbsd-11 gcc bug) Jason Thorpe <thorpej@me.com> - 2026-05-31 14:29 -0500
                  Re: netbsd-11 gcc bug "Greg A. Woods" <woods@planix.ca> - 2026-05-28 14:40 -0700
                    Re: netbsd-11 gcc bug Mouse <mouse@Rodents-Montreal.ORG> - 2026-05-28 18:36 -0400
                    Re: netbsd-11 gcc bug Jason Thorpe <thorpej@me.com> - 2026-05-28 19:04 -0400
                      Re: netbsd-11 gcc bug "Greg A. Woods" <woods@planix.ca> - 2026-05-29 14:31 -0700
        Re: netbsd-11 gcc bug Jörg Sonnenberger <joerg@bec.de> - 2026-05-26 13:52 +0200
          Re: netbsd-11 gcc bug Jason Thorpe <thorpej@me.com> - 2026-05-26 10:02 -0400

Page 1 of 2  [1] 2  Next page →


#3593 — netbsd-11 gcc bug

FromManuel Bouyer <bouyer@antioche.eu.org>
Date2026-05-23 10:59 +0200
Subjectnetbsd-11 gcc bug
Message-ID<ahFseBt1-iumbm_G@antioche.eu.org>
Hello,
I tried upgrading a server to netbsd-11 and quickly got a panic
in ipf:
[ 150.0240120] fatal page fault in supervisor mode
[ 150.0586271] trap type 6 code 0 rip 0xffffffff8056dffa cs 0x8 rflags 0x10286 cr2 0xec ilevel 0x4 rsp 0xffff870268a63a50
[ 150.1225901] curlwp 0xffff869617c47400 pid 0.3 lowest kstack 0xffff870268a5f2c0
[ 150.1657501] panic: trap
[ 150.1803103] cpu0: Begin traceback...
[ 150.2016313] vpanic() at netbsd:vpanic+0x171
[ 150.2257081] panic() at netbsd:panic+0x3c
[ 150.2473911] trap() at netbsd:trap+0xb43
[ 150.2743808] --- trap (number 6) ---
[ 150.2951811] ipf_fastroute() at netbsd:ipf_fastroute+0x6ea
[ 150.3266935] ipf_send_ip() at netbsd:ipf_send_ip+0x127
[ 150.3544099] ipf_check() at netbsd:ipf_check+0xfd5
[ 150.3859226] pfil_run_hooks() at netbsd:pfil_run_hooks+0x11e
[ 150.4164994] ipintr() at netbsd:ipintr+0x21e
[ 150.4451003] softint_dispatch() at netbsd:softint_dispatch+0x112

ipf_fastroute+0x6ea points to external/bsd/ipf/netinet/ip_fil_netbsd.c
line 1200:
                if (!fr || !(fr->fr_flags & FR_RETMASK)) {

0xec matches the offset of fr_flags in struct frentry_t
This code shouldn't dereference fr_flags if fr is NULL.

The assembly code matching this part of ipf_fastroute() is:
1219                    fin->fin_fr = NULL;
   0xffffffff8056dfeb <+1755>:  movq   $0x0,0x10(%r15)

1220                    if (!fr || !(fr->fr_flags & FR_RETMASK)) {
   0xffffffff8056dff3 <+1763>:  mov    -0xd8(%rbp),%r8
--Type <RET> for more, q to quit, c to continue without paging--
   0xffffffff8056dffa <+1770>:  testl  $0x3000,0xec(%r8)
   0xffffffff8056e005 <+1781>:  mov    -0xf0(%rbp),%r9
   0xffffffff8056e00c <+1788>:  je     0xffffffff8056e326 <ipf_fastroute+2582>
   0xffffffff8056e012 <+1794>:  mov    %r9,-0xd8(%rbp)

1224                    }

ipf_fastroute+2582 does the call to ipf_state_check() and jumps back to +1794
 
But it seems to assume that fr cannot be NULL here but I can't find
on which basis. Any idea how I could force a NULL check here ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [next] | [standalone]


#3594

FromManuel Bouyer <bouyer@antioche.eu.org>
Date2026-05-23 12:55 +0200
Message-ID<ahGHlgSY-CcScwUG@antioche.eu.org>
In reply to#3593

[Multipart message — attachments visible in raw view] — view raw

On Sat, May 23, 2026 at 10:59:36AM +0200, Manuel Bouyer wrote:
> Hello,
> I tried upgrading a server to netbsd-11 and quickly got a panic
> in ipf:
> [ 150.0240120] fatal page fault in supervisor mode
> [ 150.0586271] trap type 6 code 0 rip 0xffffffff8056dffa cs 0x8 rflags 0x10286 cr2 0xec ilevel 0x4 rsp 0xffff870268a63a50
> [ 150.1225901] curlwp 0xffff869617c47400 pid 0.3 lowest kstack 0xffff870268a5f2c0
> [ 150.1657501] panic: trap
> [ 150.1803103] cpu0: Begin traceback...
> [ 150.2016313] vpanic() at netbsd:vpanic+0x171
> [ 150.2257081] panic() at netbsd:panic+0x3c
> [ 150.2473911] trap() at netbsd:trap+0xb43
> [ 150.2743808] --- trap (number 6) ---
> [ 150.2951811] ipf_fastroute() at netbsd:ipf_fastroute+0x6ea
> [ 150.3266935] ipf_send_ip() at netbsd:ipf_send_ip+0x127
> [ 150.3544099] ipf_check() at netbsd:ipf_check+0xfd5
> [ 150.3859226] pfil_run_hooks() at netbsd:pfil_run_hooks+0x11e
> [ 150.4164994] ipintr() at netbsd:ipintr+0x21e
> [ 150.4451003] softint_dispatch() at netbsd:softint_dispatch+0x112
> 
> ipf_fastroute+0x6ea points to external/bsd/ipf/netinet/ip_fil_netbsd.c
> line 1200:
>                 if (!fr || !(fr->fr_flags & FR_RETMASK)) {
> 
> 0xec matches the offset of fr_flags in struct frentry_t
> This code shouldn't dereference fr_flags if fr is NULL.
> 
> The assembly code matching this part of ipf_fastroute() is:
> 1219                    fin->fin_fr = NULL;
>    0xffffffff8056dfeb <+1755>:  movq   $0x0,0x10(%r15)
> 
> 1220                    if (!fr || !(fr->fr_flags & FR_RETMASK)) {
>    0xffffffff8056dff3 <+1763>:  mov    -0xd8(%rbp),%r8
> --Type <RET> for more, q to quit, c to continue without paging--
>    0xffffffff8056dffa <+1770>:  testl  $0x3000,0xec(%r8)
>    0xffffffff8056e005 <+1781>:  mov    -0xf0(%rbp),%r9
>    0xffffffff8056e00c <+1788>:  je     0xffffffff8056e326 <ipf_fastroute+2582>
>    0xffffffff8056e012 <+1794>:  mov    %r9,-0xd8(%rbp)
> 
> 1224                    }
> 
> ipf_fastroute+2582 does the call to ipf_state_check() and jumps back to +1794
>  
> But it seems to assume that fr cannot be NULL here but I can't find
> on which basis. Any idea how I could force a NULL check here ?

More data point:
on netbsd-10, the same code is compiled as:
1217                    if (!fr || !(fr->fr_flags & FR_RETMASK)) {
   0xffffffff8056aaa0 <+1811>:  test   %r14,%r14
   0xffffffff8056aaa3 <+1814>:  je     0xffffffff8056aae2 <ipf_fastroute+1877>
   0xffffffff8056aaa5 <+1816>:  testl  $0x3000,0xec(%r14)
   0xffffffff8056aab0 <+1827>:  je     0xffffffff8056aae2 <ipf_fastroute+1877>

1221                    }

On netbsd-11, building ipf_fastroute() with O0 or O1 makes the NULL test
show up in assembly. With O2 is't not present.

I'm now running  a kernel with the attached patch, lets see how it works.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

[toc] | [prev] | [next] | [standalone]


#3598

FromManuel Bouyer <bouyer@antioche.eu.org>
Date2026-05-23 20:45 +0200
Message-ID<ahH14aSWZYZuO9Uw@antioche.eu.org>
In reply to#3594
It's easy to repoduce, see PR toolchain/60289

Hopefully we don't have much places with this kind of construct in our
sources.
Do we have someone in touch with the gcc folks here ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3595

FromValery Ushakov <uwe@stderr.spb.ru>
Date2026-05-23 14:22 +0300
Message-ID<ahGN7po8yS38pkwo@snips.stderr.spb.ru>
In reply to#3593
On Sat, May 23, 2026 at 10:59:36 +0200, Manuel Bouyer wrote:

> But it seems to assume that fr cannot be NULL here but I can't find
> on which basis. Any idea how I could force a NULL check here ?

-fno-delete-null-pointer-checks

-uwe

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3596

FromManuel Bouyer <bouyer@antioche.eu.org>
Date2026-05-23 13:28 +0200
Message-ID<ahGPX78t_ujoLPGi@antioche.eu.org>
In reply to#3595
On Sat, May 23, 2026 at 02:22:22PM +0300, Valery Ushakov wrote:
> On Sat, May 23, 2026 at 10:59:36 +0200, Manuel Bouyer wrote:
> 
> > But it seems to assume that fr cannot be NULL here but I can't find
> > on which basis. Any idea how I could force a NULL check here ?
> 
> -fno-delete-null-pointer-checks

thanks, but it doesn't seem to do the job here. With
#pragma GCC optimize ("no-delete-null-pointer-checks")
instead of
#pragma GCC optimize ("O1")

I get:

1221                    if (!fr || !(fr->fr_flags & FR_RETMASK)) {
   0xffffffff8056dff3 <+1763>:  mov    -0xd8(%rbp),%r8
   0xffffffff8056dffa <+1770>:  mov    -0xf0(%rbp),%r9
   0xffffffff8056e001 <+1777>:  testl  $0x3000,0xec(%r8)
   0xffffffff8056e00c <+1788>:  je     0xffffffff8056e326 <ipf_fastroute+2582>
   0xffffffff8056e012 <+1794>:  mov    %r9,-0xd8(%rbp)

1225                    }

> 
> -uwe
> 

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3597

FromRoland Illig <roland.illig@gmx.de>
Date2026-05-23 20:38 +0200
Message-ID<430a61e7-c38b-49f1-907e-7a9c14fbc305@gmx.de>
In reply to#3593
Am 23.05.2026 um 10:59 schrieb Manuel Bouyer:
> Hello,
> I tried upgrading a server to netbsd-11 and quickly got a panic
> in ipf:
> [ 150.0240120] fatal page fault in supervisor mode
> [ 150.0586271] trap type 6 code 0 rip 0xffffffff8056dffa cs 0x8 rflags 0x10286 cr2 0xec ilevel 0x4 rsp 0xffff870268a63a50
> [ 150.1225901] curlwp 0xffff869617c47400 pid 0.3 lowest kstack 0xffff870268a5f2c0
> [ 150.1657501] panic: trap
> [ 150.1803103] cpu0: Begin traceback...
> [ 150.2016313] vpanic() at netbsd:vpanic+0x171
> [ 150.2257081] panic() at netbsd:panic+0x3c
> [ 150.2473911] trap() at netbsd:trap+0xb43
> [ 150.2743808] --- trap (number 6) ---
> [ 150.2951811] ipf_fastroute() at netbsd:ipf_fastroute+0x6ea
> [ 150.3266935] ipf_send_ip() at netbsd:ipf_send_ip+0x127
> [ 150.3544099] ipf_check() at netbsd:ipf_check+0xfd5
> [ 150.3859226] pfil_run_hooks() at netbsd:pfil_run_hooks+0x11e
> [ 150.4164994] ipintr() at netbsd:ipintr+0x21e
> [ 150.4451003] softint_dispatch() at netbsd:softint_dispatch+0x112
> 
> ipf_fastroute+0x6ea points to external/bsd/ipf/netinet/ip_fil_netbsd.c
> line 1200:
>                 if (!fr || !(fr->fr_flags & FR_RETMASK)) {

line 1214 says:
> 	if ((fdp != &fr->fr_dif) && (fin->fin_out == 0)) {

After this point, fr is guaranteed to be non-null, as the expression
&fr->fr_dif would invoke undefined behavior, even though no memory near
the null pointer would be accessed.

C99 6.5.3.2 doesn't explicitly mention taking the address of
nullptr->member, but it may be possible to construct a valid argument
using that section.

Then, in line 1220, fr cannot be null anymore since that variable is not
re-assigned anywhere nearby.

Roland


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3599

FromManuel Bouyer <bouyer@antioche.eu.org>
Date2026-05-23 20:47 +0200
Message-ID<ahH2NFJ58JQpjOBi@antioche.eu.org>
In reply to#3597
On Sat, May 23, 2026 at 08:38:03PM +0200, Roland Illig wrote:
> Am 23.05.2026 um 10:59 schrieb Manuel Bouyer:
> > Hello,
> > I tried upgrading a server to netbsd-11 and quickly got a panic
> > in ipf:
> > [ 150.0240120] fatal page fault in supervisor mode
> > [ 150.0586271] trap type 6 code 0 rip 0xffffffff8056dffa cs 0x8 rflags 0x10286 cr2 0xec ilevel 0x4 rsp 0xffff870268a63a50
> > [ 150.1225901] curlwp 0xffff869617c47400 pid 0.3 lowest kstack 0xffff870268a5f2c0
> > [ 150.1657501] panic: trap
> > [ 150.1803103] cpu0: Begin traceback...
> > [ 150.2016313] vpanic() at netbsd:vpanic+0x171
> > [ 150.2257081] panic() at netbsd:panic+0x3c
> > [ 150.2473911] trap() at netbsd:trap+0xb43
> > [ 150.2743808] --- trap (number 6) ---
> > [ 150.2951811] ipf_fastroute() at netbsd:ipf_fastroute+0x6ea
> > [ 150.3266935] ipf_send_ip() at netbsd:ipf_send_ip+0x127
> > [ 150.3544099] ipf_check() at netbsd:ipf_check+0xfd5
> > [ 150.3859226] pfil_run_hooks() at netbsd:pfil_run_hooks+0x11e
> > [ 150.4164994] ipintr() at netbsd:ipintr+0x21e
> > [ 150.4451003] softint_dispatch() at netbsd:softint_dispatch+0x112
> > 
> > ipf_fastroute+0x6ea points to external/bsd/ipf/netinet/ip_fil_netbsd.c
> > line 1200:
> >                 if (!fr || !(fr->fr_flags & FR_RETMASK)) {
> 
> line 1214 says:
> > 	if ((fdp != &fr->fr_dif) && (fin->fin_out == 0)) {
> 
> After this point, fr is guaranteed to be non-null, as the expression
> &fr->fr_dif would invoke undefined behavior, even though no memory near
> the null pointer would be accessed.

We're just taking computing the address here, we're not telling anything
about its validity.

> 
> C99 6.5.3.2 doesn't explicitly mention taking the address of
> nullptr->member, but it may be possible to construct a valid argument
> using that section.

I guess there's still variants of offsetof() around which uses this.

So what are you suggesting ? changing line 1214 to
if ((fdp != &fr->fr_dif || fr == NULL) && (fin->fin_out == 0)) {

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3600

FromRobert Elz <kre@munnari.OZ.AU>
Date2026-05-24 02:29 +0700
Message-ID<9186.1779564551@jacaranda.noi.kre.to>
In reply to#3599
    Date:        Sat, 23 May 2026 20:47:16 +0200
    From:        Manuel Bouyer <bouyer@antioche.eu.org>
    Message-ID:  <ahH2NFJ58JQpjOBi@antioche.eu.org>

  | We're just taking computing the address here, we're not telling anything
  | about its validity.

I believe the argument is that you're computing an address based upon
a NULL pointer, which has unspecified representation, hence the result
is undefined (we know NULL is 0, but C doesn't).   One of the many
stupidities inflicted upon the C language by the compiler people who
want to be able to use every trick they can to be able to generate
faster code.

  | I guess there's still variants of offsetof() around which uses this.

Perhaps.

  | So what are you suggesting ? changing line 1214 to
  | if ((fdp != &fr->fr_dif || fr == NULL) && (fin->fin_out == 0)) {

It would need to be

	if ((fr == NULL || fdp != &fr->fr_dif) && fin->fin_out == 0) {

(removing the parens around the == test is just style).   The way you
wrote it the compiler would just omit the fr == NULL test, as by doing
the addr comparison first you've already promised that fr != NULL).

kre



--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3601

FromMouse <mouse@Rodents-Montreal.ORG>
Date2026-05-23 19:11 -0400
Message-ID<202605232311.TAA18437@Stone.Rodents-Montreal.ORG>
In reply to#3600
> One of the many stupidities inflicted upon the C language by the
> compiler people who want to be able to use every trick they can to be
> able to generate faster code.

About all it means is that compilers that insist on doing that (ie,
lacking an option to turn off such over-aggressive inferences) are
broken, not suitable for use cases other than generating impressive
benchmark stats.  Compilers exist to be useful, after all, and this is
a significant impairment to its usefulness.

I'd say this should be filed as a bug with the gcc people; if they
can't/won't fix it, NetBSD should roll back to a previous gcc and/or
drop gcc as the system compiler, as then it'd be clearly unsuitable.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3602

FromMartin Husemann <martin@duskware.de>
Date2026-05-24 10:44 +0200
Message-ID<ahK6gS0miWFUSfxs@big-apple.aprisoft.de>
In reply to#3601
On Sat, May 23, 2026 at 07:11:18PM -0400, Mouse wrote:
> I'd say this should be filed as a bug with the gcc people; if they
> can't/won't fix it, NetBSD should roll back to a previous gcc and/or
> drop gcc as the system compiler, as then it'd be clearly unsuitable.

It is not about gcc or clang, it is about the C standard, which is
created by an open comittee and you are welcome to send defect reports
and/or ask for clarifictaions. (Doing so is a slightly obfuscated process
but it actually works and you do get answers.)

The compiler using all strength the standard gives it is just a logical
consequence of the user asking it to optimize. As Manuel showed earlier
in this thread lowering optimization levels for this file does avoid
the issue.

But fixing the code to avoid "undefined behaviour" is the proper solution
and fits NetBSD's clean code mission well. This sometimes hits at unexpected
places, but on the other side we have got better warnings and tools that
helped us elliminate lots of ancient bugs.

Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3603

FromMouse <mouse@Rodents-Montreal.ORG>
Date2026-05-24 08:33 -0400
Message-ID<202605241233.IAA21053@Stone.Rodents-Montreal.ORG>
In reply to#3602
>> I'd say this should be filed as a bug with the gcc people; if they
>> can't/won't fix it, NetBSD should roll back to a previous gcc and/or
>> drop gcc as the system compiler, as then it'd be clearly unsuitable.

(I perhaps should have added "as the self-hosting compiler for
NetBSD".)

> It is not about gcc or clang, it is about the C standard, which is
> created by an open comittee and you are welcome to send defect
> reports and/or ask for clarifictaions.

Which is fair - but compilers do not have to take advantage of all the
leeway the standard gives them.  "Undefined behaviour" can, after all,
include "exactly what the code author intended".

> The compiler using all strength the standard gives it is just a
> logical consequence of the user asking it to optimize.

Then, for the sake of being useful, it needs a setting that is "do most
optimizations, but don't make such unreasonable inferences about
formally-undefined behaviour".

Perhaps the gcc people are able-and-willing to do that.  But, if not, I
still maintain that it is thereby unsuitable for the use case of
building NetBSD.

The standard is an attempt to satisfy everyone; the result is a
compromise, and gives too much leeway for some use cases and not enough
for others - repeated, of course, for each point on which there is
potential leeway.  If &p->member is undefined behaviour when p is nil,
the correct thing for a *useful* compiler to do is not to simply assume
p is non-nil, at least not unless specifically told to; it is to test
for nil and take some suitable action - indirect through it, call a
(possibly invocation-dependent) routine, *something* better than
silently propagating the inference.  Especially in the presence of an
explicit test for nil later!

> But fixing the code to avoid "undefined behaviour" is the proper
> solution and fits NetBSD's clean code mission well.

Well, none of this is my call to make.  But I still think it excessive
for the compiler to cause formally-undefined behaviour to elide
explicit tests, especially when told to run freestanding.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3604

FromJason Thorpe <thorpej@me.com>
Date2026-05-24 11:38 -0400
Message-ID<FAEA5318-0F20-4AF2-A3F2-9B99D2621C3F@me.com>
In reply to#3603
Sorry for the top-post, reply below…

-- thorpej
Sent from my iPhone.

> On May 24, 2026, at 8:33 AM, Mouse <mouse@rodents-montreal.org> wrote:
> 
> Perhaps the gcc people are able-and-willing to do that.  But, if not, I
> still maintain that it is thereby unsuitable for the use case of
> building NetBSD.

I mean, I also find “compiler people” annoying in this regard, but this doesn’t render the compiler “unsuitable”.  Just fix the code and be done with it.  I do wish the compiler at least emitted a warning, however, when a NULL-check is elided for this reason.
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3605

From"Greg A. Woods" <woods@planix.ca>
Date2026-05-24 16:53 -0700
Message-ID<m1wRId5-00Mo5fC@more.local>
In reply to#3604

[Multipart message — attachments visible in raw view] — view raw

At Sun, 24 May 2026 11:38:59 -0400, Jason Thorpe <thorpej@me.com> wrote:
Subject: Re: netbsd-11 gcc bug
> 
> > On May 24, 2026, at 8:33 AM, Mouse <mouse@rodents-montreal.org>
> > wrote:
> >  Perhaps the gcc people are able-and-willing to do that.  But,
> > if not, I still maintain that it is thereby unsuitable for the
> > use case of building NetBSD.
> 
> I mean, I also find  compiler people  annoying in this regard, but
> this doesn t render the compiler  unsuitable .  Just fix the code
> and be done with it.  I do wish the compiler at least emitted a
> warning, however, when a NULL-check is elided for this reason.

Indeed, especially for the warnings!  (Does UBSAN help?)

The ultimate fix may be to convince the C standards committee to
eliminate all cases of "undefined behaviour" from the standard.

Converting most cases to "implementation defined behaviour" should be
one option.  Some others may require whacking some sense into some
vendors of esoteric systems.

There are apparently some attempts afoot to do this (according to Robert
C. Seacord, the current convenor of the committee if I understand
correctly).


-- 
					Greg A. Woods <gwoods@acm.org>

Kelowna, BC     +1 250 762-7675           RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>     Avoncote Farms <woods@avoncote.ca>

[toc] | [prev] | [next] | [standalone]


#3606

FromDavid Holland <dholland-tech@netbsd.org>
Date2026-05-25 00:02 +0000
Message-ID<ahORiW66-m7o7YK4@netbsd.org>
In reply to#3603
On Sun, May 24, 2026 at 08:33:19AM -0400, Mouse wrote:
 > > It is not about gcc or clang, it is about the C standard, which is
 > > created by an open comittee and you are welcome to send defect
 > > reports and/or ask for clarifictaions.
 > 
 > Which is fair - but compilers do not have to take advantage of all the
 > leeway the standard gives them.  "Undefined behaviour" can, after all,
 > include "exactly what the code author intended".
 >
 > [...]

No, formally it can't, because "undefined" means "meaningless".

The root of these problems is that long ago a bunch of the undefined
and unspecified behavior in C was understood as being that way to
leave room for exotic hardware, and there was a corresponding
expectation that the compiler would do the obvious thing and the
exotic or not-so-exotic behavior of your actual hardware would
determine the behavior.

Correspondingly, people would write code assuming they were never
going to run it on a 286, or a one's complement machine, in the
presence of Multics segments, or whatever, and it worked, and nobody
worried about it.

This understanding had already disappeared entirely upstream more than
twenty years ago now, and it's a lost cause. You can't write machine-
dependent C that way any more; every available compiler will mangle
it.

There are many reasons for this. One is that most of the exotic
hardware that room was left for is dead and forgotten. Why are there
holes in the standard related to signed shifts and signed integer
overflow? There's no reason whatsoever now. It's been a long time
since anyone bothered with hardware that didn't have a signed right
shift instruction, and longer still since anyone had real hardware
whose signed integers weren't two's complement. Normal people don't
consider DS9000s irrelevant, they've never even got as far as having
heard of the idea.

Another is that the existence of C++ has prevented fixing anything in
C; thus it hasn't kept up with compiler technology. The places where
certain bits of undefined and unspecified behavior line up exactly
with things that mattered in a compiler written forty years ago are no
longer obvious, and so what's left is a vague general belief that the
UB in the C standard is meant to be room for optimizers.

A third is that few people in the programming languages and compilers
world, especially the academic parts of it, have any appreciation for
(or real understanding of) C and think of it as a design bug with no
redeeming features. This (at best) makes exploiting undefined behavior
into a game, like writing demos for old weird hardware. Not only is
there no understanding of what the affordances might have been
intended for, there's not even any awareness that this is even a point
to consider.

You can try pushing back, but you aren't going to get anywhere.

 > If &p->member is undefined behaviour when p is nil,
 > the correct thing for a *useful* compiler to do is not to simply assume
 > p is non-nil, at least not unless specifically told to; it is to test
 > for nil and take some suitable action - indirect through it, call a
 > (possibly invocation-dependent) routine, *something* better than
 > silently propagating the inference.  Especially in the presence of an
 > explicit test for nil later!

...no. Inserting extra code is never going to fly, all that'll ever
result from that is people thinking your compiler is broken. The
correct thing here is to warn that the null test downstream is
redundant at the same time as removing it. There are a few cases where
it's hard to generate meaningful warnings from UB inference, and
therefore at least sort of excusable not to, but this isn't one of
them.

A smarter compiler or program checker might instead note the explicit
null checks elsewhere, conclude that it's intended to be possible for
the pointer to be null, and warn about the missing null test at the
point that does &p->x. That requires marginally more work than just
tracking whether a pointer can be null (instead you need to track
whether we know it can't be null, whether we know it can, or if we
aren't sure) but isn't exactly difficult.

-- 
David A. Holland
dholland@netbsd.org

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3607

FromMouse <mouse@Rodents-Montreal.ORG>
Date2026-05-25 08:23 -0400
Message-ID<202605251223.IAA08565@Stone.Rodents-Montreal.ORG>
In reply to#3606
>> Which is fair - but compilers do not have to take advantage of all
>> the leeway the standard gives them.  "Undefined behaviour" can,
>> after all, include "exactly what the code author intended".

> No, formally it can't, because "undefined" means "meaningless".

No, it means the compiler can do anything whatever.  As C99 puts it,

       3.4.3
       [#1] undefined behavior
       behavior,  upon  use  of  a nonportable or erroneous program
       construct or of erroneous data, for which this International
       Standard imposes no requirements

This can include generating code based on interpreting the construct in
question in any way whatever, including the one the code author
intended.

The construct may in some sense be formally meaningless, in that there
is no meaning formally defined for it, but that does not mean the
compiler can't choose to give it meaning.

As a simple example involving gcc, nested functions are formally UB (at
least as of C99), but gcc chooses to give them meaning.

> The root of these problems is that long ago a bunch of the undefined
> and unspecified behavior in C was understood as being that way to
> leave room for exotic hardware, and there was a corresponding
> expectation that the compiler would do the obvious thing and the
> exotic or not-so-exotic behavior of your actual hardware would
> determine the behavior.

Yes.  It was intended to be _useful_.

That is, after all, what compilers are for: to be useful.

> This understanding had already disappeared entirely upstream more
> than twenty years ago now,

Not entirely.  You can cast between pointers and integers and get
something useful - though that's IB rather than UB, so the compiler is
_somewhat_ constrained, but it would be formally fine (and likely
faster) for all casts from integers to pointers to produce nil
pointers.  Why is that not done?  Fundamentally, because it would make
the compiler less useful.  But somehow this, which also makes the
compiler less useful, is considered acceptable.  I don't get it.

> A third is that few people in the programming languages and compilers
> world, especially the academic parts of it, have any appreciation for
> (or real understanding of) C and think of it as a design bug with no
> redeeming features.

Such people have no business implementing C compilers, any more than I
have any business implementing an OCAML compiler.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3613 — C compiler (over-)optimization (was: netbsd-11 gcc bug)

FromEdgar Fuß <ef@math.uni-bonn.de>
Date2026-05-29 19:12 +0200
SubjectC compiler (over-)optimization (was: netbsd-11 gcc bug)
Message-ID<ahnJE9DCIaDoAEHf@trav.math.uni-bonn.de>
In reply to#3607
> That is, after all, what compilers are for: to be useful.
This makes me believe the problem is much simpler, worse, and lies deeper.
The question is: useful for whom?

One part of people programming in C, especially system programmers (for whom 
the language has originally been designed) obviously (or so I think) regard 
C as a kind of abstract portable assembler (which, as far as I understand, 
B and K&R C were sort of intended to be), are knowledgeble of assembly 
language(s) and read
	*p++ = q->foo_ch;
as
	MOV.B 37(A5), (A6)+

Then, because C was so widely available (because it ws used to write OSes 
and without an OS, there would be no point in writing applications), 
application programmers started to use C to write, well, applications 
(I'm not talking of Unix userland, which was probably mostly written by 
system programmers).
And they read above C code as
	abstract_assignement(source_type: abstract_char, destination_type: abstract_char, destination: abstract_pointer_dereference(p), source: abstract_struct_type_fetch(type: char, base: abstract_pointer_dereference(q), field: "ch")); abstract_pointer_increment(type: abstract_char, pointer: p); sequence_point;

Parallel to that, two things happened to real computers:
1. sign-magnitude, one's complement, 6-bit-chars and 36-bit-words disappeared
2. SMP, out-of-order-execution, speculative execution etc. appeared

And then, C, the gory details of which used to be de-facto-defined by the 
existing compiler(s), written by system programmers in a way the language 
was useful to them, was standarized (sill supporting 
6-bit-one's-complement-chars and 36 bit longs), trying to abstractly define 
what was common sense for the sematics before. On top of that, compiler 
authors who weren't (or didn't care for) system programmers tried to make 
the language "useful" (e.g. 0.03% faster in edge cases) for application 
programmers, obeying the words of the standard.

That can't work.

I'm afraid the only way out would be to abandon C as a system programming 
language, invent a new one that, e.g., forbids optimizing away null pointer 
tests because 42 lines above, there's an expression that is formally UB 
if NULL is 0x12345678.

Of course that's not going to happen.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3614 — Re: C compiler (over-)optimization (was: netbsd-11 gcc bug)

FromMouse <mouse@Rodents-Montreal.ORG>
Date2026-05-29 14:21 -0400
SubjectRe: C compiler (over-)optimization (was: netbsd-11 gcc bug)
Message-ID<202605291821.OAA24702@Stone.Rodents-Montreal.ORG>
In reply to#3613
>> That is, after all, what compilers are for: to be useful.
> This makes me believe the problem is much simpler, worse, and lies
> deeper.  The question is: useful for whom?

Well, yes, that is an important addendum.

At one point, "their vendors" would have been the obvious answer.  But
today the biggest two options are both open source, so, while that
answer is still valid to an extent (modulo s/vendor/supplier/), it's a
great deal fuzzier what the exact utility metric is.

> One part of people programming in C, especially system programmers
> (for whom the language has originally been designed) obviously (or so
> I think) regard C as a kind of abstract portable assembler (which, as
> far as I understand, B and K&R C were sort of intended to be), [...]

> Then, because C was so widely available (because it ws used to write
> OSes and without an OS, there would be no point in writing
> applications), application programmers started to use C to write,
> well, applications [...]

Yes.  That matches my own understanding.

> Parallel to that, two things happened to real computers:
> 1. sign-magnitude, one's complement, 6-bit-chars and 36-bit-words disappeared
> 2. SMP, out-of-order-execution, speculative execution etc. appeared

Well, out-of-order and speculative execution are not all that relevant
here.  They don't affect the C abstract machine, not even today's C
abstract machine.  In theory, they don't even affect the
machine-language abstract machine.

SMP is a little more relevant, but not much.

> I'm afraid the only way out would be to abandon C as a system
> programming language, invent a new one that, e.g., forbids optimizing
> away null pointer tests because 42 lines above, there's an expression
> that is formally UB if NULL is 0x12345678.

I don't think so.

I think you/we just need two flavours of C compilers: ones for
applications and ones for OS implementations (or possibly three, with
the third one being for benchmarks).  Formally-undefined behaviour is
not a problem, after all, if you can count on having a compiler that
turns it into what you actually want.  A compiler can define, document,
and support semantics for something that is formally UB, after all;
that is entirely within the leeway the spec permits.

What you really need is a compiler supplier that understands that
usefulness is the true metric of goodness, as opposed to benchmarks or
even strict spec conformance, that "but it conforms to the spec" is not
an appropriate response to a bug report such as might have come out of
the example that started this discussion.  Well, understands that and
considers the use cases you care about important enough to supply a
compiler for.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3615 — Re: C compiler (over-)optimization (was: netbsd-11 gcc bug)

From"Greg A. Woods" <woods@planix.ca>
Date2026-05-29 14:27 -0700
SubjectRe: C compiler (over-)optimization (was: netbsd-11 gcc bug)
Message-ID<m1wT4jz-00Mo5fC@more.local>
In reply to#3614

[Multipart message — attachments visible in raw view] — view raw

At Fri, 29 May 2026 14:21:18 -0400 (EDT), Mouse <mouse@Rodents-Montreal.ORG> wrote:
Subject: Re: C compiler (over-)optimization (was: netbsd-11 gcc bug)
>
> I think you/we just need two flavours of C compilers: ones for
> applications and ones for OS implementations (or possibly three, with
> the third one being for benchmarks).  Formally-undefined behaviour is
> not a problem, after all, if you can count on having a compiler that
> turns it into what you actually want.  A compiler can define, document,
> and support semantics for something that is formally UB, after all;
> that is entirely within the leeway the spec permits.
>
> What you really need is a compiler supplier that understands that
> usefulness is the true metric of goodness, as opposed to benchmarks or
> even strict spec conformance, that "but it conforms to the spec" is not
> an appropriate response to a bug report such as might have come out of
> the example that started this discussion.  Well, understands that and
> considers the use cases you care about important enough to supply a
> compiler for.

The problem in the narrower context of open-source operating systems is
that we have basically two compiler "vendors" that egg each other on and
play "keep up with the Jones's" between each other, and that includes
"abusing" UB to make optimisations and do other "unexpected" things that
basically break legacy code.  I guess it could be worse and there be
only one.

So what you're basically asking for is more or less what Edgar proposed:
A new systems programming language (that is effectively a variant of
"Standard C") -- and one that forbids being initially implemented as a
front-end for either GCC or LLVM where too much happens under the hood
(i.e. in the backend, including LTO).

Maybe we go back to PCC and make sure it evolves to conform to a variant
of C where there is no UB -- only IDB, i.e. as you said, the compiler
explicitly defines semantics for everything that the C Lawyers call UB.

And then maybe if that works well enough and is popular enough it could
be extended as a semi-formal new variant of C that the other compiler
vendors could implement.  Then instead of a whole slew/slough of
"-fno-break-this-construct" flags for those other vendor compiler turn
into just one:  "-std=NewC" (or maybe we call it "L", as I propose in my
notes on C).

--
					Greg A. Woods <gwoods@acm.org>

Kelowna, BC     +1 250 762-7675           RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>     Avoncote Farms <woods@avoncote.ca>

[toc] | [prev] | [next] | [standalone]


#3617 — Re: C compiler (over-)optimization

FromAnders Magnusson <ragge@tethuvudet.se>
Date2026-05-30 12:30 +0200
SubjectRe: C compiler (over-)optimization
Message-ID<6baa057d-3ba2-40bf-aef6-4bdaa78e1d48@tethuvudet.se>
In reply to#3615
Den 2026-05-29 kl. 23:27, skrev Greg A. Woods:
> At Fri, 29 May 2026 14:21:18 -0400 (EDT), Mouse <mouse@Rodents-Montreal.ORG> wrote:
> So what you're basically asking for is more or less what Edgar proposed:
> A new systems programming language (that is effectively a variant of
> "Standard C") -- and one that forbids being initially implemented as a
> front-end for either GCC or LLVM where too much happens under the hood
> (i.e. in the backend, including LTO).
> 
This is all depending on which approach you have.

A looong time ago I was talking to some people from HP about their 
system compiler in HPUX, since IIRC there was one system C compiler and 
another Ansi C compiler.
They stated that it was a strategic decision to use the old stable 
compiler for the system itself to really avoid surprises, since the 
system itself is not that performance sensitive.
Then HP was able to use a compiler with all fancy new features and 
optimizations without breaking anything else.

So, there are prior art to having a special system compiler.

-- R

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


#3618 — Re: C compiler (over-)optimization (was: netbsd-11 gcc bug)

FromAndrew Cagney <andrew.cagney@gmail.com>
Date2026-05-31 13:02 -0400
SubjectRe: C compiler (over-)optimization (was: netbsd-11 gcc bug)
Message-ID<CAJeAr6srT0w2HN+MDoPUiPcfw2tFs+4ds+NiN5AaT-Xk3eoo2Q@mail.gmail.com>
In reply to#3615
> Maybe we go back to PCC and make sure it evolves to conform to a variant
> of C where there is no UB -- only IDB, i.e. as you said, the compiler
> explicitly defines semantics for everything that the C Lawyers call UB.

That runs the risk of compiler and language lock-in.

I'm more worried about things like this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125524

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-admin@muc.de

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | muc.lists.netbsd.tech.toolchain


csiph-web