Path: csiph.com!xmission!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: David Brown Newsgroups: comp.compilers Subject: Re: Undefined behaviour in C23 Date: Sat, 23 Aug 2025 16:55:13 +0200 Organization: Compilers Central Sender: johnl%iecc.com Approved: comp.compilers@iecc.com Message-ID: <25-08-012@comp.compilers> References: <25-08-002@comp.compilers> <25-08-003@comp.compilers> <25-08-005@comp.compilers> <25-08-007@comp.compilers> <25-08-008@comp.compilers> <25-08-011@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="21326"; mail-complaints-to="abuse@iecc.com" Keywords: C, standards Posted-Date: 23 Aug 2025 15:01:59 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <25-08-011@comp.compilers> Xref: csiph.com comp.compilers:3686 On 23/08/2025 00:11, Keith Thompson wrote: > comp.lang.c would probably be a better place for this discussion, > but cross-posting between moderated and unmoderated newsgroups is > likely to cause problems. Yes - but some comments have also wandered slightly from being just applicable to C. Still, it is not really a compiler discussion. [FYI, cross-posting to comp.compilers and other groups works because your moderator's scripts know how to handle it. -John] > David Brown writes: >> On 21/08/2025 21:53, Keith Thompson wrote: > [...] >> If you declare and call a function "foo" that is written in fully >> portable C code, but not part of the current translation unit being >> compiled (perhaps it has been separately compiled or included in a >> library), then it would be UB by the section 4 definition (since the C >> standards don't say anything about what "foo" does, nor does your code). > > If the translation unit that defined "foo" is part of your program, then > your code *does* define its behavior. Linking multiple translation > units into a program is specified by the C standard; it's translation > phase 8. No. The C standard does not define how this linking or combing is done - it only covers certain specific aspects of the linking that relate directly to C. The behaviour of the function "foo" here is not defined in the C standards, and if the source code is not available when translating a different translation unit, the behaviour of "foo" is undefined. >> But the code that calls "foo" is portable and not erroneous, so it is >> not UB by the section 3 definition. > > If "foo" is defined by your program, either in the current > translation unit or in another one, the call is well defined > (assuming "foo" doesn't do something silly like dividing by zero). > If "foo" is defined outside your program, the C standard has nothing > to say about it. It could even be implemented in a language other > than C. > > The *behavior* of such a call is not portable. (And the execution of > such a call is definitely undefined behavior if the visible declaration > is inconsistent with the definition.) The C code being translated has code to call the function - the call is defined (assuming declarations and definitions are consistent), but the effect of the call is not defined - it is therefore UB. > > The section 3 definition of "undefined behavior" is a bit informal. And yet it is in the section labelled "Terms, definitions and symbols". > It's not clear what it means by "erroneous", for example. Section > 4 is more precise, and states that UB can be indicated "by the > omission of any explicit definition of behavior" (in the standard). > The standard omits any definition of the behavior of foo(). > I agree that the definitions are somewhat vague and missing details, but I also think they are somewhat inconsistent. > [...] > >> Add to that, the C standard has a specific term for features that are >> non-portable but not undefined behaviour - "implementation-defined >> behaviour". Code that relies on "int" being 32-bit is not portable, but >> it is not UB when compiled on implementations for which "int" /is/ 32-bit. > > That's not what "implementation-defined behavior" means in C. > Cases of implementation-defined behavior are explicitly called out in > the standard, and an implementation must document how it treats each > instance of implementation-defined behavior. Each implementation > must document the range of int. There is no such requirement for > the behavior of "foo" defined in some non-standard header. Yes, exactly - implementation-defined behaviours are things that are not portable, but are not undefined behaviour, because they must be defined by the implementation. (The C standard usually also gives some specific options or minimum requirements for those definitions.) > >>> No, a bug in your code is not necessarily undefined behavior. It could >>> easily be code whose behavior is well defined by the language standard, >>> but that behavior isn't what the programmer intended. >> >> When I write code, /I/ define what the behaviour of the code should be. >> A bug in the code means it is not acting according to my definitions - >> it is UB. It may still be acting according to the definitions of the C >> abstract machine given in the C standards (you are correct there). Even >> if it has C-standard UB, it will still be acting according to the >> definitions of the target machine's instruction set. Behaviour is >> defined on multiple levels, only one of which is the C standard. > > "Undefined behavior" is a technical term defined by the C standard. > It's not just behavior that is not defined. Section 4 says precisely that behaviour that is not defined by the C standard, is "undefined behaviour" in exactly the same way as things that are explicitly labelled "undefined behaviour" in the standard. And that, I think, is the root of the problem - the C standard is on the one hand trying to classify, define and describe things as "undefined behaviour" as a technical term in the C standard, while on the other hand it is also trying to say these are things that have no definition or descriptions of their behaviours. > It is behavior that > is not defined *by the C standard*. If I write printf("goodbye\n") > when I meant to write printf("hello\n"), that's incorrect behavior, > but it's not undefined behavior. > I agree that it is not C undefined behaviour, yes. But it can be undefined behaviour at a higher level in the design and specification of the program. As I see it, programming is the process of taking higher level specification of a task down through layers until you have something that is executable on a computer - a bug is when the code at a layer is not following the defined behaviour it should be following. I was perhaps not as clear as I should have been that I was not talking only about C-level "undefined behaviour", as the term is defined (approximately) in the C standards.