Path: csiph.com!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.lang.c Subject: Re: Safety of casting from 'long' to 'int' Date: Tue, 5 May 2026 13:02:38 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <10tcppe$4ll$1@reader1.panix.com> References: <10su8cn$am9i$1@dont-email.me> <10tb2pv$3v9t1$2@dont-email.me> <10tbeo8$gk1$1@reader1.panix.com> <10tbh2v$3khv$1@dont-email.me> Injection-Date: Tue, 5 May 2026 13:02:38 -0000 (UTC) Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="4789"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Xref: csiph.com comp.lang.c:398366 In article <10tbh2v$3khv$1@dont-email.me>, Bart wrote: >On 05/05/2026 01:48, Dan Cross wrote: >> In article <10tb2pv$3v9t1$2@dont-email.me>, Bart wrote: >>> On 04/05/2026 22:14, Chris M. Thomasson wrote: >>>> On 5/3/2026 9:39 AM, Bart wrote: >>>> [...] >>>> >>>> A compiler vendor can take an UB and say, well, lets define it for fun, >>>> or whatever. It can say, if you do 1.0 / 0.0, we can make it 1.0 / >>>> 0.00000042 for shits and giggles. If you want pure std C, use this >>>> setting... If not, well, your UB will ride the train into the universe >>>> and beyond... ;^) >>> >>> I don't now about UB for floating point. >>> >>> For the one under discussion which is arithmetic between integers, >>> suppose there was a choice of just two compilers: >>> >>> (1) On overflow, it just loses the top significant bits of the result >>> >>> (2) On overflow, it will set your computer on fire and burn your house down >>> >>> There are no options to change that behaviour. >> >> This is silly and reductive. >> >> The point about saying that something is "UB" originally came >> from different machines doing things differently; C chose to >> kick that can down the road by simply declaring that the result >> was "undefined." "Nasal demons" were never anything more than a >> euphamism. >> >> The computing world is, in so many ways, simpler now than it was >> then: we've got 2's complement machines everywhere, bytes are >> pretty much uniformly sized at 8 bits, multi-byte values have >> power of two widths, and so on. > >That was happening at about the time I came in, around the late 70s, >after starting off on word-based mainframes. C however has taken decades >to adapt, and it still hasn't fully. Because, as I said, the incentives changed. They went from, "oh gee, all these machines are different; we can't possibly define what this means" to, "oh gee, we can use the undefined nature of $X to make code faster." They'd probably argue that they've adapted just fine; just not in the direction they wanted you to. >> So now, "UB" has come to mean >> that a compiler can make assumptions about code to introduce >> aggressive optimizations: if it is undefined what happens when >> signed integer arithmetic overflows, then the the compiler can >> choose to use a saturating instruction instead of a regular >> arithmetic instruction that wraps; or it can trap; or elide the >> instruction entirely. >> >> All of those options are correct, because as far as the language >> is concerned, what the compiler does is undefined, so anything >> it does is thus definitionally "correct." > >This is where I get lost. The language could easily have narrowed down >the possibilities. See above. >> This differs from "implementation defined" in that IB _is_ well >> defined; it's just that the actual behavior depends on the >> implementation. For example, the integer value of the code >> point corresponding to the character constant 'A' is IB: in >> ASCII is 65 dec, but in EBCDIC it is 193. It is perfectly >> well-defined to assign that to an `int`, but the actual value >> depends on the implementation. >> >>> Which would you choose? Personally, I would shoot the person who >>> mandated random, unchecked behaviour in the first place. >> >> No, you wouldn't. You are not going to shoot anybody. It is >> weird when people say things like this. >> >>> And then choose (1). >> >> Why that and not saturating arithmetic? > >Because nearly all hardware, many languages using machines types, even >most C implementations by default, and all my own stuff, has wraparound >behaviour. Don't care about your stuff. It's not relevant. The question was about what instruction a compiler emits for a given operation. The point is that, in the face of UB, compiler authors have greater lattitude here than they otherwise might if the behavior was well-defined. Consider ARM, which has the `QADD` instruction, that saturates. Now consider, `int foo(int a) { return a + 1; }` compiled with a compiler following the eabi. Since signed integer overflow is UB, and in that environment `int` is 32-bits wide, the compiler is perfectly free to implement this using `QADD`, so that `foo(INT_MAX)` returns `INT_MAX`. Your objection seems to be, basically, "why would anyone every do that?" To which I answer, there could be any number of reasons: perhaps they're targeting silicon where that's cheaper than a normal `ADD`. Perhaps the compiler is doing whole- program optimization, targeting a standalone environment, and thus sees the entire call graph, and determines that `foo` is always followed by a sequence that ensures saturating behavior, so it's more efficient to elide that code and use `QADD`. I don't know, but neither do you, and that's the point. >Also, in C, matches the behaviour of unsigned arithmetic. > >It would be bizarre to go for something different /at this level of >language/. At higher levels you can get more sophisticated, but you >probably still wouldn't go for saturated. See above. - Dan C.