Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.compilers Subject: Re: Undefined Behavior Optimizations in C Date: Sun, 22 Jan 2023 09:56:22 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <23-01-071@comp.compilers> References: <23-01-027@comp.compilers> <23-01-031@comp.compilers> <23-01-041@comp.compilers> <23-01-062@comp.compilers> <23-01-065@comp.compilers> <23-01-067@comp.compilers> <23-01-069@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="77450"; mail-complaints-to="abuse@iecc.com" Keywords: C, optimize Posted-Date: 22 Jan 2023 12:42:35 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3339 anton@mips.complang.tuwien.ac.at (Anton Ertl) writes: >AMD64 specifies zero-extension for both signed >and unsigned ints (and has instructions that generate zero-extended >results). Looking at , I find no such specification. However, compilers certainly behave in that way. E.g., for int add (int a, int b) { return a+b; } gcc generates: 0: 8d 04 37 lea (%rdi,%rsi,1),%eax 3: c3 retq which zero-extends the result. This certainly rules out an ABI that requires sign-extension for signed integers. One interesting case is: long add (unsigned a, long b) { return a+b; } which gcc compiles into 0: 89 ff mov %edi,%edi 2: 48 8d 04 37 lea (%rdi,%rsi,1),%rax 6: c3 retq What's the point of the MOV instruction here? It performs a 32->64-bit zero extension of %rdi. So gcc apparently assumes that passed operands are garbage-extended on AMD64. Or maybe gcc is just cautious here. Another test: unsigned bar(int x); unsigned long foo(long x) { return bar(x); } gcc -O compiles this to: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 9: 89 c0 mov %eax,%eax b: 48 83 c4 08 add $0x8,%rsp f: c3 retq There is no zero or sign-extension on passing x to bar(), so the value is passed garbage-extended. There is a zero extension for converting the return value unsigned long, so gcc assumes that the return value of bar is not necessarily zero-extended. Conclusion: In the System V ABI for AMD64, values are passed around garbage-extended (in the general case). - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/