Groups | Search | Server Info | Login | Register

Re: arm-gcc, Cortex-M0+, uint64_t and alignment

From	David Brown <david.brown@hesbynett.no>
Newsgroups	comp.arch.embedded
Subject	Re: arm-gcc, Cortex-M0+, uint64_t and alignment
Date	2026-01-22 10:03 +0100
Organization	A noiseless patient Spider
Message-ID	<10ksp4q$2s8q3$1@dont-email.me> (permalink)
References	(5 earlier) <10kq1m7$1un41$1@dont-email.me> <10kq4mi$1vcdq$1@dont-email.me> <10kqpg9$26oa0$1@dont-email.me> <10kqu02$280ts$1@dont-email.me> <10kr0gf$29a7t$1@dont-email.me>

Show all headers | View raw

On 21/01/2026 17:57, pozz wrote:
> Il 21/01/2026 17:13, David Brown ha scritto:
>> On 21/01/2026 15:58, pozz wrote:
>>> Il 21/01/2026 10:02, David Brown ha scritto:
>>>> On 21/01/2026 09:11, pozz wrote:

<snip for brevity>

> Most probably I can't explain what I want to say. I don't want to use an 
> *improper* alignment (different from the one that gcc really is using). 
> I want to know what happens when I *instruct* the compiler to use a 
> 4-bytes alignment for uint64_t in the context of Cortex-M0+ core only.
> 

I think we may have been talking slightly past each other, so that 
re-wording was helpful.

> In other words, is it completely safe to use, as you suggested,
> 
>     typedef uint64_t __attribute__((align(4))) uint64_a;
> 
> ???

Baring compiler bugs, yes, that is completely safe.  When the compiler 
lets you make such a type, and use it, it is the compiler's 
responsibility to get the details right.  You should never see issues 
from the compiler's knowledge and assumptions of alignments, and it 
should generate instructions that work on the target (for example, if 
the target hardware required 8-byte alignment for 64-bit loads and 
stores, then the compiler would generate two 32-bit accesses instead).

And for the Cortex M series, 4-byte alignment is the maximum needed for 
working code (though there might be efficiency differences on some of 
the biggest M cores that have 64-bit buses internally, or when data 
caches are used).

> 
>  From what you wrote, I think yes. Maybe just a very small optimization 
> penalty.
> 

Yes.  And I think that is a "missed optimisation opportunity" bug.  I 
suspect (or speculate), but have not looked at the compiler code to be 
sure, that the code generator generally accesses the low half of 64-bit 
data first.  And then it may have specific optimisations ("peephole" 
optimisations) for re-ordering the accesses for "long long" types in 
certain circumstances, saving a register and an instruction.  However, 
that would apply only to the specific type - and while "uint64_a" works 
a lot like "unsigned long long", it is not that exact type, and won't 
trigger the same optimisation.

> 
>>> From the goldbot link that you share in another post, it seems 
>>> there's a penalty of a single instruction (it's strange, it seems the 
>>> compiler needs to save the struct pointer to r3, before loading the 
>>> two halves of the word, but only if uint64_t is aligned to 4-bytes).
>>>
>>
>> The compiler is not perfect here - there is definitely an extra 
>> instruction because it is reading the low word first.  (clang reads 
>> the low word first for uint64_t as well, meaning it gives worse code 
>> for A and B as well.)  In real code, rather than a brief test snippet, 
>> other factors could mean this does not happen - it's only because the 
>> pointer happens to be in r0 that you see it here.
>>
>> But there's no harm in filing a gcc bug on this, looking for an 
>> obvious improvement.
>>
>>>
>>>> I would be extremely surprised to find code that fails to work on an 
>>>> M0+ because of a uint64_t pointer that is 4-byte aligned but not 
>>>> 8-byte aligned.  But if /I/ want to use 64-bit integers with 4-byte 
>>>> alignments, I'd use the typedef'd aligned type for the object type 
>>>> and for any relevant pointers.
>>>
>>> Yes, of course. Even if I don't understand why the compiler isn't 
>>> able to align at 4-bytes address the uint64_t member in struct B.
>>>
>>
>> It can't align the uint64_t member because the EABI says uint64_t (or, 
>> rather, unsigned long long) is 8 bytes aligned.  gcc didn't make those 
>> rules - ARM did.
> 
> But struct B is defined with correct alignment attribute for uint64_t 
> member. I tried also:
> 
> struct B {
>      uint32_t x;
>      uint64_t y __attribute__((aligned(4)));
> };
> 
> The struct size is always 16, so y is placed at offset 8 and not 4. It 
> seems to me gcc isn't able to respect the aligned attribute of 4 bytes 
> when it is specified inside the struct definition.

That is my conclusion too.  (I tried the "aligned" attribute in every 
place I could.)  The only place it worked was on a typedef for the new 
"uint64_a" type.

> 
> I don't see many differences with:
> 
> typedef __attribute__((aligned(4))) uint64_t uint64_a;
> 
> struct C {
>      uint32_t x;
>      uint64_a y;
> };
> 

It certainly seems inconsistent to me that it works on the typedef, and 
not directly in the struct definition.  After all, a typedef does not 
actually define a new type (it's a silly name) - it merely defines an 
alias or shortcut name for a type.  So it would seem logical that using 
"uint64_a" or "__attribute__((aligned(4))) uint64_t" in the struct 
definition would mean exactly the same thing.  But apparently not.  gcc 
attributes are not part of the normal C grammar, so there's no standard 
to fall back on here.

>>
>> As I briefly mentioned before, there are a number of very poor choices 
>> in the EABI (and the 32-bit ARM ABI used for Linux).  This is far from 
>> the worst.
>>
>>>>> However, what really changes in the binary output?
>>>>>
>>>>> In some cases, the address of uint64_t can change from 8-bytes to 
>>>>> 4-bytes aligned address (because we instructed it to do so). What 
>>>>> about the code that accesses uint64_t aligned to 4-bytes? Is it 
>>>>> identical between 4- and 8-bytes alignment requirement? I think so, 
>>>>> because in both case, the compiler should add two load/store 
>>>>> 4-bytes instructions.
>>>>>
>>>>
>>>
>>
>

Back to comp.arch.embedded | Previous | Next — Previous in thread | Next in thread | Find similar

Thread

arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 13:26 +0100
  Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 17:07 +0100
    Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 16:41 +0000
      Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 18:09 +0100
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:44 +0100
          Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 09:11 +0100
            Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 10:02 +0100
              Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 15:58 +0100
                Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 17:13 +0100
                Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 17:57 +0100
                Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-22 10:03 +0100
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 17:48 +0000
      Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:41 +0100
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 18:10 +0000
          Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:32 +0100
            Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-21 03:38 +0000
              Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 08:54 +0100
    Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 17:55 +0100
      Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:24 +0100

csiph-web