Groups > comp.arch.embedded > #32461 > unrolled thread

arm-gcc, Cortex-M0+, uint64_t and alignment

Started by	pozz <pozzugno@gmail.com>
First post	2026-01-20 13:26 +0100
Last post	2026-01-20 22:24 +0100
Articles	19 — 3 participants

Back to article view | Back to comp.arch.embedded

  arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 13:26 +0100
    Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 17:07 +0100
      Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 16:41 +0000
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 18:09 +0100
          Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:44 +0100
            Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 09:11 +0100
              Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 10:02 +0100
                Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 15:58 +0100
                  Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 17:13 +0100
                    Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 17:57 +0100
                      Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-22 10:03 +0100
          Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 17:48 +0000
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:41 +0100
          Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 18:10 +0000
            Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:32 +0100
              Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-21 03:38 +0000
                Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 08:54 +0100
      Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 17:55 +0100
        Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:24 +0100

#32461 — arm-gcc, Cortex-M0+, uint64_t and alignment

From	pozz <pozzugno@gmail.com>
Date	2026-01-20 13:26 +0100
Subject	arm-gcc, Cortex-M0+, uint64_t and alignment
Message-ID	<10kns7l$1733k$1@dont-email.me>

I just discovered that my arm-gcc assigns an alignment of 8 to a struct 
with uint64_t member.

First of all: I can't explain why. Cortex-M0+ shouldn't have any special 
load/store instructions for 64-bits data. I think the uint64_t variable 
is *always* accessed with two separate instructions.

Second thing. Is it safe to force the alignment of such structs to 4 
with __attribute__((aligned(4)))?

I have big arrays of structs that contains uint64_t members, so I'm 
thinking how to save some space.

[toc] | [next] | [standalone]

#32462

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-20 17:07 +0100
Message-ID	<10ko98i$1bptj$1@dont-email.me>
In reply to	#32461

On 20/01/2026 13:26, pozz wrote:
> I just discovered that my arm-gcc assigns an alignment of 8 to a struct 
> with uint64_t member.
> 
> First of all: I can't explain why. Cortex-M0+ shouldn't have any special 
> load/store instructions for 64-bits data. I think the uint64_t variable 
> is *always* accessed with two separate instructions.
> 

There are other Cortex-M devices that /can/ access 64 bit data with a 
single instruction (though not always as an atomic function).

Compilers use family ABI's, not ABI's specifically tuned for exact 
devices.  The EABI for 32-bit ARM says long long's are 8 byte aligned, 
so that's what is used for all targets that use the EABI.  (There's a 
lot to dislike about the EABI - this is not the worst thing.)

> Second thing. Is it safe to force the alignment of such structs to 4 
> with __attribute__((aligned(4)))?
> 

You can't reduce the alignment of a struct or its elements by adding an 
__aligned_ attribute to the struct itself or any of its fields.  The 
best you can do on the struct itself is __attribute__((packed)).  But 
that can come with disadvantages, and inefficient use.

> I have big arrays of structs that contains uint64_t members, so I'm 
> thinking how to save some space.

The best way is to organise the fields so that they are naturally 
aligned, and don't have padding for alignment.  I like "-Wpadded" to 
tell me if there is unexpected padding.

What you /can/ do, however, is define a type that is 64 bits, but 4 byte 
alignment:

typedef uint64_t __attribute__((aligned(4)) uint64_a;

Now you can use "uint64_a" instead of "uint64_t", and it will have 4 
byte alignment.

[toc] | [prev] | [next] | [standalone]

#32463

From	Grant Edwards <invalid@invalid.invalid>
Date	2026-01-20 16:41 +0000
Message-ID	<10kob7m$qel$1@reader2.panix.com>
In reply to	#32462

On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:

> You can't reduce the alignment of a struct or its elements by adding an 
> __aligned_ attribute to the struct itself or any of its fields.  The 
> best you can do on the struct itself is __attribute__((packed)).  But 
> that can come with disadvantages, and inefficient use.

Yep making a structure aligned is an excellent way to introduce subtle
bugs that happen when somebody, somewhere passes a pointer to one of
those structure fields to some library function. Somebody I used to
work with was very fond of making all of his structures aligned (for
no apparent reason). Then he would test his code on an X86 desktop
machine. It worked fine because the X86 support unaligned
accesses. Then he would move to an ARM target, and it would
fail. Inevitably the cry "The compiler's broken!" would be heard, and
I would have to explain to him for the Nth time about misaligned
accesses on different ARM targets.  Some of our targets generate a bus
fault, some just silently read/write only part of the data.

That same guy once insisted that with the 32-bit GCC compiler we were
using "unsigned long variables work, but unsigned variables don't". So
he was busily changing all of his "unsigned" variables to "unsigned
long".  I printed out the assembly generated for both cases showing
that it was identical. He then insisted that the linker must be doing
something to break unsigned integers.

And then there was the time he decided that cross compiling on a
single-core Linux host worked but compiling on a dual-core
didn't. [Both cases using a single-threaded "make".]

And the time he decided that he needed to upgrade a buch of the Ubuntu
X11 libraries on the X86 host machine to fix a problem in the ARM
target.

--
Grant

[toc] | [prev] | [next] | [standalone]

#32465

From	pozz <pozzugno@gmail.com>
Date	2026-01-20 18:09 +0100
Message-ID	<10kocr4$1d65l$1@dont-email.me>
In reply to	#32463

Il 20/01/2026 17:41, Grant Edwards ha scritto:
> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
> 
>> You can't reduce the alignment of a struct or its elements by adding an
>> __aligned_ attribute to the struct itself or any of its fields.  The
>> best you can do on the struct itself is __attribute__((packed)).  But
>> that can come with disadvantages, and inefficient use.
> 
> Yep making a structure aligned is an excellent way to introduce subtle
> bugs that happen when somebody, somewhere passes a pointer to one of
> those structure fields to some library function. 

However, as long as the application runs on Cortex-M0+, the aligned 
version shouldn't introduce issues, should it?

[toc] | [prev] | [next] | [standalone]

#32467

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-20 18:44 +0100
Message-ID	<10koet6$1dlne$2@dont-email.me>
In reply to	#32465

On 20/01/2026 18:09, pozz wrote:
> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>
>>> You can't reduce the alignment of a struct or its elements by adding an
>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>> best you can do on the struct itself is __attribute__((packed)).  But
>>> that can come with disadvantages, and inefficient use.
>>
>> Yep making a structure aligned is an excellent way to introduce subtle
>> bugs that happen when somebody, somewhere passes a pointer to one of
>> those structure fields to some library function. 
> 
> However, as long as the application runs on Cortex-M0+, the aligned 
> version shouldn't introduce issues, should it?
> 
> 

Correctly aligned data is never a problem.  /Misaligned/ data is a problem.

The Cortex-M0+ cannot access misaligned data directly.  But if the 
compiler knows that it is misaligned - by "packed" struct, or "aligned" 
attribute on the typedef - it should break apart the accesses into bytes 
or 16-bit half-words as necessary.  (Aligning a uint64_t to 4 byte 
alignment will not be a problem.)

[toc] | [prev] | [next] | [standalone]

#32474

From	pozz <pozzugno@gmail.com>
Date	2026-01-21 09:11 +0100
Message-ID	<10kq1m7$1un41$1@dont-email.me>
In reply to	#32467

Il 20/01/2026 18:44, David Brown ha scritto:
> On 20/01/2026 18:09, pozz wrote:
>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>
>>>> You can't reduce the alignment of a struct or its elements by adding an
>>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>>> best you can do on the struct itself is __attribute__((packed)).  But
>>>> that can come with disadvantages, and inefficient use.
>>>
>>> Yep making a structure aligned is an excellent way to introduce subtle
>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>> those structure fields to some library function. 
>>
>> However, as long as the application runs on Cortex-M0+, the aligned 
>> version shouldn't introduce issues, should it?
>>
>>
> 
> Correctly aligned data is never a problem.  /Misaligned/ data is a problem.
> 
> The Cortex-M0+ cannot access misaligned data directly.  But if the 
> compiler knows that it is misaligned - by "packed" struct, or "aligned" 
> attribute on the typedef - it should break apart the accesses into bytes 
> or 16-bit half-words as necessary.  (Aligning a uint64_t to 4 byte 
> alignment will not be a problem.)

However for Cortex-M0+ uint64_t aligned at 4 bytes is:
- aligned for the core (two 4-bytes aligned accesses are required)
- misaligned for the ABI and the compiler

We agree that forcing the gcc compiler to consider 4-bytes as the 
required alignment of uint64_t (using aligned attribute) is always safe. 
However, what really changes in the binary output?

In some cases, the address of uint64_t can change from 8-bytes to 
4-bytes aligned address (because we instructed it to do so). What about 
the code that accesses uint64_t aligned to 4-bytes? Is it identical 
between 4- and 8-bytes alignment requirement? I think so, because in 
both case, the compiler should add two load/store 4-bytes instructions.

[toc] | [prev] | [next] | [standalone]

#32475

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-21 10:02 +0100
Message-ID	<10kq4mi$1vcdq$1@dont-email.me>
In reply to	#32474

On 21/01/2026 09:11, pozz wrote:
> Il 20/01/2026 18:44, David Brown ha scritto:
>> On 20/01/2026 18:09, pozz wrote:
>>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>>
>>>>> You can't reduce the alignment of a struct or its elements by 
>>>>> adding an
>>>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>>>> best you can do on the struct itself is __attribute__((packed)).  But
>>>>> that can come with disadvantages, and inefficient use.
>>>>
>>>> Yep making a structure aligned is an excellent way to introduce subtle
>>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>>> those structure fields to some library function. 
>>>
>>> However, as long as the application runs on Cortex-M0+, the aligned 
>>> version shouldn't introduce issues, should it?
>>>
>>>
>>
>> Correctly aligned data is never a problem.  /Misaligned/ data is a 
>> problem.
>>
>> The Cortex-M0+ cannot access misaligned data directly.  But if the 
>> compiler knows that it is misaligned - by "packed" struct, or 
>> "aligned" attribute on the typedef - it should break apart the 
>> accesses into bytes or 16-bit half-words as necessary.  (Aligning a 
>> uint64_t to 4 byte alignment will not be a problem.)
> 
> However for Cortex-M0+ uint64_t aligned at 4 bytes is:
> - aligned for the core (two 4-bytes aligned accesses are required)

Yes.  As far as I know, the M0+ core does not need any alignment greater 
than 4 for any purpose.  (But I might not know everything about the 
core!)  There can be alignment requirements for other things, such as DMA.

> - misaligned for the ABI and the compiler

Yes.

> 
> We agree that forcing the gcc compiler to consider 4-bytes as the 
> required alignment of uint64_t (using aligned attribute) is always safe. 

No.

It will almost always be safe, but you don't have any guarantees.  The 
compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) p 
& 0x07" will always be zero.  Is it likely that you would have anything 
in your code where that is relevant, and also that the compiler would 
generate code that relies on that assumption?  No, it is very unlikely.

But there is a general principle that you should not lie to your 
compiler - don't write code that executes UB, breaks ABIs, or is 
otherwise breaking the contract you have with the compiler unless you 
are using compiler features that let you keep everything honest.

Part of that is that code you are writing now for the M0+ might be 
copied or adapted to a different target at a different time.  Maybe on a 
different core, the same data will be read using some kind of SIMD or 
vector instruction that /does/ require 8-byte alignment.  Don't mess 
these things without telling your compiler.  And don't mess with them 
without telling future maintainers and programmers using the code 
(including your future self).

I would be extremely surprised to find code that fails to work on an M0+ 
because of a uint64_t pointer that is 4-byte aligned but not 8-byte 
aligned.  But if /I/ want to use 64-bit integers with 4-byte alignments, 
I'd use the typedef'd aligned type for the object type and for any 
relevant pointers.

> However, what really changes in the binary output?
> 
> In some cases, the address of uint64_t can change from 8-bytes to 
> 4-bytes aligned address (because we instructed it to do so). What about 
> the code that accesses uint64_t aligned to 4-bytes? Is it identical 
> between 4- and 8-bytes alignment requirement? I think so, because in 
> both case, the compiler should add two load/store 4-bytes instructions.
>

[toc] | [prev] | [next] | [standalone]

#32476

From	pozz <pozzugno@gmail.com>
Date	2026-01-21 15:58 +0100
Message-ID	<10kqpg9$26oa0$1@dont-email.me>
In reply to	#32475

Il 21/01/2026 10:02, David Brown ha scritto:
> On 21/01/2026 09:11, pozz wrote:
>> Il 20/01/2026 18:44, David Brown ha scritto:
>>> On 20/01/2026 18:09, pozz wrote:
>>>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>>>
>>>>>> You can't reduce the alignment of a struct or its elements by 
>>>>>> adding an
>>>>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>>>>> best you can do on the struct itself is __attribute__((packed)).  But
>>>>>> that can come with disadvantages, and inefficient use.
>>>>>
>>>>> Yep making a structure aligned is an excellent way to introduce subtle
>>>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>>>> those structure fields to some library function. 
>>>>
>>>> However, as long as the application runs on Cortex-M0+, the aligned 
>>>> version shouldn't introduce issues, should it?
>>>>
>>>>
>>>
>>> Correctly aligned data is never a problem.  /Misaligned/ data is a 
>>> problem.
>>>
>>> The Cortex-M0+ cannot access misaligned data directly.  But if the 
>>> compiler knows that it is misaligned - by "packed" struct, or 
>>> "aligned" attribute on the typedef - it should break apart the 
>>> accesses into bytes or 16-bit half-words as necessary.  (Aligning a 
>>> uint64_t to 4 byte alignment will not be a problem.)
>>
>> However for Cortex-M0+ uint64_t aligned at 4 bytes is:
>> - aligned for the core (two 4-bytes aligned accesses are required)
> 
> Yes.  As far as I know, the M0+ core does not need any alignment greater 
> than 4 for any purpose.  (But I might not know everything about the 
> core!)  There can be alignment requirements for other things, such as DMA.
> 
>> - misaligned for the ABI and the compiler
> 
> Yes.
> 
>>
>> We agree that forcing the gcc compiler to consider 4-bytes as the 
>> required alignment of uint64_t (using aligned attribute) is always safe. 
> 
> No.
> 
> It will almost always be safe, but you don't have any guarantees.  The 
> compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) p 
> & 0x07" will always be zero.  Is it likely that you would have anything 
> in your code where that is relevant, and also that the compiler would 
> generate code that relies on that assumption?  No, it is very unlikely.
> 
> But there is a general principle that you should not lie to your 
> compiler - don't write code that executes UB, breaks ABIs, or is 
> otherwise breaking the contract you have with the compiler unless you 
> are using compiler features that let you keep everything honest.
> 
> Part of that is that code you are writing now for the M0+ might be 
> copied or adapted to a different target at a different time.  Maybe on a 
> different core, the same data will be read using some kind of SIMD or 
> vector instruction that /does/ require 8-byte alignment.  Don't mess 
> these things without telling your compiler.  And don't mess with them 
> without telling future maintainers and programmers using the code 
> (including your future self).

But it is exactly what I wanted to do: explictly tell the compiler to 
align uint64_t at a 4-bytes address (as I wrote, with attribute align). 
I didn't think to lie my best friend compiler.

What I wanted to know is if there were other issues or drawback, such as 
more instructions penalty. From the goldbot link that you share in 
another post, it seems there's a penalty of a single instruction (it's 
strange, it seems the compiler needs to save the struct pointer to r3, 
before loading the two halves of the word, but only if uint64_t is 
aligned to 4-bytes).


> I would be extremely surprised to find code that fails to work on an M0+ 
> because of a uint64_t pointer that is 4-byte aligned but not 8-byte 
> aligned.  But if /I/ want to use 64-bit integers with 4-byte alignments, 
> I'd use the typedef'd aligned type for the object type and for any 
> relevant pointers.

Yes, of course. Even if I don't understand why the compiler isn't able 
to align at 4-bytes address the uint64_t member in struct B.

>> However, what really changes in the binary output?
>>
>> In some cases, the address of uint64_t can change from 8-bytes to 
>> 4-bytes aligned address (because we instructed it to do so). What 
>> about the code that accesses uint64_t aligned to 4-bytes? Is it 
>> identical between 4- and 8-bytes alignment requirement? I think so, 
>> because in both case, the compiler should add two load/store 4-bytes 
>> instructions.
>>
>

[toc] | [prev] | [next] | [standalone]

#32477

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-21 17:13 +0100
Message-ID	<10kqu02$280ts$1@dont-email.me>
In reply to	#32476

On 21/01/2026 15:58, pozz wrote:
> Il 21/01/2026 10:02, David Brown ha scritto:
>> On 21/01/2026 09:11, pozz wrote:
>>> Il 20/01/2026 18:44, David Brown ha scritto:
>>>> On 20/01/2026 18:09, pozz wrote:
>>>>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>>>>
>>>>>>> You can't reduce the alignment of a struct or its elements by 
>>>>>>> adding an
>>>>>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>>>>>> best you can do on the struct itself is __attribute__((packed)).  
>>>>>>> But
>>>>>>> that can come with disadvantages, and inefficient use.
>>>>>>
>>>>>> Yep making a structure aligned is an excellent way to introduce 
>>>>>> subtle
>>>>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>>>>> those structure fields to some library function. 
>>>>>
>>>>> However, as long as the application runs on Cortex-M0+, the aligned 
>>>>> version shouldn't introduce issues, should it?
>>>>>
>>>>>
>>>>
>>>> Correctly aligned data is never a problem.  /Misaligned/ data is a 
>>>> problem.
>>>>
>>>> The Cortex-M0+ cannot access misaligned data directly.  But if the 
>>>> compiler knows that it is misaligned - by "packed" struct, or 
>>>> "aligned" attribute on the typedef - it should break apart the 
>>>> accesses into bytes or 16-bit half-words as necessary.  (Aligning a 
>>>> uint64_t to 4 byte alignment will not be a problem.)
>>>
>>> However for Cortex-M0+ uint64_t aligned at 4 bytes is:
>>> - aligned for the core (two 4-bytes aligned accesses are required)
>>
>> Yes.  As far as I know, the M0+ core does not need any alignment 
>> greater than 4 for any purpose.  (But I might not know everything 
>> about the core!)  There can be alignment requirements for other 
>> things, such as DMA.
>>
>>> - misaligned for the ABI and the compiler
>>
>> Yes.
>>
>>>
>>> We agree that forcing the gcc compiler to consider 4-bytes as the 
>>> required alignment of uint64_t (using aligned attribute) is always safe. 
>>
>> No.
>>
>> It will almost always be safe, but you don't have any guarantees.  The 
>> compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) 
>> p & 0x07" will always be zero.  Is it likely that you would have 
>> anything in your code where that is relevant, and also that the 
>> compiler would generate code that relies on that assumption?  No, it 
>> is very unlikely.
>>
>> But there is a general principle that you should not lie to your 
>> compiler - don't write code that executes UB, breaks ABIs, or is 
>> otherwise breaking the contract you have with the compiler unless you 
>> are using compiler features that let you keep everything honest.
>>
>> Part of that is that code you are writing now for the M0+ might be 
>> copied or adapted to a different target at a different time.  Maybe on 
>> a different core, the same data will be read using some kind of SIMD 
>> or vector instruction that /does/ require 8-byte alignment.  Don't 
>> mess these things without telling your compiler.  And don't mess with 
>> them without telling future maintainers and programmers using the code 
>> (including your future self).
> 
> But it is exactly what I wanted to do: explictly tell the compiler to 
> align uint64_t at a 4-bytes address (as I wrote, with attribute align). 
> I didn't think to lie my best friend compiler.
> 

uint64_t on 32-bit EABI ARM has an alignment of 8 bytes.  That's cut in 
stone, and you cannot change it (short of adding a new ABI to the 
toolchain).  If you try to use uint64_t objects that are not 8-byte 
aligned, or try to use pointers that are not 8-byte aligned to access 
uint64_t types, you are lying to your compiler.

If you make a new type that is like a uint64_t but with an "aligned(4)" 
attribute, you have a /new/ type.  And that type will work just like you 
want - it is an 8 byte unsigned integer with a 4 byte alignment.  As 
long as you use that consistently, you'll be fine.

> What I wanted to know is if there were other issues or drawback, such as 
> more instructions penalty. 

The drawback from trying to use an object of a type with an improper 
alignment is that you have UB.  What more reasons do you want for not 
doing it?

> From the goldbot link that you share in 
> another post, it seems there's a penalty of a single instruction (it's 
> strange, it seems the compiler needs to save the struct pointer to r3, 
> before loading the two halves of the word, but only if uint64_t is 
> aligned to 4-bytes).
> 

The compiler is not perfect here - there is definitely an extra 
instruction because it is reading the low word first.  (clang reads the 
low word first for uint64_t as well, meaning it gives worse code for A 
and B as well.)  In real code, rather than a brief test snippet, other 
factors could mean this does not happen - it's only because the pointer 
happens to be in r0 that you see it here.

But there's no harm in filing a gcc bug on this, looking for an obvious 
improvement.

> 
>> I would be extremely surprised to find code that fails to work on an 
>> M0+ because of a uint64_t pointer that is 4-byte aligned but not 
>> 8-byte aligned.  But if /I/ want to use 64-bit integers with 4-byte 
>> alignments, I'd use the typedef'd aligned type for the object type and 
>> for any relevant pointers.
> 
> Yes, of course. Even if I don't understand why the compiler isn't able 
> to align at 4-bytes address the uint64_t member in struct B.
> 

It can't align the uint64_t member because the EABI says uint64_t (or, 
rather, unsigned long long) is 8 bytes aligned.  gcc didn't make those 
rules - ARM did.

As I briefly mentioned before, there are a number of very poor choices 
in the EABI (and the 32-bit ARM ABI used for Linux).  This is far from 
the worst.

>>> However, what really changes in the binary output?
>>>
>>> In some cases, the address of uint64_t can change from 8-bytes to 
>>> 4-bytes aligned address (because we instructed it to do so). What 
>>> about the code that accesses uint64_t aligned to 4-bytes? Is it 
>>> identical between 4- and 8-bytes alignment requirement? I think so, 
>>> because in both case, the compiler should add two load/store 4-bytes 
>>> instructions.
>>>
>>
>

[toc] | [prev] | [next] | [standalone]

#32478

From	pozz <pozzugno@gmail.com>
Date	2026-01-21 17:57 +0100
Message-ID	<10kr0gf$29a7t$1@dont-email.me>
In reply to	#32477

Il 21/01/2026 17:13, David Brown ha scritto:
> On 21/01/2026 15:58, pozz wrote:
>> Il 21/01/2026 10:02, David Brown ha scritto:
>>> On 21/01/2026 09:11, pozz wrote:
>>>> Il 20/01/2026 18:44, David Brown ha scritto:
>>>>> On 20/01/2026 18:09, pozz wrote:
>>>>>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>>>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>>>>>
>>>>>>>> You can't reduce the alignment of a struct or its elements by 
>>>>>>>> adding an
>>>>>>>> __aligned_ attribute to the struct itself or any of its fields.  
>>>>>>>> The
>>>>>>>> best you can do on the struct itself is __attribute__((packed)). 
>>>>>>>> But
>>>>>>>> that can come with disadvantages, and inefficient use.
>>>>>>>
>>>>>>> Yep making a structure aligned is an excellent way to introduce 
>>>>>>> subtle
>>>>>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>>>>>> those structure fields to some library function. 
>>>>>>
>>>>>> However, as long as the application runs on Cortex-M0+, the 
>>>>>> aligned version shouldn't introduce issues, should it?
>>>>>>
>>>>>>
>>>>>
>>>>> Correctly aligned data is never a problem.  /Misaligned/ data is a 
>>>>> problem.
>>>>>
>>>>> The Cortex-M0+ cannot access misaligned data directly.  But if the 
>>>>> compiler knows that it is misaligned - by "packed" struct, or 
>>>>> "aligned" attribute on the typedef - it should break apart the 
>>>>> accesses into bytes or 16-bit half-words as necessary.  (Aligning a 
>>>>> uint64_t to 4 byte alignment will not be a problem.)
>>>>
>>>> However for Cortex-M0+ uint64_t aligned at 4 bytes is:
>>>> - aligned for the core (two 4-bytes aligned accesses are required)
>>>
>>> Yes.  As far as I know, the M0+ core does not need any alignment 
>>> greater than 4 for any purpose.  (But I might not know everything 
>>> about the core!)  There can be alignment requirements for other 
>>> things, such as DMA.
>>>
>>>> - misaligned for the ABI and the compiler
>>>
>>> Yes.
>>>
>>>>
>>>> We agree that forcing the gcc compiler to consider 4-bytes as the 
>>>> required alignment of uint64_t (using aligned attribute) is always 
>>>> safe. 
>>>
>>> No.
>>>
>>> It will almost always be safe, but you don't have any guarantees.  
>>> The compiler knows that if "p" is of type "uint64_t *", then 
>>> "(uintptr_t) p & 0x07" will always be zero.  Is it likely that you 
>>> would have anything in your code where that is relevant, and also 
>>> that the compiler would generate code that relies on that 
>>> assumption?  No, it is very unlikely.
>>>
>>> But there is a general principle that you should not lie to your 
>>> compiler - don't write code that executes UB, breaks ABIs, or is 
>>> otherwise breaking the contract you have with the compiler unless you 
>>> are using compiler features that let you keep everything honest.
>>>
>>> Part of that is that code you are writing now for the M0+ might be 
>>> copied or adapted to a different target at a different time.  Maybe 
>>> on a different core, the same data will be read using some kind of 
>>> SIMD or vector instruction that /does/ require 8-byte alignment.  
>>> Don't mess these things without telling your compiler.  And don't 
>>> mess with them without telling future maintainers and programmers 
>>> using the code (including your future self).
>>
>> But it is exactly what I wanted to do: explictly tell the compiler to 
>> align uint64_t at a 4-bytes address (as I wrote, with attribute 
>> align). I didn't think to lie my best friend compiler.
>>
> 
> uint64_t on 32-bit EABI ARM has an alignment of 8 bytes.  That's cut in 
> stone, and you cannot change it (short of adding a new ABI to the 
> toolchain).  If you try to use uint64_t objects that are not 8-byte 
> aligned, or try to use pointers that are not 8-byte aligned to access 
> uint64_t types, you are lying to your compiler.
> 
> If you make a new type that is like a uint64_t but with an "aligned(4)" 
> attribute, you have a /new/ type.  And that type will work just like you 
> want - it is an 8 byte unsigned integer with a 4 byte alignment.  As 
> long as you use that consistently, you'll be fine.
> 
>> What I wanted to know is if there were other issues or drawback, such 
>> as more instructions penalty. 
> 
> The drawback from trying to use an object of a type with an improper 
> alignment is that you have UB.  What more reasons do you want for not 
> doing it?

Most probably I can't explain what I want to say. I don't want to use an 
*improper* alignment (different from the one that gcc really is using). 
I want to know what happens when I *instruct* the compiler to use a 
4-bytes alignment for uint64_t in the context of Cortex-M0+ core only.

In other words, is it completely safe to use, as you suggested,

    typedef uint64_t __attribute__((align(4))) uint64_a;

???

 From what you wrote, I think yes. Maybe just a very small optimization 
penalty.


>> From the goldbot link that you share in another post, it seems there's 
>> a penalty of a single instruction (it's strange, it seems the compiler 
>> needs to save the struct pointer to r3, before loading the two halves 
>> of the word, but only if uint64_t is aligned to 4-bytes).
>>
> 
> The compiler is not perfect here - there is definitely an extra 
> instruction because it is reading the low word first.  (clang reads the 
> low word first for uint64_t as well, meaning it gives worse code for A 
> and B as well.)  In real code, rather than a brief test snippet, other 
> factors could mean this does not happen - it's only because the pointer 
> happens to be in r0 that you see it here.
> 
> But there's no harm in filing a gcc bug on this, looking for an obvious 
> improvement.
> 
>>
>>> I would be extremely surprised to find code that fails to work on an 
>>> M0+ because of a uint64_t pointer that is 4-byte aligned but not 
>>> 8-byte aligned.  But if /I/ want to use 64-bit integers with 4-byte 
>>> alignments, I'd use the typedef'd aligned type for the object type 
>>> and for any relevant pointers.
>>
>> Yes, of course. Even if I don't understand why the compiler isn't able 
>> to align at 4-bytes address the uint64_t member in struct B.
>>
> 
> It can't align the uint64_t member because the EABI says uint64_t (or, 
> rather, unsigned long long) is 8 bytes aligned.  gcc didn't make those 
> rules - ARM did.

But struct B is defined with correct alignment attribute for uint64_t 
member. I tried also:

struct B {
     uint32_t x;
     uint64_t y __attribute__((aligned(4)));
};

The struct size is always 16, so y is placed at offset 8 and not 4. It 
seems to me gcc isn't able to respect the aligned attribute of 4 bytes 
when it is specified inside the struct definition.

I don't see many differences with:

typedef __attribute__((aligned(4))) uint64_t uint64_a;

struct C {
     uint32_t x;
     uint64_a y;
};

> 
> As I briefly mentioned before, there are a number of very poor choices 
> in the EABI (and the 32-bit ARM ABI used for Linux).  This is far from 
> the worst.
> 
>>>> However, what really changes in the binary output?
>>>>
>>>> In some cases, the address of uint64_t can change from 8-bytes to 
>>>> 4-bytes aligned address (because we instructed it to do so). What 
>>>> about the code that accesses uint64_t aligned to 4-bytes? Is it 
>>>> identical between 4- and 8-bytes alignment requirement? I think so, 
>>>> because in both case, the compiler should add two load/store 4-bytes 
>>>> instructions.
>>>>
>>>
>>
>

[toc] | [prev] | [next] | [standalone]

#32479

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-22 10:03 +0100
Message-ID	<10ksp4q$2s8q3$1@dont-email.me>
In reply to	#32478

On 21/01/2026 17:57, pozz wrote:
> Il 21/01/2026 17:13, David Brown ha scritto:
>> On 21/01/2026 15:58, pozz wrote:
>>> Il 21/01/2026 10:02, David Brown ha scritto:
>>>> On 21/01/2026 09:11, pozz wrote:

<snip for brevity>

> Most probably I can't explain what I want to say. I don't want to use an 
> *improper* alignment (different from the one that gcc really is using). 
> I want to know what happens when I *instruct* the compiler to use a 
> 4-bytes alignment for uint64_t in the context of Cortex-M0+ core only.
> 

I think we may have been talking slightly past each other, so that 
re-wording was helpful.

> In other words, is it completely safe to use, as you suggested,
> 
>     typedef uint64_t __attribute__((align(4))) uint64_a;
> 
> ???

Baring compiler bugs, yes, that is completely safe.  When the compiler 
lets you make such a type, and use it, it is the compiler's 
responsibility to get the details right.  You should never see issues 
from the compiler's knowledge and assumptions of alignments, and it 
should generate instructions that work on the target (for example, if 
the target hardware required 8-byte alignment for 64-bit loads and 
stores, then the compiler would generate two 32-bit accesses instead).

And for the Cortex M series, 4-byte alignment is the maximum needed for 
working code (though there might be efficiency differences on some of 
the biggest M cores that have 64-bit buses internally, or when data 
caches are used).

> 
>  From what you wrote, I think yes. Maybe just a very small optimization 
> penalty.
> 

Yes.  And I think that is a "missed optimisation opportunity" bug.  I 
suspect (or speculate), but have not looked at the compiler code to be 
sure, that the code generator generally accesses the low half of 64-bit 
data first.  And then it may have specific optimisations ("peephole" 
optimisations) for re-ordering the accesses for "long long" types in 
certain circumstances, saving a register and an instruction.  However, 
that would apply only to the specific type - and while "uint64_a" works 
a lot like "unsigned long long", it is not that exact type, and won't 
trigger the same optimisation.

> 
>>> From the goldbot link that you share in another post, it seems 
>>> there's a penalty of a single instruction (it's strange, it seems the 
>>> compiler needs to save the struct pointer to r3, before loading the 
>>> two halves of the word, but only if uint64_t is aligned to 4-bytes).
>>>
>>
>> The compiler is not perfect here - there is definitely an extra 
>> instruction because it is reading the low word first.  (clang reads 
>> the low word first for uint64_t as well, meaning it gives worse code 
>> for A and B as well.)  In real code, rather than a brief test snippet, 
>> other factors could mean this does not happen - it's only because the 
>> pointer happens to be in r0 that you see it here.
>>
>> But there's no harm in filing a gcc bug on this, looking for an 
>> obvious improvement.
>>
>>>
>>>> I would be extremely surprised to find code that fails to work on an 
>>>> M0+ because of a uint64_t pointer that is 4-byte aligned but not 
>>>> 8-byte aligned.  But if /I/ want to use 64-bit integers with 4-byte 
>>>> alignments, I'd use the typedef'd aligned type for the object type 
>>>> and for any relevant pointers.
>>>
>>> Yes, of course. Even if I don't understand why the compiler isn't 
>>> able to align at 4-bytes address the uint64_t member in struct B.
>>>
>>
>> It can't align the uint64_t member because the EABI says uint64_t (or, 
>> rather, unsigned long long) is 8 bytes aligned.  gcc didn't make those 
>> rules - ARM did.
> 
> But struct B is defined with correct alignment attribute for uint64_t 
> member. I tried also:
> 
> struct B {
>      uint32_t x;
>      uint64_t y __attribute__((aligned(4)));
> };
> 
> The struct size is always 16, so y is placed at offset 8 and not 4. It 
> seems to me gcc isn't able to respect the aligned attribute of 4 bytes 
> when it is specified inside the struct definition.

That is my conclusion too.  (I tried the "aligned" attribute in every 
place I could.)  The only place it worked was on a typedef for the new 
"uint64_a" type.

> 
> I don't see many differences with:
> 
> typedef __attribute__((aligned(4))) uint64_t uint64_a;
> 
> struct C {
>      uint32_t x;
>      uint64_a y;
> };
> 

It certainly seems inconsistent to me that it works on the typedef, and 
not directly in the struct definition.  After all, a typedef does not 
actually define a new type (it's a silly name) - it merely defines an 
alias or shortcut name for a type.  So it would seem logical that using 
"uint64_a" or "__attribute__((aligned(4))) uint64_t" in the struct 
definition would mean exactly the same thing.  But apparently not.  gcc 
attributes are not part of the normal C grammar, so there's no standard 
to fall back on here.

>>
>> As I briefly mentioned before, there are a number of very poor choices 
>> in the EABI (and the 32-bit ARM ABI used for Linux).  This is far from 
>> the worst.
>>
>>>>> However, what really changes in the binary output?
>>>>>
>>>>> In some cases, the address of uint64_t can change from 8-bytes to 
>>>>> 4-bytes aligned address (because we instructed it to do so). What 
>>>>> about the code that accesses uint64_t aligned to 4-bytes? Is it 
>>>>> identical between 4- and 8-bytes alignment requirement? I think so, 
>>>>> because in both case, the compiler should add two load/store 
>>>>> 4-bytes instructions.
>>>>>
>>>>
>>>
>>
>

[toc] | [prev] | [next] | [standalone]

#32468

From	Grant Edwards <invalid@invalid.invalid>
Date	2026-01-20 17:48 +0000
Message-ID	<10kof5q$929$1@reader2.panix.com>
In reply to	#32465

On 2026-01-20, pozz <pozzugno@gmail.com> wrote:
> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>> 
>>> You can't reduce the alignment of a struct or its elements by adding an
>>> __aligned_ attribute to the struct itself or any of its fields.  The
>>> best you can do on the struct itself is __attribute__((packed)).  But
>>> that can come with disadvantages, and inefficient use.
>> 
>> Yep making a structure aligned is an excellent way to introduce subtle
>> bugs that happen when somebody, somewhere passes a pointer to one of
>> those structure fields to some library function. 

Aargh, my bad. I meant that making a strucutre _packed_ is an
excellent way to introduce subtle bugs that happen when somebody,
somewhere passes a pointer to one of those structure fields to some
library function.

> However, as long as the application runs on Cortex-M0+, the aligned 
> version shouldn't introduce issues, should it?

A non-packed structure should always be OK.

A packed structure will work fine as long as it's being accessed by
code that "knows" it's packed.  You can pass a pointer to packed
struct to a function as long as it's declared in that function as a
pointer to a packed struct: the compiler will generate extra code to
deal with accesses to values that are misaligned due to the
packing. However, passing a pointer to an packed field structure
(e.g. to a uint64_t) to a function where it was declared as a normal
"uint64_t *p" can cause failures on ARM targets. It will work OK on
X86. I think it used to work OK on m68k also. IIRC SPARC failed in
similar ways to ARM.

--
Grant

[toc] | [prev] | [next] | [standalone]

#32466

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-20 18:41 +0100
Message-ID	<10koep6$1dlne$1@dont-email.me>
In reply to	#32463

On 20/01/2026 17:41, Grant Edwards wrote:
> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
> 
>> You can't reduce the alignment of a struct or its elements by adding an
>> __aligned_ attribute to the struct itself or any of its fields.  The
>> best you can do on the struct itself is __attribute__((packed)).  But
>> that can come with disadvantages, and inefficient use.
> 
> Yep making a structure aligned is an excellent way to introduce subtle

To be clear - you mean making the structure /packed/, not /aligned/, or 
in some other way allowing objects to be placed at a smaller alignment 
than the ABI says.

> bugs that happen when somebody, somewhere passes a pointer to one of
> those structure fields to some library function. Somebody I used to
> work with was very fond of making all of his structures aligned (for
> no apparent reason). Then he would test his code on an X86 desktop
> machine. It worked fine because the X86 support unaligned
> accesses. Then he would move to an ARM target, and it would
> fail. Inevitably the cry "The compiler's broken!" would be heard, and
> I would have to explain to him for the Nth time about misaligned
> accesses on different ARM targets.  Some of our targets generate a bus
> fault, some just silently read/write only part of the data.
> 

Cortex M3 and bigger all handle misaligned accesses without problem 
(albeit possibly at a performance penalty).  Cortex M0, M0+ and M1 do 
not support misaligned accesses.  On an M4, the compiler should generate 
normal 32-bit loads and stores for a "packed" struct with 32-bit fields, 
but it should generate byte-by-byte accesses on a Cortex M0.

It's fine to have packed structs, or types with smaller than normal 
alignment, as long as the compiler knows that's the case.   So you don't 
take pointers to fields in a "packed" struct, and if you use something 
like the "uint64_a" type I suggested, it should be accessed by pointers 
to its real type, not pointers to "uint64_t".  (In practice, I would 
expect that pointers to uint64_t would work, because the accesses will 
be 32-bit anyway, but you should never lie to your compiler!)

> That same guy once insisted that with the 32-bit GCC compiler we were
> using "unsigned long variables work, but unsigned variables don't". So
> he was busily changing all of his "unsigned" variables to "unsigned
> long".  I printed out the assembly generated for both cases showing
> that it was identical. He then insisted that the linker must be doing
> something to break unsigned integers.

That is, shall we say, a strange idea from that guy.  Perhaps he was 
confused by uint32_t being "unsigned long" on EABI 32-bit ARM, rather 
than "unsigned int" (as it is on 32-bit ARM Linux, and on Windows) ?

I have often seen people think that "unsigned int" and "unsigned long" 
are the same type on 32-bit ARM, just because they are both 32-bit, and 
get confused when there are compiler complaints about incompatible 
pointers when they are mixed.

> 
> And then there was the time he decided that cross compiling on a
> single-core Linux host worked but compiling on a dual-core
> didn't. [Both cases using a single-threaded "make".]
> 
> And the time he decided that he needed to upgrade a buch of the Ubuntu
> X11 libraries on the X86 host machine to fix a problem in the ARM
> target.
> 

Someone is a little confused :-)

[toc] | [prev] | [next] | [standalone]

#32469

From	Grant Edwards <invalid@invalid.invalid>
Date	2026-01-20 18:10 +0000
Message-ID	<10kogeb$n48$1@reader2.panix.com>
In reply to	#32466

On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:

> Cortex M3 and bigger all handle misaligned accesses without problem
> (albeit possibly at a performance penalty).

FWIW, the M3 can be configured to generate a fault on unaligned
accesses, so whether it works or not depends on your low-level init
code. I believe that unaligned-fault-enable feature is disabled by
default at reset.  Also, The M3 only supports non-world aligned
accesses for normal signle store/load instructions.  LDM/STM and
LDRD/STRD will fault on non-word aligned access.

--
Grant

[toc] | [prev] | [next] | [standalone]

#32471

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-20 22:32 +0100
Message-ID	<10kos8h$1ivol$2@dont-email.me>
In reply to	#32469

On 20/01/2026 19:10, Grant Edwards wrote:
> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
> 
>> Cortex M3 and bigger all handle misaligned accesses without problem
>> (albeit possibly at a performance penalty).
> 
> FWIW, the M3 can be configured to generate a fault on unaligned
> accesses, so whether it works or not depends on your low-level init
> code. I believe that unaligned-fault-enable feature is disabled by
> default at reset.  

I did not know that.

> Also, The M3 only supports non-world aligned
> accesses for normal signle store/load instructions.  LDM/STM and
> LDRD/STRD will fault on non-word aligned access.
> 

Yes.  Of course, the LDM/STM are primarily used for pushing and popping 
registers on the stack, so you are always going to be aligned there.

In the godbolt.org link I posted in a reply to Pozz, we can see that 
when the compiler knows the uint64_t is aligned at least to 4 bytes, it 
uses LDRD, but when it does not know that it is 4 bytes aligned, it uses 
two LDR instructions.

(As an aside, I find it annoying that STRD can be interrupted in the 
middle - it means you don't have an atomic 64-bit store.  LDRD can also 
be interrupted in the middle, but as it is restarted, it gives you a 
64-bit atomic read.)

[toc] | [prev] | [next] | [standalone]

#32472

From	Grant Edwards <invalid@invalid.invalid>
Date	2026-01-21 03:38 +0000
Message-ID	<10kphnl$gvp$1@reader2.panix.com>
In reply to	#32471

On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:

> (As an aside, I find it annoying that STRD can be interrupted in the 
> middle - it means you don't have an atomic 64-bit store.  LDRD can also 
> be interrupted in the middle, but as it is restarted, it gives you a 
> 64-bit atomic read.)

Yes, I just noticed that in the manual the other day, and it seemed
like an odd decision.

--
Grant

[toc] | [prev] | [next] | [standalone]

#32473

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-21 08:54 +0100
Message-ID	<10kq0nq$1u1ai$1@dont-email.me>
In reply to	#32472

On 21/01/2026 04:38, Grant Edwards wrote:
> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
> 
>> (As an aside, I find it annoying that STRD can be interrupted in the
>> middle - it means you don't have an atomic 64-bit store.  LDRD can also
>> be interrupted in the middle, but as it is restarted, it gives you a
>> 64-bit atomic read.)
> 
> Yes, I just noticed that in the manual the other day, and it seemed
> like an odd decision.
> 

It's not odd from the implementation viewpoint, but disappointing from 
the user viewpoint.  The double loads and stores are implemented as a 
sort of combination of two instructions, or at least two actions. 
Disabling interrupts in the middle of the instructions would mean 
additional hardware logic.  (I think all longer-running instructions, 
like divisions, are interruptible.)  When the interrupt returns, the 
instructions are simply restarted.

That gives an atomic 64-bit load, so it lets you safely read 64-bit data 
that is changed by an interrupt or higher-priority thread - unlike using 
two separate 32-bit load instructions.  (Using a volatile read appears 
to force the use of LDRD on gcc for M3 and above, while non-volatile 
reads might be split and re-arranged depending on the surrounding code.)

An interrupted double store is, obviously, a very different matter - 
your interrupt routines or pre-empting threads see half-written data.

My guess as to the decision process is that making these instructions 
non-interruptible would have taken more hardware, and weakened 
guarantees on interrupt latency.  But if they had asked /me/, I'd have 
chosen to make STRD non-interruptible :-)

[toc] | [prev] | [next] | [standalone]

#32464

From	pozz <pozzugno@gmail.com>
Date	2026-01-20 17:55 +0100
Message-ID	<10koc0h$1cr4j$1@dont-email.me>
In reply to	#32462

Il 20/01/2026 17:07, David Brown ha scritto:
> On 20/01/2026 13:26, pozz wrote:
>> I just discovered that my arm-gcc assigns an alignment of 8 to a 
>> struct with uint64_t member.
>>
>> First of all: I can't explain why. Cortex-M0+ shouldn't have any 
>> special load/store instructions for 64-bits data. I think the uint64_t 
>> variable is *always* accessed with two separate instructions.
>>
> 
> There are other Cortex-M devices that /can/ access 64 bit data with a 
> single instruction (though not always as an atomic function).
> 
> Compilers use family ABI's, not ABI's specifically tuned for exact 
> devices.  The EABI for 32-bit ARM says long long's are 8 byte aligned, 
> so that's what is used for all targets that use the EABI.  (There's a 
> lot to dislike about the EABI - this is not the worst thing.)

So the ABI used by arm gcc is EABI that is valid for a list of Cortex-M 
devices, a few of these that require 8-byte alignment of 64-bits integers.


>> Second thing. Is it safe to force the alignment of such structs to 4 
>> with __attribute__((aligned(4)))?
> 
> You can't reduce the alignment of a struct or its elements by adding an 
> __aligned_ attribute to the struct itself or any of its fields.  The 
> best you can do on the struct itself is __attribute__((packed)).  But 
> that can come with disadvantages, and inefficient use.

But this is the opposite of what you write below!


>> I have big arrays of structs that contains uint64_t members, so I'm 
>> thinking how to save some space.
> 
> The best way is to organise the fields so that they are naturally 
> aligned, and don't have padding for alignment.  I like "-Wpadded" to 
> tell me if there is unexpected padding.
> 
> 
> What you /can/ do, however, is define a type that is 64 bits, but 4 byte 
> alignment:
> 
> typedef uint64_t __attribute__((aligned(4)) uint64_a;
> 
> Now you can use "uint64_a" instead of "uint64_t", and it will have 4 
> byte alignment.

Before you wrote it's impossible to reduce the alignment from 8 to 4 
with __attribute__((aligned(4))), but now you write it is possible.

[toc] | [prev] | [next] | [standalone]

#32470

From	David Brown <david.brown@hesbynett.no>
Date	2026-01-20 22:24 +0100
Message-ID	<10korqk$1ivol$1@dont-email.me>
In reply to	#32464

On 20/01/2026 17:55, pozz wrote:
> Il 20/01/2026 17:07, David Brown ha scritto:
>> On 20/01/2026 13:26, pozz wrote:
>>> I just discovered that my arm-gcc assigns an alignment of 8 to a 
>>> struct with uint64_t member.
>>>
>>> First of all: I can't explain why. Cortex-M0+ shouldn't have any 
>>> special load/store instructions for 64-bits data. I think the 
>>> uint64_t variable is *always* accessed with two separate instructions.
>>>
>>
>> There are other Cortex-M devices that /can/ access 64 bit data with a 
>> single instruction (though not always as an atomic function).
>>
>> Compilers use family ABI's, not ABI's specifically tuned for exact 
>> devices.  The EABI for 32-bit ARM says long long's are 8 byte aligned, 
>> so that's what is used for all targets that use the EABI.  (There's a 
>> lot to dislike about the EABI - this is not the worst thing.)
> 
> So the ABI used by arm gcc is EABI that is valid for a list of Cortex-M 
> devices, a few of these that require 8-byte alignment of 64-bits integers.
> 

I don't think any of the 32-bit Cortex-M cores actually need 8 byte 
alignment in the hardware - it could have been for compatibility with 
64-bit devices.

> 
>>> Second thing. Is it safe to force the alignment of such structs to 4 
>>> with __attribute__((aligned(4)))?
>>
>> You can't reduce the alignment of a struct or its elements by adding 
>> an __aligned_ attribute to the struct itself or any of its fields.  
>> The best you can do on the struct itself is __attribute__((packed)).  
>> But that can come with disadvantages, and inefficient use.
> 
> But this is the opposite of what you write below!

No, but I might have been unclear.

Adding the "aligned" attribute to the /struct/, or to the field members 
/directly/, does not help you here.  Adding it to a new typedef does.

> 
> 
>>> I have big arrays of structs that contains uint64_t members, so I'm 
>>> thinking how to save some space.
>>
>> The best way is to organise the fields so that they are naturally 
>> aligned, and don't have padding for alignment.  I like "-Wpadded" to 
>> tell me if there is unexpected padding.
>>
>>
>> What you /can/ do, however, is define a type that is 64 bits, but 4 
>> byte alignment:
>>
>> typedef uint64_t __attribute__((aligned(4)) uint64_a;
>>
>> Now you can use "uint64_a" instead of "uint64_t", and it will have 4 
>> byte alignment.
> 
> Before you wrote it's impossible to reduce the alignment from 8 to 4 
> with __attribute__((aligned(4))), but now you write it is possible.
> 

Putting it in a typedef lets you change the alignment.

See <https://godbolt.org/z/3fac7n7Yo>, and look at the code generated 
for the M0+ and the M4 to see how "packed" and "aligned" affects things.

[toc] | [prev] | [standalone]

csiph-web

arm-gcc, Cortex-M0+, uint64_t and alignment

Contents

#32461 — arm-gcc, Cortex-M0+, uint64_t and alignment

#32462

#32463

#32465

#32467

#32474

#32475

#32476

#32477

#32478

#32479

#32468

#32466

#32469

#32471

#32472

#32473

#32464

#32470