Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.arch.embedded > #32461 > unrolled thread
| Started by | pozz <pozzugno@gmail.com> |
|---|---|
| First post | 2026-01-20 13:26 +0100 |
| Last post | 2026-01-20 22:24 +0100 |
| Articles | 19 — 3 participants |
Back to article view | Back to comp.arch.embedded
arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 13:26 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 17:07 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 16:41 +0000
Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 18:09 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:44 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 09:11 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 10:02 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 15:58 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 17:13 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-21 17:57 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-22 10:03 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 17:48 +0000
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 18:41 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-20 18:10 +0000
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:32 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment Grant Edwards <invalid@invalid.invalid> - 2026-01-21 03:38 +0000
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-21 08:54 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment pozz <pozzugno@gmail.com> - 2026-01-20 17:55 +0100
Re: arm-gcc, Cortex-M0+, uint64_t and alignment David Brown <david.brown@hesbynett.no> - 2026-01-20 22:24 +0100
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-20 13:26 +0100 |
| Subject | arm-gcc, Cortex-M0+, uint64_t and alignment |
| Message-ID | <10kns7l$1733k$1@dont-email.me> |
I just discovered that my arm-gcc assigns an alignment of 8 to a struct with uint64_t member. First of all: I can't explain why. Cortex-M0+ shouldn't have any special load/store instructions for 64-bits data. I think the uint64_t variable is *always* accessed with two separate instructions. Second thing. Is it safe to force the alignment of such structs to 4 with __attribute__((aligned(4)))? I have big arrays of structs that contains uint64_t members, so I'm thinking how to save some space.
[toc] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-20 17:07 +0100 |
| Message-ID | <10ko98i$1bptj$1@dont-email.me> |
| In reply to | #32461 |
On 20/01/2026 13:26, pozz wrote: > I just discovered that my arm-gcc assigns an alignment of 8 to a struct > with uint64_t member. > > First of all: I can't explain why. Cortex-M0+ shouldn't have any special > load/store instructions for 64-bits data. I think the uint64_t variable > is *always* accessed with two separate instructions. > There are other Cortex-M devices that /can/ access 64 bit data with a single instruction (though not always as an atomic function). Compilers use family ABI's, not ABI's specifically tuned for exact devices. The EABI for 32-bit ARM says long long's are 8 byte aligned, so that's what is used for all targets that use the EABI. (There's a lot to dislike about the EABI - this is not the worst thing.) > Second thing. Is it safe to force the alignment of such structs to 4 > with __attribute__((aligned(4)))? > You can't reduce the alignment of a struct or its elements by adding an __aligned_ attribute to the struct itself or any of its fields. The best you can do on the struct itself is __attribute__((packed)). But that can come with disadvantages, and inefficient use. > I have big arrays of structs that contains uint64_t members, so I'm > thinking how to save some space. The best way is to organise the fields so that they are naturally aligned, and don't have padding for alignment. I like "-Wpadded" to tell me if there is unexpected padding. What you /can/ do, however, is define a type that is 64 bits, but 4 byte alignment: typedef uint64_t __attribute__((aligned(4)) uint64_a; Now you can use "uint64_a" instead of "uint64_t", and it will have 4 byte alignment.
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2026-01-20 16:41 +0000 |
| Message-ID | <10kob7m$qel$1@reader2.panix.com> |
| In reply to | #32462 |
On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > You can't reduce the alignment of a struct or its elements by adding an > __aligned_ attribute to the struct itself or any of its fields. The > best you can do on the struct itself is __attribute__((packed)). But > that can come with disadvantages, and inefficient use. Yep making a structure aligned is an excellent way to introduce subtle bugs that happen when somebody, somewhere passes a pointer to one of those structure fields to some library function. Somebody I used to work with was very fond of making all of his structures aligned (for no apparent reason). Then he would test his code on an X86 desktop machine. It worked fine because the X86 support unaligned accesses. Then he would move to an ARM target, and it would fail. Inevitably the cry "The compiler's broken!" would be heard, and I would have to explain to him for the Nth time about misaligned accesses on different ARM targets. Some of our targets generate a bus fault, some just silently read/write only part of the data. That same guy once insisted that with the 32-bit GCC compiler we were using "unsigned long variables work, but unsigned variables don't". So he was busily changing all of his "unsigned" variables to "unsigned long". I printed out the assembly generated for both cases showing that it was identical. He then insisted that the linker must be doing something to break unsigned integers. And then there was the time he decided that cross compiling on a single-core Linux host worked but compiling on a dual-core didn't. [Both cases using a single-threaded "make".] And the time he decided that he needed to upgrade a buch of the Ubuntu X11 libraries on the X86 host machine to fix a problem in the ARM target. -- Grant
[toc] | [prev] | [next] | [standalone]
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-20 18:09 +0100 |
| Message-ID | <10kocr4$1d65l$1@dont-email.me> |
| In reply to | #32463 |
Il 20/01/2026 17:41, Grant Edwards ha scritto: > On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > >> You can't reduce the alignment of a struct or its elements by adding an >> __aligned_ attribute to the struct itself or any of its fields. The >> best you can do on the struct itself is __attribute__((packed)). But >> that can come with disadvantages, and inefficient use. > > Yep making a structure aligned is an excellent way to introduce subtle > bugs that happen when somebody, somewhere passes a pointer to one of > those structure fields to some library function. However, as long as the application runs on Cortex-M0+, the aligned version shouldn't introduce issues, should it?
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-20 18:44 +0100 |
| Message-ID | <10koet6$1dlne$2@dont-email.me> |
| In reply to | #32465 |
On 20/01/2026 18:09, pozz wrote: > Il 20/01/2026 17:41, Grant Edwards ha scritto: >> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >> >>> You can't reduce the alignment of a struct or its elements by adding an >>> __aligned_ attribute to the struct itself or any of its fields. The >>> best you can do on the struct itself is __attribute__((packed)). But >>> that can come with disadvantages, and inefficient use. >> >> Yep making a structure aligned is an excellent way to introduce subtle >> bugs that happen when somebody, somewhere passes a pointer to one of >> those structure fields to some library function. > > However, as long as the application runs on Cortex-M0+, the aligned > version shouldn't introduce issues, should it? > > Correctly aligned data is never a problem. /Misaligned/ data is a problem. The Cortex-M0+ cannot access misaligned data directly. But if the compiler knows that it is misaligned - by "packed" struct, or "aligned" attribute on the typedef - it should break apart the accesses into bytes or 16-bit half-words as necessary. (Aligning a uint64_t to 4 byte alignment will not be a problem.)
[toc] | [prev] | [next] | [standalone]
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-21 09:11 +0100 |
| Message-ID | <10kq1m7$1un41$1@dont-email.me> |
| In reply to | #32467 |
Il 20/01/2026 18:44, David Brown ha scritto: > On 20/01/2026 18:09, pozz wrote: >> Il 20/01/2026 17:41, Grant Edwards ha scritto: >>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >>> >>>> You can't reduce the alignment of a struct or its elements by adding an >>>> __aligned_ attribute to the struct itself or any of its fields. The >>>> best you can do on the struct itself is __attribute__((packed)). But >>>> that can come with disadvantages, and inefficient use. >>> >>> Yep making a structure aligned is an excellent way to introduce subtle >>> bugs that happen when somebody, somewhere passes a pointer to one of >>> those structure fields to some library function. >> >> However, as long as the application runs on Cortex-M0+, the aligned >> version shouldn't introduce issues, should it? >> >> > > Correctly aligned data is never a problem. /Misaligned/ data is a problem. > > The Cortex-M0+ cannot access misaligned data directly. But if the > compiler knows that it is misaligned - by "packed" struct, or "aligned" > attribute on the typedef - it should break apart the accesses into bytes > or 16-bit half-words as necessary. (Aligning a uint64_t to 4 byte > alignment will not be a problem.) However for Cortex-M0+ uint64_t aligned at 4 bytes is: - aligned for the core (two 4-bytes aligned accesses are required) - misaligned for the ABI and the compiler We agree that forcing the gcc compiler to consider 4-bytes as the required alignment of uint64_t (using aligned attribute) is always safe. However, what really changes in the binary output? In some cases, the address of uint64_t can change from 8-bytes to 4-bytes aligned address (because we instructed it to do so). What about the code that accesses uint64_t aligned to 4-bytes? Is it identical between 4- and 8-bytes alignment requirement? I think so, because in both case, the compiler should add two load/store 4-bytes instructions.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-21 10:02 +0100 |
| Message-ID | <10kq4mi$1vcdq$1@dont-email.me> |
| In reply to | #32474 |
On 21/01/2026 09:11, pozz wrote: > Il 20/01/2026 18:44, David Brown ha scritto: >> On 20/01/2026 18:09, pozz wrote: >>> Il 20/01/2026 17:41, Grant Edwards ha scritto: >>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >>>> >>>>> You can't reduce the alignment of a struct or its elements by >>>>> adding an >>>>> __aligned_ attribute to the struct itself or any of its fields. The >>>>> best you can do on the struct itself is __attribute__((packed)). But >>>>> that can come with disadvantages, and inefficient use. >>>> >>>> Yep making a structure aligned is an excellent way to introduce subtle >>>> bugs that happen when somebody, somewhere passes a pointer to one of >>>> those structure fields to some library function. >>> >>> However, as long as the application runs on Cortex-M0+, the aligned >>> version shouldn't introduce issues, should it? >>> >>> >> >> Correctly aligned data is never a problem. /Misaligned/ data is a >> problem. >> >> The Cortex-M0+ cannot access misaligned data directly. But if the >> compiler knows that it is misaligned - by "packed" struct, or >> "aligned" attribute on the typedef - it should break apart the >> accesses into bytes or 16-bit half-words as necessary. (Aligning a >> uint64_t to 4 byte alignment will not be a problem.) > > However for Cortex-M0+ uint64_t aligned at 4 bytes is: > - aligned for the core (two 4-bytes aligned accesses are required) Yes. As far as I know, the M0+ core does not need any alignment greater than 4 for any purpose. (But I might not know everything about the core!) There can be alignment requirements for other things, such as DMA. > - misaligned for the ABI and the compiler Yes. > > We agree that forcing the gcc compiler to consider 4-bytes as the > required alignment of uint64_t (using aligned attribute) is always safe. No. It will almost always be safe, but you don't have any guarantees. The compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) p & 0x07" will always be zero. Is it likely that you would have anything in your code where that is relevant, and also that the compiler would generate code that relies on that assumption? No, it is very unlikely. But there is a general principle that you should not lie to your compiler - don't write code that executes UB, breaks ABIs, or is otherwise breaking the contract you have with the compiler unless you are using compiler features that let you keep everything honest. Part of that is that code you are writing now for the M0+ might be copied or adapted to a different target at a different time. Maybe on a different core, the same data will be read using some kind of SIMD or vector instruction that /does/ require 8-byte alignment. Don't mess these things without telling your compiler. And don't mess with them without telling future maintainers and programmers using the code (including your future self). I would be extremely surprised to find code that fails to work on an M0+ because of a uint64_t pointer that is 4-byte aligned but not 8-byte aligned. But if /I/ want to use 64-bit integers with 4-byte alignments, I'd use the typedef'd aligned type for the object type and for any relevant pointers. > However, what really changes in the binary output? > > In some cases, the address of uint64_t can change from 8-bytes to > 4-bytes aligned address (because we instructed it to do so). What about > the code that accesses uint64_t aligned to 4-bytes? Is it identical > between 4- and 8-bytes alignment requirement? I think so, because in > both case, the compiler should add two load/store 4-bytes instructions. >
[toc] | [prev] | [next] | [standalone]
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-21 15:58 +0100 |
| Message-ID | <10kqpg9$26oa0$1@dont-email.me> |
| In reply to | #32475 |
Il 21/01/2026 10:02, David Brown ha scritto: > On 21/01/2026 09:11, pozz wrote: >> Il 20/01/2026 18:44, David Brown ha scritto: >>> On 20/01/2026 18:09, pozz wrote: >>>> Il 20/01/2026 17:41, Grant Edwards ha scritto: >>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >>>>> >>>>>> You can't reduce the alignment of a struct or its elements by >>>>>> adding an >>>>>> __aligned_ attribute to the struct itself or any of its fields. The >>>>>> best you can do on the struct itself is __attribute__((packed)). But >>>>>> that can come with disadvantages, and inefficient use. >>>>> >>>>> Yep making a structure aligned is an excellent way to introduce subtle >>>>> bugs that happen when somebody, somewhere passes a pointer to one of >>>>> those structure fields to some library function. >>>> >>>> However, as long as the application runs on Cortex-M0+, the aligned >>>> version shouldn't introduce issues, should it? >>>> >>>> >>> >>> Correctly aligned data is never a problem. /Misaligned/ data is a >>> problem. >>> >>> The Cortex-M0+ cannot access misaligned data directly. But if the >>> compiler knows that it is misaligned - by "packed" struct, or >>> "aligned" attribute on the typedef - it should break apart the >>> accesses into bytes or 16-bit half-words as necessary. (Aligning a >>> uint64_t to 4 byte alignment will not be a problem.) >> >> However for Cortex-M0+ uint64_t aligned at 4 bytes is: >> - aligned for the core (two 4-bytes aligned accesses are required) > > Yes. As far as I know, the M0+ core does not need any alignment greater > than 4 for any purpose. (But I might not know everything about the > core!) There can be alignment requirements for other things, such as DMA. > >> - misaligned for the ABI and the compiler > > Yes. > >> >> We agree that forcing the gcc compiler to consider 4-bytes as the >> required alignment of uint64_t (using aligned attribute) is always safe. > > No. > > It will almost always be safe, but you don't have any guarantees. The > compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) p > & 0x07" will always be zero. Is it likely that you would have anything > in your code where that is relevant, and also that the compiler would > generate code that relies on that assumption? No, it is very unlikely. > > But there is a general principle that you should not lie to your > compiler - don't write code that executes UB, breaks ABIs, or is > otherwise breaking the contract you have with the compiler unless you > are using compiler features that let you keep everything honest. > > Part of that is that code you are writing now for the M0+ might be > copied or adapted to a different target at a different time. Maybe on a > different core, the same data will be read using some kind of SIMD or > vector instruction that /does/ require 8-byte alignment. Don't mess > these things without telling your compiler. And don't mess with them > without telling future maintainers and programmers using the code > (including your future self). But it is exactly what I wanted to do: explictly tell the compiler to align uint64_t at a 4-bytes address (as I wrote, with attribute align). I didn't think to lie my best friend compiler. What I wanted to know is if there were other issues or drawback, such as more instructions penalty. From the goldbot link that you share in another post, it seems there's a penalty of a single instruction (it's strange, it seems the compiler needs to save the struct pointer to r3, before loading the two halves of the word, but only if uint64_t is aligned to 4-bytes). > I would be extremely surprised to find code that fails to work on an M0+ > because of a uint64_t pointer that is 4-byte aligned but not 8-byte > aligned. But if /I/ want to use 64-bit integers with 4-byte alignments, > I'd use the typedef'd aligned type for the object type and for any > relevant pointers. Yes, of course. Even if I don't understand why the compiler isn't able to align at 4-bytes address the uint64_t member in struct B. >> However, what really changes in the binary output? >> >> In some cases, the address of uint64_t can change from 8-bytes to >> 4-bytes aligned address (because we instructed it to do so). What >> about the code that accesses uint64_t aligned to 4-bytes? Is it >> identical between 4- and 8-bytes alignment requirement? I think so, >> because in both case, the compiler should add two load/store 4-bytes >> instructions. >> >
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-21 17:13 +0100 |
| Message-ID | <10kqu02$280ts$1@dont-email.me> |
| In reply to | #32476 |
On 21/01/2026 15:58, pozz wrote: > Il 21/01/2026 10:02, David Brown ha scritto: >> On 21/01/2026 09:11, pozz wrote: >>> Il 20/01/2026 18:44, David Brown ha scritto: >>>> On 20/01/2026 18:09, pozz wrote: >>>>> Il 20/01/2026 17:41, Grant Edwards ha scritto: >>>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >>>>>> >>>>>>> You can't reduce the alignment of a struct or its elements by >>>>>>> adding an >>>>>>> __aligned_ attribute to the struct itself or any of its fields. The >>>>>>> best you can do on the struct itself is __attribute__((packed)). >>>>>>> But >>>>>>> that can come with disadvantages, and inefficient use. >>>>>> >>>>>> Yep making a structure aligned is an excellent way to introduce >>>>>> subtle >>>>>> bugs that happen when somebody, somewhere passes a pointer to one of >>>>>> those structure fields to some library function. >>>>> >>>>> However, as long as the application runs on Cortex-M0+, the aligned >>>>> version shouldn't introduce issues, should it? >>>>> >>>>> >>>> >>>> Correctly aligned data is never a problem. /Misaligned/ data is a >>>> problem. >>>> >>>> The Cortex-M0+ cannot access misaligned data directly. But if the >>>> compiler knows that it is misaligned - by "packed" struct, or >>>> "aligned" attribute on the typedef - it should break apart the >>>> accesses into bytes or 16-bit half-words as necessary. (Aligning a >>>> uint64_t to 4 byte alignment will not be a problem.) >>> >>> However for Cortex-M0+ uint64_t aligned at 4 bytes is: >>> - aligned for the core (two 4-bytes aligned accesses are required) >> >> Yes. As far as I know, the M0+ core does not need any alignment >> greater than 4 for any purpose. (But I might not know everything >> about the core!) There can be alignment requirements for other >> things, such as DMA. >> >>> - misaligned for the ABI and the compiler >> >> Yes. >> >>> >>> We agree that forcing the gcc compiler to consider 4-bytes as the >>> required alignment of uint64_t (using aligned attribute) is always safe. >> >> No. >> >> It will almost always be safe, but you don't have any guarantees. The >> compiler knows that if "p" is of type "uint64_t *", then "(uintptr_t) >> p & 0x07" will always be zero. Is it likely that you would have >> anything in your code where that is relevant, and also that the >> compiler would generate code that relies on that assumption? No, it >> is very unlikely. >> >> But there is a general principle that you should not lie to your >> compiler - don't write code that executes UB, breaks ABIs, or is >> otherwise breaking the contract you have with the compiler unless you >> are using compiler features that let you keep everything honest. >> >> Part of that is that code you are writing now for the M0+ might be >> copied or adapted to a different target at a different time. Maybe on >> a different core, the same data will be read using some kind of SIMD >> or vector instruction that /does/ require 8-byte alignment. Don't >> mess these things without telling your compiler. And don't mess with >> them without telling future maintainers and programmers using the code >> (including your future self). > > But it is exactly what I wanted to do: explictly tell the compiler to > align uint64_t at a 4-bytes address (as I wrote, with attribute align). > I didn't think to lie my best friend compiler. > uint64_t on 32-bit EABI ARM has an alignment of 8 bytes. That's cut in stone, and you cannot change it (short of adding a new ABI to the toolchain). If you try to use uint64_t objects that are not 8-byte aligned, or try to use pointers that are not 8-byte aligned to access uint64_t types, you are lying to your compiler. If you make a new type that is like a uint64_t but with an "aligned(4)" attribute, you have a /new/ type. And that type will work just like you want - it is an 8 byte unsigned integer with a 4 byte alignment. As long as you use that consistently, you'll be fine. > What I wanted to know is if there were other issues or drawback, such as > more instructions penalty. The drawback from trying to use an object of a type with an improper alignment is that you have UB. What more reasons do you want for not doing it? > From the goldbot link that you share in > another post, it seems there's a penalty of a single instruction (it's > strange, it seems the compiler needs to save the struct pointer to r3, > before loading the two halves of the word, but only if uint64_t is > aligned to 4-bytes). > The compiler is not perfect here - there is definitely an extra instruction because it is reading the low word first. (clang reads the low word first for uint64_t as well, meaning it gives worse code for A and B as well.) In real code, rather than a brief test snippet, other factors could mean this does not happen - it's only because the pointer happens to be in r0 that you see it here. But there's no harm in filing a gcc bug on this, looking for an obvious improvement. > >> I would be extremely surprised to find code that fails to work on an >> M0+ because of a uint64_t pointer that is 4-byte aligned but not >> 8-byte aligned. But if /I/ want to use 64-bit integers with 4-byte >> alignments, I'd use the typedef'd aligned type for the object type and >> for any relevant pointers. > > Yes, of course. Even if I don't understand why the compiler isn't able > to align at 4-bytes address the uint64_t member in struct B. > It can't align the uint64_t member because the EABI says uint64_t (or, rather, unsigned long long) is 8 bytes aligned. gcc didn't make those rules - ARM did. As I briefly mentioned before, there are a number of very poor choices in the EABI (and the 32-bit ARM ABI used for Linux). This is far from the worst. >>> However, what really changes in the binary output? >>> >>> In some cases, the address of uint64_t can change from 8-bytes to >>> 4-bytes aligned address (because we instructed it to do so). What >>> about the code that accesses uint64_t aligned to 4-bytes? Is it >>> identical between 4- and 8-bytes alignment requirement? I think so, >>> because in both case, the compiler should add two load/store 4-bytes >>> instructions. >>> >> >
[toc] | [prev] | [next] | [standalone]
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-21 17:57 +0100 |
| Message-ID | <10kr0gf$29a7t$1@dont-email.me> |
| In reply to | #32477 |
Il 21/01/2026 17:13, David Brown ha scritto:
> On 21/01/2026 15:58, pozz wrote:
>> Il 21/01/2026 10:02, David Brown ha scritto:
>>> On 21/01/2026 09:11, pozz wrote:
>>>> Il 20/01/2026 18:44, David Brown ha scritto:
>>>>> On 20/01/2026 18:09, pozz wrote:
>>>>>> Il 20/01/2026 17:41, Grant Edwards ha scritto:
>>>>>>> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote:
>>>>>>>
>>>>>>>> You can't reduce the alignment of a struct or its elements by
>>>>>>>> adding an
>>>>>>>> __aligned_ attribute to the struct itself or any of its fields.
>>>>>>>> The
>>>>>>>> best you can do on the struct itself is __attribute__((packed)).
>>>>>>>> But
>>>>>>>> that can come with disadvantages, and inefficient use.
>>>>>>>
>>>>>>> Yep making a structure aligned is an excellent way to introduce
>>>>>>> subtle
>>>>>>> bugs that happen when somebody, somewhere passes a pointer to one of
>>>>>>> those structure fields to some library function.
>>>>>>
>>>>>> However, as long as the application runs on Cortex-M0+, the
>>>>>> aligned version shouldn't introduce issues, should it?
>>>>>>
>>>>>>
>>>>>
>>>>> Correctly aligned data is never a problem. /Misaligned/ data is a
>>>>> problem.
>>>>>
>>>>> The Cortex-M0+ cannot access misaligned data directly. But if the
>>>>> compiler knows that it is misaligned - by "packed" struct, or
>>>>> "aligned" attribute on the typedef - it should break apart the
>>>>> accesses into bytes or 16-bit half-words as necessary. (Aligning a
>>>>> uint64_t to 4 byte alignment will not be a problem.)
>>>>
>>>> However for Cortex-M0+ uint64_t aligned at 4 bytes is:
>>>> - aligned for the core (two 4-bytes aligned accesses are required)
>>>
>>> Yes. As far as I know, the M0+ core does not need any alignment
>>> greater than 4 for any purpose. (But I might not know everything
>>> about the core!) There can be alignment requirements for other
>>> things, such as DMA.
>>>
>>>> - misaligned for the ABI and the compiler
>>>
>>> Yes.
>>>
>>>>
>>>> We agree that forcing the gcc compiler to consider 4-bytes as the
>>>> required alignment of uint64_t (using aligned attribute) is always
>>>> safe.
>>>
>>> No.
>>>
>>> It will almost always be safe, but you don't have any guarantees.
>>> The compiler knows that if "p" is of type "uint64_t *", then
>>> "(uintptr_t) p & 0x07" will always be zero. Is it likely that you
>>> would have anything in your code where that is relevant, and also
>>> that the compiler would generate code that relies on that
>>> assumption? No, it is very unlikely.
>>>
>>> But there is a general principle that you should not lie to your
>>> compiler - don't write code that executes UB, breaks ABIs, or is
>>> otherwise breaking the contract you have with the compiler unless you
>>> are using compiler features that let you keep everything honest.
>>>
>>> Part of that is that code you are writing now for the M0+ might be
>>> copied or adapted to a different target at a different time. Maybe
>>> on a different core, the same data will be read using some kind of
>>> SIMD or vector instruction that /does/ require 8-byte alignment.
>>> Don't mess these things without telling your compiler. And don't
>>> mess with them without telling future maintainers and programmers
>>> using the code (including your future self).
>>
>> But it is exactly what I wanted to do: explictly tell the compiler to
>> align uint64_t at a 4-bytes address (as I wrote, with attribute
>> align). I didn't think to lie my best friend compiler.
>>
>
> uint64_t on 32-bit EABI ARM has an alignment of 8 bytes. That's cut in
> stone, and you cannot change it (short of adding a new ABI to the
> toolchain). If you try to use uint64_t objects that are not 8-byte
> aligned, or try to use pointers that are not 8-byte aligned to access
> uint64_t types, you are lying to your compiler.
>
> If you make a new type that is like a uint64_t but with an "aligned(4)"
> attribute, you have a /new/ type. And that type will work just like you
> want - it is an 8 byte unsigned integer with a 4 byte alignment. As
> long as you use that consistently, you'll be fine.
>
>> What I wanted to know is if there were other issues or drawback, such
>> as more instructions penalty.
>
> The drawback from trying to use an object of a type with an improper
> alignment is that you have UB. What more reasons do you want for not
> doing it?
Most probably I can't explain what I want to say. I don't want to use an
*improper* alignment (different from the one that gcc really is using).
I want to know what happens when I *instruct* the compiler to use a
4-bytes alignment for uint64_t in the context of Cortex-M0+ core only.
In other words, is it completely safe to use, as you suggested,
typedef uint64_t __attribute__((align(4))) uint64_a;
???
From what you wrote, I think yes. Maybe just a very small optimization
penalty.
>> From the goldbot link that you share in another post, it seems there's
>> a penalty of a single instruction (it's strange, it seems the compiler
>> needs to save the struct pointer to r3, before loading the two halves
>> of the word, but only if uint64_t is aligned to 4-bytes).
>>
>
> The compiler is not perfect here - there is definitely an extra
> instruction because it is reading the low word first. (clang reads the
> low word first for uint64_t as well, meaning it gives worse code for A
> and B as well.) In real code, rather than a brief test snippet, other
> factors could mean this does not happen - it's only because the pointer
> happens to be in r0 that you see it here.
>
> But there's no harm in filing a gcc bug on this, looking for an obvious
> improvement.
>
>>
>>> I would be extremely surprised to find code that fails to work on an
>>> M0+ because of a uint64_t pointer that is 4-byte aligned but not
>>> 8-byte aligned. But if /I/ want to use 64-bit integers with 4-byte
>>> alignments, I'd use the typedef'd aligned type for the object type
>>> and for any relevant pointers.
>>
>> Yes, of course. Even if I don't understand why the compiler isn't able
>> to align at 4-bytes address the uint64_t member in struct B.
>>
>
> It can't align the uint64_t member because the EABI says uint64_t (or,
> rather, unsigned long long) is 8 bytes aligned. gcc didn't make those
> rules - ARM did.
But struct B is defined with correct alignment attribute for uint64_t
member. I tried also:
struct B {
uint32_t x;
uint64_t y __attribute__((aligned(4)));
};
The struct size is always 16, so y is placed at offset 8 and not 4. It
seems to me gcc isn't able to respect the aligned attribute of 4 bytes
when it is specified inside the struct definition.
I don't see many differences with:
typedef __attribute__((aligned(4))) uint64_t uint64_a;
struct C {
uint32_t x;
uint64_a y;
};
>
> As I briefly mentioned before, there are a number of very poor choices
> in the EABI (and the 32-bit ARM ABI used for Linux). This is far from
> the worst.
>
>>>> However, what really changes in the binary output?
>>>>
>>>> In some cases, the address of uint64_t can change from 8-bytes to
>>>> 4-bytes aligned address (because we instructed it to do so). What
>>>> about the code that accesses uint64_t aligned to 4-bytes? Is it
>>>> identical between 4- and 8-bytes alignment requirement? I think so,
>>>> because in both case, the compiler should add two load/store 4-bytes
>>>> instructions.
>>>>
>>>
>>
>
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-22 10:03 +0100 |
| Message-ID | <10ksp4q$2s8q3$1@dont-email.me> |
| In reply to | #32478 |
On 21/01/2026 17:57, pozz wrote:
> Il 21/01/2026 17:13, David Brown ha scritto:
>> On 21/01/2026 15:58, pozz wrote:
>>> Il 21/01/2026 10:02, David Brown ha scritto:
>>>> On 21/01/2026 09:11, pozz wrote:
<snip for brevity>
> Most probably I can't explain what I want to say. I don't want to use an
> *improper* alignment (different from the one that gcc really is using).
> I want to know what happens when I *instruct* the compiler to use a
> 4-bytes alignment for uint64_t in the context of Cortex-M0+ core only.
>
I think we may have been talking slightly past each other, so that
re-wording was helpful.
> In other words, is it completely safe to use, as you suggested,
>
> typedef uint64_t __attribute__((align(4))) uint64_a;
>
> ???
Baring compiler bugs, yes, that is completely safe. When the compiler
lets you make such a type, and use it, it is the compiler's
responsibility to get the details right. You should never see issues
from the compiler's knowledge and assumptions of alignments, and it
should generate instructions that work on the target (for example, if
the target hardware required 8-byte alignment for 64-bit loads and
stores, then the compiler would generate two 32-bit accesses instead).
And for the Cortex M series, 4-byte alignment is the maximum needed for
working code (though there might be efficiency differences on some of
the biggest M cores that have 64-bit buses internally, or when data
caches are used).
>
> From what you wrote, I think yes. Maybe just a very small optimization
> penalty.
>
Yes. And I think that is a "missed optimisation opportunity" bug. I
suspect (or speculate), but have not looked at the compiler code to be
sure, that the code generator generally accesses the low half of 64-bit
data first. And then it may have specific optimisations ("peephole"
optimisations) for re-ordering the accesses for "long long" types in
certain circumstances, saving a register and an instruction. However,
that would apply only to the specific type - and while "uint64_a" works
a lot like "unsigned long long", it is not that exact type, and won't
trigger the same optimisation.
>
>>> From the goldbot link that you share in another post, it seems
>>> there's a penalty of a single instruction (it's strange, it seems the
>>> compiler needs to save the struct pointer to r3, before loading the
>>> two halves of the word, but only if uint64_t is aligned to 4-bytes).
>>>
>>
>> The compiler is not perfect here - there is definitely an extra
>> instruction because it is reading the low word first. (clang reads
>> the low word first for uint64_t as well, meaning it gives worse code
>> for A and B as well.) In real code, rather than a brief test snippet,
>> other factors could mean this does not happen - it's only because the
>> pointer happens to be in r0 that you see it here.
>>
>> But there's no harm in filing a gcc bug on this, looking for an
>> obvious improvement.
>>
>>>
>>>> I would be extremely surprised to find code that fails to work on an
>>>> M0+ because of a uint64_t pointer that is 4-byte aligned but not
>>>> 8-byte aligned. But if /I/ want to use 64-bit integers with 4-byte
>>>> alignments, I'd use the typedef'd aligned type for the object type
>>>> and for any relevant pointers.
>>>
>>> Yes, of course. Even if I don't understand why the compiler isn't
>>> able to align at 4-bytes address the uint64_t member in struct B.
>>>
>>
>> It can't align the uint64_t member because the EABI says uint64_t (or,
>> rather, unsigned long long) is 8 bytes aligned. gcc didn't make those
>> rules - ARM did.
>
> But struct B is defined with correct alignment attribute for uint64_t
> member. I tried also:
>
> struct B {
> uint32_t x;
> uint64_t y __attribute__((aligned(4)));
> };
>
> The struct size is always 16, so y is placed at offset 8 and not 4. It
> seems to me gcc isn't able to respect the aligned attribute of 4 bytes
> when it is specified inside the struct definition.
That is my conclusion too. (I tried the "aligned" attribute in every
place I could.) The only place it worked was on a typedef for the new
"uint64_a" type.
>
> I don't see many differences with:
>
> typedef __attribute__((aligned(4))) uint64_t uint64_a;
>
> struct C {
> uint32_t x;
> uint64_a y;
> };
>
It certainly seems inconsistent to me that it works on the typedef, and
not directly in the struct definition. After all, a typedef does not
actually define a new type (it's a silly name) - it merely defines an
alias or shortcut name for a type. So it would seem logical that using
"uint64_a" or "__attribute__((aligned(4))) uint64_t" in the struct
definition would mean exactly the same thing. But apparently not. gcc
attributes are not part of the normal C grammar, so there's no standard
to fall back on here.
>>
>> As I briefly mentioned before, there are a number of very poor choices
>> in the EABI (and the 32-bit ARM ABI used for Linux). This is far from
>> the worst.
>>
>>>>> However, what really changes in the binary output?
>>>>>
>>>>> In some cases, the address of uint64_t can change from 8-bytes to
>>>>> 4-bytes aligned address (because we instructed it to do so). What
>>>>> about the code that accesses uint64_t aligned to 4-bytes? Is it
>>>>> identical between 4- and 8-bytes alignment requirement? I think so,
>>>>> because in both case, the compiler should add two load/store
>>>>> 4-bytes instructions.
>>>>>
>>>>
>>>
>>
>
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2026-01-20 17:48 +0000 |
| Message-ID | <10kof5q$929$1@reader2.panix.com> |
| In reply to | #32465 |
On 2026-01-20, pozz <pozzugno@gmail.com> wrote: > Il 20/01/2026 17:41, Grant Edwards ha scritto: >> On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: >> >>> You can't reduce the alignment of a struct or its elements by adding an >>> __aligned_ attribute to the struct itself or any of its fields. The >>> best you can do on the struct itself is __attribute__((packed)). But >>> that can come with disadvantages, and inefficient use. >> >> Yep making a structure aligned is an excellent way to introduce subtle >> bugs that happen when somebody, somewhere passes a pointer to one of >> those structure fields to some library function. Aargh, my bad. I meant that making a strucutre _packed_ is an excellent way to introduce subtle bugs that happen when somebody, somewhere passes a pointer to one of those structure fields to some library function. > However, as long as the application runs on Cortex-M0+, the aligned > version shouldn't introduce issues, should it? A non-packed structure should always be OK. A packed structure will work fine as long as it's being accessed by code that "knows" it's packed. You can pass a pointer to packed struct to a function as long as it's declared in that function as a pointer to a packed struct: the compiler will generate extra code to deal with accesses to values that are misaligned due to the packing. However, passing a pointer to an packed field structure (e.g. to a uint64_t) to a function where it was declared as a normal "uint64_t *p" can cause failures on ARM targets. It will work OK on X86. I think it used to work OK on m68k also. IIRC SPARC failed in similar ways to ARM. -- Grant
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-20 18:41 +0100 |
| Message-ID | <10koep6$1dlne$1@dont-email.me> |
| In reply to | #32463 |
On 20/01/2026 17:41, Grant Edwards wrote: > On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > >> You can't reduce the alignment of a struct or its elements by adding an >> __aligned_ attribute to the struct itself or any of its fields. The >> best you can do on the struct itself is __attribute__((packed)). But >> that can come with disadvantages, and inefficient use. > > Yep making a structure aligned is an excellent way to introduce subtle To be clear - you mean making the structure /packed/, not /aligned/, or in some other way allowing objects to be placed at a smaller alignment than the ABI says. > bugs that happen when somebody, somewhere passes a pointer to one of > those structure fields to some library function. Somebody I used to > work with was very fond of making all of his structures aligned (for > no apparent reason). Then he would test his code on an X86 desktop > machine. It worked fine because the X86 support unaligned > accesses. Then he would move to an ARM target, and it would > fail. Inevitably the cry "The compiler's broken!" would be heard, and > I would have to explain to him for the Nth time about misaligned > accesses on different ARM targets. Some of our targets generate a bus > fault, some just silently read/write only part of the data. > Cortex M3 and bigger all handle misaligned accesses without problem (albeit possibly at a performance penalty). Cortex M0, M0+ and M1 do not support misaligned accesses. On an M4, the compiler should generate normal 32-bit loads and stores for a "packed" struct with 32-bit fields, but it should generate byte-by-byte accesses on a Cortex M0. It's fine to have packed structs, or types with smaller than normal alignment, as long as the compiler knows that's the case. So you don't take pointers to fields in a "packed" struct, and if you use something like the "uint64_a" type I suggested, it should be accessed by pointers to its real type, not pointers to "uint64_t". (In practice, I would expect that pointers to uint64_t would work, because the accesses will be 32-bit anyway, but you should never lie to your compiler!) > That same guy once insisted that with the 32-bit GCC compiler we were > using "unsigned long variables work, but unsigned variables don't". So > he was busily changing all of his "unsigned" variables to "unsigned > long". I printed out the assembly generated for both cases showing > that it was identical. He then insisted that the linker must be doing > something to break unsigned integers. That is, shall we say, a strange idea from that guy. Perhaps he was confused by uint32_t being "unsigned long" on EABI 32-bit ARM, rather than "unsigned int" (as it is on 32-bit ARM Linux, and on Windows) ? I have often seen people think that "unsigned int" and "unsigned long" are the same type on 32-bit ARM, just because they are both 32-bit, and get confused when there are compiler complaints about incompatible pointers when they are mixed. > > And then there was the time he decided that cross compiling on a > single-core Linux host worked but compiling on a dual-core > didn't. [Both cases using a single-threaded "make".] > > And the time he decided that he needed to upgrade a buch of the Ubuntu > X11 libraries on the X86 host machine to fix a problem in the ARM > target. > Someone is a little confused :-)
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2026-01-20 18:10 +0000 |
| Message-ID | <10kogeb$n48$1@reader2.panix.com> |
| In reply to | #32466 |
On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > Cortex M3 and bigger all handle misaligned accesses without problem > (albeit possibly at a performance penalty). FWIW, the M3 can be configured to generate a fault on unaligned accesses, so whether it works or not depends on your low-level init code. I believe that unaligned-fault-enable feature is disabled by default at reset. Also, The M3 only supports non-world aligned accesses for normal signle store/load instructions. LDM/STM and LDRD/STRD will fault on non-word aligned access. -- Grant
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-20 22:32 +0100 |
| Message-ID | <10kos8h$1ivol$2@dont-email.me> |
| In reply to | #32469 |
On 20/01/2026 19:10, Grant Edwards wrote: > On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > >> Cortex M3 and bigger all handle misaligned accesses without problem >> (albeit possibly at a performance penalty). > > FWIW, the M3 can be configured to generate a fault on unaligned > accesses, so whether it works or not depends on your low-level init > code. I believe that unaligned-fault-enable feature is disabled by > default at reset. I did not know that. > Also, The M3 only supports non-world aligned > accesses for normal signle store/load instructions. LDM/STM and > LDRD/STRD will fault on non-word aligned access. > Yes. Of course, the LDM/STM are primarily used for pushing and popping registers on the stack, so you are always going to be aligned there. In the godbolt.org link I posted in a reply to Pozz, we can see that when the compiler knows the uint64_t is aligned at least to 4 bytes, it uses LDRD, but when it does not know that it is 4 bytes aligned, it uses two LDR instructions. (As an aside, I find it annoying that STRD can be interrupted in the middle - it means you don't have an atomic 64-bit store. LDRD can also be interrupted in the middle, but as it is restarted, it gives you a 64-bit atomic read.)
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2026-01-21 03:38 +0000 |
| Message-ID | <10kphnl$gvp$1@reader2.panix.com> |
| In reply to | #32471 |
On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > (As an aside, I find it annoying that STRD can be interrupted in the > middle - it means you don't have an atomic 64-bit store. LDRD can also > be interrupted in the middle, but as it is restarted, it gives you a > 64-bit atomic read.) Yes, I just noticed that in the manual the other day, and it seemed like an odd decision. -- Grant
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-21 08:54 +0100 |
| Message-ID | <10kq0nq$1u1ai$1@dont-email.me> |
| In reply to | #32472 |
On 21/01/2026 04:38, Grant Edwards wrote: > On 2026-01-20, David Brown <david.brown@hesbynett.no> wrote: > >> (As an aside, I find it annoying that STRD can be interrupted in the >> middle - it means you don't have an atomic 64-bit store. LDRD can also >> be interrupted in the middle, but as it is restarted, it gives you a >> 64-bit atomic read.) > > Yes, I just noticed that in the manual the other day, and it seemed > like an odd decision. > It's not odd from the implementation viewpoint, but disappointing from the user viewpoint. The double loads and stores are implemented as a sort of combination of two instructions, or at least two actions. Disabling interrupts in the middle of the instructions would mean additional hardware logic. (I think all longer-running instructions, like divisions, are interruptible.) When the interrupt returns, the instructions are simply restarted. That gives an atomic 64-bit load, so it lets you safely read 64-bit data that is changed by an interrupt or higher-priority thread - unlike using two separate 32-bit load instructions. (Using a volatile read appears to force the use of LDRD on gcc for M3 and above, while non-volatile reads might be split and re-arranged depending on the surrounding code.) An interrupted double store is, obviously, a very different matter - your interrupt routines or pre-empting threads see half-written data. My guess as to the decision process is that making these instructions non-interruptible would have taken more hardware, and weakened guarantees on interrupt latency. But if they had asked /me/, I'd have chosen to make STRD non-interruptible :-)
[toc] | [prev] | [next] | [standalone]
| From | pozz <pozzugno@gmail.com> |
|---|---|
| Date | 2026-01-20 17:55 +0100 |
| Message-ID | <10koc0h$1cr4j$1@dont-email.me> |
| In reply to | #32462 |
Il 20/01/2026 17:07, David Brown ha scritto: > On 20/01/2026 13:26, pozz wrote: >> I just discovered that my arm-gcc assigns an alignment of 8 to a >> struct with uint64_t member. >> >> First of all: I can't explain why. Cortex-M0+ shouldn't have any >> special load/store instructions for 64-bits data. I think the uint64_t >> variable is *always* accessed with two separate instructions. >> > > There are other Cortex-M devices that /can/ access 64 bit data with a > single instruction (though not always as an atomic function). > > Compilers use family ABI's, not ABI's specifically tuned for exact > devices. The EABI for 32-bit ARM says long long's are 8 byte aligned, > so that's what is used for all targets that use the EABI. (There's a > lot to dislike about the EABI - this is not the worst thing.) So the ABI used by arm gcc is EABI that is valid for a list of Cortex-M devices, a few of these that require 8-byte alignment of 64-bits integers. >> Second thing. Is it safe to force the alignment of such structs to 4 >> with __attribute__((aligned(4)))? > > You can't reduce the alignment of a struct or its elements by adding an > __aligned_ attribute to the struct itself or any of its fields. The > best you can do on the struct itself is __attribute__((packed)). But > that can come with disadvantages, and inefficient use. But this is the opposite of what you write below! >> I have big arrays of structs that contains uint64_t members, so I'm >> thinking how to save some space. > > The best way is to organise the fields so that they are naturally > aligned, and don't have padding for alignment. I like "-Wpadded" to > tell me if there is unexpected padding. > > > What you /can/ do, however, is define a type that is 64 bits, but 4 byte > alignment: > > typedef uint64_t __attribute__((aligned(4)) uint64_a; > > Now you can use "uint64_a" instead of "uint64_t", and it will have 4 > byte alignment. Before you wrote it's impossible to reduce the alignment from 8 to 4 with __attribute__((aligned(4))), but now you write it is possible.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-01-20 22:24 +0100 |
| Message-ID | <10korqk$1ivol$1@dont-email.me> |
| In reply to | #32464 |
On 20/01/2026 17:55, pozz wrote: > Il 20/01/2026 17:07, David Brown ha scritto: >> On 20/01/2026 13:26, pozz wrote: >>> I just discovered that my arm-gcc assigns an alignment of 8 to a >>> struct with uint64_t member. >>> >>> First of all: I can't explain why. Cortex-M0+ shouldn't have any >>> special load/store instructions for 64-bits data. I think the >>> uint64_t variable is *always* accessed with two separate instructions. >>> >> >> There are other Cortex-M devices that /can/ access 64 bit data with a >> single instruction (though not always as an atomic function). >> >> Compilers use family ABI's, not ABI's specifically tuned for exact >> devices. The EABI for 32-bit ARM says long long's are 8 byte aligned, >> so that's what is used for all targets that use the EABI. (There's a >> lot to dislike about the EABI - this is not the worst thing.) > > So the ABI used by arm gcc is EABI that is valid for a list of Cortex-M > devices, a few of these that require 8-byte alignment of 64-bits integers. > I don't think any of the 32-bit Cortex-M cores actually need 8 byte alignment in the hardware - it could have been for compatibility with 64-bit devices. > >>> Second thing. Is it safe to force the alignment of such structs to 4 >>> with __attribute__((aligned(4)))? >> >> You can't reduce the alignment of a struct or its elements by adding >> an __aligned_ attribute to the struct itself or any of its fields. >> The best you can do on the struct itself is __attribute__((packed)). >> But that can come with disadvantages, and inefficient use. > > But this is the opposite of what you write below! No, but I might have been unclear. Adding the "aligned" attribute to the /struct/, or to the field members /directly/, does not help you here. Adding it to a new typedef does. > > >>> I have big arrays of structs that contains uint64_t members, so I'm >>> thinking how to save some space. >> >> The best way is to organise the fields so that they are naturally >> aligned, and don't have padding for alignment. I like "-Wpadded" to >> tell me if there is unexpected padding. >> >> >> What you /can/ do, however, is define a type that is 64 bits, but 4 >> byte alignment: >> >> typedef uint64_t __attribute__((aligned(4)) uint64_a; >> >> Now you can use "uint64_a" instead of "uint64_t", and it will have 4 >> byte alignment. > > Before you wrote it's impossible to reduce the alignment from 8 to 4 > with __attribute__((aligned(4))), but now you write it is possible. > Putting it in a typedef lets you change the alignment. See <https://godbolt.org/z/3fac7n7Yo>, and look at the code generated for the M0+ and the M4 to see how "packed" and "aligned" affects things.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.arch.embedded
csiph-web