Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.os.linux.misc > #73831 > unrolled thread
| Started by | Lester Thorpe <lt@gnu.rocks> |
|---|---|
| First post | 2025-09-10 12:33 +0000 |
| Last post | 2025-09-12 00:15 -0400 |
| Articles | 16 — 8 participants |
Back to article view | Back to comp.os.linux.misc
Hint For GNU/Linux Progrmmers Lester Thorpe <lt@gnu.rocks> - 2025-09-10 12:33 +0000
Re: Hint For GNU/Linux Progrmmers John McCue <jmclnx@gmail.com.invalid> - 2025-09-11 15:11 +0000
Re: Hint For GNU/Linux Progrmmers John Ames <commodorejohn@gmail.com> - 2025-09-11 09:30 -0700
Re: Hint For GNU/Linux Progrmmers c186282 <c186282@nnada.net> - 2025-09-12 00:25 -0400
Re: Hint For GNU/Linux Progrmmers Lester Thorpe <lt@gnu.rocks> - 2025-09-11 18:42 +0000
Re: Hint For GNU/Linux Progrmmers Lawrence D’Oliveiro <ldo@nz.invalid> - 2025-09-11 22:38 +0000
Re: Hint For GNU/Linux Progrmmers c186282 <c186282@nnada.net> - 2025-09-12 03:02 -0400
Re: Hint For GNU/Linux Progrmmers Lester Thorpe <lt@gnu.rocks> - 2025-09-12 08:19 +0000
Re: Hint For GNU/Linux Progrmmers Stéphane CARPENTIER <sc@fiat-linux.fr> - 2025-09-13 13:44 +0000
Re: Hint For GNU/Linux Progrmmers Farley Flud <fsquared@fsquared.linux> - 2025-09-12 10:17 +0000
Re: Hint For GNU/Linux Progrmmers c186282 <c186282@nnada.net> - 2025-09-12 06:51 -0400
Re: Hint For GNU/Linux Progrmmers The Natural Philosopher <tnp@invalid.invalid> - 2025-09-12 15:45 +0100
Re: Hint For GNU/Linux Progrmmers Farley Flud <fsquared@fsquared.linux> - 2025-09-12 15:20 +0000
Re: Hint For GNU/Linux Progrmmers The Natural Philosopher <tnp@invalid.invalid> - 2025-09-13 07:57 +0100
Re: Hint For GNU/Linux Progrmmers John Ames <commodorejohn@gmail.com> - 2025-09-12 10:11 -0700
Re: Hint For GNU/Linux Progrmmers c186282 <c186282@nnada.net> - 2025-09-12 00:15 -0400
| From | Lester Thorpe <lt@gnu.rocks> |
|---|---|
| Date | 2025-09-10 12:33 +0000 |
| Subject | Hint For GNU/Linux Progrmmers |
| Message-ID | <pan$88697$563f5c11$d06bbf9a$5d033190@gnu.rocks> |
Program optimization is essential, but yet it is difficult to arrive at a best method. For example, unrolling all loops can either improve or degrade performance. The user therefore, to get the best optimization, will have to experiment through profiling or trial and error to arrive at the best means. This can be prohibitive for many users. I propose that GNU/Linux programmers should determine the best options and then publish these recommendations in the source tree to guide the interested user. When I create programs I always determine the best optimization but these programs are only for my own use. They are never published. -- Gentoo: the only road to GNU/Linux perfection.
[toc] | [next] | [standalone]
| From | John McCue <jmclnx@gmail.com.invalid> |
|---|---|
| Date | 2025-09-11 15:11 +0000 |
| Message-ID | <109uoqs$2n3c4$1@dont-email.me> |
| In reply to | #73831 |
Follow-ups trimmed to comp.os.linux.misc
In comp.os.linux.misc Lester Thorpe <lt@gnu.rocks> wrote:
> Program optimization is essential, but yet it is difficult
> to arrive at a best method.
<snip>
> I propose that GNU/Linux programmers should determine
> the best options and then publish these recommendations
> in the source tree to guide the interested user.
I find O1 is good enough for all programs I create.
To, me, testing and retesting different optimizations is a
huge waste of time and at most you might save 1 second :)
For programs created by others, I keep whatever setting
the use since they know much better than me.
--
csh(1) - "An elegant shell, for a more... civilized age."
- Paraphrasing Star Wars
[toc] | [prev] | [next] | [standalone]
| From | John Ames <commodorejohn@gmail.com> |
|---|---|
| Date | 2025-09-11 09:30 -0700 |
| Message-ID | <20250911093036.00006328@gmail.com> |
| In reply to | #73897 |
On Thu, 11 Sep 2025 15:11:24 -0000 (UTC) John McCue <jmclnx@gmail.com.invalid> wrote: > I find O1 is good enough for all programs I create. > > To, me, testing and retesting different optimizations is a > huge waste of time and at most you might save 1 second :) Even if you're an optimization freak (and there's nothing wrong with that,) the efficacy of tweaks like loop unrolling is highly dependent on machine particulars (cache size, etc.) - it's difficult if not impossible to establish a one-size-fits-all recipe for True Optimum Performance that could be handed out to non-freaks, as is being suggested here. Some level of tweaking may be warranted (e.g. unrolling loops in a way that suits the particular algorithm,) but there's little point trying to generalize deep grease-monkey fine-tuning across *all target systems* even for a single distro, let alone The World At Large.
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2025-09-12 00:25 -0400 |
| Message-ID | <8JednfQnLaxBPV71nZ2dnZfqnPWdnZ2d@giganews.com> |
| In reply to | #73898 |
On 9/11/25 12:30 PM, John Ames wrote: > On Thu, 11 Sep 2025 15:11:24 -0000 (UTC) > John McCue <jmclnx@gmail.com.invalid> wrote: > >> I find O1 is good enough for all programs I create. >> >> To, me, testing and retesting different optimizations is a >> huge waste of time and at most you might save 1 second :) > > Even if you're an optimization freak (and there's nothing wrong with > that,) the efficacy of tweaks like loop unrolling is highly dependent > on machine particulars (cache size, etc.) - it's difficult if not > impossible to establish a one-size-fits-all recipe for True Optimum > Performance that could be handed out to non-freaks, as is being > suggested here. Some level of tweaking may be warranted (e.g. unrolling > loops in a way that suits the particular algorithm,) but there's little > point trying to generalize deep grease-monkey fine-tuning across *all > target systems* even for a single distro, let alone The World At Large. Best tact - proto, then re-write a few weeks later. The 2nd take will be smarter, tighter. Compiler options ... only deliver slight improvements. Best used if you need SMALLER, not faster. Had one microcontroller app I kept tweaking for five or six generations. Each time I could zap unnecesssary steps. Got it down nearly 50% from the original - saved power (it was a solar-powered field app so that was kinda important). New "AI" code-writing ... don't count on much "optimization". The AI won't really "get it". It may work - but be kinda messy. If it's a popular app, figure the power/time consumption of 'messy' for millions/billions of users.
[toc] | [prev] | [next] | [standalone]
| From | Lester Thorpe <lt@gnu.rocks> |
|---|---|
| Date | 2025-09-11 18:42 +0000 |
| Message-ID | <pan$f36d4$163cb056$190affba$333a3f12@gnu.rocks> |
| In reply to | #73897 |
On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote: > > To, me, testing and retesting different optimizations is a > huge waste of time and at most you might save 1 second :) > That was my original point and the reason I suggest that programmers should do the dirty work for the user. But seconds can quickly add. For audio/video encoding and math/physics simulations optimization can mean the difference between 20 minutes and 1 hour, which is highly significant. There is an "ancient" program called "paranoia" which evaluates a machines floating point accuracy: <https://netlib.sandia.gov/paranoia/paranoia.c> Using your "-O1" to compile would lead to erroneous results. In this case, "-O0" is required. Granted, this program precedes Linux and GCC but other, more recent programs, may behave in similar ways regarding optimization. Therefore, it should be the programmers responsibility to indicate the correct optimization. > For programs created by others, I keep whatever setting > the use since they know much better than me. -- Gentoo: the only road to GNU/Linux perfection.
[toc] | [prev] | [next] | [standalone]
| From | Lawrence D’Oliveiro <ldo@nz.invalid> |
|---|---|
| Date | 2025-09-11 22:38 +0000 |
| Message-ID | <109vj1l$2vup8$6@dont-email.me> |
| In reply to | #73897 |
On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote: > To, me, testing and retesting different optimizations is a huge waste of > time and at most you might save 1 second :) I was once hired to build an app in MATLAB for decoding and displaying multiple channels of EEG data, using its built-in GUI tools (momentary shudder as the PTSD kicks in), in real time. One of the original researchers had already written some stream-decoding code to start with; I had a go at doing it in different ways, and was able to achieve close to a 2:1 speedup on the DEC Alpha I was using for testing. Then I ran the same code on the Windows NT box which was going to be used as the actual deployment platform ... and most of the speedup went away.
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2025-09-12 03:02 -0400 |
| Message-ID | <XPedndWqAP33WF71nZ2dnZfqnPednZ2d@giganews.com> |
| In reply to | #73911 |
On 9/11/25 6:38 PM, Lawrence D’Oliveiro wrote: > On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote: > >> To, me, testing and retesting different optimizations is a huge waste of >> time and at most you might save 1 second :) > > I was once hired to build an app in MATLAB for decoding and displaying > multiple channels of EEG data, using its built-in GUI tools (momentary > shudder as the PTSD kicks in), in real time. One of the original > researchers had already written some stream-decoding code to start with; I > had a go at doing it in different ways, and was able to achieve close to a > 2:1 speedup on the DEC Alpha I was using for testing. > > Then I ran the same code on the Windows NT box which was going to be used > as the actual deployment platform ... and most of the speedup went away. THE best op is to proto, look/think for a few weeks, then re-write. That will do FAR more than any compiler tweaks. I like to proto in Python, then re-write in Pascal or maybe K&R 'C' depending. New - "AI" generated code. The "AI" does NOT "get it". It's code will be MESSY - 'Lego'. Maybe not so bad for random utilities, but if the app is meant for millions/billions then shitty code sucks a LOT more CPU cycles and energy. "AI" ... at present it's gonna suck maybe 25% of the entire global energy output just so it can pretend to be idiot people. BIZ loves it because they think it can replace all those annoying HUMANS. Alas, disemployed humans can't BUY their stuff so ....... Can't get there from here. Sorry.
[toc] | [prev] | [next] | [standalone]
| From | Lester Thorpe <lt@gnu.rocks> |
|---|---|
| Date | 2025-09-12 08:19 +0000 |
| Message-ID | <pan$7e885$b476cfd9$150b5649$705f5b01@gnu.rocks> |
| In reply to | #73940 |
On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote: > > THE best op is to proto, look/think for a few weeks, > then re-write. > > That will do FAR more than any compiler tweaks. > Everyone is missing the main point. I am referring to optimizing code that is already published and available, e.g. the average GNU/Linux package. This code cannot be (easily) rewritten by the user and the only way to optimize is during build time, which can be quite effective. I have experienced up to 40% performance increase using just compiler options. But finding the best options can at times be difficult and that's why the code author should provide guidance. -- Gentoo: the only road to GNU/Linux perfection.
[toc] | [prev] | [next] | [standalone]
| From | Stéphane CARPENTIER <sc@fiat-linux.fr> |
|---|---|
| Date | 2025-09-13 13:44 +0000 |
| Message-ID | <68c57549$0$3363$426a34cc@news.free.fr> |
| In reply to | #73947 |
Le 12-09-2025, Lester Thorpe <lt@gnu.rocks> a écrit : > On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote: > >> >> THE best op is to proto, look/think for a few weeks, >> then re-write. >> >> That will do FAR more than any compiler tweaks. >> > > Everyone is missing the main point. Which one? - That you are a fraud? Nope: I know it. - That you don't know how to optimize compilation? Nope: I know it. - That you can only copy/past code? Nope: I know it. - That you are a distro lackey? Nope: I know it. - The fact that the more you speak about something, the less you know about it? Nope: I know it. - That you are a Windows fanboy trying to make Linux users pass like morons? Nope: I know it. > I am referring to optimizing code that is already published > and available, e.g. the average GNU/Linux package. You mean that the guys who wrote and published the code know how to compile it? Or do you mean what the people competent enough to write code for a great tool are too stupid to be able to know how to compile it? Do you really understand how your sentence is, at the same time, stupid and inconsistent? You explain at the same time they know what they are doing and they don't know what they are doing. You just explained you need random people to help you find a general way to sort out what experts do good and what they don't. > I have experienced up to 40% performance increase using just compiler > options. I don't believe that. And your last video proves that it's a lie. > But finding the best options can at times be difficult Agreed. But I don't believe you can find them. And, I believe the distro managers, helped with the people who provided the code, can do it. In any case, it would take me hours to find better options than what's provided by the distro managers helped by package producers to get noticeable results on my own computer. Another way to state it: spending hours to win few seconds each moths is a waste of my precious time. -- Si vous avez du temps à perdre : https://scarpet42.gitlab.io
[toc] | [prev] | [next] | [standalone]
| From | Farley Flud <fsquared@fsquared.linux> |
|---|---|
| Date | 2025-09-12 10:17 +0000 |
| Message-ID | <186481985309dd34$12201$2237616$802601b3@news.usenetexpress.com> |
| In reply to | #73940 |
On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote: > > THE best op is to proto, look/think for a few weeks, > then re-write. > > That will do FAR more than any compiler tweaks. > Not necessarily. Consider the Automatically Tuned Linear Algebra Software (ATLAS): <https://math-atlas.sourceforge.net/> Linear algebra (i.e. matrix operations) software is used as a standard benchmark for all supercomputers. The ATLAS program will automatically tune itself, using compiler options, for the best performance on a particular machine. ATLAS has some pre-determined options for certain CPUs but if a CPU is not on the list ATLAS will then undergo an automatic tuning wherein different options are tried and compared. Compiler tweaks can make a big difference. The original point of this thread is that all software should emulate ATLAS to some extent. -- Hail Linux! Hail FOSS! Hail Stallman!
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2025-09-12 06:51 -0400 |
| Message-ID | <IfCdnU0dLJHfZl71nZ2dnZfqnPudnZ2d@giganews.com> |
| In reply to | #73951 |
On 9/12/25 6:17 AM, Farley Flud wrote: > On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote: > >> >> THE best op is to proto, look/think for a few weeks, >> then re-write. >> >> That will do FAR more than any compiler tweaks. >> > > Not necessarily. Should ......... always did for me ....... > Consider the Automatically Tuned Linear Algebra Software (ATLAS): > > <https://math-atlas.sourceforge.net/> Ugh .... Gimme 'pure' 'C' or Pascal or FORTRAN. Linear algebra is not the best solution to everything. > Linear algebra (i.e. matrix operations) software is used as a standard > benchmark for all supercomputers. Not interested in that sort of benchmark. > The ATLAS program will automatically tune itself, using compiler options, > for the best performance on a particular machine. But we're not really talking 'compiler options' here but good/better/best source code. If your source is messy then no compiler can help you much. > ATLAS has some pre-determined options for certain CPUs but if a CPU > is not on the list ATLAS will then undergo an automatic tuning wherein > different options are tried and compared. > > Compiler tweaks can make a big difference. Depends. Garbage IN = Garbage OUT. > The original point of this thread is that all software should > emulate ATLAS to some extent. Ummm ... maybe in deep theory ...... but that's not how the 99% will do it. "Hello World" does NOT need this approach.
[toc] | [prev] | [next] | [standalone]
| From | The Natural Philosopher <tnp@invalid.invalid> |
|---|---|
| Date | 2025-09-12 15:45 +0100 |
| Message-ID | <10a1bmp$3i30k$10@dont-email.me> |
| In reply to | #73951 |
On 12/09/2025 11:17, Farley Flud wrote: > The ATLAS program will automatically tune itself, using compiler options, > for the best performance on a particular machine. How does it know what machine is the target? -- There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact. Mark Twain
[toc] | [prev] | [next] | [standalone]
| From | Farley Flud <fsquared@fsquared.linux> |
|---|---|
| Date | 2025-09-12 15:20 +0000 |
| Message-ID | <1864922473a43b41$102$2557511$802601b3@news.usenetexpress.com> |
| In reply to | #73972 |
On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote: > On 12/09/2025 11:17, Farley Flud wrote: >> The ATLAS program will automatically tune itself, using compiler options, >> for the best performance on a particular machine. > > How does it know what machine is the target? > The tuning occurs during build time. The "target" is the machine upon which it is being built. No binaries are distributed. Only the source code is available. However, some GNU/Linux distros will include binary Atlas packages but these are necessarily sub-optimal builds. Check out the blurb from Fedora: https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html -- Hail Linux! Hail FOSS! Hail Stallman!
[toc] | [prev] | [next] | [standalone]
| From | The Natural Philosopher <tnp@invalid.invalid> |
|---|---|
| Date | 2025-09-13 07:57 +0100 |
| Message-ID | <10a34kk$7bv3$1@dont-email.me> |
| In reply to | #73976 |
On 12/09/2025 16:20, Farley Flud wrote: > On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote: > >> On 12/09/2025 11:17, Farley Flud wrote: >>> The ATLAS program will automatically tune itself, using compiler options, >>> for the best performance on a particular machine. >> >> How does it know what machine is the target? >> > > The tuning occurs during build time. The "target" is the machine upon which > it is being built. > That will do really nicely when I am compiling for an ARM 2040 on my *86 machine, then... > No binaries are distributed. Only the source code is available. > > However, some GNU/Linux distros will include binary Atlas packages > but these are necessarily sub-optimal builds. Check out the blurb > from Fedora: > > https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html > > > -- “Politics is the art of looking for trouble, finding it everywhere, diagnosing it incorrectly and applying the wrong remedies.” ― Groucho Marx
[toc] | [prev] | [next] | [standalone]
| From | John Ames <commodorejohn@gmail.com> |
|---|---|
| Date | 2025-09-12 10:11 -0700 |
| Message-ID | <20250912101103.00007f46@gmail.com> |
| In reply to | #73972 |
On Fri, 12 Sep 2025 15:45:45 +0100 The Natural Philosopher <tnp@invalid.invalid> wrote: > > The ATLAS program will automatically tune itself, using compiler > > options, for the best performance on a particular machine. > > How does it know what machine is the target? Presumably it targets the machine on which it's running. Reminds me a bit of one of the few genuinely smart things MS's .NET framework does - part of the install/update process involves it auto-profiling/tuning its core VM interpreter/library in situ so it can accurately benchmark itself. That only affects raw VM performance (a bad algorithm running on top of a fast VM is still gonna suck,) and it'd be more involved to do something comparable with a native-code application (dynamic linking might save you the trouble of a full recompile, but it'd still be non- trivial,) but it *is* a nice touch.
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2025-09-12 00:15 -0400 |
| Message-ID | <wIadnWI-CYX5A171nZ2dnZfqn_GdnZ2d@giganews.com> |
| In reply to | #73897 |
On 9/11/25 11:11 AM, John McCue wrote: > Follow-ups trimmed to comp.os.linux.misc > > In comp.os.linux.misc Lester Thorpe <lt@gnu.rocks> wrote: >> Program optimization is essential, but yet it is difficult >> to arrive at a best method. > <snip> >> I propose that GNU/Linux programmers should determine >> the best options and then publish these recommendations >> in the source tree to guide the interested user. > > I find O1 is good enough for all programs I create. Yep. As for the actual writ code, write it once, wait a week or two, then write it over again better. I tended to proto in Python, then re-do in Pascal. The re-do was always a lot tighter/smarter. > To, me, testing and retesting different optimizations is a > huge waste of time and at most you might save 1 second :) Yep, esp at the compiler level. Refined source - might improve 25% or so. Drop unneeded/weird steps. > For programs created by others, I keep whatever setting > the use since they know much better than me. Well, not necessarily ....
[toc] | [prev] | [standalone]
Back to top | Article view | comp.os.linux.misc
csiph-web