Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #389665 > unrolled thread

transpiling to low level C

Started byThiago Adams <thiago.adams@gmail.com>
First post2024-12-15 00:05 -0300
Last post2025-02-09 12:43 -0800
Articles 20 on this page of 140 — 19 participants

Back to article view | Back to comp.lang.c


Contents

  transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 00:05 -0300
    Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-15 04:31 +0000
      Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 07:44 -0300
        Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-15 22:22 +0000
          Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 20:22 -0300
            Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-16 01:02 -0600
              Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-16 08:17 -0300
              Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 11:46 +0000
              Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 19:44 +0000
              Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-16 13:59 -0800
                Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 23:36 +0000
    Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-14 20:39 -0800
      Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 07:49 -0300
        Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-15 13:01 -0800
          Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-02-15 21:01 -0800
            USENET and spam (Was: Re: transpiling to low level C) Salvador Mirzo <smirzo@example.com> - 2025-02-16 10:17 -0300
    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-15 11:28 +0000
      Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 08:46 -0300
        Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 09:13 -0300
    Re: transpiling to low level C Bonita Montero <Bonita.Montero@gmail.com> - 2024-12-15 20:08 +0100
      Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-15 21:32 +0000
        Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-15 17:53 -0600
          Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-16 10:36 +0100
          Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-16 08:21 -0300
            Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 01:03 -0600
              Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 14:55 -0300
                Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 14:59 -0300
                  Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 15:16 -0300
                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 18:37 +0000
                      Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 16:07 -0300
                        Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 19:42 +0000
                          Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:51 -0600
                            Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-18 16:43 -0300
                              Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 18:27 -0600
                                Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 00:35 +0000
                                  Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 23:46 -0600
                                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 11:27 +0000
                                      Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-19 14:36 -0600
                                        Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-20 05:10 -0600
                                    Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-23 02:08 +0000
                                      Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-23 05:15 -0600
                Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 13:07 -0600
                  Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 16:33 -0300
                    Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:51 -0600
                  Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-21 05:34 +0000
          Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-16 18:12 +0100
            Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 18:37 +0000
              Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-16 21:39 +0100
                Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 23:26 +0000
                  Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-16 17:19 -0800
                    Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 00:40 -0600
                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 16:17 +0000
                      Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-17 18:18 +0100
                      Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-17 18:46 +0000
                        Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 22:45 +0000
                          Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-18 00:23 +0000
                            Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 01:24 +0000
                              Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-18 03:51 +0000
                        Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-18 17:26 +0100
                      Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-17 12:13 -0800
                        Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-18 17:19 +0100
                  Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-17 18:29 +0100
            Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-20 17:28 -0800
              Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-21 21:31 +0100
                Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-21 13:51 -0800
                  Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:22 +0100
                    Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-13 08:10 -0800
                Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 00:20 +0200
                  Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:13 +0100
                    Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 02:18 +0200
                      Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:39 +0100
                        Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 03:04 +0200
                          Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 03:06 +0100
                            Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 17:39 -0800
                              Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-23 02:41 +0000
                                Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 08:43 +0100
                                  Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 00:51 -0600
                                    Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-28 09:20 -0800
                                Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-04 12:12 -0800
                                  Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-01-04 12:53 -0800
                                  Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2025-01-05 11:18 +0000
                                    Re: transpiling to low level C James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-01-05 12:04 -0500
                                    Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-07 21:38 -0800
                          Re: transpiling to low level C James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-12-21 22:17 -0500
                            Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 19:51 +0100
                          Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-06-06 11:50 -0700
                  Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 13:02 -0800
                    Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 13:25 -0800
                      Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 15:50 -0800
                Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 06:01 +0000
                  Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 11:22 +0200
                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-22 11:35 +0000
                  Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 10:38 -0800
                    Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 19:44 +0000
                      Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-04 11:18 -0800
                  Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 20:41 +0100
                    Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 00:20 +0200
                      Re: transpiling to low level C scott@slp53.sl.home (Scott Lurndal) - 2024-12-23 15:41 +0000
                        Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-23 15:51 +0000
                        Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 18:05 +0200
                        Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 14:05 -0800
                    Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 23:29 +0000
                    Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 09:46 +0100
                      Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-23 11:35 +0000
                        Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 13:18 +0100
                      Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 13:40 +0200
                        Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 13:24 +0100
                        Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 13:18 -0800
                          Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-24 00:41 +0000
                            Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 20:55 -0800
                          Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 03:41 -0600
                            Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 15:43 -0600
                            Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-28 09:24 -0800
                              Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-28 13:59 -0600
                                Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-31 04:57 -0800
                      Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 13:28 -0800
                    Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 14:00 -0800
                Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-22 14:19 +0000
                  Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-22 15:30 +0000
                Re: transpiling to low level C Kaz Kylheku <643-408-1753@kylheku.com> - 2024-12-22 21:45 +0000
                  Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-22 23:22 +0000
                    Re: transpiling to low level C Kaz Kylheku <643-408-1753@kylheku.com> - 2024-12-22 23:47 +0000
                  Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 17:22 -0800
          Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 21:23 +0000
            Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-17 11:16 +0200
            Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 12:04 +0000
              Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 12:51 -0600
                Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 12:08 +0000
                  Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:50 -0600
                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 23:37 +0000
              Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-17 19:40 +0000
                Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 19:45 +0000
                  Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-17 22:25 +0000
                    Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 22:55 +0000
                      Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-18 05:55 +0000
                        Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 00:32 +0000
      Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 21:22 +0000
        Re: transpiling to low level C Rosario19 <Ros@invalid.invalid> - 2024-12-26 13:16 +0100
    Re: transpiling to low level C User One <noreply@invalid.com> - 2025-02-09 17:51 +0000
      Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-02-09 12:43 -0800

Page 2 of 7 — ← Prev page 1 [2] 3 4 5 6 7  Next page →


#389676

Frombart <bc@freeuk.com>
Date2024-12-15 21:32 +0000
Message-ID<vjnhsq$oh1f$1@dont-email.me>
In reply to#389674
On 15/12/2024 19:08, Bonita Montero wrote:
> C++ is more readable because is is magnitudes more expressive than C.
> You can easily write a C++-statement that would hunddres of lines in
> C (imagines specializing a unordered_map by hand). Making a language
> less expressive makes it even less readable, and that's also true for
> your reduced C.
> 

That's not really the point of it. This reduced C is used as an 
intermediate language for a compiler target. It will not usually be 
read, or maintained.

An intermediate language needs to at a lower level than the source language.

And for this project, it needs to be compilable by any C89 compiler.

Generating C++ would be quite useless.

[toc] | [prev] | [next] | [standalone]


#389679

FromBGB <cr88192@gmail.com>
Date2024-12-15 17:53 -0600
Message-ID<vjnq5s$pubt$1@dont-email.me>
In reply to#389676
On 12/15/2024 3:32 PM, bart wrote:
> On 15/12/2024 19:08, Bonita Montero wrote:
>> C++ is more readable because is is magnitudes more expressive than C.
>> You can easily write a C++-statement that would hunddres of lines in
>> C (imagines specializing a unordered_map by hand). Making a language
>> less expressive makes it even less readable, and that's also true for
>> your reduced C.
>>
> 
> That's not really the point of it. This reduced C is used as an 
> intermediate language for a compiler target. It will not usually be 
> read, or maintained.
> 
> An intermediate language needs to at a lower level than the source 
> language.
> 
> And for this project, it needs to be compilable by any C89 compiler.
> 
> Generating C++ would be quite useless.
> 

As an IL, even C is a little overkill, unless turned into a restricted 
subset (say, along similar lines to GCC's GIMPLE).

Say:
   Only function-scope variables allowed;
   No high-level control structures;
   ...

Say:
   int foo(int x)
   {
     int i, v;
     for(i=x, v=0; i>0; i--)
       v=v*i;
     return(v);
   }

Becoming, say:
   int foo(int x)
   {
     int i;
     int v;
     i=x;
     v=0;
     if(i<=0)goto L1;
     L0:
     v=v*i;
     i=i-1;
     if(i>0)goto L0;
     L1:
     return v;
   }

...


Though, this still requires the backend to have a full parser.



Would still be simpler still for a backend to use a plain binary 
serialization, and/or maybe a syntax more like BASIC or similar.

Say:
   SUB foo ( x as Int ) as Int
     DIM i as Int
     DIM v as Int
     i=x
     v=0
     IF i <= 0 THEN GOTO L1
     L0:
     v = v * i
     i = i - 1
     IF i > 0 THEN GOTO L0
     L1:
     RETURN v
   END SUB

Where one can have a parser that reads one line at a time, breaks it 
into tokens, and does simple pattern matching.

Pretty much all higher level control flow can be expressed via goto.

Variables within sub-blocks can be promoted to function scope, possibly 
renamed:
   int i;
   if() {
     long i;
     ...
   }
Remaps:
   int i$0;
   long i$1;

"switch()" can be decomposed:
   switch(i)
   {
     case 1:
       A;
       break;
     case 2:
       B;
     case 3:
       C;
   }
To:
   if(i==1)goto CL1;
   if(i==2)goto CL2;
   if(i==3)goto CL3;
   goto CDFL;
   CL1:
     A;
     goto CEND;
   CL2:
     B;
   CL3:
     C;
   CDFL:
   CEND:

...

[toc] | [prev] | [next] | [standalone]


#389681

FromDavid Brown <david.brown@hesbynett.no>
Date2024-12-16 10:36 +0100
Message-ID<vjosa4$12l76$1@dont-email.me>
In reply to#389679
On 16/12/2024 00:53, BGB wrote:
> On 12/15/2024 3:32 PM, bart wrote:
>> On 15/12/2024 19:08, Bonita Montero wrote:
>>> C++ is more readable because is is magnitudes more expressive than C.
>>> You can easily write a C++-statement that would hunddres of lines in
>>> C (imagines specializing a unordered_map by hand). Making a language
>>> less expressive makes it even less readable, and that's also true for
>>> your reduced C.
>>>
>>
>> That's not really the point of it. This reduced C is used as an 
>> intermediate language for a compiler target. It will not usually be 
>> read, or maintained.
>>
>> An intermediate language needs to at a lower level than the source 
>> language.
>>
>> And for this project, it needs to be compilable by any C89 compiler.
>>
>> Generating C++ would be quite useless.
>>
> 
> As an IL, even C is a little overkill, unless turned into a restricted 
> subset (say, along similar lines to GCC's GIMPLE).
> 

Have either you or Thiago looked at C-- as an alternative?  I don't know 
how practical it would be.

<https://en.wikipedia.org/wiki/C-->

[toc] | [prev] | [next] | [standalone]


#389683

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-16 08:21 -0300
Message-ID<vjp2f3$13k4m$2@dont-email.me>
In reply to#389679
On 15/12/2024 20:53, BGB wrote:
> On 12/15/2024 3:32 PM, bart wrote:
>> On 15/12/2024 19:08, Bonita Montero wrote:
>>> C++ is more readable because is is magnitudes more expressive than C.
>>> You can easily write a C++-statement that would hunddres of lines in
>>> C (imagines specializing a unordered_map by hand). Making a language
>>> less expressive makes it even less readable, and that's also true for
>>> your reduced C.
>>>
>>
>> That's not really the point of it. This reduced C is used as an 
>> intermediate language for a compiler target. It will not usually be 
>> read, or maintained.
>>
>> An intermediate language needs to at a lower level than the source 
>> language.
>>
>> And for this project, it needs to be compilable by any C89 compiler.
>>
>> Generating C++ would be quite useless.
>>
> 
> As an IL, even C is a little overkill, unless turned into a restricted 
> subset (say, along similar lines to GCC's GIMPLE).
> 
> Say:
>    Only function-scope variables allowed;
>    No high-level control structures;
>    ...
> 
> Say:
>    int foo(int x)
>    {
>      int i, v;
>      for(i=x, v=0; i>0; i--)
>        v=v*i;
>      return(v);
>    }
> 
> Becoming, say:
>    int foo(int x)
>    {
>      int i;
>      int v;
>      i=x;
>      v=0;
>      if(i<=0)goto L1;
>      L0:
>      v=v*i;
>      i=i-1;
>      if(i>0)goto L0;
>      L1:
>      return v;
>    }
> 
> ...
> 

I have considered to remove loops and keep only goto.
But I think this is not bring too much simplification.

[toc] | [prev] | [next] | [standalone]


#389696

FromBGB <cr88192@gmail.com>
Date2024-12-17 01:03 -0600
Message-ID<vjr7np$1j57r$2@dont-email.me>
In reply to#389683
On 12/16/2024 5:21 AM, Thiago Adams wrote:
> On 15/12/2024 20:53, BGB wrote:
>> On 12/15/2024 3:32 PM, bart wrote:
>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>> C++ is more readable because is is magnitudes more expressive than C.
>>>> You can easily write a C++-statement that would hunddres of lines in
>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>> less expressive makes it even less readable, and that's also true for
>>>> your reduced C.
>>>>
>>>
>>> That's not really the point of it. This reduced C is used as an 
>>> intermediate language for a compiler target. It will not usually be 
>>> read, or maintained.
>>>
>>> An intermediate language needs to at a lower level than the source 
>>> language.
>>>
>>> And for this project, it needs to be compilable by any C89 compiler.
>>>
>>> Generating C++ would be quite useless.
>>>
>>
>> As an IL, even C is a little overkill, unless turned into a restricted 
>> subset (say, along similar lines to GCC's GIMPLE).
>>
>> Say:
>>    Only function-scope variables allowed;
>>    No high-level control structures;
>>    ...
>>
>> Say:
>>    int foo(int x)
>>    {
>>      int i, v;
>>      for(i=x, v=0; i>0; i--)
>>        v=v*i;
>>      return(v);
>>    }
>>
>> Becoming, say:
>>    int foo(int x)
>>    {
>>      int i;
>>      int v;
>>      i=x;
>>      v=0;
>>      if(i<=0)goto L1;
>>      L0:
>>      v=v*i;
>>      i=i-1;
>>      if(i>0)goto L0;
>>      L1:
>>      return v;
>>    }
>>
>> ...
>>
> 
> I have considered to remove loops and keep only goto.
> But I think this is not bring too much simplification.
> 

It depends.

If the compiler works like an actual C compiler, with a full parser and 
AST stage, yeah, it may not save much.


If the parser is a thin wrapper over 3AC operations (only allowing 
statements that map 1:1 with a 3AC IR operation), it may save a bit more...



As for whether or not it makes sense to use a C like syntax here, this 
is more up for debate (for practical use within a compiler, I would 
assume a binary serialization rather than an ASCII syntax, though ASCII 
may be better in terms of inter-operation or human readability).


But, as can be noted, I would assume a binary serialization that is 
oriented around operators; and *not* about serializing the structures 
used to implement those operators. Also I would assume that the IR need 
not be in SSA form (conversion to full SSA could be done when reading in 
the IR operations).


Ny argument is that not using SSA form means fewer issues for both the 
serialization format and compiler front-end to need to deal with (and is 
comparably easy to regenerate for the backend, with the backend 
operating with its internal IR in SSA form).

Well, contrast to LLVM assuming everything is always in SSA form.

...

[toc] | [prev] | [next] | [standalone]


#389702

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-17 14:55 -0300
Message-ID<vjsdum$1rfp2$1@dont-email.me>
In reply to#389696
Em 12/17/2024 4:03 AM, BGB escreveu:
> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>> On 15/12/2024 20:53, BGB wrote:
>>> On 12/15/2024 3:32 PM, bart wrote:
>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>> C++ is more readable because is is magnitudes more expressive than C.
>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>> less expressive makes it even less readable, and that's also true for
>>>>> your reduced C.
>>>>>
>>>>
>>>> That's not really the point of it. This reduced C is used as an 
>>>> intermediate language for a compiler target. It will not usually be 
>>>> read, or maintained.
>>>>
>>>> An intermediate language needs to at a lower level than the source 
>>>> language.
>>>>
>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>
>>>> Generating C++ would be quite useless.
>>>>
>>>
>>> As an IL, even C is a little overkill, unless turned into a 
>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>
>>> Say:
>>>    Only function-scope variables allowed;
>>>    No high-level control structures;
>>>    ...
>>>
>>> Say:
>>>    int foo(int x)
>>>    {
>>>      int i, v;
>>>      for(i=x, v=0; i>0; i--)
>>>        v=v*i;
>>>      return(v);
>>>    }
>>>
>>> Becoming, say:
>>>    int foo(int x)
>>>    {
>>>      int i;
>>>      int v;
>>>      i=x;
>>>      v=0;
>>>      if(i<=0)goto L1;
>>>      L0:
>>>      v=v*i;
>>>      i=i-1;
>>>      if(i>0)goto L0;
>>>      L1:
>>>      return v;
>>>    }
>>>
>>> ...
>>>
>>
>> I have considered to remove loops and keep only goto.
>> But I think this is not bring too much simplification.
>>
> 
> It depends.
> 
> If the compiler works like an actual C compiler, with a full parser and 
> AST stage, yeah, it may not save much.
> 
> 
> If the parser is a thin wrapper over 3AC operations (only allowing 
> statements that map 1:1 with a 3AC IR operation), it may save a bit more...
> 
> 
> 
> As for whether or not it makes sense to use a C like syntax here, this 
> is more up for debate (for practical use within a compiler, I would 
> assume a binary serialization rather than an ASCII syntax, though ASCII 
> may be better in terms of inter-operation or human readability).
> 
> 
> But, as can be noted, I would assume a binary serialization that is 
> oriented around operators; and *not* about serializing the structures 
> used to implement those operators. Also I would assume that the IR need 
> not be in SSA form (conversion to full SSA could be done when reading in 
> the IR operations).
> 
> 
> Ny argument is that not using SSA form means fewer issues for both the 
> serialization format and compiler front-end to need to deal with (and is 
> comparably easy to regenerate for the backend, with the backend 
> operating with its internal IR in SSA form).
> 
> Well, contrast to LLVM assuming everything is always in SSA form.
> 
> ...
> 
> 

I also have considered split expressions.

For instance

if (a*b+c) {}

into

register int r1 = a * b;
register int r2 = r1 + c;
if (r2) {}

This would make easier to add overflow checks in runtime (if desired) 
and implement things like _complex

Is this what you mean by 3AC or SSA?

This would definitely simplify expressions grammar.

[toc] | [prev] | [next] | [standalone]


#389703

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-17 14:59 -0300
Message-ID<vjse6l$1rfp2$2@dont-email.me>
In reply to#389702
Em 12/17/2024 2:55 PM, Thiago Adams escreveu:
> Em 12/17/2024 4:03 AM, BGB escreveu:
>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>> On 15/12/2024 20:53, BGB wrote:
>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>> C++ is more readable because is is magnitudes more expressive than C.
>>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>>> less expressive makes it even less readable, and that's also true for
>>>>>> your reduced C.
>>>>>>
>>>>>
>>>>> That's not really the point of it. This reduced C is used as an 
>>>>> intermediate language for a compiler target. It will not usually be 
>>>>> read, or maintained.
>>>>>
>>>>> An intermediate language needs to at a lower level than the source 
>>>>> language.
>>>>>
>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>
>>>>> Generating C++ would be quite useless.
>>>>>
>>>>
>>>> As an IL, even C is a little overkill, unless turned into a 
>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>
>>>> Say:
>>>>    Only function-scope variables allowed;
>>>>    No high-level control structures;
>>>>    ...
>>>>
>>>> Say:
>>>>    int foo(int x)
>>>>    {
>>>>      int i, v;
>>>>      for(i=x, v=0; i>0; i--)
>>>>        v=v*i;
>>>>      return(v);
>>>>    }
>>>>
>>>> Becoming, say:
>>>>    int foo(int x)
>>>>    {
>>>>      int i;
>>>>      int v;
>>>>      i=x;
>>>>      v=0;
>>>>      if(i<=0)goto L1;
>>>>      L0:
>>>>      v=v*i;
>>>>      i=i-1;
>>>>      if(i>0)goto L0;
>>>>      L1:
>>>>      return v;
>>>>    }
>>>>
>>>> ...
>>>>
>>>
>>> I have considered to remove loops and keep only goto.
>>> But I think this is not bring too much simplification.
>>>
>>
>> It depends.
>>
>> If the compiler works like an actual C compiler, with a full parser 
>> and AST stage, yeah, it may not save much.
>>
>>
>> If the parser is a thin wrapper over 3AC operations (only allowing 
>> statements that map 1:1 with a 3AC IR operation), it may save a bit 
>> more...
>>
>>
>>
>> As for whether or not it makes sense to use a C like syntax here, this 
>> is more up for debate (for practical use within a compiler, I would 
>> assume a binary serialization rather than an ASCII syntax, though 
>> ASCII may be better in terms of inter-operation or human readability).
>>
>>
>> But, as can be noted, I would assume a binary serialization that is 
>> oriented around operators; and *not* about serializing the structures 
>> used to implement those operators. Also I would assume that the IR 
>> need not be in SSA form (conversion to full SSA could be done when 
>> reading in the IR operations).
>>
>>
>> Ny argument is that not using SSA form means fewer issues for both the 
>> serialization format and compiler front-end to need to deal with (and 
>> is comparably easy to regenerate for the backend, with the backend 
>> operating with its internal IR in SSA form).
>>
>> Well, contrast to LLVM assuming everything is always in SSA form.
>>
>> ...
>>
>>
> 
> I also have considered split expressions.
> 
> For instance
> 
> if (a*b+c) {}
> 
> into
> 
> register int r1 = a * b;
> register int r2 = r1 + c;
> if (r2) {}
> 
> This would make easier to add overflow checks in runtime (if desired) 
> and implement things like _complex
> 
> Is this what you mean by 3AC or SSA?
> 
> This would definitely simplify expressions grammar.
> 
> 

I also have consider remove local scopes. But I think local scopes may 
be useful to better use stack reusing the same addresses when variables 
goes out of scope.
For instance

{
  int i =1;
  {
   int a  = 2;
  }
  {
   int b  = 3;
  }
}
I think scope makes easier to use the same stack position of a and b 
because it is easier to see a does not exist any more.

[toc] | [prev] | [next] | [standalone]


#389704

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-17 15:16 -0300
Message-ID<vjsf6g$1rlkq$1@dont-email.me>
In reply to#389703
Em 12/17/2024 2:59 PM, Thiago Adams escreveu:
> Em 12/17/2024 2:55 PM, Thiago Adams escreveu:
>> Em 12/17/2024 4:03 AM, BGB escreveu:
>>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>>> On 15/12/2024 20:53, BGB wrote:
>>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>>> C++ is more readable because is is magnitudes more expressive 
>>>>>>> than C.
>>>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>>>> less expressive makes it even less readable, and that's also true 
>>>>>>> for
>>>>>>> your reduced C.
>>>>>>>
>>>>>>
>>>>>> That's not really the point of it. This reduced C is used as an 
>>>>>> intermediate language for a compiler target. It will not usually 
>>>>>> be read, or maintained.
>>>>>>
>>>>>> An intermediate language needs to at a lower level than the source 
>>>>>> language.
>>>>>>
>>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>>
>>>>>> Generating C++ would be quite useless.
>>>>>>
>>>>>
>>>>> As an IL, even C is a little overkill, unless turned into a 
>>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>>
>>>>> Say:
>>>>>    Only function-scope variables allowed;
>>>>>    No high-level control structures;
>>>>>    ...
>>>>>
>>>>> Say:
>>>>>    int foo(int x)
>>>>>    {
>>>>>      int i, v;
>>>>>      for(i=x, v=0; i>0; i--)
>>>>>        v=v*i;
>>>>>      return(v);
>>>>>    }
>>>>>
>>>>> Becoming, say:
>>>>>    int foo(int x)
>>>>>    {
>>>>>      int i;
>>>>>      int v;
>>>>>      i=x;
>>>>>      v=0;
>>>>>      if(i<=0)goto L1;
>>>>>      L0:
>>>>>      v=v*i;
>>>>>      i=i-1;
>>>>>      if(i>0)goto L0;
>>>>>      L1:
>>>>>      return v;
>>>>>    }
>>>>>
>>>>> ...
>>>>>
>>>>
>>>> I have considered to remove loops and keep only goto.
>>>> But I think this is not bring too much simplification.
>>>>
>>>
>>> It depends.
>>>
>>> If the compiler works like an actual C compiler, with a full parser 
>>> and AST stage, yeah, it may not save much.
>>>
>>>
>>> If the parser is a thin wrapper over 3AC operations (only allowing 
>>> statements that map 1:1 with a 3AC IR operation), it may save a bit 
>>> more...
>>>
>>>
>>>
>>> As for whether or not it makes sense to use a C like syntax here, 
>>> this is more up for debate (for practical use within a compiler, I 
>>> would assume a binary serialization rather than an ASCII syntax, 
>>> though ASCII may be better in terms of inter-operation or human 
>>> readability).
>>>
>>>
>>> But, as can be noted, I would assume a binary serialization that is 
>>> oriented around operators; and *not* about serializing the structures 
>>> used to implement those operators. Also I would assume that the IR 
>>> need not be in SSA form (conversion to full SSA could be done when 
>>> reading in the IR operations).
>>>
>>>
>>> Ny argument is that not using SSA form means fewer issues for both 
>>> the serialization format and compiler front-end to need to deal with 
>>> (and is comparably easy to regenerate for the backend, with the 
>>> backend operating with its internal IR in SSA form).
>>>
>>> Well, contrast to LLVM assuming everything is always in SSA form.
>>>
>>> ...
>>>
>>>
>>
>> I also have considered split expressions.
>>
>> For instance
>>
>> if (a*b+c) {}
>>
>> into
>>
>> register int r1 = a * b;
>> register int r2 = r1 + c;
>> if (r2) {}
>>
>> This would make easier to add overflow checks in runtime (if desired) 
>> and implement things like _complex
>>
>> Is this what you mean by 3AC or SSA?
>>
>> This would definitely simplify expressions grammar.
>>
>>
> 
> I also have consider remove local scopes. But I think local scopes may 
> be useful to better use stack reusing the same addresses when variables 
> goes out of scope.
> For instance
> 
> {
>   int i =1;
>   {
>    int a  = 2;
>   }
>   {
>    int b  = 3;
>   }
> }
> I think scope makes easier to use the same stack position of a and b 
> because it is easier to see a does not exist any more.
> 

also remove structs changing by unsigned char [] and cast parts of it to 
access members.

I think this the lower level possible in c.




[toc] | [prev] | [next] | [standalone]


#389705

Frombart <bc@freeuk.com>
Date2024-12-17 18:37 +0000
Message-ID<vjsgdr$1rrvs$1@dont-email.me>
In reply to#389704
On 17/12/2024 18:16, Thiago Adams wrote:

> 
> also remove structs changing by unsigned char [] and cast parts of it to 
> access members.
> 
> I think this the lower level possible in c.

This is what I do in my IL, where structs are just fixed blocks of so 
many bytes.

But there are some things to consider:

* A struct may still need alignment corresponding to the strictest 
alignment among the members. (Any padding between members and at the end 
should already be taken care of.)

I use an alignment based on overall size, so a 40-byte struct is assumed 
to have an 64-bit max alignment, but it may only need 16-bit alignment. 
That is harmless, but it can be fixed with some extra metadata.

With a C char[], you can choose to use a short[] array for example 
(obviously of half the length) to signal that it needs 16-bit alignment.


* Some machine ABIs, like SYS V for 64 bits, may need to know the 
internal layout of structs when they are passed 'by value'.

If reduced down to char[], this info will be missing.

I ignore this because I only target Win64 ABI. It only comes up in SYS 
V, when calling functions across an FFI, and when the API uses value 
structs, which is uncommon. And also makes I can't make head or tail of 
the rules.

[toc] | [prev] | [next] | [standalone]


#389708

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-17 16:07 -0300
Message-ID<vjsi61$1rlkq$2@dont-email.me>
In reply to#389705
Em 12/17/2024 3:37 PM, bart escreveu:
> On 17/12/2024 18:16, Thiago Adams wrote:
> 
>>
>> also remove structs changing by unsigned char [] and cast parts of it 
>> to access members.
>>
>> I think this the lower level possible in c.
> 
> This is what I do in my IL, where structs are just fixed blocks of so 
> many bytes.
> 

How do you do with struct parameters?

[toc] | [prev] | [next] | [standalone]


#389712

Frombart <bc@freeuk.com>
Date2024-12-17 19:42 +0000
Message-ID<vjsk7q$1rrvs$2@dont-email.me>
In reply to#389708
On 17/12/2024 19:07, Thiago Adams wrote:
> Em 12/17/2024 3:37 PM, bart escreveu:
>> On 17/12/2024 18:16, Thiago Adams wrote:
>>
>>>
>>> also remove structs changing by unsigned char [] and cast parts of it 
>>> to access members.
>>>
>>> I think this the lower level possible in c.
>>
>> This is what I do in my IL, where structs are just fixed blocks of so 
>> many bytes.
>>
> 
> How do you do with struct parameters?
> 


In the IL they are always passed notionally by value. This side of the 
IL (that is, the frontend compile that generates IL), knows nothing 
about the target, such as ABI details.

(In practice, some things are known, like the word size of the target, 
since that can change characteristics of the source language, like the 
size of 'int' or of 'void*'. It also needs to assume, or request from 
the backend, argument evaluation order, although my IL can reverse order 
if necessary.)

It is the backend, on the other size of the IL, that needs to deal with 
those details.

That can include making copies of structs that the ABI says are passed 
by value. But when targeting SYS V ABI (which I haven't attempted yet), 
it may need to know the internal layout of a struct.

You can however do experiments with using SYS V on Linux (must be 64 bits):

* Create test structs with, say, int32 or int64 elements

* Write a test function where such a struct is passed by value, and
   then return a modified copy

* Rerun the test using a version of the function where a char[] version 
of the struct is passed and returned, and which contains the member 
access casts you suggested

* See if it gives the same results.

You might need a union of the two structs, or use memcpy to transfer 
contents, before and after calling the test function.

[toc] | [prev] | [next] | [standalone]


#389726

FromBGB <cr88192@gmail.com>
Date2024-12-18 12:51 -0600
Message-ID<vjv5ir$2ds8r$2@dont-email.me>
In reply to#389712
On 12/17/2024 1:42 PM, bart wrote:
> On 17/12/2024 19:07, Thiago Adams wrote:
>> Em 12/17/2024 3:37 PM, bart escreveu:
>>> On 17/12/2024 18:16, Thiago Adams wrote:
>>>
>>>>
>>>> also remove structs changing by unsigned char [] and cast parts of 
>>>> it to access members.
>>>>
>>>> I think this the lower level possible in c.
>>>
>>> This is what I do in my IL, where structs are just fixed blocks of so 
>>> many bytes.
>>>
>>
>> How do you do with struct parameters?
>>
> 
> 
> In the IL they are always passed notionally by value. This side of the 
> IL (that is, the frontend compile that generates IL), knows nothing 
> about the target, such as ABI details.
> 
> (In practice, some things are known, like the word size of the target, 
> since that can change characteristics of the source language, like the 
> size of 'int' or of 'void*'. It also needs to assume, or request from 
> the backend, argument evaluation order, although my IL can reverse order 
> if necessary.)
> 
> It is the backend, on the other size of the IL, that needs to deal with 
> those details.
> 
> That can include making copies of structs that the ABI says are passed 
> by value. But when targeting SYS V ABI (which I haven't attempted yet), 
> it may need to know the internal layout of a struct.
> 
> You can however do experiments with using SYS V on Linux (must be 64 bits):
> 
> * Create test structs with, say, int32 or int64 elements
> 
> * Write a test function where such a struct is passed by value, and
>    then return a modified copy
> 
> * Rerun the test using a version of the function where a char[] version 
> of the struct is passed and returned, and which contains the member 
> access casts you suggested
> 
> * See if it gives the same results.
> 
> You might need a union of the two structs, or use memcpy to transfer 
> contents, before and after calling the test function.


I took a different approach:
In the backend IR stage, structs are essentially treated as references 
to the structure.

A local structure may be "initialized" via an IR operation, in which 
point it will be assigned storage in the stack frame, and the reference 
will be initialized to the storage area for the structure.

Most operations will pass them by reference.

Assigning a struct will essentially be turned into a struct-copy 
operation (using the same mechanism as inline memcpy).


Type model could be seen as multiple levels:
   I: integer types of 'int' and smaller;
   L: integer types of 64 bits or less that are not I.
   D: 'double' and smaller floating-point types.
   A: Address (pointers, arrays, structs, ...)
   X: 128-bit types.
     int128, 'long double', SIMD vectors, ...

I:
   char, signed char, unsigned char
   short, unsigned short
   int, unsigned int
   _Bool, wchar_t, ...
L:
   long, long long, unsigned long, unsigned long long
   64-bit SIMD vectors
   variant (sorta)
D: double, float, short float
A:
   pointers
   arrays
   structs
   class instances
   ...
X:
   grab bag of pretty much everything that is 128 bits.

The toplevel types all basically have similar storage and behavior, so 
in many cases one can rely on this rather than the actual type.



...

[toc] | [prev] | [next] | [standalone]


#389728

FromThiago Adams <thiago.adams@gmail.com>
Date2024-12-18 16:43 -0300
Message-ID<vjv8lv$2edrv$1@dont-email.me>
In reply to#389726
Em 12/18/2024 3:51 PM, BGB escreveu:
> 
> I took a different approach:
> In the backend IR stage, structs are essentially treated as references 
> to the structure.
> 
> A local structure may be "initialized" via an IR operation, in which 
> point it will be assigned storage in the stack frame, and the reference 
> will be initialized to the storage area for the structure.
> 
> Most operations will pass them by reference.
> 
> Assigning a struct will essentially be turned into a struct-copy 
> operation (using the same mechanism as inline memcpy).

But what happens with calling a external C function that has a struct X 
as parameter? (not pointer to struct)

[toc] | [prev] | [next] | [standalone]


#389731

FromBGB <cr88192@gmail.com>
Date2024-12-18 18:27 -0600
Message-ID<vjvp9g$2h6ck$1@dont-email.me>
In reply to#389728
On 12/18/2024 1:43 PM, Thiago Adams wrote:
> Em 12/18/2024 3:51 PM, BGB escreveu:
>>
>> I took a different approach:
>> In the backend IR stage, structs are essentially treated as references 
>> to the structure.
>>
>> A local structure may be "initialized" via an IR operation, in which 
>> point it will be assigned storage in the stack frame, and the 
>> reference will be initialized to the storage area for the structure.
>>
>> Most operations will pass them by reference.
>>
>> Assigning a struct will essentially be turned into a struct-copy 
>> operation (using the same mechanism as inline memcpy).
> 
> But what happens with calling a external C function that has a struct X 
> as parameter? (not pointer to struct)


In my ABI, if larger than 16 bytes, it is passed by reference (as a 
pointer in a register or on the stack), callee is responsible for 
copying it somewhere else if needed.

For struct return, a pointer to return the struct into is provided by 
the caller, and the callee copies the returned struct into this address.

If the caller ignores the return value, the caller provides a dummy 
buffer for the return value.

If no prototype is provided... well, most likely the program crashes or 
similar.

So, in effect, the by-value semantics are mostly faked by the compiler.


It is roughly similar to the handling of C array types, which in this 
case are also seen as a combination of a hidden pointer to the data, and 
the backing data (the array's contents). The code-generator mostly 
operates in terms of this hidden pointer.


By-Value Structs smaller than 16 bytes are passed as-if they were a 64 
or 128 bit integer type (as a single register or as a register pair, 
with a layout matching their in-memory representation).

...


But, yeah, at the IL level, one could potentially eliminate structs and 
arrays as a separate construct, and instead have bare pointers and a 
generic "reserve a blob of bytes in the frame and initialize this 
pointer to point to it" operator (with the business end of this operator 
happening in the function prolog).

...

[toc] | [prev] | [next] | [standalone]


#389734

Frombart <bc@freeuk.com>
Date2024-12-19 00:35 +0000
Message-ID<vjvpos$2gsil$3@dont-email.me>
In reply to#389731
On 19/12/2024 00:27, BGB wrote:
> On 12/18/2024 1:43 PM, Thiago Adams wrote:
>> Em 12/18/2024 3:51 PM, BGB escreveu:
>>>
>>> I took a different approach:
>>> In the backend IR stage, structs are essentially treated as 
>>> references to the structure.
>>>
>>> A local structure may be "initialized" via an IR operation, in which 
>>> point it will be assigned storage in the stack frame, and the 
>>> reference will be initialized to the storage area for the structure.
>>>
>>> Most operations will pass them by reference.
>>>
>>> Assigning a struct will essentially be turned into a struct-copy 
>>> operation (using the same mechanism as inline memcpy).
>>
>> But what happens with calling a external C function that has a struct 
>> X as parameter? (not pointer to struct)
> 
> 
> In my ABI, if larger than 16 bytes, it is passed by reference (as a 
> pointer in a register or on the stack), callee is responsible for 
> copying it somewhere else if needed.
> 
> For struct return, a pointer to return the struct into is provided by 
> the caller, and the callee copies the returned struct into this address.
> 
> If the caller ignores the return value, the caller provides a dummy 
> buffer for the return value.
> 
> If no prototype is provided... well, most likely the program crashes or 
> similar.
> 
> So, in effect, the by-value semantics are mostly faked by the compiler.
> 
> 
> It is roughly similar to the handling of C array types, which in this 
> case are also seen as a combination of a hidden pointer to the data, and 
> the backing data (the array's contents). The code-generator mostly 
> operates in terms of this hidden pointer.
> 
> 
> By-Value Structs smaller than 16 bytes are passed as-if they were a 64 
> or 128 bit integer type (as a single register or as a register pair, 
> with a layout matching their in-memory representation).
> 
> ...
> 
> 
> But, yeah, at the IL level, one could potentially eliminate structs and 
> arrays as a separate construct, and instead have bare pointers and a 
> generic "reserve a blob of bytes in the frame and initialize this 
> pointer to point to it" operator (with the business end of this operator 
> happening in the function prolog).

The problem with this, that I mentioned elsewhere, is how well it would 
work with SYS V ABI, since the rules for structs are complex, and 
apparently recursive.

Having just a block of bytes might not be enough.

[toc] | [prev] | [next] | [standalone]


#389738

FromBGB <cr88192@gmail.com>
Date2024-12-18 23:46 -0600
Message-ID<vk0bvf$2nn4a$1@dont-email.me>
In reply to#389734
On 12/18/2024 6:35 PM, bart wrote:
> On 19/12/2024 00:27, BGB wrote:
>> On 12/18/2024 1:43 PM, Thiago Adams wrote:
>>> Em 12/18/2024 3:51 PM, BGB escreveu:
>>>>
>>>> I took a different approach:
>>>> In the backend IR stage, structs are essentially treated as 
>>>> references to the structure.
>>>>
>>>> A local structure may be "initialized" via an IR operation, in which 
>>>> point it will be assigned storage in the stack frame, and the 
>>>> reference will be initialized to the storage area for the structure.
>>>>
>>>> Most operations will pass them by reference.
>>>>
>>>> Assigning a struct will essentially be turned into a struct-copy 
>>>> operation (using the same mechanism as inline memcpy).
>>>
>>> But what happens with calling a external C function that has a struct 
>>> X as parameter? (not pointer to struct)
>>
>>
>> In my ABI, if larger than 16 bytes, it is passed by reference (as a 
>> pointer in a register or on the stack), callee is responsible for 
>> copying it somewhere else if needed.
>>
>> For struct return, a pointer to return the struct into is provided by 
>> the caller, and the callee copies the returned struct into this address.
>>
>> If the caller ignores the return value, the caller provides a dummy 
>> buffer for the return value.
>>
>> If no prototype is provided... well, most likely the program crashes 
>> or similar.
>>
>> So, in effect, the by-value semantics are mostly faked by the compiler.
>>
>>
>> It is roughly similar to the handling of C array types, which in this 
>> case are also seen as a combination of a hidden pointer to the data, 
>> and the backing data (the array's contents). The code-generator mostly 
>> operates in terms of this hidden pointer.
>>
>>
>> By-Value Structs smaller than 16 bytes are passed as-if they were a 64 
>> or 128 bit integer type (as a single register or as a register pair, 
>> with a layout matching their in-memory representation).
>>
>> ...
>>
>>
>> But, yeah, at the IL level, one could potentially eliminate structs 
>> and arrays as a separate construct, and instead have bare pointers and 
>> a generic "reserve a blob of bytes in the frame and initialize this 
>> pointer to point to it" operator (with the business end of this 
>> operator happening in the function prolog).
> 
> The problem with this, that I mentioned elsewhere, is how well it would 
> work with SYS V ABI, since the rules for structs are complex, and 
> apparently recursive.
> 
> Having just a block of bytes might not be enough.

In my case, I am not bothering with the SysV style ABI's (well, along 
with there not being any x86 or x86-64 target...).


For my ISA, it is a custom ABI, but follows mostly similar rules to some 
of the other "Microsoft style" ABIs (where, I have noted that across 
multiple targets, MS tools have tended to use similar ABI designs).

For my compiler targeting RISC-V, it uses a variation of RV's ABI rules.
Argument passing is basically similar, but struct pass/return is 
different; and it passes floating-point values in GPRs (and, in my own 
ISA, all floating-point values use GPRs, as there are no FPU registers; 
though FPU registers do exist for RISC-V).

Not likely a huge issue as one is unlikely to use ELF and PE/COFF in the 
same program.


For the "OS" that runs on my CPU core, it is natively using PE/COFF, but 
ELF is supported for RISC-V (currently PIE only). It generally needs to 
use my own C library as I still haven't gotten glibc or musl libc to 
work on it (and they work in a different way from my own C library).

Seemingly, something is going terribly wrong in the "dynamic linking" 
process, but too hard to figure out in the absence of any real debugging 
interface (what debug mechanisms I have, effectively lack any symbols 
for things inside "ld-linux.so"'s domain).

Theoretically, could make porting usermode software easier, as then I 
could compile stuff as-if it were running on an RV64 port of Linux.

But, easier said than done.

...

[toc] | [prev] | [next] | [standalone]


#389739

Frombart <bc@freeuk.com>
Date2024-12-19 11:27 +0000
Message-ID<vk0vu7$2qr2c$1@dont-email.me>
In reply to#389738
On 19/12/2024 05:46, BGB wrote:
> On 12/18/2024 6:35 PM, bart wrote:
>> On 19/12/2024 00:27, BGB wrote:

>>> By-Value Structs smaller than 16 bytes are passed as-if they were a 
>>> 64 or 128 bit integer type (as a single register or as a register 
>>> pair, with a layout matching their in-memory representation).
>>>
>>> ...
>>>
>>>
>>> But, yeah, at the IL level, one could potentially eliminate structs 
>>> and arrays as a separate construct, and instead have bare pointers 
>>> and a generic "reserve a blob of bytes in the frame and initialize 
>>> this pointer to point to it" operator (with the business end of this 
>>> operator happening in the function prolog).
>>
>> The problem with this, that I mentioned elsewhere, is how well it 
>> would work with SYS V ABI, since the rules for structs are complex, 
>> and apparently recursive.
>>
>> Having just a block of bytes might not be enough.
> 
> In my case, I am not bothering with the SysV style ABI's (well, along 
> with there not being any x86 or x86-64 target...).

I'd imagine it's worse with ARM targets as there are so many more 
registers to try and deconstruct structs into.

> 
> For my ISA, it is a custom ABI, but follows mostly similar rules to some 
> of the other "Microsoft style" ABIs (where, I have noted that across 
> multiple targets, MS tools have tended to use similar ABI designs).

When you do your own thing, it's easy.

In the 1980s, I didn't need to worry about call conventions used for 
other software, since there /was/ no other software! I had to write 
everything, save for the odd calls to DOS which used some form of SYSCALL.

Then, arrays and structs were actually passed and returned by value (not 
via hidden references), by copying the data to and from the stack.

However, I don't recall ever using the feature, as I considered it 
efficient. I always used explicit references in my code.

> For my compiler targeting RISC-V, it uses a variation of RV's ABI rules.
> Argument passing is basically similar, but struct pass/return is 
> different; and it passes floating-point values in GPRs (and, in my own 
> ISA, all floating-point values use GPRs, as there are no FPU registers; 
> though FPU registers do exist for RISC-V).

Supporting C's variadic functions, which is needed for many languages 
when calling C across an FFI, usually requires different rules. On Win64 
ABI for example, by passing low variadic arguments in both GPRs and FPU 
registers.

/Implementing/ variadic functions (which only occurs if implementing C) 
is another headache if it has to work with the ABI (which can be assumed 
for a non-static function).

I barely have a working solution for Win64 ABI, which needs to be done 
via stdarg.h, but wouldn't have a clue how to do it for SYS V.

(Even Win64 has problems, as it assumes a downward-growing stack; in my 
IL interpreter, the stack grows upwards!)

> Not likely a huge issue as one is unlikely to use ELF and PE/COFF in the 
> same program.
> 
> 
> For the "OS" that runs on my CPU core, it is natively using PE/COFF, but 

That's interesting: you deliberately used one of the most complex file 
formats around, when you could have devised your own?

I did exactly that at a period when my generated DLLs were buggy for 
some reason (it turned out to be two reasons). I created a simple 
dynamic library format of my own. Then I found the same format worked 
also for executables.

But I needed a loader program to run them, as Windows obviously didn't 
understand the format. Such a program can be written in 800 lines of C, 
and can dynamically libraries in both my format, and proper DLLs (not 
the buggy ones I generated!).

A hello-world program is under 300 bytes compared with 2 or
2.5KB of EXE. And the format is portable to Linux, so no need to 
generate ELF (but I haven't tried). Plus the format might be transparent 
to AV software (haven't tried that either).

[toc] | [prev] | [next] | [standalone]


#389747

FromBGB <cr88192@gmail.com>
Date2024-12-19 14:36 -0600
Message-ID<vk204q$310pu$1@dont-email.me>
In reply to#389739
On 12/19/2024 5:27 AM, bart wrote:
> On 19/12/2024 05:46, BGB wrote:
>> On 12/18/2024 6:35 PM, bart wrote:
>>> On 19/12/2024 00:27, BGB wrote:
> 
>>>> By-Value Structs smaller than 16 bytes are passed as-if they were a 
>>>> 64 or 128 bit integer type (as a single register or as a register 
>>>> pair, with a layout matching their in-memory representation).
>>>>
>>>> ...
>>>>
>>>>
>>>> But, yeah, at the IL level, one could potentially eliminate structs 
>>>> and arrays as a separate construct, and instead have bare pointers 
>>>> and a generic "reserve a blob of bytes in the frame and initialize 
>>>> this pointer to point to it" operator (with the business end of this 
>>>> operator happening in the function prolog).
>>>
>>> The problem with this, that I mentioned elsewhere, is how well it 
>>> would work with SYS V ABI, since the rules for structs are complex, 
>>> and apparently recursive.
>>>
>>> Having just a block of bytes might not be enough.
>>
>> In my case, I am not bothering with the SysV style ABI's (well, along 
>> with there not being any x86 or x86-64 target...).
> 
> I'd imagine it's worse with ARM targets as there are so many more 
> registers to try and deconstruct structs into.
> 

Not messed much with the ARM64 ABI or similar, but I will draw the line 
in the sand somewhere.

Struct passing/return is enough of an edge case that one can just sort 
of declare it "no go" between compilers with "mostly but not strictly 
compatible" ABIs.


>>
>> For my ISA, it is a custom ABI, but follows mostly similar rules to 
>> some of the other "Microsoft style" ABIs (where, I have noted that 
>> across multiple targets, MS tools have tended to use similar ABI 
>> designs).
> 
> When you do your own thing, it's easy.
> 
> In the 1980s, I didn't need to worry about call conventions used for 
> other software, since there /was/ no other software! I had to write 
> everything, save for the odd calls to DOS which used some form of SYSCALL.
> 
> Then, arrays and structs were actually passed and returned by value (not 
> via hidden references), by copying the data to and from the stack.
> 
> However, I don't recall ever using the feature, as I considered it 
> efficient. I always used explicit references in my code.
> 

Most of the time, one is passing/returning structures as pointers, and 
not by value.

By value structures are usually small.


When a structure is not small, it is both simpler to implement, and 
usually faster, to internally pass it by reference.

If you pass a large structure to a function by value, via an on-stack 
copy, and the function assigns it to another location (say, a global 
variable):
   Pass by reference: Only a single copy operation is needed;
   Pass by value on-stack: At least two copy operations are needed.

One also needs to reserve enough space in the function arguments list to 
hold any structures passed, which could be bad if they are potentially 
large.



But, on my ISA, ABI is sort of like:
   R4 ..R7 : Arg0 ..Arg3
   R20..R23: Arg4 ..Arg7
   R36..R39: Arg8 ..Arg11 (optional)
   R52..R55: Arg12..Arg15 (optional)
Return Value:
   R2, R3:R2 (128 bit)
   R2 is also used to pass in the return value pointer.

'this':
   Generally passed in either R3 or R18, depending on ABI variant.

Where, callee-save:
   R8 ..R14,  R24..R31,
   R40..R47,  R56..R63
   R15=SP

Non-saved scratch:
   R2 ..R7 ,  R16..R23,
   R32..R39,  R48..R55


Arguments beyond the first 8/16 register arguments are passed on stack. 
In this case, a spill space for the first 8/16 arguments (64 or 128 
bytes) is provided on stack before the first non-register argument.

If the function accepts a fixed number of arguments and the number of 
argument registers is 8 or less, spill space need only be provided for 
the first 8 arguments (calling vararg functions will always reserve 
space for 16 registers in the 16-register ABI). This spill space 
effectively belongs to the callee rather than the caller.


Structures (by value):
   1.. 8 bytes: Passed in a single register
   9..16 bytes: Passed in a pair, padded to the next even pair
   17+: Pass as a reference.

Things like 128-bit types are also passed/returned in register pairs.



Contrast, RV ABI:
   X10..X17 are used for arguments;
   No spill space is provided;
   ...

My variant uses similar rules to my own ABI for passing/returning 
structures, with:
   X28, structure return pointer
   X29, 'this'
Normal return values go into X10 or X11:X10.



Note that in both ABI's, passing 'this' in a register would mean that 
class instances and COM objects are not equivalent (COM object methods 
always pass 'this' as the first argument).

The 'this' register is implicitly also used by lambdas to pass in the 
pointer to the captured bindings area (which mostly resembles a 
structure containing each variable captured by the lambda).

Can note though that in this case, capturing a binding by reference 
means the lambda is limited to automatic lifetime (non-automatic lambdas 
may only capture by value). In this case, capture by value is the default.


>> For my compiler targeting RISC-V, it uses a variation of RV's ABI rules.
>> Argument passing is basically similar, but struct pass/return is 
>> different; and it passes floating-point values in GPRs (and, in my own 
>> ISA, all floating-point values use GPRs, as there are no FPU 
>> registers; though FPU registers do exist for RISC-V).
> 
> Supporting C's variadic functions, which is needed for many languages 
> when calling C across an FFI, usually requires different rules. On Win64 
> ABI for example, by passing low variadic arguments in both GPRs and FPU 
> registers.
> 

I simplified things by assuming only GPRs are used.


> /Implementing/ variadic functions (which only occurs if implementing C) 
> is another headache if it has to work with the ABI (which can be assumed 
> for a non-static function).
> 
> I barely have a working solution for Win64 ABI, which needs to be done 
> via stdarg.h, but wouldn't have a clue how to do it for SYS V.
> 
> (Even Win64 has problems, as it assumes a downward-growing stack; in my 
> IL interpreter, the stack grows upwards!)
> 

Most targets use a downward growing stack.
Mine is no exception here...


>> Not likely a huge issue as one is unlikely to use ELF and PE/COFF in 
>> the same program.
>>
>>
>> For the "OS" that runs on my CPU core, it is natively using PE/COFF, but 
> 
> That's interesting: you deliberately used one of the most complex file 
> formats around, when you could have devised your own?
> 

For what I wanted, I would have mostly needed to recreate most of the 
same functionality as PE/COFF anyways.


When one considers the entire loading process (including DLLs/SOs), then 
PE/COFF loading is actually simpler than ELF loading (ELF subjects the 
loader to needing to deal with symbol and relocation tables), similar to 
PIE loading.


Things like the MZ stub are optional in my case, and mostly ignored if 
present (in my LZ compressed PE variants, the MZ stub is omitted entirely).


I had at one point considered doing a custom format resembling LZ 
compressed MachO, but ended up not bothering, as it wouldn't have really 
saved anything over LZ compressed PE/COFF.


Some "unneeded cruft" like the Resource Section was discarded, mostly 
replaced by an embedded WAD2 image. The header was modified some to 
allow for backwards compatibility with the Windows format (mostly 
creating a dummy header in the original format that points to the WAD2 
directory).


Idea is that icons, bitmaps, and other things, would mostly be held in 
WAD lumps. Though, resources which may be accessed via symbols in the 
EXE/DLL need to be stored uncompressed (where "__rsrc_lumpname" may be 
used to access the contents of resource-section lumps as an extern symbol).

Say, for example:
   extern byte __rsrc_mybitmap[];  //resolves to a DIB/BMP or similar

For now, resource formats:
   Images:
     BMP (various settings)
       4, 8, and 16 bpp typical
       Supports a non-standard 16-bpp alpha-blended mode (*1).
       Supports non-standard 16 color and 256 color with transparent.
       Supports CRAM BMP as well (2 bpp)
     QOI (assumes RGBA32, nominally lossless)
       QOI is a semi-simplistic non-entropy-coded format.
       Can give PNG-like compression in some cases.
       Reasonably fast/cheap to decode.
     LCIF, custom lossy format, color-cell compression.
       OK Q/bpp but mostly only on the low-end.
       Resembles a QOI+CRAM hybrid.
     UPIC, lossy or lossless, JPEG-like (*2)

*1:
   0rrrrrgggggbbbbb  Normal/Opaque
   1rrrraggggabbbba  With 3 bit alpha (4b/ch RGB).

For 16 and 256 color, a variant is supported with a transparent color.
Generally the high intensity magenta is reused as the transparent color. 
This is encoded in the color palette (if all colors apart from one have 
the alpha bits set to FF, and one color has 00, then that color is 
assumed to be a transparent color).

CRAM bpp: Uses a limited form of the 8-bit CRAM format:
   16 bits, 4x4 pixels, 1 bit per pixel
   2x 8 bits: Color Endpoints
The rest of the format being unsupported, so it can simply assume a 
fixed 32-bits per 4x4 pixel cell.



*2: The UPIC format is structurally similar to JPEG, but:
   Uses TLV packaging (vs FF-escape tagging);
   Uses Rice coding (vs Huffman)
   Uses Z3.V5 VLC, vs Z4.V4
   Uses Block-Haar and RCT
     Vs DCT and YCbCr.
   Supports an alpha channel.
     Y    1       (*2A)
     YA   1:1     (*2A)
     YUV  4:2:0
     YUV  4:4:4   (*2A)
     YUVA 4:2:0:4
     YUVA 4:4:4:4 (*2A)
   *2A: May be used in the lossless modes, depending on image.


VLC coding resembles Deflate's natch distance encoding, with sign-folded 
values. Runs of zero coefficients have a shorter limit, but similar. 
Like with JPEG, an 0x00 symbol encodes an early EOB.

In tests, on my main PC:
   Vs JPEG: It is a little faster
     Q/bpp is similar, better/worse depends on image.
       Slightly worse on photos, but "similar".
       Generally somewhat better on artificial images.
   Vs PNG:
     Faster to decode (with less memory overhead);
     Better compression on many images (particularly photo-like).

Note that UPIC was designed to not require any large intermediate 
buffers, so will decode directly to an RGB555 or RGBA32 output buffer 
(decoding happens in terms of individual 16x16 pixel macroblocks).

It was designed to be moderately fast and to try to minimize memory 
overhead for decoding (vs either PNG or JPEG, which need a more 
significant chunk of working memory to decode).


Block-Haar is a Haar transform made to fit the same 8x8 pixel blocks as 
DCT, where Haar maps (A,B)->(C,D):
   C=(A+B)/2  (*: X/2 here being defined as (X>>1))
   D=A-B
But, can be reversed exactly, IIRC:
   B=C-(D/2)
   A=B+D
By doing multiple stages of Haar transform, one can build an 8-pixel 
version, and then use horizontal and vertical transforms for an 8x8 
block. It is computationally fairly cheap, and lossless.

The Walsh-Hadamard transform can give similar properties, but generally 
involves a few extra steps that make it more computationally expensive.

It is possible to use a lifting transform to make a Reversible DCT, but 
it is slow...


BGBCC accepts JPEG and PNG for input and can convert them to 
BMP/QOI/UPIC as needed.


For audio storage, generally using the RIFF WAV format. For bulk audio, 
both A-Law and IMA ADPCM work OK. Granted, IMA ADPCM is not space 
efficient for stereo, but mostly OK for mono (most common use-case for 
sound effects).


> I did exactly that at a period when my generated DLLs were buggy for 
> some reason (it turned out to be two reasons). I created a simple 
> dynamic library format of my own. Then I found the same format worked 
> also for executables.
> 
> But I needed a loader program to run them, as Windows obviously didn't 
> understand the format. Such a program can be written in 800 lines of C, 
> and can dynamically libraries in both my format, and proper DLLs (not 
> the buggy ones I generated!).
> 
> A hello-world program is under 300 bytes compared with 2 or
> 2.5KB of EXE. And the format is portable to Linux, so no need to 
> generate ELF (but I haven't tried). Plus the format might be transparent 
> to AV software (haven't tried that either).
> 

OK.

By design, my PEL format (PE+LZ) isn't going to get under 2K (1K for 
headers, 1K for LZ'ed sections).

But, usually this is not a problem.

[toc] | [prev] | [next] | [standalone]


#389753

FromBGB <cr88192@gmail.com>
Date2024-12-20 05:10 -0600
Message-ID<vk3jbp$3dldp$1@dont-email.me>
In reply to#389747
On 12/19/2024 2:36 PM, BGB wrote:
> On 12/19/2024 5:27 AM, bart wrote:
>> On 19/12/2024 05:46, BGB wrote:
>>> On 12/18/2024 6:35 PM, bart wrote:
>>>> On 19/12/2024 00:27, BGB wrote:
>>
>>>>> By-Value Structs smaller than 16 bytes are passed as-if they were a 
>>>>> 64 or 128 bit integer type (as a single register or as a register 
>>>>> pair, with a layout matching their in-memory representation).
>>>>>
>>>>> ...
>>>>>
>>>>>
>>>>> But, yeah, at the IL level, one could potentially eliminate structs 
>>>>> and arrays as a separate construct, and instead have bare pointers 
>>>>> and a generic "reserve a blob of bytes in the frame and initialize 
>>>>> this pointer to point to it" operator (with the business end of 
>>>>> this operator happening in the function prolog).
>>>>
>>>> The problem with this, that I mentioned elsewhere, is how well it 
>>>> would work with SYS V ABI, since the rules for structs are complex, 
>>>> and apparently recursive.
>>>>
>>>> Having just a block of bytes might not be enough.
>>>
>>> In my case, I am not bothering with the SysV style ABI's (well, along 
>>> with there not being any x86 or x86-64 target...).
>>
>> I'd imagine it's worse with ARM targets as there are so many more 
>> registers to try and deconstruct structs into.
>>
> 
> Not messed much with the ARM64 ABI or similar, but I will draw the line 
> in the sand somewhere.
> 
> Struct passing/return is enough of an edge case that one can just sort 
> of declare it "no go" between compilers with "mostly but not strictly 
> compatible" ABIs.
> 
> 
>>>
>>> For my ISA, it is a custom ABI, but follows mostly similar rules to 
>>> some of the other "Microsoft style" ABIs (where, I have noted that 
>>> across multiple targets, MS tools have tended to use similar ABI 
>>> designs).
>>
>> When you do your own thing, it's easy.
>>
>> In the 1980s, I didn't need to worry about call conventions used for 
>> other software, since there /was/ no other software! I had to write 
>> everything, save for the odd calls to DOS which used some form of 
>> SYSCALL.
>>
>> Then, arrays and structs were actually passed and returned by value 
>> (not via hidden references), by copying the data to and from the stack.
>>
>> However, I don't recall ever using the feature, as I considered it 
>> efficient. I always used explicit references in my code.
>>
> 
> Most of the time, one is passing/returning structures as pointers, and 
> not by value.
> 
> By value structures are usually small.
> 
> 
> When a structure is not small, it is both simpler to implement, and 
> usually faster, to internally pass it by reference.
> 
> If you pass a large structure to a function by value, via an on-stack 
> copy, and the function assigns it to another location (say, a global 
> variable):
>    Pass by reference: Only a single copy operation is needed;
>    Pass by value on-stack: At least two copy operations are needed.
> 
> One also needs to reserve enough space in the function arguments list to 
> hold any structures passed, which could be bad if they are potentially 
> large.
> 
> 
> 
> But, on my ISA, ABI is sort of like:
>    R4 ..R7 : Arg0 ..Arg3
>    R20..R23: Arg4 ..Arg7
>    R36..R39: Arg8 ..Arg11 (optional)
>    R52..R55: Arg12..Arg15 (optional)
> Return Value:
>    R2, R3:R2 (128 bit)
>    R2 is also used to pass in the return value pointer.
> 
> 'this':
>    Generally passed in either R3 or R18, depending on ABI variant.
> 
> Where, callee-save:
>    R8 ..R14,  R24..R31,
>    R40..R47,  R56..R63
>    R15=SP
> 
> Non-saved scratch:
>    R2 ..R7 ,  R16..R23,
>    R32..R39,  R48..R55
> 
> 
> Arguments beyond the first 8/16 register arguments are passed on stack. 
> In this case, a spill space for the first 8/16 arguments (64 or 128 
> bytes) is provided on stack before the first non-register argument.
> 
> If the function accepts a fixed number of arguments and the number of 
> argument registers is 8 or less, spill space need only be provided for 
> the first 8 arguments (calling vararg functions will always reserve 
> space for 16 registers in the 16-register ABI). This spill space 
> effectively belongs to the callee rather than the caller.
> 
> 
> Structures (by value):
>    1.. 8 bytes: Passed in a single register
>    9..16 bytes: Passed in a pair, padded to the next even pair
>    17+: Pass as a reference.
> 
> Things like 128-bit types are also passed/returned in register pairs.
> 
> 
> 
> Contrast, RV ABI:
>    X10..X17 are used for arguments;
>    No spill space is provided;
>    ...
> 
> My variant uses similar rules to my own ABI for passing/returning 
> structures, with:
>    X28, structure return pointer
>    X29, 'this'
> Normal return values go into X10 or X11:X10.
> 
> 
> 
> Note that in both ABI's, passing 'this' in a register would mean that 
> class instances and COM objects are not equivalent (COM object methods 
> always pass 'this' as the first argument).
> 
> The 'this' register is implicitly also used by lambdas to pass in the 
> pointer to the captured bindings area (which mostly resembles a 
> structure containing each variable captured by the lambda).
> 
> Can note though that in this case, capturing a binding by reference 
> means the lambda is limited to automatic lifetime (non-automatic lambdas 
> may only capture by value). In this case, capture by value is the default.
> 
> 
>>> For my compiler targeting RISC-V, it uses a variation of RV's ABI rules.
>>> Argument passing is basically similar, but struct pass/return is 
>>> different; and it passes floating-point values in GPRs (and, in my 
>>> own ISA, all floating-point values use GPRs, as there are no FPU 
>>> registers; though FPU registers do exist for RISC-V).
>>
>> Supporting C's variadic functions, which is needed for many languages 
>> when calling C across an FFI, usually requires different rules. On 
>> Win64 ABI for example, by passing low variadic arguments in both GPRs 
>> and FPU registers.
>>
> 
> I simplified things by assuming only GPRs are used.
> 
> 
>> /Implementing/ variadic functions (which only occurs if implementing 
>> C) is another headache if it has to work with the ABI (which can be 
>> assumed for a non-static function).
>>
>> I barely have a working solution for Win64 ABI, which needs to be done 
>> via stdarg.h, but wouldn't have a clue how to do it for SYS V.
>>
>> (Even Win64 has problems, as it assumes a downward-growing stack; in 
>> my IL interpreter, the stack grows upwards!)
>>
> 
> Most targets use a downward growing stack.
> Mine is no exception here...
> 
> 
>>> Not likely a huge issue as one is unlikely to use ELF and PE/COFF in 
>>> the same program.
>>>
>>>
>>> For the "OS" that runs on my CPU core, it is natively using PE/COFF, but 
>>
>> That's interesting: you deliberately used one of the most complex file 
>> formats around, when you could have devised your own?
>>
> 
> For what I wanted, I would have mostly needed to recreate most of the 
> same functionality as PE/COFF anyways.
> 
> 
> When one considers the entire loading process (including DLLs/SOs), then 
> PE/COFF loading is actually simpler than ELF loading (ELF subjects the 
> loader to needing to deal with symbol and relocation tables), similar to 
> PIE loading.
> 

My wording there sucked...

PIE loading is the same as the case for ELF shared object loading, so is 
fairly complex.

For normal loading, they try to make it simpler for the kernel loader by 
having a special "interpreter" program deal with it. The process it then 
uses to bootstrap itself is rather convoluted.

> 
> Things like the MZ stub are optional in my case, and mostly ignored if 
> present (in my LZ compressed PE variants, the MZ stub is omitted entirely).
> 

My loader will accept multiple sub-variants:
   With MZ stub (original format);
   Without MZ stub (but uncompressed);
   With LZ4 compression (no MZ stub allowed).


The format for the no-stub case is basically the same as the with-stub 
case, except that the stub is absent and thus the 'PE' sig is still present.

Note that in my variants, omitting the MZ stub does cause it to change 
to a different checksum algorithm (the original PE/COFF checksum being 
unacceptably weak).


> 
> I had at one point considered doing a custom format resembling LZ 
> compressed MachO, but ended up not bothering, as it wouldn't have really 
> saved anything over LZ compressed PE/COFF.
> 

The core process is still:
   Read stuff into memory;
   Apply post-load fixups.

This part of the process was essentially unavoidable.

> 
> Some "unneeded cruft" like the Resource Section was discarded, mostly 
> replaced by an embedded WAD2 image. The header was modified some to 
> allow for backwards compatibility with the Windows format (mostly 
> creating a dummy header in the original format that points to the WAD2 
> directory).
> 

Note that the change of resource section format was more because the 
original approach to the resource section made little sense to me.

Identifying things with short names made a lot more sense than magic 
numbers.

The WAD approach Worked for Doom and similar, probably sufficient for 
things like inline bitmap images and icons.

> 
> Idea is that icons, bitmaps, and other things, would mostly be held in 
> WAD lumps. Though, resources which may be accessed via symbols in the 
> EXE/DLL need to be stored uncompressed (where "__rsrc_lumpname" may be 
> used to access the contents of resource-section lumps as an extern symbol).
> 

Note that it can also load blobs of text or binary data.
Though, BGBCC provides less in terms of format converters for arbitrary 
data.

A special text format is used both to define files to pull into the 
resource section (and what lump name to use), as well as format 
conversions to apply.


> Say, for example:
>    extern byte __rsrc_mybitmap[];  //resolves to a DIB/BMP or similar
> 
> For now, resource formats:
>    Images:
>      BMP (various settings)
>        4, 8, and 16 bpp typical
>        Supports a non-standard 16-bpp alpha-blended mode (*1).
>        Supports non-standard 16 color and 256 color with transparent.
>        Supports CRAM BMP as well (2 bpp)
>      QOI (assumes RGBA32, nominally lossless)
>        QOI is a semi-simplistic non-entropy-coded format.
>        Can give PNG-like compression in some cases.
>        Reasonably fast/cheap to decode.
>      LCIF, custom lossy format, color-cell compression.
>        OK Q/bpp but mostly only on the low-end.
>        Resembles a QOI+CRAM hybrid.
>      UPIC, lossy or lossless, JPEG-like (*2)
> 
> *1:
>    0rrrrrgggggbbbbb  Normal/Opaque
>    1rrrraggggabbbba  With 3 bit alpha (4b/ch RGB).
> 
> For 16 and 256 color, a variant is supported with a transparent color.
> Generally the high intensity magenta is reused as the transparent color. 
> This is encoded in the color palette (if all colors apart from one have 
> the alpha bits set to FF, and one color has 00, then that color is 
> assumed to be a transparent color).
> 
> CRAM bpp: Uses a limited form of the 8-bit CRAM format:
>    16 bits, 4x4 pixels, 1 bit per pixel
>    2x 8 bits: Color Endpoints
> The rest of the format being unsupported, so it can simply assume a 
> fixed 32-bits per 4x4 pixel cell.
> 

There being cases where one may want this...
If an image doesn't have more than 2 colors per 4x4 cell, it may give an 
acceptable image (and is often less space than 16-color).

Though, for small images, 16 color may use less space due to a smaller 
color palette (but, in theory, could add a special case to allow 
omitting the color palette when it is the default palette).

Say:
   biBitCount=8, biClrUsed=0, biClrImportant=256
Encoding a special "palette is absent, use fixed OS palette" case.
   As the BMP format burns 1K just to encode a 256-color palette.


> 
> 
> *2: The UPIC format is structurally similar to JPEG, but:
>    Uses TLV packaging (vs FF-escape tagging);
>    Uses Rice coding (vs Huffman)
>    Uses Z3.V5 VLC, vs Z4.V4
>    Uses Block-Haar and RCT
>      Vs DCT and YCbCr.
>    Supports an alpha channel.
>      Y    1       (*2A)
>      YA   1:1     (*2A)
>      YUV  4:2:0
>      YUV  4:4:4   (*2A)
>      YUVA 4:2:0:4
>      YUVA 4:4:4:4 (*2A)
>    *2A: May be used in the lossless modes, depending on image.
> 
> 
> VLC coding resembles Deflate's natch distance encoding, with sign-folded 
> values. Runs of zero coefficients have a shorter limit, but similar. 
> Like with JPEG, an 0x00 symbol encodes an early EOB.
> 

^ match. Also, UPIC is a custom format.

Add context:
Actually, it is using an entropy coding scheme I call STF+AdRice:
   Swap towards front, with Adaptive Rice Coding.

The Rice coding parameter (k) is adapted based on Q:
   0: k--;
   1: no change;
   2..7: k++
   8: k++; Symbol index encoded as a raw 8 bits.

Symbols are encoded as indices into a table. Whenever an index is 
encoded, the symbol swaps places with the symbol at (I*15)/16, causing 
more commonly used symbols to migrate towards 0.

Theoretically, the decoding process is more complex than a table-driven 
static Huffman decoder (as well as worse compression), but:
   Less memory is needed;
   Faster to initialize;
   On average, it is speed competitive.
     Lookup table initialization for static Huffman is expensive;
     Decode speed hindered by high L1 miss rates.

With a 15-bit symbol-length limit, Huffman has a very high L1 miss rate. 
Generally, to be fast, one needs to impose a 12 or 13 bit symbol length 
limit, reducing compression, but greatly reducing the number of L1 
misses. Though, 12 bits is a lower limit in practice (going much less 
than this, and Huffman coding becomes ineffective).



> In tests, on my main PC:
>    Vs JPEG: It is a little faster
>      Q/bpp is similar, better/worse depends on image.
>        Slightly worse on photos, but "similar".
>        Generally somewhat better on artificial images.
>    Vs PNG:
>      Faster to decode (with less memory overhead);
>      Better compression on many images (particularly photo-like).
> 
> Note that UPIC was designed to not require any large intermediate 
> buffers, so will decode directly to an RGB555 or RGBA32 output buffer 
> (decoding happens in terms of individual 16x16 pixel macroblocks).
> 
> It was designed to be moderately fast and to try to minimize memory 
> overhead for decoding (vs either PNG or JPEG, which need a more 
> significant chunk of working memory to decode).
> 
> 
> Block-Haar is a Haar transform made to fit the same 8x8 pixel blocks as 
> DCT, where Haar maps (A,B)->(C,D):
>    C=(A+B)/2  (*: X/2 here being defined as (X>>1))
>    D=A-B
> But, can be reversed exactly, IIRC:
>    B=C-(D/2)
>    A=B+D
> By doing multiple stages of Haar transform, one can build an 8-pixel 
> version, and then use horizontal and vertical transforms for an 8x8 
> block. It is computationally fairly cheap, and lossless.
> 
> The Walsh-Hadamard transform can give similar properties, but generally 
> involves a few extra steps that make it more computationally expensive.
> 
> It is possible to use a lifting transform to make a Reversible DCT, but 
> it is slow...
> 

Also, the code-size footprint for UPIC is smaller than a JPEG decoder.


> 
> BGBCC accepts JPEG and PNG for input and can convert them to BMP/QOI/ 
> UPIC as needed.
> 
> 
> For audio storage, generally using the RIFF WAV format. For bulk audio, 
> both A-Law and IMA ADPCM work OK. Granted, IMA ADPCM is not space 
> efficient for stereo, but mostly OK for mono (most common use-case for 
> sound effects).
> 

This isn't used much yet in this project.

In general, for other cases where I use audio, 16kHz is a typical default.

Where:
   8 and 11 kHz sound poor.
   Also 8-bit linear PCM sounds poor.

I am less a fan of MP3:
   Very complex decoder;
   Much under 96 or 128 kbps, has very obvious audio distortions...
     At lower bitrates, the audio quality is decidedly unpleasant.
   IMHO: 16 kHz ADPCM sounds better than 64 kbps MP3.

Not sure why it is so possible, when, as noted, at lower bitrates it 
sounds pretty broken (but, then again, it mostly sounds much fine at 128 
kbps or beyond, so dunno).

ADPCM's property of sounding tinny is still preferable to sounding like 
one is rattling a steel can full of broken glass, IMHO.


Did experimentally create an MP3-like audio codec (but much simpler), 
also using Block-Haar (rather than MDCT), and reused some amount of code 
from UPIC, which seems to avoid some of MP3's more obvious artifacts. 
But, the design did have a few of its own issues (might need to revisit 
later).

Mostly, it uses a half-cubic spline to approximate the low-frequency 
components (and try to reduce blocking artifacts; the spline is 
subtracted out so only higher frequency components use the Block-Haar), 
but seemingly the spline was too coarse (one sample per block), and I 
would likely need a higher effective sampling rate for the spline to 
avoid blocking artifacts in some cases (mostly, with sounds at roughly 
the same frequency as the block size effectively resulting in square 
waves, which sound bad).

> 
>> I did exactly that at a period when my generated DLLs were buggy for 
>> some reason (it turned out to be two reasons). I created a simple 
>> dynamic library format of my own. Then I found the same format worked 
>> also for executables.
>>
>> But I needed a loader program to run them, as Windows obviously didn't 
>> understand the format. Such a program can be written in 800 lines of 
>> C, and can dynamically libraries in both my format, and proper DLLs 
>> (not the buggy ones I generated!).
>>
>> A hello-world program is under 300 bytes compared with 2 or
>> 2.5KB of EXE. And the format is portable to Linux, so no need to 
>> generate ELF (but I haven't tried). Plus the format might be 
>> transparent to AV software (haven't tried that either).
>>
> 
> OK.
> 
> By design, my PEL format (PE+LZ) isn't going to get under 2K (1K for 
> headers, 1K for LZ'ed sections).
> 
> But, usually this is not a problem.
> 
> 

[toc] | [prev] | [next] | [standalone]


#389789

FromLawrence D'Oliveiro <ldo@nz.invalid>
Date2024-12-23 02:08 +0000
Message-ID<vkagne$sv9v$1@dont-email.me>
In reply to#389738
On Wed, 18 Dec 2024 23:46:21 -0600, BGB wrote:

> ... (what debug mechanisms I have, effectively lack any symbols
> for things inside "ld-linux.so"'s domain).

    nm -D /lib/ld-linux.so.2

[toc] | [prev] | [next] | [standalone]


Page 2 of 7 — ← Prev page 1 [2] 3 4 5 6 7  Next page →

Back to top | Article view | comp.lang.c


csiph-web