Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #389665 > unrolled thread
| Started by | Thiago Adams <thiago.adams@gmail.com> |
|---|---|
| First post | 2024-12-15 00:05 -0300 |
| Last post | 2025-02-09 12:43 -0800 |
| Articles | 20 on this page of 140 — 19 participants |
Back to article view | Back to comp.lang.c
transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 00:05 -0300
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-15 04:31 +0000
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 07:44 -0300
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-15 22:22 +0000
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 20:22 -0300
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-16 01:02 -0600
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-16 08:17 -0300
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 11:46 +0000
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 19:44 +0000
Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-16 13:59 -0800
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 23:36 +0000
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-14 20:39 -0800
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 07:49 -0300
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-15 13:01 -0800
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-02-15 21:01 -0800
USENET and spam (Was: Re: transpiling to low level C) Salvador Mirzo <smirzo@example.com> - 2025-02-16 10:17 -0300
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-15 11:28 +0000
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 08:46 -0300
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-15 09:13 -0300
Re: transpiling to low level C Bonita Montero <Bonita.Montero@gmail.com> - 2024-12-15 20:08 +0100
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-15 21:32 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-15 17:53 -0600
Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-16 10:36 +0100
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-16 08:21 -0300
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 01:03 -0600
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 14:55 -0300
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 14:59 -0300
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 15:16 -0300
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 18:37 +0000
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 16:07 -0300
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 19:42 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:51 -0600
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-18 16:43 -0300
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 18:27 -0600
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 00:35 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 23:46 -0600
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 11:27 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-19 14:36 -0600
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-20 05:10 -0600
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-23 02:08 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-23 05:15 -0600
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 13:07 -0600
Re: transpiling to low level C Thiago Adams <thiago.adams@gmail.com> - 2024-12-17 16:33 -0300
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:51 -0600
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-21 05:34 +0000
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-16 18:12 +0100
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 18:37 +0000
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-16 21:39 +0100
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-16 23:26 +0000
Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-16 17:19 -0800
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 00:40 -0600
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 16:17 +0000
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-17 18:18 +0100
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-17 18:46 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 22:45 +0000
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-18 00:23 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 01:24 +0000
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-18 03:51 +0000
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-18 17:26 +0100
Re: transpiling to low level C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-12-17 12:13 -0800
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-18 17:19 +0100
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-17 18:29 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-20 17:28 -0800
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-21 21:31 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-21 13:51 -0800
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:22 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-13 08:10 -0800
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 00:20 +0200
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:13 +0100
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 02:18 +0200
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 01:39 +0100
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 03:04 +0200
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 03:06 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 17:39 -0800
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-23 02:41 +0000
Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 08:43 +0100
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 00:51 -0600
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-28 09:20 -0800
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-04 12:12 -0800
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-01-04 12:53 -0800
Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2025-01-05 11:18 +0000
Re: transpiling to low level C James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-01-05 12:04 -0500
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-07 21:38 -0800
Re: transpiling to low level C James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-12-21 22:17 -0500
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 19:51 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-06-06 11:50 -0700
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 13:02 -0800
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 13:25 -0800
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 15:50 -0800
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 06:01 +0000
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-22 11:22 +0200
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-22 11:35 +0000
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 10:38 -0800
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 19:44 +0000
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-04 11:18 -0800
Re: transpiling to low level C Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-12-22 20:41 +0100
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 00:20 +0200
Re: transpiling to low level C scott@slp53.sl.home (Scott Lurndal) - 2024-12-23 15:41 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-23 15:51 +0000
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 18:05 +0200
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 14:05 -0800
Re: transpiling to low level C antispam@fricas.org (Waldek Hebisch) - 2024-12-22 23:29 +0000
Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 09:46 +0100
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-23 11:35 +0000
Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 13:18 +0100
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-23 13:40 +0200
Re: transpiling to low level C David Brown <david.brown@hesbynett.no> - 2024-12-23 13:24 +0100
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 13:18 -0800
Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-24 00:41 +0000
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 20:55 -0800
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 03:41 -0600
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-25 15:43 -0600
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-28 09:24 -0800
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-28 13:59 -0600
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-31 04:57 -0800
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-12-23 13:28 -0800
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-23 14:00 -0800
Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-22 14:19 +0000
Re: transpiling to low level C Ben Bacarisse <ben@bsb.me.uk> - 2024-12-22 15:30 +0000
Re: transpiling to low level C Kaz Kylheku <643-408-1753@kylheku.com> - 2024-12-22 21:45 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-22 23:22 +0000
Re: transpiling to low level C Kaz Kylheku <643-408-1753@kylheku.com> - 2024-12-22 23:47 +0000
Re: transpiling to low level C Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-12-22 17:22 -0800
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 21:23 +0000
Re: transpiling to low level C Michael S <already5chosen@yahoo.com> - 2024-12-17 11:16 +0200
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 12:04 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-17 12:51 -0600
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 12:08 +0000
Re: transpiling to low level C BGB <cr88192@gmail.com> - 2024-12-18 12:50 -0600
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-18 23:37 +0000
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-17 19:40 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 19:45 +0000
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-17 22:25 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-17 22:55 +0000
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-18 05:55 +0000
Re: transpiling to low level C bart <bc@freeuk.com> - 2024-12-19 00:32 +0000
Re: transpiling to low level C Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-12-16 21:22 +0000
Re: transpiling to low level C Rosario19 <Ros@invalid.invalid> - 2024-12-26 13:16 +0100
Re: transpiling to low level C User One <noreply@invalid.com> - 2025-02-09 17:51 +0000
Re: transpiling to low level C "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-02-09 12:43 -0800
Page 3 of 7 — ← Prev page 1 2 [3] 4 5 6 7 Next page →
| From | BGB <cr88192@gmail.com> |
|---|---|
| Date | 2024-12-23 05:15 -0600 |
| Message-ID | <vkbgom$15p6c$2@dont-email.me> |
| In reply to | #389789 |
On 12/22/2024 8:08 PM, Lawrence D'Oliveiro wrote:
> On Wed, 18 Dec 2024 23:46:21 -0600, BGB wrote:
>
>> ... (what debug mechanisms I have, effectively lack any symbols
>> for things inside "ld-linux.so"'s domain).
>
> nm -D /lib/ld-linux.so.2
It is not actually on Linux, but rather trying to make my kernel mimic
Linux...
The issue isn't getting the symbol map, but rather that in this case,
there are multiple levels of abstraction and so, at the level of the CPU
emulator (where I can get instruction traces when something crashes), it
can no longer figure out what addresses map to where.
With the normal PE loader, it can send messages to the virtual debug
UART which signal where it has loaded things in memory (for every EXE
and DLL). But, things partly break down for ELF PIE binaries with glibc
or musl.
Granted, the ELF loader does at least know in theory where the main
binary and interpreter were loaded.
But, seemingly, process is sort of like:
Read in main ELF binary;
Read in interpreter;
Set up argument list, environment, and other stuff (*1), on the stack;
Branch to entry point on interpreter;
Magic happens.
(Currently, it just crashes).
*1:
(SP+ 0): argc
(SP+ 8): argv[0]
(SP+16): argv[1]
...
(SP+(argc+1)*8): NULL
(SP+xx): Env var pointers...
(SP+xx): NULL
(SP+xx): Auxiliary Vectors
Key/value pairs
Terminated by Key==0
Information on how exactly to set up the auxiliary vectors in a way that
glibc and musl are happy, is harder to figure out. At this stage, things
become rather poorly documented.
Theoretically the interpreter program is responsible for loading the
other SO's; or if the main ELF loader is supposed to do it, it is not
obvious how it is supposed to tell ld-so where it had loaded them.
...
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@gmail.com> |
|---|---|
| Date | 2024-12-17 13:07 -0600 |
| Message-ID | <vjsi62$1s5j5$2@dont-email.me> |
| In reply to | #389702 |
On 12/17/2024 11:55 AM, Thiago Adams wrote:
> Em 12/17/2024 4:03 AM, BGB escreveu:
>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>> On 15/12/2024 20:53, BGB wrote:
>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>> C++ is more readable because is is magnitudes more expressive than C.
>>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>>> less expressive makes it even less readable, and that's also true for
>>>>>> your reduced C.
>>>>>>
>>>>>
>>>>> That's not really the point of it. This reduced C is used as an
>>>>> intermediate language for a compiler target. It will not usually be
>>>>> read, or maintained.
>>>>>
>>>>> An intermediate language needs to at a lower level than the source
>>>>> language.
>>>>>
>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>
>>>>> Generating C++ would be quite useless.
>>>>>
>>>>
>>>> As an IL, even C is a little overkill, unless turned into a
>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>
>>>> Say:
>>>> Only function-scope variables allowed;
>>>> No high-level control structures;
>>>> ...
>>>>
>>>> Say:
>>>> int foo(int x)
>>>> {
>>>> int i, v;
>>>> for(i=x, v=0; i>0; i--)
>>>> v=v*i;
>>>> return(v);
>>>> }
>>>>
>>>> Becoming, say:
>>>> int foo(int x)
>>>> {
>>>> int i;
>>>> int v;
>>>> i=x;
>>>> v=0;
>>>> if(i<=0)goto L1;
>>>> L0:
>>>> v=v*i;
>>>> i=i-1;
>>>> if(i>0)goto L0;
>>>> L1:
>>>> return v;
>>>> }
>>>>
>>>> ...
>>>>
>>>
>>> I have considered to remove loops and keep only goto.
>>> But I think this is not bring too much simplification.
>>>
>>
>> It depends.
>>
>> If the compiler works like an actual C compiler, with a full parser
>> and AST stage, yeah, it may not save much.
>>
>>
>> If the parser is a thin wrapper over 3AC operations (only allowing
>> statements that map 1:1 with a 3AC IR operation), it may save a bit
>> more...
>>
>>
>>
>> As for whether or not it makes sense to use a C like syntax here, this
>> is more up for debate (for practical use within a compiler, I would
>> assume a binary serialization rather than an ASCII syntax, though
>> ASCII may be better in terms of inter-operation or human readability).
>>
>>
>> But, as can be noted, I would assume a binary serialization that is
>> oriented around operators; and *not* about serializing the structures
>> used to implement those operators. Also I would assume that the IR
>> need not be in SSA form (conversion to full SSA could be done when
>> reading in the IR operations).
>>
>>
>> Ny argument is that not using SSA form means fewer issues for both the
>> serialization format and compiler front-end to need to deal with (and
>> is comparably easy to regenerate for the backend, with the backend
>> operating with its internal IR in SSA form).
>>
>> Well, contrast to LLVM assuming everything is always in SSA form.
>>
>> ...
>>
>>
>
> I also have considered split expressions.
>
> For instance
>
> if (a*b+c) {}
>
> into
>
> register int r1 = a * b;
> register int r2 = r1 + c;
> if (r2) {}
>
> This would make easier to add overflow checks in runtime (if desired)
> and implement things like _complex
>
> Is this what you mean by 3AC or SSA?
>
3AC means that IR expressed 3 (or sometimes more) operands per IR op.
So:
MUL r1, a, b
Rather than, say, stack:
LOAD a
LOAD b
MUL
STORE r1
SSA:
Static Single Assignment
Generally:
Every variable may only be assigned once (more like in a functional
programming language);
Generally, variables are "merged" in the control-flow via PHI operators
(which variable merges in depending on which path control came from).
IMHO, while SSA is preferable for backend analysis, optimization, and
code generation; it is undesirable pretty much everywhere else as it
adds too much complexity.
Better IMO for the frontend compiler and main IL stage to assume that
local variables are freely mutable.
Typically, global variables are excluded in most variants, and remain
fully mutable; but may be handled as designated LOAD/STORE operations.
In BGBCC though, full SSA only applies to temporaries. Normal local
variables are merely flagged by "version", and all versions of the same
local variable implicitly merge back together at each branch/label.
This allows some similar advantages (for analysis and optimization)
while limiting some of the complexities. Though, this differs from
temporaries which are assumed to essentially fully disappear once they
go outside of the span in which they exist (albeit with an awkward case
to deal with temporaries that cross basic-block boundaries, which need
to actually "exist" in some semi-concrete form, more like local variables).
Note that unless the address is taken of a local variable, it need not
have any backing in memory. Temporaries can never have their address
taken, so generally exist exclusively in CPU registers.
> This would definitely simplify expressions grammar.
>
>
[toc] | [prev] | [next] | [standalone]
| From | Thiago Adams <thiago.adams@gmail.com> |
|---|---|
| Date | 2024-12-17 16:33 -0300 |
| Message-ID | <vjsjll$1rlkq$3@dont-email.me> |
| In reply to | #389709 |
Em 12/17/2024 4:07 PM, BGB escreveu:
> On 12/17/2024 11:55 AM, Thiago Adams wrote:
>> Em 12/17/2024 4:03 AM, BGB escreveu:
>>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>>> On 15/12/2024 20:53, BGB wrote:
>>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>>> C++ is more readable because is is magnitudes more expressive
>>>>>>> than C.
>>>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>>>> less expressive makes it even less readable, and that's also true
>>>>>>> for
>>>>>>> your reduced C.
>>>>>>>
>>>>>>
>>>>>> That's not really the point of it. This reduced C is used as an
>>>>>> intermediate language for a compiler target. It will not usually
>>>>>> be read, or maintained.
>>>>>>
>>>>>> An intermediate language needs to at a lower level than the source
>>>>>> language.
>>>>>>
>>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>>
>>>>>> Generating C++ would be quite useless.
>>>>>>
>>>>>
>>>>> As an IL, even C is a little overkill, unless turned into a
>>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>>
>>>>> Say:
>>>>> Only function-scope variables allowed;
>>>>> No high-level control structures;
>>>>> ...
>>>>>
>>>>> Say:
>>>>> int foo(int x)
>>>>> {
>>>>> int i, v;
>>>>> for(i=x, v=0; i>0; i--)
>>>>> v=v*i;
>>>>> return(v);
>>>>> }
>>>>>
>>>>> Becoming, say:
>>>>> int foo(int x)
>>>>> {
>>>>> int i;
>>>>> int v;
>>>>> i=x;
>>>>> v=0;
>>>>> if(i<=0)goto L1;
>>>>> L0:
>>>>> v=v*i;
>>>>> i=i-1;
>>>>> if(i>0)goto L0;
>>>>> L1:
>>>>> return v;
>>>>> }
>>>>>
>>>>> ...
>>>>>
>>>>
>>>> I have considered to remove loops and keep only goto.
>>>> But I think this is not bring too much simplification.
>>>>
>>>
>>> It depends.
>>>
>>> If the compiler works like an actual C compiler, with a full parser
>>> and AST stage, yeah, it may not save much.
>>>
>>>
>>> If the parser is a thin wrapper over 3AC operations (only allowing
>>> statements that map 1:1 with a 3AC IR operation), it may save a bit
>>> more...
>>>
>>>
>>>
>>> As for whether or not it makes sense to use a C like syntax here,
>>> this is more up for debate (for practical use within a compiler, I
>>> would assume a binary serialization rather than an ASCII syntax,
>>> though ASCII may be better in terms of inter-operation or human
>>> readability).
>>>
>>>
>>> But, as can be noted, I would assume a binary serialization that is
>>> oriented around operators; and *not* about serializing the structures
>>> used to implement those operators. Also I would assume that the IR
>>> need not be in SSA form (conversion to full SSA could be done when
>>> reading in the IR operations).
>>>
>>>
>>> Ny argument is that not using SSA form means fewer issues for both
>>> the serialization format and compiler front-end to need to deal with
>>> (and is comparably easy to regenerate for the backend, with the
>>> backend operating with its internal IR in SSA form).
>>>
>>> Well, contrast to LLVM assuming everything is always in SSA form.
>>>
>>> ...
>>>
>>>
>>
>> I also have considered split expressions.
>>
>> For instance
>>
>> if (a*b+c) {}
>>
>> into
>>
>> register int r1 = a * b;
>> register int r2 = r1 + c;
>> if (r2) {}
>>
>> This would make easier to add overflow checks in runtime (if desired)
>> and implement things like _complex
>>
>> Is this what you mean by 3AC or SSA?
>>
>
> 3AC means that IR expressed 3 (or sometimes more) operands per IR op.
>
> So:
> MUL r1, a, b
> Rather than, say, stack:
> LOAD a
> LOAD b
> MUL
> STORE r1
>
>
> SSA:
> Static Single Assignment
>
Oh sorry .. I knew what SSA is.
> Generally:
> Every variable may only be assigned once (more like in a functional
> programming language);
> Generally, variables are "merged" in the control-flow via PHI operators
> (which variable merges in depending on which path control came from).
>
I do similar merge in my flow analysis but without the concept of SSA.
> IMHO, while SSA is preferable for backend analysis, optimization, and
> code generation; it is undesirable pretty much everywhere else as it
> adds too much complexity.
>
> Better IMO for the frontend compiler and main IL stage to assume that
> local variables are freely mutable.
>
> Typically, global variables are excluded in most variants, and remain
> fully mutable; but may be handled as designated LOAD/STORE operations.
>
>
> In BGBCC though, full SSA only applies to temporaries. Normal local
> variables are merely flagged by "version", and all versions of the same
> local variable implicitly merge back together at each branch/label.
>
Sorry what is BGBCC ? (C compiler?)
> This allows some similar advantages (for analysis and optimization)
> while limiting some of the complexities. Though, this differs from
> temporaries which are assumed to essentially fully disappear once they
> go outside of the span in which they exist (albeit with an awkward case
> to deal with temporaries that cross basic-block boundaries, which need
> to actually "exist" in some semi-concrete form, more like local variables).
>
> Note that unless the address is taken of a local variable, it need not
> have any backing in memory. Temporaries can never have their address
> taken, so generally exist exclusively in CPU registers.
>
>
>> This would definitely simplify expressions grammar.
>>
>>
>
It can be added in the future.
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@gmail.com> |
|---|---|
| Date | 2024-12-18 12:51 -0600 |
| Message-ID | <vjv5jd$2ds8r$3@dont-email.me> |
| In reply to | #389710 |
On 12/17/2024 1:33 PM, Thiago Adams wrote:
> Em 12/17/2024 4:07 PM, BGB escreveu:
>> On 12/17/2024 11:55 AM, Thiago Adams wrote:
>>> Em 12/17/2024 4:03 AM, BGB escreveu:
>>>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>>>> On 15/12/2024 20:53, BGB wrote:
>>>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>>>> C++ is more readable because is is magnitudes more expressive
>>>>>>>> than C.
>>>>>>>> You can easily write a C++-statement that would hunddres of
>>>>>>>> lines in
>>>>>>>> C (imagines specializing a unordered_map by hand). Making a
>>>>>>>> language
>>>>>>>> less expressive makes it even less readable, and that's also
>>>>>>>> true for
>>>>>>>> your reduced C.
>>>>>>>>
>>>>>>>
>>>>>>> That's not really the point of it. This reduced C is used as an
>>>>>>> intermediate language for a compiler target. It will not usually
>>>>>>> be read, or maintained.
>>>>>>>
>>>>>>> An intermediate language needs to at a lower level than the
>>>>>>> source language.
>>>>>>>
>>>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>>>
>>>>>>> Generating C++ would be quite useless.
>>>>>>>
>>>>>>
>>>>>> As an IL, even C is a little overkill, unless turned into a
>>>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>>>
>>>>>> Say:
>>>>>> Only function-scope variables allowed;
>>>>>> No high-level control structures;
>>>>>> ...
>>>>>>
>>>>>> Say:
>>>>>> int foo(int x)
>>>>>> {
>>>>>> int i, v;
>>>>>> for(i=x, v=0; i>0; i--)
>>>>>> v=v*i;
>>>>>> return(v);
>>>>>> }
>>>>>>
>>>>>> Becoming, say:
>>>>>> int foo(int x)
>>>>>> {
>>>>>> int i;
>>>>>> int v;
>>>>>> i=x;
>>>>>> v=0;
>>>>>> if(i<=0)goto L1;
>>>>>> L0:
>>>>>> v=v*i;
>>>>>> i=i-1;
>>>>>> if(i>0)goto L0;
>>>>>> L1:
>>>>>> return v;
>>>>>> }
>>>>>>
>>>>>> ...
>>>>>>
>>>>>
>>>>> I have considered to remove loops and keep only goto.
>>>>> But I think this is not bring too much simplification.
>>>>>
>>>>
>>>> It depends.
>>>>
>>>> If the compiler works like an actual C compiler, with a full parser
>>>> and AST stage, yeah, it may not save much.
>>>>
>>>>
>>>> If the parser is a thin wrapper over 3AC operations (only allowing
>>>> statements that map 1:1 with a 3AC IR operation), it may save a bit
>>>> more...
>>>>
>>>>
>>>>
>>>> As for whether or not it makes sense to use a C like syntax here,
>>>> this is more up for debate (for practical use within a compiler, I
>>>> would assume a binary serialization rather than an ASCII syntax,
>>>> though ASCII may be better in terms of inter-operation or human
>>>> readability).
>>>>
>>>>
>>>> But, as can be noted, I would assume a binary serialization that is
>>>> oriented around operators; and *not* about serializing the
>>>> structures used to implement those operators. Also I would assume
>>>> that the IR need not be in SSA form (conversion to full SSA could be
>>>> done when reading in the IR operations).
>>>>
>>>>
>>>> Ny argument is that not using SSA form means fewer issues for both
>>>> the serialization format and compiler front-end to need to deal with
>>>> (and is comparably easy to regenerate for the backend, with the
>>>> backend operating with its internal IR in SSA form).
>>>>
>>>> Well, contrast to LLVM assuming everything is always in SSA form.
>>>>
>>>> ...
>>>>
>>>>
>>>
>>> I also have considered split expressions.
>>>
>>> For instance
>>>
>>> if (a*b+c) {}
>>>
>>> into
>>>
>>> register int r1 = a * b;
>>> register int r2 = r1 + c;
>>> if (r2) {}
>>>
>>> This would make easier to add overflow checks in runtime (if desired)
>>> and implement things like _complex
>>>
>>> Is this what you mean by 3AC or SSA?
>>>
>>
>> 3AC means that IR expressed 3 (or sometimes more) operands per IR op.
>>
>> So:
>> MUL r1, a, b
>> Rather than, say, stack:
>> LOAD a
>> LOAD b
>> MUL
>> STORE r1
>>
>>
>> SSA:
>> Static Single Assignment
>>
>
> Oh sorry .. I knew what SSA is.
>
>> Generally:
>> Every variable may only be assigned once (more like in a functional
>> programming language);
>> Generally, variables are "merged" in the control-flow via PHI
>> operators (which variable merges in depending on which path control
>> came from).
>>
>
> I do similar merge in my flow analysis but without the concept of SSA.
>
>> IMHO, while SSA is preferable for backend analysis, optimization, and
>> code generation; it is undesirable pretty much everywhere else as it
>> adds too much complexity.
>>
>> Better IMO for the frontend compiler and main IL stage to assume that
>> local variables are freely mutable.
>>
>> Typically, global variables are excluded in most variants, and remain
>> fully mutable; but may be handled as designated LOAD/STORE operations.
>>
>>
>> In BGBCC though, full SSA only applies to temporaries. Normal local
>> variables are merely flagged by "version", and all versions of the
>> same local variable implicitly merge back together at each branch/label.
>>
>
> Sorry what is BGBCC ? (C compiler?)
>
It is my C compiler.
Can be found within my current main project:
https://github.com/cr88192/bgbtech_btsr1arch/tree/master/bgbcc22
It started out, long ago, as a fork off my scripting language, which was
originally a JavaScript clone.
First stage:
Originally written as a C interpreter of sorts.
Original idea was to use dynamically compiled C as an application
scripting language, but C wasn't a great language for this task (vs a JS
clone), and the compiler was a lot harder to debug.
Then, for a while, it was turned over to mining metadata from headers to
generate an FFI for the script language.
Its use as a C compiler was revived when I started my CPU ISA project,
as I needed a compiler for it, and other options (Clang, GCC, and LCC)
were unattractive in various ways.
Though, in all, a lot more effort in the project has gone into the C
compiler than into much of anything else, and it is still a bit of a
pain finding and fixing bugs (and avoiding causing new bugs).
It targets both BJX2 (my own ISA) or RISC-V, albeit using PE/COFF for
the latter (rather than ELF).
>> This allows some similar advantages (for analysis and optimization)
>> while limiting some of the complexities. Though, this differs from
>> temporaries which are assumed to essentially fully disappear once they
>> go outside of the span in which they exist (albeit with an awkward
>> case to deal with temporaries that cross basic-block boundaries, which
>> need to actually "exist" in some semi-concrete form, more like local
>> variables).
>>
>> Note that unless the address is taken of a local variable, it need not
>> have any backing in memory. Temporaries can never have their address
>> taken, so generally exist exclusively in CPU registers.
>>
>>
>>> This would definitely simplify expressions grammar.
>>>
>>>
>>
>
>
> It can be added in the future.
>
[toc] | [prev] | [next] | [standalone]
| From | Lawrence D'Oliveiro <ldo@nz.invalid> |
|---|---|
| Date | 2024-12-21 05:34 +0000 |
| Message-ID | <vk5k0f$3t9nf$1@dont-email.me> |
| In reply to | #389709 |
On Tue, 17 Dec 2024 13:07:44 -0600, BGB wrote: > Every variable may only be assigned once ... Note this only applies to registers.
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-12-16 18:12 +0100 |
| Message-ID | <vjpn29$17jub$1@dont-email.me> |
| In reply to | #389679 |
On 16.12.2024 00:53, BGB wrote: > [...] > > Pretty much all higher level control flow can be expressed via goto. A 'goto' may be used but it isn't strictly *necessary*. What *is* necessary, though, that is an 'if' (some conditional branch), and either 'goto' or recursive functions. Janis > [...]
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-12-16 18:37 +0000 |
| Message-ID | <vjps0l$18hon$1@dont-email.me> |
| In reply to | #389685 |
On 16/12/2024 17:12, Janis Papanagnou wrote: > On 16.12.2024 00:53, BGB wrote: >> [...] >> >> Pretty much all higher level control flow can be expressed via goto. > > A 'goto' may be used but it isn't strictly *necessary*. What *is* > necessary, though, that is an 'if' (some conditional branch), and > either 'goto' or recursive functions. You would do away with just about the simplest feature of any language, and insist users emulate intra-function branching via function calls? I'd like to see your design of IL that doesn't have branching! I guess it would also need closures and/or continuations. I suspect you either don't quite understand what an IL is for, or forgot that's what this is about. An IL is typically lower level than the source language, while higher level than a native code target which makes it easier to generate code for. A functional IL would make it harder than assembly.
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-12-16 21:39 +0100 |
| Message-ID | <vjq36r$19sis$1@dont-email.me> |
| In reply to | #389686 |
On 16.12.2024 19:37, bart wrote: > On 16/12/2024 17:12, Janis Papanagnou wrote: >> On 16.12.2024 00:53, BGB wrote: >>> [...] >>> >>> Pretty much all higher level control flow can be expressed via goto. >> >> A 'goto' may be used but it isn't strictly *necessary*. What *is* >> necessary, though, that is an 'if' (some conditional branch), and >> either 'goto' or recursive functions. > > You would do away with just about the simplest feature of any language, > and insist users emulate intra-function branching via function calls? Why are you, yet again, and still, making up nonsensical suppositions! > > I'd like to see your design of IL that doesn't have branching! I guess > it would also need closures and/or continuations. I suspect you either > don't quite understand what an IL is for, or forgot that's what this is > about. Why are you, yet again, and still, making up nonsensical suppositions! I wasn't commenting on any "IL", nor on any "emulation". I also didn't restrict my view to any specific machine model (as you have obviously been doing). You can still see what I was commenting on in the single sentence that I quoted in my post. (No more and no less.) - If you have anything to say (without any nonsensical suppositions), concerning the quote and my response, feel free to make another try. > > An IL is typically lower level than the source language, while higher > level than a native code target which makes it easier to generate code > for. A functional IL would make it harder than assembly. If you want to talk about "IL" or "assembly" search another target for discussing your ideas. Janis
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-12-16 23:26 +0000 |
| Message-ID | <vjqcut$1bld5$1@dont-email.me> |
| In reply to | #389688 |
On 16/12/2024 20:39, Janis Papanagnou wrote: > On 16.12.2024 19:37, bart wrote: >> On 16/12/2024 17:12, Janis Papanagnou wrote: >>> On 16.12.2024 00:53, BGB wrote: >>>> [...] >>>> >>>> Pretty much all higher level control flow can be expressed via goto. >>> >>> A 'goto' may be used but it isn't strictly *necessary*. What *is* >>> necessary, though, that is an 'if' (some conditional branch), and >>> either 'goto' or recursive functions. >> >> You would do away with just about the simplest feature of any language, >> and insist users emulate intra-function branching via function calls? > > Why are you, yet again, and still, making up nonsensical suppositions! > >> >> I'd like to see your design of IL that doesn't have branching! I guess >> it would also need closures and/or continuations. I suspect you either >> don't quite understand what an IL is for, or forgot that's what this is >> about. > > Why are you, yet again, and still, making up nonsensical suppositions! > > I wasn't commenting on any "IL", The subthread was about ILs. > nor on any "emulation". I also didn't > restrict my view to any specific machine model (as you have obviously > been doing). > > You can still see what I was commenting on in the single sentence that > I quoted in my post. (No more and no less.) - If you have anything to > say (without any nonsensical suppositions), concerning the quote and > my response, feel free to make another try. In that case I've no idea what you were trying to say. When somebody says that 'goto' can emulate any control structure, then clearly some of them need to be conditional; that is implied. Your reply suggested they you can do away with 'goto', and use recursive functions, in a scenario where no other control structures need exist. OK, if this is not for an IL, then it's not a language I would care for either. Why tie one hand behind your back for no good reason? > If you want to talk about "IL" or "assembly" search another target for > discussing your ideas. > So what was your proposal about?
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-12-16 17:19 -0800 |
| Message-ID | <877c7z85t2.fsf@nosuchdomain.example.com> |
| In reply to | #389692 |
bart <bc@freeuk.com> writes:
[SNIP]
> In that case I've no idea what you were trying to say.
>
> When somebody says that 'goto' can emulate any control structure, then
> clearly some of them need to be conditional; that is implied.
>
> Your reply suggested they you can do away with 'goto', and use
> recursive functions, in a scenario where no other control structures
> need exist.
>
> OK, if this is not for an IL, then it's not a language I would care
> for either. Why tie one hand behind your back for no good reason?
I read Janis's post. I saw a suggestion that certain constructs are
*theoretically* unnecessary. I saw no suggestion of any advocacy for
such an approach.
"""
A 'goto' may be used but it isn't strictly *necessary*. What *is*
necessary, though, that is an 'if' (some conditional branch), and
either 'goto' or recursive functions.
"""
[...]
> So what was your proposal about?
I saw no proposal.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@gmail.com> |
|---|---|
| Date | 2024-12-17 00:40 -0600 |
| Message-ID | <vjr6dn$1j57r$1@dont-email.me> |
| In reply to | #389694 |
On 12/16/2024 7:19 PM, Keith Thompson wrote:
> bart <bc@freeuk.com> writes:
> [SNIP]
>> In that case I've no idea what you were trying to say.
>>
>> When somebody says that 'goto' can emulate any control structure, then
>> clearly some of them need to be conditional; that is implied.
>>
>> Your reply suggested they you can do away with 'goto', and use
>> recursive functions, in a scenario where no other control structures
>> need exist.
>>
>> OK, if this is not for an IL, then it's not a language I would care
>> for either. Why tie one hand behind your back for no good reason?
>
> I read Janis's post. I saw a suggestion that certain constructs are
> *theoretically* unnecessary. I saw no suggestion of any advocacy for
> such an approach.
>
> """
> A 'goto' may be used but it isn't strictly *necessary*. What *is*
> necessary, though, that is an 'if' (some conditional branch), and
> either 'goto' or recursive functions.
> """
>
While if-call is technically possible, it is likely to result in a
significantly higher performance overhead if compared with if-goto.
Say:
if-goto:
can be a single CPU instruction on some targets.
Both RISC-V and BJX2 can do single-instruction compare-and-branch.
if-call:
A whole lot more than 1 instruction...
One may have call/return, stack frame creation, ...
Or, a backend that is very clever about inlining.
Also: if-goto can readily express pretty much every other intra-function
control flow construct:
for, while, switch, break, continue, ...
And, is less awkward for a compiler than trying to map control flow to
any of the other constructs. Also it is generally the fastest construct
for implementing these higher level constructs.
Meanwhile, if-call, good luck... You are also going to need something
beyond normal C style variable scoping to make this work acceptably. At
best, it is likely to be very awkward and likely to perform poorly.
Though, it is possible that one could make a case for a 3-way branch
operator in the IR, say:
if3(x,y) A, B, C
Which does the equivalent of, say:
if(x<y)goto A;
else if(x>y)goto B;
else goto C;
This would be a bit niche, but could be useful for things like
implementing "switch()" via binary subdivision.
Though, functionally could decay into normal if-goto in various cases.
...
> [...]
>
>> So what was your proposal about?
>
> I saw no proposal.
>
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-12-17 16:17 +0000 |
| Message-ID | <vjs871$1q22j$1@dont-email.me> |
| In reply to | #389694 |
On 17/12/2024 01:19, Keith Thompson wrote: > bart <bc@freeuk.com> writes: > [SNIP] >> In that case I've no idea what you were trying to say. >> >> When somebody says that 'goto' can emulate any control structure, then >> clearly some of them need to be conditional; that is implied. >> >> Your reply suggested they you can do away with 'goto', and use >> recursive functions, in a scenario where no other control structures >> need exist. >> >> OK, if this is not for an IL, then it's not a language I would care >> for either. Why tie one hand behind your back for no good reason? > > I read Janis's post. I saw a suggestion that certain constructs are > *theoretically* unnecessary. I saw no suggestion of any advocacy for > such an approach. > > """ > A 'goto' may be used but it isn't strictly *necessary*. What *is* > necessary, though, that is an 'if' (some conditional branch), and > either 'goto' or recursive functions. > """ This doesn't actually make much sense. So 'goto' is necessary, but 'goto' *is*? If you try to extract any meaning, it is that any control flow can be expressed either with 'goto' or with 'recursive functions'. This is what I picked up on. Who on earth would eschew 'goto' and use such a disproportionately more complex and inefficient method like recursive functions? How would you even express an arbitrary goto from random point X in a function to random point Y, which may be inside differently nested blocks, via a recursive function? If this is suppposed to be theoretically possible, then neither do you need a compiler OR a computer to run any program! Pencil and paper will work.
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-12-17 18:18 +0100 |
| Message-ID | <vjsbp3$1r3ns$1@dont-email.me> |
| In reply to | #389699 |
On 17.12.2024 17:17, bart wrote:
> On 17/12/2024 01:19, Keith Thompson wrote:
>> [...]
>>
>> """
>> [...] but it isn't strictly *necessary*. [...]
>> """
>
> This doesn't actually make much sense. So 'goto' is necessary, but
> 'goto' *is*?
Have you issues with reading? ("isn't" is not "is", and Keith's
"[*theoretically*] unnecessary" is not "necessary".)
>
> If you try to extract any meaning, it is that any control flow can be
> expressed either with 'goto' or with 'recursive functions'.
It's actually the other way round; you can specify functionality
using Recursive Functions, only a subset of these functions can
be expressed with simple loops by algorithmic transformations.[*]
(Of course you *could* thus also take an approach the other way
round, i.e. from an imperative 'while' to a recursive function,
but yet, beyond your [wrong] suppositions, no one was suggesting
that.)
> [...]
(I snipped the irrelevant rest that you made up in your confusion
of not knowing.)
Janis
[*] I had started to write a longer post to explain that in detail
to you, though when I saw Keith's terse post I thought that should
suffice. - Alas, no. So all I want to suggest is to read up things
yourself. I'd suggest books from F.L.Bauer ("Algorithmic Language
and Program Development", for example), or from H.Partsch on that
topic; e.g. "Specification and Transformation of Programs".
(You may come back after reading, in case you are still puzzled.)
[toc] | [prev] | [next] | [standalone]
| From | antispam@fricas.org (Waldek Hebisch) |
|---|---|
| Date | 2024-12-17 18:46 +0000 |
| Message-ID | <vjsgt6$32kfa$1@paganini.bofh.team> |
| In reply to | #389699 |
bart <bc@freeuk.com> wrote:
>
> If you try to extract any meaning, it is that any control flow can be
> expressed either with 'goto' or with 'recursive functions'.
>
> This is what I picked up on. Who on earth would eschew 'goto' and use
> such a disproportionately more complex and inefficient method like
> recursive functions?
Due to silly conding standard? Or in language that does not have
'goto'.
> How would you even express an arbitrary goto from random point X in a
> function to random point Y, which may be inside differently nested
> blocks, via a recursive function?
AFAICS in C main limitation is that you either pass all variables
as parameters (ugly and verbose) or use only global variables
(much worse than 'goto'). The following silly example shows
that 'if' can be simulated using array of function pointers and
indirect calls:
static int bar(int a) {
return a + 1;
}
static int baz(int a) {
return 2*a;
}
int
silly(int a) {
int (*t[2])(int) = {bar, baz};
return (*t[!!(a > 3)])(a);
}
If you compile it with 'gcc -S -O2' you can see that actually there
are no function calls in generated code (but generated code is clearly
crappy). However, needed optimication is really simple, so
in principle any compiler could do better. OTOH code like
this is rare in practice, so probably compiler writers did not
bother.
In similar way one can simulate dense C 'switch'.
Main point is that function call at the end of say 'F' to function
'G' which retruns in the same way as 'F' can be compiled to some
stack shuffling + goto (this is called 'tail call optimization').
IIUC at least some Scheme and ML compilers keep calls in intermediate
level representation (both have no 'goto') and convert them to
jumps only when emiting machine code.
Similar thing was used by one disassembly system: it generated "high
level code" by converting all jumps in machine code to function
calls. Later the result was cleaned up by transformations, in
particular recursion elimination.
Of course, for orginal purpose of this thread replacing 'if' by
indirect calls is useless
--
Waldek Hebisch
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-12-17 22:45 +0000 |
| Message-ID | <vjsutq$1u93o$1@dont-email.me> |
| In reply to | #389706 |
On 17/12/2024 18:46, Waldek Hebisch wrote: > bart <bc@freeuk.com> wrote: >> >> If you try to extract any meaning, it is that any control flow can be >> expressed either with 'goto' or with 'recursive functions'. >> >> This is what I picked up on. Who on earth would eschew 'goto' and use >> such a disproportionately more complex and inefficient method like >> recursive functions? > > Due to silly conding standard? Or in language that does not have > 'goto'. It was suggested that 'theoretically', 'goto' could be replaced by recursive function calls. Whether still within the context of a language with no other control flow instructions, is not known. The suggester also chose not to share examples of how it would work.
[toc] | [prev] | [next] | [standalone]
| From | antispam@fricas.org (Waldek Hebisch) |
|---|---|
| Date | 2024-12-18 00:23 +0000 |
| Message-ID | <vjt4le$38j95$1@paganini.bofh.team> |
| In reply to | #389716 |
bart <bc@freeuk.com> wrote:
> On 17/12/2024 18:46, Waldek Hebisch wrote:
>> bart <bc@freeuk.com> wrote:
>>>
>>> If you try to extract any meaning, it is that any control flow can be
>>> expressed either with 'goto' or with 'recursive functions'.
>>>
>>> This is what I picked up on. Who on earth would eschew 'goto' and use
>>> such a disproportionately more complex and inefficient method like
>>> recursive functions?
>>
>> Due to silly conding standard? Or in language that does not have
>> 'goto'.
>
> It was suggested that 'theoretically', 'goto' could be replaced by
> recursive function calls.
>
> Whether still within the context of a language with no other control
> flow instructions, is not known. The suggester also chose not to share
> examples of how it would work.
The example I gave (and you snipped) was supposed to explain how
the technique works, but it seems that it is not enough. So
let us look at another example. Start from ordinary C code that
only uses global variables (this is not strictly necessary, but
let as make such assumption for simplicity):
int n;
int * a;
int b;
int i;
...
/* Simple search loop */
for(i = 0; i < n; i++) {
if (a[i] == b) {
break;
}
}
First, express flow control using only conditional and unconditional
jump:
l0:
i = 0;
goto l3;
l1:
int c1 = a[i] == b;
if (c1) {
goto l4;
} else {
goto l2;
}
l2:
i++;
l3:
int c2 = i < n;
if (c2) {
goto l1;
} else {
goto l4;
}
l4:
;
Note, I introduced more jumps than strictly necessary, so that
hunks between labels end either in conditional or unconditional
jump.
Next, replace each hunk staring in a label, up to (but not
including) next label, by a new function. Replace final jumps
by function calls, for conditional jumps using the same trick
as in previous 'silly' example:
int n;
int * a;
int b;
int i;
void l2(void);
void l3(void);
void l4(void);
void l0(void) {
i = 0;
l3();
}
void l1(void) {
void (*(t[2]))(void) = {l4, l2};
int c1 = a[i] == b;
(*(t[c1]))();
}
void l2(void) {
i++;
l3();
}
void l3(void) {
void (*(t[]))(void) = {l1, l4};
int c2 = i < n;
(*(t[c2]))();
}
void l4(void) {
}
Note: 'l4' is different than other functions, intead of calling
something it returns, ensuring that the sequence of calls
eventually terminate.
I hope that principles are clear now. If you compile this
with gcc at -O2 you will see that there are no calls
in generated code, only jumps. Slightly better code is
generated by clang. Note that generated code uses stack
only for final return.
BTW: you can see that currently tcc do not support this
coding style, that is code generated by tcc dully performs
all calls leading possibly to stack overflow and to
lower performance. Code generated by tcc from "jumpy"
version looks slightly worse than code generated by
clang from version using calls.
--
Waldek Hebisch
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-12-18 01:24 +0000 |
| Message-ID | <vjt88q$20478$1@dont-email.me> |
| In reply to | #389718 |
On 18/12/2024 00:23, Waldek Hebisch wrote:
> bart <bc@freeuk.com> wrote:
>> On 17/12/2024 18:46, Waldek Hebisch wrote:
>>> bart <bc@freeuk.com> wrote:
>>>>
>>>> If you try to extract any meaning, it is that any control flow can be
>>>> expressed either with 'goto' or with 'recursive functions'.
>>>>
>>>> This is what I picked up on. Who on earth would eschew 'goto' and use
>>>> such a disproportionately more complex and inefficient method like
>>>> recursive functions?
>>>
>>> Due to silly conding standard? Or in language that does not have
>>> 'goto'.
>>
>> It was suggested that 'theoretically', 'goto' could be replaced by
>> recursive function calls.
>>
>> Whether still within the context of a language with no other control
>> flow instructions, is not known. The suggester also chose not to share
>> examples of how it would work.
>
> The example I gave (and you snipped) was supposed to explain how
> the technique works, but it seems that it is not enough.
It showed how to do conditional code without explicit branching. It
didn't seem to me to cover arbitrary gotos, or where recursion comes
into it.
(Actually I implemented it in my two languages to compare performance to
'straight' versions, however my test called silly() lots of times so it
wasn't a good test.)
> So
> let us look at another example. Start from ordinary C code that
> only uses global variables (this is not strictly necessary, but
> let as make such assumption for simplicity):
>
> int n;
> int * a;
> int b;
> int i;
>
> ...
> /* Simple search loop */
> for(i = 0; i < n; i++) {
> if (a[i] == b) {
> break;
> }
> }
>
> First, express flow control using only conditional and unconditional
> jump:
>
> l0:
> i = 0;
> goto l3;
> l1:
> int c1 = a[i] == b;
> if (c1) {
> goto l4;
> } else {
> goto l2;
> }
> l2:
> i++;
> l3:
> int c2 = i < n;
> if (c2) {
> goto l1;
> } else {
> goto l4;
> }
> l4:
> ;
>
> Note, I introduced more jumps than strictly necessary, so that
> hunks between labels end either in conditional or unconditional
> jump.
>
> Next, replace each hunk staring in a label, up to (but not
> including) next label, by a new function. Replace final jumps
> by function calls, for conditional jumps using the same trick
> as in previous 'silly' example:
>
> int n;
> int * a;
> int b;
> int i;
>
> void l2(void);
> void l3(void);
> void l4(void);
>
> void l0(void) {
> i = 0;
> l3();
> }
>
> void l1(void) {
> void (*(t[2]))(void) = {l4, l2};
> int c1 = a[i] == b;
> (*(t[c1]))();
> }
>
> void l2(void) {
> i++;
> l3();
> }
>
> void l3(void) {
> void (*(t[]))(void) = {l1, l4};
> int c2 = i < n;
> (*(t[c2]))();
> }
>
> void l4(void) {
> }
>
> Note: 'l4' is different than other functions, intead of calling
> something it returns, ensuring that the sequence of calls
> eventually terminate.
OK thanks for this. I tried to duplicate it based on this starting point:
#include <stdio.h>
int n=6;
int a[]={10,20,30,40,50,60};
int b=30;
int i;
int main(void) {
for(i = 0; i < n; i++) {
printf("%d\n",a[i]);
if (a[i] == b) {
break;
}
}
}
This prints 10 20 30 as it is. But the version with the function calls
showed only '10'. If I swapped '{l1, l4}' in l3(), then I got '10 10 20'.
I didn't spend too long to debug it further. I will take your word that
this works. (I tried 3 compilers all with the same results, including TCC.)
I don't fully understand it; what I got was that you first produce
linear code with labels. Each span between labels is turned into a
function. To 'step into' label L, or jump to L, I have to do L().
There would still be lots of questions (even ignoring the problems of
accessing locals), like what the return path is, or how an early return
would work (also returning a value). Or what kind of pressure the stack
would be under.
It looks like a crude form of threaded code (which, when I use that,
never returns, and it doesn't use a stack either).
I've seen enough to know that it would be last kind of IL I would choose
(unless it was the last IL left in the world - then maybe).
There is also the oddity that eliminating a simple kind of branching
relies on more elaborate branching: call and return mechanisms.
More interesting and more practical would be replacing call/return by
'goto'! (It would need to support label pointers or indirect jumps,
unless runtime code modification was allowed.)
(my test)
--------------------------
#include <stdio.h>
int n=6;
int a[]={10,20,30,40,50,60};
int b=30;
int i;
void k2(void);
void k3(void);
void k4(void);
void k0(void) {
i = 0;
k3();
}
void k1(void) {
void (*(t[2]))(void) = {k4, k2};
printf("%d\n",a[i]);
int c1 = a[i] == b;
(*(t[c1]))();
}
void k2(void) {
i++;
// k3();
}
void k3(void) {
void (*(t[]))(void) = {k4, k1};
int c2 = i < n;
(*(t[c2]))();
}
void k4(void) {
}
int main(void) {
k0();
k1();
k2();
k3();
k4();
}
[toc] | [prev] | [next] | [standalone]
| From | antispam@fricas.org (Waldek Hebisch) |
|---|---|
| Date | 2024-12-18 03:51 +0000 |
| Message-ID | <vjtgrj$396sg$1@paganini.bofh.team> |
| In reply to | #389719 |
bart <bc@freeuk.com> wrote:
> On 18/12/2024 00:23, Waldek Hebisch wrote:
>> bart <bc@freeuk.com> wrote:
>>> On 17/12/2024 18:46, Waldek Hebisch wrote:
>>>> bart <bc@freeuk.com> wrote:
>>>>>
>>>>> If you try to extract any meaning, it is that any control flow can be
>>>>> expressed either with 'goto' or with 'recursive functions'.
>>>>>
>>>>> This is what I picked up on. Who on earth would eschew 'goto' and use
>>>>> such a disproportionately more complex and inefficient method like
>>>>> recursive functions?
>>>>
>>>> Due to silly conding standard? Or in language that does not have
>>>> 'goto'.
>>>
>>> It was suggested that 'theoretically', 'goto' could be replaced by
>>> recursive function calls.
>>>
>>> Whether still within the context of a language with no other control
>>> flow instructions, is not known. The suggester also chose not to share
>>> examples of how it would work.
>>
>> The example I gave (and you snipped) was supposed to explain how
>> the technique works, but it seems that it is not enough.
>
> It showed how to do conditional code without explicit branching. It
> didn't seem to me to cover arbitrary gotos, or where recursion comes
> into it.
>
> (Actually I implemented it in my two languages to compare performance to
> 'straight' versions, however my test called silly() lots of times so it
> wasn't a good test.)
>
>> So
>> let us look at another example. Start from ordinary C code that
>> only uses global variables (this is not strictly necessary, but
>> let as make such assumption for simplicity):
>>
>> int n;
>> int * a;
>> int b;
>> int i;
>>
>> ...
>> /* Simple search loop */
>> for(i = 0; i < n; i++) {
>> if (a[i] == b) {
>> break;
>> }
>> }
>>
>> First, express flow control using only conditional and unconditional
>> jump:
>>
>> l0:
>> i = 0;
>> goto l3;
>> l1:
>> int c1 = a[i] == b;
>> if (c1) {
>> goto l4;
>> } else {
>> goto l2;
>> }
>> l2:
>> i++;
>> l3:
>> int c2 = i < n;
>> if (c2) {
>> goto l1;
>> } else {
>> goto l4;
>> }
>> l4:
>> ;
>>
>> Note, I introduced more jumps than strictly necessary, so that
>> hunks between labels end either in conditional or unconditional
>> jump.
>>
>> Next, replace each hunk staring in a label, up to (but not
>> including) next label, by a new function. Replace final jumps
>> by function calls, for conditional jumps using the same trick
>> as in previous 'silly' example:
>>
>> int n;
>> int * a;
>> int b;
>> int i;
>>
>> void l2(void);
>> void l3(void);
>> void l4(void);
>>
>> void l0(void) {
>> i = 0;
>> l3();
>> }
>>
>> void l1(void) {
>> void (*(t[2]))(void) = {l4, l2};
^^^^^^^
Should be
l2, l4
>> int c1 = a[i] == b;
>> (*(t[c1]))();
>> }
>>
>> void l2(void) {
>> i++;
>> l3();
>> }
>>
>> void l3(void) {
>> void (*(t[]))(void) = {l1, l4};
^^^^^^
l4, l2
>> int c2 = i < n;
>> (*(t[c2]))();
>> }
>>
>> void l4(void) {
>> }
>>
>> Note: 'l4' is different than other functions, intead of calling
>> something it returns, ensuring that the sequence of calls
>> eventually terminate.
>
> OK thanks for this. I tried to duplicate it based on this starting point:
>
> #include <stdio.h>
>
> int n=6;
> int a[]={10,20,30,40,50,60};
> int b=30;
> int i;
>
> int main(void) {
> for(i = 0; i < n; i++) {
> printf("%d\n",a[i]);
> if (a[i] == b) {
> break;
> }
> }
> }
>
> This prints 10 20 30 as it is. But the version with the function calls
> showed only '10'. If I swapped '{l1, l4}' in l3(), then I got '10 10 20'.
Sorry, there was a thinko: 1 is true and this is the second element
of the array, while I was thinking that the first one is true branch
and second is false branch.
> I didn't spend too long to debug it further. I will take your word that
> this works. (I tried 3 compilers all with the same results, including TCC.)
>
> I don't fully understand it; what I got was that you first produce
> linear code with labels. Each span between labels is turned into a
> function. To 'step into' label L, or jump to L, I have to do L().
Yes.
> There would still be lots of questions (even ignoring the problems of
> accessing locals), like what the return path is, or how an early return
> would work (also returning a value). Or what kind of pressure the stack
> would be under.
OK, you take a function F, it has some arguments and local variables.
And some retrun type. You create "entry function" to take the
same arguments as F and has the same return type as F. You tranform
body as above, but now each new function has the same return type
as F and arguments are arguments of original function + extra arguments,
one for each local variable of F. In "entry function" you call
function corresponding to first label passing it arguments and
initial values of local variables of F. In each subseqent call
you pass argument and values around so that they are available
in each new function. And the call is an argument to return
statement. When you want to return you simply return value,
without performing a call.
Stack use depend on optimizations in your compiler. With small
effort compiler can recognize that it will return value (possibly
void) from a call and replace such call by stack+register
shuffling + jump. Actually when there is return value, you
have something like
return lx(a0, a1, ..., ak);
which is easy to recognize due to 'return' keyword. One also
need to check that types agree (C automatically applies integer
convertions, but such convertions may produce real code, so in
such case one needs normal call). In void case one need to
check that there the call is textually last thing or that
it is followed by return statement. Stack+register
shuffling may require some code before control transfer, but
call can be replaced by jump.
So, if compiler has tail call optimization, then there is no
more stack use than maximum needed by any of the functions.
Note: I described general transformation, partially to show
that 'if' is _not_ needed. But similar style is used to
write code by hand. In hand written code people do not
bother with transforming 'if', which makes tail call
optimization a bit more complicated. OTOH, unlike rather
ugly code produced by mechanical transformation, hand
written code depending on tail call optimization may be quite
nice and readible. There is potential trouble: sometimes
author thinks that a call is a tail call, but compiler
disagrees, leading to lower efficiency.
Of course, when compiler do not have tail call optimization,
then stack use may be quite high.
> It looks like a crude form of threaded code (which, when I use that,
> never returns, and it doesn't use a stack either).
IMO it is quite different than what I know as threaded code.
> I've seen enough to know that it would be last kind of IL I would choose
> (unless it was the last IL left in the world - then maybe).
>
> There is also the oddity that eliminating a simple kind of branching
> relies on more elaborate branching: call and return mechanisms.
One motivation for eliminating 'goto' is that it is not easy to
say what effect 'goto' has on variables. I mean, variables keep
ther values, but when you may arrive to given point from several
places than values of variables depend on place that control came
from, and this may be hard to analyze. In a sense functions have
the same problem, but there is well-developed technique to reason
about function calls. So both jumps and function calls are
hard to analyze, but eliminating jumps allows re-use of work
done for functions.
> More interesting and more practical would be replacing call/return by
> 'goto'! (It would need to support label pointers or indirect jumps,
> unless runtime code modification was allowed.)
The point is that calls are strictly more powerful than jumps
(you get parameter passing and local variables).
--
Waldek Hebisch
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-12-18 17:26 +0100 |
| Message-ID | <vjut4b$2c9ul$1@dont-email.me> |
| In reply to | #389706 |
On 17.12.2024 19:46, Waldek Hebisch wrote:
> bart <bc@freeuk.com> wrote:
>> [...]
[ ponderings about where recursive functions might be used ]
> Due to silly conding standard? Or in language that does not have
> 'goto'.
(I'd rule out the "coding standards" hypothesis.)
Languages without 'goto', I suppose, would either have other control
constructs ('while', etc.) to formulate in an imperative style, or
be of the Functional Programming Languages type.
Janis
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-12-17 12:13 -0800 |
| Message-ID | <87ldwe6pb8.fsf@nosuchdomain.example.com> |
| In reply to | #389699 |
bart <bc@freeuk.com> writes:
> On 17/12/2024 01:19, Keith Thompson wrote:
>> bart <bc@freeuk.com> writes:
>> [SNIP]
>>> In that case I've no idea what you were trying to say.
>>>
>>> When somebody says that 'goto' can emulate any control structure, then
>>> clearly some of them need to be conditional; that is implied.
>>>
>>> Your reply suggested they you can do away with 'goto', and use
>>> recursive functions, in a scenario where no other control structures
>>> need exist.
>>>
>>> OK, if this is not for an IL, then it's not a language I would care
>>> for either. Why tie one hand behind your back for no good reason?
>> I read Janis's post. I saw a suggestion that certain constructs are
>> *theoretically* unnecessary. I saw no suggestion of any advocacy for
>> such an approach.
>> """
>> A 'goto' may be used but it isn't strictly *necessary*. What *is*
>> necessary, though, that is an 'if' (some conditional branch), and
>> either 'goto' or recursive functions.
>> """
>
> This doesn't actually make much sense. So 'goto' is necessary, but
> 'goto' *is*?
I presume you didn't write what you intended to write. Responding to
what I *think* you meant :
Either
"if" and "goto"
or
"if" and recursive functions
are theoretically sufficient to express certain kinds of algorithms
(I'm handwaving a bit). Which implies that "goto" is not strictly
necessary. It also implies that recursive functions are not strictly
necessary if you have "goto".
Since this is comp.lang.c, not comp.theory (or what comp.theory was
intended to be), I'm not going to go into the details, nor am I going to
take the time to express the concept in mathematically rigorous terms.
> If you try to extract any meaning, it is that any control flow can be
> expressed either with 'goto' or with 'recursive functions'.
Yes, either of those plus "if". It appears you understand the point.
> This is what I picked up on. Who on earth would eschew 'goto' and use
> such a disproportionately more complex and inefficient method like
> recursive functions?
Perhaps it wasn't clear initially, but it should be by now,
that Janis was talking about what's theoretically sufficient to
express general algorithms. You seized on the silly idea that
Janis was *advocating* the use of one of the two minimal methods in
an intermediate language for a compiler. The idea Janis brought
up (briefly, in passing) is about theoretical computer science,
not practical software engineering. (Janis, please correct me if
I'm mistaken.)
Repeatedly asking why anyone would do such a thing misses the point.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
Page 3 of 7 — ← Prev page 1 2 [3] 4 5 6 7 Next page →
Back to top | Article view | comp.lang.c
csiph-web